# 29412 Cust: AddisonWesley Au: BarYam Pg. No. i
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page i
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. ii
Title: Dynamics Complex Systems
Shor t / Normal / Long
St udi e s i n Nonli ne a ri t y
Ser ies Editor : Rober t L. Devaney
Ra lph Abra h a m, Dyn a mics: Th e Ge o me t ry of Be h a vio r
Ra lph H. Abra h a m a nd Ch rist oph e r D. Sh a w, Dyn a mics: Th e Ge o me t ry of
Be h a vio r
Robe rt L. De va ne y, Ch a o s, Fra ct a ls, a n d Dyn a mics: Co mp u t e r Exp e rime n t s
in Ma t h e ma t ics
Robe rt L. De va ne y, A First Co u rse in Ch a o t ic Dyn a mica l Syst e ms: Th e o ry
a n d Exp e rime n t
Robe rt L. De va ne y, An I n t ro d u ct io n t o Ch a o t ic Dyn a mica l Syst e ms, Se co n d
Ed it io n
Robe rt L. De va ne y, J a me s F. Ge orge s, De lbe rt L. J oh nson , Ch a o t ic
Dyn a mica l Syst e ms Soft wa re
Ge ra ld A. Edga r ( e d. ) , Cla ssics o n Fra ct a ls
J a me s Ge orge s, De l J oh nson , a nd Robe rt L. De va ne y, Dyn a mica l Syst e ms
So ft wa re
Mich a e l McGuire, An Eye fo r Fra ct a ls
St e ve n H. St roga t z, No n lin e a r Dyn a mics a n d Ch a o s: Wit h Ap p lica t io n s t o
Ph ysics, Bio lo gy, Ch e mist ry, a n d En gin e e rin g
Nich ola s B. Tufilla ro, Tyle r Abbot t , a n d J e re mia h Re illy, An Exp e rime n t a l
Ap p ro a ch t o No n lin e a r Dyn a mics a n d Ch a o s
FMadBARYAM_29412 3/10/02 11:05 AM Page ii
Yaneer BarYam
Dyna mi cs of
Comple x Sys t e ms
The Advanced Book Program
AddisonWesley
Reading, Massachusetts
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. iii
Title: Dynamics Complex Systems
Shor t / Normal / Long
v
ww
FMadBARYAM_29412 3/10/02 11:05 AM Page iii
Figure 2.4.1 ©1992 Benjamin Cummings, from E. N. Marieb/Human Anatomy and
Physiology. Used with permission.
Figure 7.1.1 (bottom) by Br ad Smith, Elwood Linney, and the Center for In Vivo Microscopy
at Duke Universit y (A National Center for Research Resources, NIH). Used with per mission.
Many of the designations used by manufacturers and sellers to dist inguish their products are
claimed as tr ademarks. Where those designations appear in this book and AddisonWesley
was aware of a tr ademar k claim, the designations have been printed in initial capital letters.
Library of Congress CataloginginPublication Data
BarYam, Yaneer.
Dynamics of complex systems / Yaneer BarYam.
p. cm.
Includes index.
ISBN 0201557487
1. Biomathematics. 2. System theor y. I. Title.
QH323.5.B358 1997
570' .15' 1—DC21 9652033
CIP
Copyr ight © 1997 by Yaneer BarYam
All rights reser ved. No par t of this publication may be reproduced, stored in a retr ieval sys
tem, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the pr ior written permission of the publisher. Pr inted in the
United States of America.
AddisonWesley is an imprint of Addison Wesley Longman, Inc.
Cover design by Suzanne Heiser and Yaneer BarYam
Text design by Jean Hammond
Set in 10/12.5 Minion by Carlisle Communications, LTD
1 2 3 4 5 6 7 8 9—MA—0100999897
First printing, August 1997
Find us on the World Wide Web at
htt p://www.aw.com/gb/
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. iv
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page iv
This book is dedicated with love to my family
Zvi, Miriam, Aureet and Sageet
Naomi
and our children
Shlomiya, Yavni, Maayan and Taeer
Aureet’s memory is a blessing.
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. v
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page v
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. vi
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page vi
vii
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. vii
Title: Dynamics Complex Systems
Shor t / Normal / Long
Cont e nt s
Pre fa ce xi
Acknowle dgme nt s xv
0 Ove rvi e w: The Dyna mi cs of Complex Sys t e ms —Exa mple s,
Que s t i ons, Me t hods a nd Conce pt s 1
0. 1 Th e Fie ld of Complex Syst e ms 1
0. 2 Exa mple s 2
0. 3 Que st ions 6
0. 4 Me t hods 8
0. 5 Conce pt s: Eme rge n ce a nd Complexit y 9
0. 6 For t he I n st ruct or 14
1 I nt roduct i on a nd Pre li mi na ri e s 16
1. 1 It e ra t ive Ma ps ( a n d Ch a os) 19
1. 2 St och a st ic I t e ra t ive Ma ps 38
1. 3 Th e rmodyna mics a nd St a t ist ica l Me ch a n ics 58
1. 4 Act iva t e d Proce sse s ( a nd Gla sse s) 95
1. 5 Ce llula r Aut oma t a 112
1. 6 St a t ist ica l Fie lds 145
1. 7 Comput e r Simula t ions ( Mon t e Ca rlo, Simula t e d An ne a ling) 186
1. 8 I n forma t ion 214
1. 9 Comput a t ion 235
1. 10 Fra ct a ls, Sca lin g a nd Re norma liza t ion 258
2 Ne ura l Ne t works I : Subdi vi s i on a nd Hi e ra rchy 295
2. 1 Ne ura l Ne t works: Bra in a n d Mind 296
2. 2 At t ra ct or Ne t works 300
FMadBARYAM_29412 3/10/02 11:05 AM Page vii
2. 3 Fe e dforwa rd Ne t works 322
2. 4 Subdivide d Ne ura l Ne t works 328
2. 5 An a lysis a nd Simula t ions of Subdivide d Ne t works 345
2. 6 From Subdivision t o Hie ra rch y 364
2. 7 Subdivision a s a Ge ne ra l Phe nome non 366
3 Ne ura l Ne t works I I : Mode ls of Mi nd 371
3. 1 Sle e p a nd Subdivision Tra in ing 372
3. 2 Bra in Funct ion a n d Mode ls of Mind 393
4 Prot e i n Foldi ng I : Si ze Sca li ng of Ti me 420
4. 1 Th e Prot e in  Folding Proble m 421
4. 2 I nt roduct ion t o t h e Mode ls 427
4. 3 Pa ra lle l Proce ssing in a Two Spin Mode l 432
4. 4 Homoge n e ous Syst e ms 435
4. 5 I n homoge n e ous Syst e ms 458
4. 6 Conclusions 471
5 Prot e i n Foldi ng I I : Ki ne t i c Pa t hwa ys 472
5. 1 Ph a se Spa ce Ch a n ne ls a s Kine t ic Pa t h wa ys 473
5. 2 Polyme r Dyn a mics: Sca lin g The ory 477
5. 3 Polyme r Dyn a mics: Simula t ions 488
5. 4 Polyme r Colla pse 503
6 Li f e I : Evolut i on—Ori gi n of Complex Orga ni s ms 528
6. 1 Livin g Orga n isms a nd En viron me nt s 529
6. 2 Evolut ion Th e ory a nd Ph e nome nology 531
6. 3 Ge nome, Phe nome a n d Fit ne ss 542
6. 4 Explora t ion , Opt imiza t ion a nd Popula t ion I nt e ra ct ion s 550
6. 5 Re product ion a n d Se le ct ion by Re source s a nd Pre da t ors 576
6. 6 Colle ct ive Evolut ion : Ge n e s, Orga n isms a nd Popula t ions 604
6. 7 Conclusions 619
viii C o n t e n t s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. viii
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page viii
7 Li f e I I : Deve lopme nt a l Bi ology—Complex by De s i gn 621
7. 1 De ve lopme n t a l Biology: Progra mmin g a Brick 621
7. 2 Diffe re nt ia t ion : Pa t t e rns in An ima l Colors 626
7. 3 De ve lopme nt a l Tool Kit 678
7. 4 Th e ory, Ma t he ma t ica l Mode ling a nd Biology 688
7. 5 Principle s of Se lf Orga n iza t ion a s Orga n iza t ion by De sign 691
7. 6 Pa t t e rn Forma t ion a nd Evolut ion 695
8 Huma n Ci vi li za t i on I : De f i ni ng Complexi t y 699
8. 1 Mot iva t ion 699
8. 2 Comple xit y of Ma t he ma t ica l Mode ls 705
8. 3 Comple xit y of Physica l Syst e ms 716
8. 4 Comple xit y Est ima t ion 759
9 Huma n Ci vi li za t i on I I : A Complex( i t y) Tra ns i t i on 782
9. 1 I nt roduct ion : Complex Syst e ms a n d Socia l Policy 783
9. 2 I nside a Comple x Syst e m 788
9. 3 Is Huma n Civiliza t ion a Complex Syst e m? 791
9. 4 Towa rd a Ne t worke d Globa l Econ omy 796
9. 5 Conse que nce s of a Tra n sit ion in Complexit y 815
9. 6 Civiliza t ion I t se lf 822
Addi t i ona l Re a di ngs 827
I ndex 839
C o n t e n t s ix
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. ix
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page ix
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. x
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page x
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. xi
Title: Dynamics Complex Systems
Shor t / Normal / Long
Pre fa ce
“Com p l ex ” is a word of t he t i m e s , as in the of ten  qu o ted “growing com p l ex i t y of
l i fe .” S c i en ce has begun t o tr y to under st and com p l ex i t y in natu re , a co u n terpoint to
the trad i ti onal scien tific obj ect ive of u n derstanding t he fundamental simplicity of
l aws of n a tu re . It is bel i eved ,h owever, that even in t he stu dy of com p l ex i t y ther e ex
ist simple and therefor e com preh en s i ble laws . The field of s tu dy of com p l ex sys tem s
holds that t he dynamics of com p l ex sys t ems are fo u n ded on universal pr inciples that
m ay be used to de s c ri be dispara te probl ems ra n ging from part i cle physics t o the eco
n omics of s oc i et i e s . A coro ll a ry is that tr a n s fer r ing ideas and re sult s from inve s ti ga
tors in hitherto dispara te areas wi ll cro s s  fer ti l i ze and lead to impor tant new re su l t s .
In this text we introduce sever al of the problems of science that embody the con
cept o f complex dynamical systems. Each is an active area of research that is at the
forefront of science.Our presentation does not tr y to provide a compr ehensive review
of the research literature available in each area. Instead we use each problem as an op
por tunity for discussing fundamental issues that are shared among all areas and there
fore can be said to unify the study of complex syst ems.
We do not expect it to be possible to provide a succinct definition of a complex
system. Instead, we give examples of such systems and provide the elements of a def
inition. It is helpful to begin by describing some of the att ributes that char acter ize
complex syst ems. Complex syst ems contain a large number of mutually int er acting
parts. Even a few inter acting objects can behave in complex ways. However, the com
plex systems that we are interested in have more than just a few parts. And yet there is
gener ally a limit to the number of parts that we are interested in. If there are too many
parts, even if these parts are strongly interacting, the propert ies of the system become
the domain of conventional ther modynamics—a unifor m mater ial.
Thus far we have d efined complex syst ems as being within the mesoscopic d o
main—containing more than a few, and less than too many parts.However, the meso
scopic regime describes any physical system on a par ticular length scale,and this is too
br oad a definition for our purposes. Another character istic of most complex dynam
ical systems is that they are in some sense purposive. This means that the dynamics of
the syst em has a d efinable objective or function. There oft en is some sense in which
the syst ems are engineered. We address this t opic dir ectly when we discuss and con
tr ast selforganization and organization by design.
A centr al goal of this text is to develop models and modeling techniques that are
useful when applied to all complex systems. For this we will adopt both analytic tools
and computer simulation. Among the analyt ic t echniques are statistical mechanics
and stochastic dynamics. Among the computer simulation techniques are cellular au
tomata and Monte Carlo. Since analytic treatments do not yield complete theories of
complex systems, computer simulations play a key role in our understanding of how
these systems work.
The human br ain is an important example of a complex system for med out of its
component neurons. Computers might similarly be understood as complex inter act
ing systems of tr ansistor s.Our brains are well suited for under standing complex sys
xi
FMadBARYAM_29412 3/10/02 11:05 AM Page xi
xii P re f a c e
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. xii
Title: Dynamics Complex Systems
Shor t / Normal / Long
tems, but not for simulating them. Why are computers bett er suited to simulations of
complex systems? One could point to the need for precision that is the t raditional do
main of the computer. However, a better reason would be the difficulty the brain has
in keeping tr ack of many and ar bitr ary interact ing objects or events—we can typically
remember seven independent pieces of information at once. The reasons for this are
an impor tant part of the design of the brain that make it power ful for other purposes.
The architect ure of the brain will be discussed beginning in Chapter 2.
The study o f the dynamics of complex syst ems creates a host o f new int erdisci
plinary fields. It not only breaks down barr iers between physics, chemistr y and biol
ogy, but also between these disciplines and the socalled soft sciences of psychology,
sociology, economics,and anthropology. As this breakdown occurs it becomes neces
sary to introduce or adopt a new vocabular y. Included in this new vocabulary are
words that have been considered taboo in one area while being extensively used in an
other. These must be adopted and adapted to make them part of the interdisciplinar y
discourse. One example is the word “mind.” While the field of biology studies the
brain,the field o f psychology considers the mind. However, as the study of neural net
works progresses,it is anticipat ed that the funct ion of the neural network will become
ident ified with the concept of mind.
An o t h er area in wh i ch scien ce has trad i ti on a lly been mute is in the con cept of m e a n
ing or purpo s e . The field of s c i en ce trad i ti on a lly has no con cept of va lues or va lu a ti on .
Its obj ective is to de s c ri be natu ral ph en om ena wi t h o ut assigning po s i tive or nega tive
con n o t a ti on to the de s c ri pti on . However, the de s c ri pti on of com p l ex sys tems requ i res a
n o ti on of p u r po s e ,s i n ce the sys tems are gen era lly purpo s ive . Within the con text of p u r
pose there may be a con cept of va lue and va lu a ti on . If , as we wi ll attem pt to do, we ad
d ress soc i ety or civi l i z a ti on as a com p l ex sys tem and iden tify its purpo s e ,t h en va lue and
va lu a ti on may also become a con cept that attains scien tific sign i f i c a n ce . Th ere are even
f u rt h er po s s i bi l i ties of i den ti f ying va lu e ,s i n ce the ver y con cept of com p l ex i ty all ows us
to iden tify va lue with com p l ex i t y thro u gh its difficult y of rep l acem en t . As is usual wi t h
a ny scien t ific adva n ce ,t h ere are both dangers and opportu n i ties with su ch devel opm en t s .
Finally, it is curious that the origin and fate of the uni verse has become an ac
cepted subject of scientific discourse—cosmology and the big bang theor y—while the
fate of humankind is gener ally the subject of religion and science fiction. There ar e
exceptions to this rule, particular ly surrounding the field o f ecology—limits to pop
ulation growth, global warming—however, this is only a limited selection o f topics
that could be addressed. Over coming this limitation may be only a matter of having
the appropr iate tools. Developing the t ools to address questions about the dynamics
of human civilization is ap propr iate within the study of complex syst ems. It should
also be recognized that as science expands to address these issues, science itself will
change as it redefines and changes other fields.
Different fields are often distinguished more by the t ype o f questions they ask
than the syst ems they study. A significant effor t has been made in this t ext to ar ticu
late questions, though not always to p rovide complete answers, since questions that
define the field of complex syst ems will inspire more p rogress than answers at this
early stage in the development of the field.
FMadBARYAM_29412 3/10/02 11:05 AM Page xii
Like other fields, the field of complex systems has many aspects, and any text
must make choices about which mater ial to include. We have suggested that complex
systems have more than a few parts and less than too many of them. There are two ap
proaches to this inter mediate regime. The first is to consider systems with more than
a few parts, but still a denumerable number—denumerable,that is, by a single person
in a reasonable amount of t ime. The second is to consider many parts, but just fewer
than too many. In the first approach the main task is to describe the behavior of a par
ticular system and its mechanism of oper ation—the funct ion of a neural network of
a few to a few hundred neurons, a fewcelled organism, a small protein,a few people,
etc. This is done by describing completely the role of each of the parts. In the second
approach, the precise number of parts is not essential,and the main task is a statisti
cal study of a collect ion of systems that differ fr om ea ch other but share the same
struct ure—an ensemble o f systems. This approach t reats general p roperties of pro
teins, neural networks, societ ies, etc. In this text, we adopt the second approach.
However, an interesting twist to our discussion is that we will show that any complex
system requires a description as a par ticular fewpart syst em.A complementary vol
ume to the present one would consider examples of systems with only a few parts and
analyze their function with a view toward extract ing general pr inciples. These pr inci
ples would complement the seemingly more general analysis of the statistical
approach.
The order of pr esentation of the topics in this text is a matter of taste. Many of
the chapter s are selfcontained discussions of a particular system or question. The first
chapter contains mat er ial that p rovides a foundation for the rest. Part of the role of
this chapter is the introduction of “simple” models upon which the remainder of the
text is based. Another role is the r eview of concepts and t echniques that will be used
in lat er chap ters so that the text is mo re selfcontained. Because of the interdiscipli
nary nature of the subject matter, the first chapter is considered to have particular im
portance. Some of the mat er ial should be familiar to most graduate students, while
other mater ial is found only in the p rofessional literature. For example, basic proba
bility theor y is reviewed, as well as the concepts and pr opert ies of cellular automata.
The pur pose is to enable this t ext to be read by students and researchers with a var i
ety of backgrounds. However, it should be apparent that digesting the variety of con
cepts aft er only a br ief presentation is a difficult task. Additional sources of mater ial
are listed at the end of this text.
Throughout the book, we have sought to limit advanced for mal discussions to a
minimum. When possible, we select models that can be described with a simpler for
malism than must be used to t reat the most gener al case possible. Where additional
layers of formalism are particular ly appropriate, reference is made to other liter ature.
Simulations are described at a level of detail that,in most cases,should enable the st u
dent to perform and expand upon the simulations described. The graphical display of
such simulations should be used as an integr al part of exposure to the dynamics of
these syst ems. Such displays are generally effective in d eveloping an intuition about
what are the impor tant or relevant proper ties of these systems.
P re f a c e xiii
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. xiii
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page xiii
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. xiv
Title: Dynamics Complex Systems
Shor t / Normal / Long
FMadBARYAM_29412 3/10/02 11:05 AM Page xiv
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. xv
Title: Dynamics Complex Systems
Shor t / Normal / Long
Acknowle dgme nt s
This book is a composite of many ideas and r eflects the efforts of many individuals
that would be impossible to acknowledge. My personal efforts to compose this body
of knowledge into a coherent framework for fut ure study are also ind ebted to many
who contr ibuted to my own development. It is the earliest teachers, who we can no
longer identify by memor y, who should be acknowledged at the completion of a ma
jor effort. They and the teachers I remember from elementary school through gradu
ate school, especially my thesis advisor John Joannopoulos, have my deepest grati
tude. Consistent with their dedication, may this be a reward for their effor ts.
The study of complex systems is a new endeavor, and I am grateful to a few col
leagues and teachers who have inspired me to pursue this path. Charles Bennett
through a few joint car trips opened my mind to the possibilities of this field and the
paths less trodden that lead to it. Tom Malone, through his course on networked cor
porations, not only cont ributed significant concepts to the last chap ter of this book,
but also motivated the creation of my course on the dynamics of complex systems.
Ther e are colleagues and students who have inspir ed or cont ributed to my un
derstanding of various aspects of mat erial cover ed in this text. Some of this cont r ibu
tion arises fr om reading and commenting on various asp ects of this text, or through
discussions of the mat erial that eventually made its way here. In some cases the dis
cussions were or iginally on unrelated matters, but because they were eventually con
nected to these subjects,they are here acknowledged. Roughly divided by area in cor
respondence with the order they appear in the text these include: Glasses—David
Adler; Cellular Automata—Gerard Vichniac, Tom Toffoli, Norman Margolus, Mike
Biafore, Eytan Domany, Danny Kandel; Computation—Jeff Siskind; Multigr id—Achi
Brandt, Shlomi Taasan, Sorin Costiner; Neural Networks—John Hopfield, Sageet
BarYam, Tom Kincaid, Paul Appelbaum, Charles Yang, Reza SadrLahijany, Jason
Redi, LeePeng Lee, Hua Yang, Jerome Kagan, Ernest Har tmann; Protein Folding—
Elisha Haas, Charles DeLisi, Temple Smith, Robert Davenport, David Mukamel,
Mehran Kardar ; Polymer Dynamics—Yitzhak Rabin, Mark Smith, Bor is Ost rovsky,
Gavin Crooks, Eliana DeBer nardezClark; Evolution—Alan Perelson, Derren Pier re,
Daniel Goldman, Stuart Kauffman, Les Kaufman; Developmental Biology—Ir ving
Epstein, Lee Segel, Ainat Rogel, Evelyn Fox Keller ; Complexity—Charles Bennett,
Michael Wer man, Michel Baranger; Human Economies and Societies—Tom Malone,
Har r y Bloom, Benjamin Samuels, Kosta Tsipis, Jonathan King.
A special acknowledgment is necessary to the students of my course from Boston
Univer sity and MIT. Among them are students whose p rojects became incorporated
in parts of this t ext and are mentioned above. The int erest that my colleagues have
shown by attending and participating in the course has bright ened it for me and their
contributions are meaningful: Lewis Lipsitz, Michel Baranger, Paul Barbone, George
Wyner, Alice Davidson,Ed Siegel, Michael Werman,Lar r y Rudolfand Mehran Kardar.
Among the rea der s of this t ext I am par ticularly ind ebted to the detailed com
ments of Bruce Boghosian, and the suppor tive comments of the series editor Bob
Devaney. I am also indebted to the suppor t of Charles Cantor and Jerome Kagan.
xv
FMadBARYAM_29412 3/10/02 11:05 AM Page xv
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. xvi
Title: Dynamics Complex Systems
Shor t / Normal / Long
I would like to acknowledge the constr uctive effor ts o f the edit ors at Addison
Wesley star ting from the initial contact with Jack Repcheck and continuing with
Jeff Robbins. I thank Lynne Reed for coordinating production, and at Carlisle
Communicat ions: Susan Steines, Bev Kraus, Faye Schilling, and Kathy Davis.
The software used for the t ext, graphs, figures and simulations of this book, in
cludes: Microsoft Excel and Word, Deneba Canvas, Wolfram’s Mathematica, and
Symantec C. The hardware includes: Macintosh Quadra,and IBM RISC wor kstations.
The contr ibutions of my family, to whom this book is dedicated, cannot be d e
scribed in a few words.
”
Yaneer BarYam
Newton, Massachusetts, June 1997
xvi Ac k n ow l e d g m e n t s
FMadBARYAM_29412 3/10/02 11:05 AM Page xvi
1
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 1
Title: Dynamics Complex Systems
Shor t / Normal / Long
0
Ove rvi ew:
The Dyna mi cs of Comple x Sys t e ms —
Exa mple s, Que s t i ons, Me t hods a nd Conce pt s
The Fi e ld of Comple x Sys t e ms
The study o f complex systems in a unified framework has become r ecognized in r e
cent years as a new scientific discipline, the ultimate of int erdisciplinary fields. It is
strongly r ooted in the advances that have been made in diverse fields ranging from
physics to anthropology, from which it dr aws inspir ation and to which it is relevant.
Many of the syst ems that surround us are complex. The goal of understanding
their properties motivates much if not all of scientific inquiry. Despite the great com
plexity and variety of systems, universal laws and phenomena are essential to our in
quiry and to our understanding. The idea that all matter is f ormed out o f the same
building blocks is one of the original concepts of science. The moder n manifestation
of this concept—atoms and their constituent par ticles—is essential to our recogni
tion of the commonality among syst ems in science. The universality of constituents
complements the universality of mechanical laws (classical or quantum) that govern
their motion. In biology, the common molecular and cellular mechanisms of a large
variet y of organisms for m the basis of our studies. However, even more univer sal than
the constituents are the dynamic processes of variation and selection that in some
manner cause organisms to evolve. Thus, all scientific endeavor is based, to a great er
or lesser degree, on the existence of universality, which manifests itself in diverse ways.
In this context,the study o f complex systems as a new endeavor str ives to increase our
ability to under stand the universalit y that ar ises when systems are highly complex.
A dict ionary d efinition of the word “complex” is: “consisting of interconnected
or int er woven parts.” Why is the nature of a complex system inherently related to its
parts? Simple systems are also formed out of parts. To explain the difference between
simple and complex syst ems, the t erms “interconnected” or “interwoven” are some
how essential.Qualitat ively, to understand the behavior of a complex system we must
understand not only the behavior of the parts but how they act t ogether to form the
behavior of the whole. It is because we cannot describe the whole without describing
each part, and b ecause ea ch part must be described in relation to other parts, that
complex systems are difficult to under stand. This is relevant to another definition of
“complex”: “not easy to understand or analyze.” These qualitative ideas about what a
complex system is can be made more quantitat ive. Ar ticulating them in a clear way is
0 . 1
00adBARYAM_29412 9/5/00 7:26 PM Page 1
both essential and fruitful in pointing the way toward progress in understanding the
universal pr opert ies of these systems.
For many years, professional sp ecialization has led science to progressive isola
tion of individual disciplines. How is it possible that wellseparated fields such as mol
ecular biology and economics can suddenly become unified in a single discipline?
How does the study of complex systems in general per tain to the detailed efforts de
voted to the study of part icular complex syst ems? In this r egard one must be car eful
to acknowledge that there is always a dichotomy between universality and specificit y.
A study of universal pr inciples does not replace detailed description of part icular
complex systems. However, univer sal pr inciples and t ools guide and simplify our in
quiries into the study of specifics. For the study of complex systems,universal simpli
fications are particularly impor tant. Somet imes universal pr inciples are intuitively
appr eciated without being explicitly stated. However, a car eful ar t iculation o f such
principles can enable us to approach par ticular syst ems with a systematic guidance
that is often absent in the study of complex syst ems.
A pictorial way of illustrating the relationship of the field of complex systems to
the many other fields o f science is indicat ed in Fig. 0.1.1. This figure shows the con
ventional view of science as progressively separating into disparate disciplines in or
der to gain knowledge about the ever larger complexity of syst ems. It also illust rates
the view of the field of complex systems, which suggests that all complex systems have
universal proper ties. Because each field develops tools for addressing the complexity
of the systems in their domain, many of these tools can be adap ted for more general
use by recognizing their univer sal applicabilit y. Hence the mot ivation for cross
disciplinar y fer tilizat ion in the study of complex systems.
In Sections 0.2–0.4 we initiate our study of complex syst ems by discussing ex
amples, questions and methods that are relevant to the study of complex systems.Our
pur pose is to int roduce the field without a strong bias as to conclusions, so that the
student can develop independent perspectives that may be useful in this new field—
opening the way to his or her own cont ributions to the study of complex systems. In
Section 0.5 we introduce two key concepts—emergence and complexity—that will
ar ise through our study of complex systems in this text.
Exa mple s
0 . 2 . 1 A few exa mples
What are com p l ex sys tems and what proper ties ch a racteri ze them? It is hel pful to star t
by making a list of s ome examples of com p l ex sys tem s . Ta ke a few minutes to make yo u r
own list. Con s i der actual sys tems ra t h er than mathem a tical models (we wi ll con s i der
m a t h em a tical models later ) . Ma ke a list of s ome simple things to con trast them wi t h .
Examples of Complex Systems
Gover nments
Families
The human body—physiological perspective
0 . 2
2 O ve r v i e w
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 2
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 2
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 3
Title: Dynamics Complex Systems
Shor t / Normal / Long
Simple systems
Physics
Chemistry
Biology
Mathematics
Computer Science
Sociology
Psychology
Economics
Anthropology
Philosophy
Simple systems
Complex systems
(a)
(b)
C
h
e
m
i
s
t
r
y
B
i
o
l
o
g
y
P
s
y
c
h
o
l
o
g
y
P
h
y
s
i
c
s
M
a
t
h
e
m
a
t
i
c
s
C
o
m
p
u
t
e
r
S
c
i
e
n
c
e
S
o
c
i
o
l
o
g
y
E
c
o
n
o
m
i
c
s
A
n
t
h
r
o
p
o
l
o
g
y
P
h
i
l
o
s
o
p
h
y
Fi gure 0 . 1 . 1 Con ce pt ua l illust ra t ion of t h e spa ce of scie n t ific in quiry. ( a ) is t h e con ve n t ion a l
vie w wh e re disciplin e s dive rge a s kn owle dge in cre a se s be ca use of t h e in cre a sin g comple xit y
of t h e va rious syst e ms be in g st udie d. I n t h is vie w a ll kn owle dge is spe cific a n d kn owle dge is
ga in e d by providin g more a n d more de t a ils. ( b) illust ra t e s t h e vie w of t h e fie ld of complex
syst e ms wh e re comple x syst e ms h a ve un ive rsa l prope rt ie s. By con side rin g t h e common prop
e rt ie s of comple x syst e ms, on e ca n a pproa ch t h e spe cifics of pa rt icula r comple x syst e ms from
t h e t op of t he sph e re a s we ll a s from t he bot t om.
00adBARYAM_29412 9/5/00 7:26 PM Page 3
A person—psychosocial per spective
The brain
The ecosystem of the world
Subworld ecosystems: desert, r ain forest, ocean
Weather
A corpor ation
A computer
Examples of Simple Systems
An oscillator
A pendulum
A spinning wheel
An orbiting planet
The purpose of thinking about examples is to develop a first under standing of the
question, What makes systems complex? To begin to address this question we can start
describing systems we know intuitively as complex and see what properties they share.
We tr y this with the first two examples listed above as complex systems.
Government
• It has many different funct ions:milit ar y, immigrat ion,t axat ion,income distr ib
ut ion, transpor tation, regulation. Each function is itself complex.
• There are different levels and t ypes of government: local, state and federal; town
meeting, council,mayoral. There are also various governmental forms in differ
ent count ries.
Family
• It is a set of individuals.
• Each individual has a relationship with the other individuals.
• Th ere is an interp l ay bet ween the rel a ti onship and the qu a l i t ies of the indivi du a l .
• The family has to interact with the outside world.
• There are different kinds of families: nuclear family, extended family, etc.
These descriptions focus on function and st ructure and diverse manifestation.
We can also consider the role that time plays in complex systems. Among the proper
ties of complex systems are change, growth and death, possibly some for m of life cy
cle. Combining time and the environment, we would point to the ability of complex
systems to adapt.
One of the issues that we will need to address is whether there are differ ent cate
gor ies of complex syst ems. For example, we might contr ast the systems we just de
scribed with complex physical systems: hydrodynamics (fluid flow, weather), glasses,
composite mater ials, earthquakes. In what way are these syst ems similar to or differ
ent from the biological or social complex systems? Can we assign funct ion and discuss
st ructure in the same way?
4 O ve r v i e w
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 4
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 4
0 . 2 . 2 Cent ra l propert ies of complex syst ems
After beginning to describe complex systems,a second st ep is to identify commonal
ities. We might make a list o f some of the char acteristics of complex systems and as
sign each of them some measure or att r ibute that can provide a first method of clas
sification or descript ion.
• Elements (and their number)
• Interactions (and their strength)
• For mation/Oper ation (and their time scales)
• Diversit y/Variability
• Environment (and its demands)
• Activit y(ies) (and its[their] objective[s])
This is a first step toward quantifying the pr opert ies of complex systems.Quantifying
the last three in the list requires some method of counting possibilities. The problem
of counting possibilities is cent ral to the discussion of quantitative complexit y.
0 . 2 . 3 Emergence: From element s a nd pa rt s t o complex syst ems
There are two approaches to organizing the proper ties of complex syst ems that wil l
serve as the foundation of our discussions. The first of these is the relationship b e
tween elements,parts and the whole. Since there is only one proper ty of the complex
system that we know for sure — that it is complex—the primary question we can ask
about this relationship is how the complexity of the whole is related to the complex
ity of the parts. As we will see, this question is a compelling question for our under
standing of complex systems.
From the examples we have indicat ed above, it is appar ent that parts of a com
plex syst em are oft en complex syst ems themselves. This is reasonable, because when
the parts o f a syst em are complex, it seems intuit ive that a collection of them would
also be complex. However, this is not the only possibilit y.
Can we describe a syst em composed of simple parts where the collective behav
ior is complex? This is an important possibility, called emergent complexit y. Any com
plex system formed out of atoms is an example. The idea of emergent complexity is
that the behaviors of many simple parts inter act in such a way that the behavior of the
whole is complex.Elements are those parts of a complex system that may be consid
ered simple when descr ibing the behavior of the whole.
Can we describe a system composed o f complex parts where the collective b e
havior is simple? This is also possible, and it is called emergent simplicit y. A useful
example is a planet or biting around a star. The behavior of the planet is quite simple,
even if the planet is the Earth, with many complex systems upon it. This example il
lustrates the possibility that the collective syst em has a behavior at a different scale
than its parts. On the smaller scale the system may behave in a complex way, but on
the larger scale all the complex details may not be relevant.
E xa m p l e s 5
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 5
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 5
0 . 2 . 4 Wha t is complexit y?
The second approach to the study of complex systems begins from an understanding
of the relationship of systems to their descriptions. The cent r al issue is defining quan
titatively what we mean by complexity. What, after all, do we mean when we say that
a system is complex? Better yet, what do we mean when we say that one system is more
complex than another? Is there a way to identify the complexity of one system and to
compare it with the complexity of another system? To develop a quantitat ive under
standing of complexity we will use tools of both statistical physics and computer sci
ence—infor mation theor y and computation theor y. Accor ding to this understanding,
complexity is the amount of information necessary to describe a system. However, in
order to arr ive at a consistent definition,care must be taken to specify the level of de
tail provided in the descr iption.
One of our targets is to understand how this concept of complexity is related to
emergence—emergent complexity and emergent simplicit y. Can we understand why
informationbased complexity is related to the description of elements,and how their
behavior gives rise to the collective complexit y of the whole syst em?
Section 0.5 of this overview discusses further the concepts of emergence and
complexity, providing a simplified preview of the more complete discussions later in
this text.
Que s t i ons
This text is st ruct ured around four questions r elated to the char acterization of com
plex systems:
1. Space: What are the character istics of the st ructure o f complex syst ems? Many
complex systems have substructure that extends all the way to the size of the sys
tem itself. Why is there substructure?
2. Time: How long do dynamical processes take in complex systems? Many complex
systems have specific responses to changes in their environment that require
changing their internal structure. How can a complex structure respond in a rea
sonable amount of t ime?
3. Selforganization and/versus organization by design: How do complex syst ems
come into existence? What are the dynamical processes that can give rise to com
plex syst ems? Many complex syst ems und ergo guid ed d evelopmental p rocesses
as par t of their formation. How are developmental processes guided?
4. Com p l ex i t y: What is com p l ex i ty? Com p l ex sys tems have va rying degrees of com
p l ex i ty. How do we ch a racteri ze / d i s tinguish t he va r ying degrees of com p l ex i ty ?
Chapter 1 of this text plays a special role. Its ten sect ions introduce mathematical
tools. These tools and their related concepts are integral to our under standing of com
plex system behavior. The main part of this book consists of eight chapters,2–9. These
0 . 3
6 O ve r v i e w
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 6
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 6
chapter s are paired.Each pair discusses one of the above four questions in the context
of a par ticular complex syst em. Chapters 2 and 3 discuss the role of substr ucture in
the context of neural networks. Chapters 4 and 5 discuss the time scale of dynamics
in the context of protein folding. Chapters 6 and 7 discuss the mechanisms of orga
nization of complex systems in the context of living organisms. Chapters 8 and 9 dis
cuss complexity in the context of human civilization. In each case the first of the pair
of chapters discusses mor e gener al issues and models. The second t ends to be more
specialized to the system that is under discussion. There is also a patter n to the degree
of analytic, simulation or qualitative treatments. In gener al,the first of the two chap
ters is more analytic, while the second relies more on simulations or qualitative treat
ments. Each chap ter has at least some discussion of qualitative concepts in additio n
to the formal quantitative discussion.
Another way to regard the text is to distinguish between the two approaches sum
marized above. The first deals with elements and interactions. The second deals with
descriptions and information. Ultimately, our object ive is to relate them, but we do so
using questions that progress gradually from the elements and interactions to the de
scriptions and infor mation. The former dominates in earlier chap ters, while the lat 
ter is important for Chapter 6 and becomes dominant in Chapters 8 and 9.
While the discussion in each ch a pter is pre s en ted in the con text of a spec i f i c
com p l ex sys tem , our focus is on com p l ex sys tems in gen era l . Thu s , we do not at
tem pt (nor would it be po s s i ble) to revi ew the en ti re fields of n eu r al net work s , pr o
tein fo l d i n g, evo luti on , devel opm ent al bi o l ogy and social and econ omic scien ce s .
Si n ce we are intere s t ed in universal aspect s of these sys tem s , t he topics we cover
n eed not be t he issues of con tem por a r y impor t a n ce in the stu dy of these sys tem s .
Our approach is to motiva te a qu e s ti on of i n t erest in the con text of com p l ex sys
tems using a par ticular com p l ex sys t em , t h en to step back and adopt a met h od of
s tu dy that has rel eva n ce to all com p l ex sys t em s . Re s e a rch ers intere s ted in a par ti c u
lar com p l ex sys tem are as likely t o find a discussion of i n terest to t hem in any on e
of the ch a pters , and should not focus on the ch a pter with the par ticular com p l ex
s ys t em in it s ti t l e .
We note that the text is interrupted by questions that are, with few exceptions,
solved in the text. They are given as questions to promote independent thought about
the study of complex systems. Some of them develop further the analysis of a system
through analyt ic work or through simulations. Others are designed for concept ual de
velopment. With few except ions they should be considered int egr al to the text, and
even if they are not solved by the reader, the solut ions should be read.
Q
ue s t i on 0 . 3 . 1 Consider a few complex systems. Make a list of their el
ements, interact ions between these elements, the mechanism by which
the system is formed and the activities in which the system is engaged.
Solut i on 0 . 3 . 1 The following table indicates proper ties of the systems that
we will be discussing most int ensively in this t ext.
Q u e s t i o n s 7
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 7
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 7
Ta ble 0 . 3 . 1 : Complex Syst e ms a nd Some At t ribut e s
8 O ve r v i e w
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 8
Title: Dynamics Complex Systems
Shor t / Normal / Long
System El ement Interacti on Formati on Acti vi ty
Proteins Amino Acids Bonds Protein folding Enzymatic
activity
Nervous system Neurons Synapses Learning Behavior
Neural networks Thought
Physiology Cells Chemical Developmental Movement
messengers biology
Physiological
Physical support functions
Life Organisms Reproduction Evolution Survival
Competition Reproduction
Predation Consumption
Communication Excretion
Human Human Beings Communication Social evolution Same as Life?
economies
Technology Confrontation Exploration?
and societies
Cooperation
Me t hods
When we think about methodology, we must keep purpose in mind.Our purpose in
studying complex systems is to extract general principles.General pr inciples can take
many forms. Most pr inciples are ar ticulated as relationships b etween properties—
when a system has the propert y x, then it has the proper t y y. When possible, r elation
ships should be quantitat ive and expressed as equations. In order to explore such re
lationships, we must constr uct and study mathematical models. Asking why the
proper ty x is related to the proper ty y requires an understanding of alter natives. What
else is possible? As a bonus, when we are able to generate systems with various p rop
ert ies, we may also be able to use them for pract ical applicat ions.
All appr oaches that are used for the study of simple systems can be applied to the
study of complex systems. However, it is impor tant to recognize features of conven
tional approaches that may hamper progress in the study of complex syst ems. Both
exper imental and theoretical methods have been developed to overcome these diffi
culties. In this text we introduce and use methods of analysis and simulation that are
par ticular ly suit ed to the study of complex syst ems. These methods avoid standar d
simplifying assump tions, but use other simplifications that are better suit ed to our
object ives. We discuss some of these in the following paragraphs.
• Don’t take it apart. Since interactions between parts of a complex system are es
sential to understanding its behavior, looking at parts by themselves is not suffi
cient. It is necessary to look at parts in the context of the whole. Similarly, a com
plex syst em int er acts with its environment, and this e nvironmental influence is
0 . 4
00adBARYAM_29412 9/5/00 7:26 PM Page 8
impor tant in describing the behavior of the system. Exper imental tools have been
developed for studying systems in situ or in vivo—in context. Theoretical analyt ic
methods such as the mean field ap proach enable parts of a system to be studied
in context. Computer simulations that tr eat a system in its entiret y also avoid
such problems.
• Don’t assume smoo t h n e s s . Mu ch of the qu a n ti t a tive stu dy of simple sys tems make s
use of d i f feren tial equ a ti on s . Di f feren tial equ a ti on s ,l i ke the wave equ a ti on ,a s su m e
that a sys tem is essen ti a lly uniform and that local details don’t matter for the be
h avi or of a sys tem on larger scales. These assu m pti ons are not gen er a lly valid for
com p l ex sys tem s . Al tern a te static models su ch as fract a l s , and dynamical models in
cluding itera t ive maps and cellular automata may be used inste ad .
• Don’t assume that only a few parameters are important. The behavior of complex
systems depends on many independent pieces of information. Developing an un
derstanding of them requires us to build mental models. However, we can onl y
have “in mind” 7±2 independent things at once. Analytic approaches, such as
scaling and renormalization,have been developed to identify the few relevant pa
rameters when this is possible. Informationbased approaches consider the col
lection of all parameters as the object of study. Computer simulations keep track
of many parameters and may be used in the st udy of dynamical processes.
There are also tools needed for communication of the results of studies.
Conventional manuscripts and oral pr esentations are now being augmented by video
and int er active media. Such novel approaches can increase the effectiveness of com
municat ion,par ticular ly of the results of computer simulations. However, we should
avoid the “cute picture” syndrome, where pictures are presented without accompany
ing discussion or analysis.
In this t ext, we int roduce and use a variety of analyt ic and computer simulation
methods to address the questions list ed in the previous section. As mentioned in the
preface, there are two gener al methods f or stud ying complex syst ems. In the first, a
specific syst em is selected and each of the parts as well as their int eractions are iden
tified and described. Subsequently, the objective is to show how the behavior of the
whole emerges from them. The second approach considers a class of systems (ensem
ble), wher e the essential characteristics of the class are described,and statistical anal y
sis is used to obtain properties and behaviors of the systems. In this t ext we focus on
the latter approach.
Conce pt s : Eme rge nce a nd Comple xi t y
The object ives of the field o f complex syst ems are built on fundamental concepts—
emergence, complexity—about which there are common misconceptions that are ad
dressed in this sect ion and throughout the book.Once under stood,these concepts re
veal the context in which universal propert ies of complex syst ems arise and specific
universal phenomena, such as the evolution of biological systems, can be better
under stood.
0 . 5
Co n c e p t s : Em e r g e n c e a n d c o m p l e x i t y 9
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 9
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 9
A complex system is a syst em formed out of many components whose behavior
is emergent,that is,the behavior of the system cannot be simply infer red from the be
havior of its components. The amount of infor mation necessary to describe the be
havior of such a system is a measure of its complexity. In the following sect ions we
discuss these concepts in greater detail.
0 . 5 . 1 Emergence
It is impossible to understand complex systems without recognizing that simple atoms
must somehow, in large numbers, give rise to complex collect ive behaviors. How and
when this occurs is the simplest and yet the most profound problem that the study of
complex syst ems fa ces. The p roblem can be appr oached first by developing an un
derstanding o f the t erm “emergence.” For man y, the concept o f emergent behavior
means that the behavior is not captured by the behavior of the parts. This is a serious
misunderstanding. It arises because the collective behavior is not readily understood
from the behavior of the parts. The collective behavior is, however, contained in the
behavior of the parts if they are studied in the context in which they are found. To ex
plain this, we discuss examples of emergent proper ties that illustrate the differ ence be
tween local emergence—where collect ive behavior appears in a small part of the sys
tem—and global emergence—where collective behavior p er tains to the system as a
whole. It is the latter which is par ticularly relevant to the study of complex systems.
We can speak abo ut em er gen ce wh en we con s i der a co ll ecti on of el em ents and the
proper ties of the co ll ective beh avi or of these el em en t s . In conven ti onal phys i c s , t h e
main arena for the st u dy of su ch proper ties is therm odynamics and stati s tical me
ch a n i c s . The easiest therm odynamic sys t em to think abo ut is a gas of p a r ti cl e s . Two
em er gent properties of a gas are its pre s su re and tem pera tu re . The re a s on they are
em er gent is that they do not natu ra lly arise out of the de s c ri pti on of an indivi dual par
ti cl e . We gen era lly de s c ri be a par ti cle by spec i f ying its po s i ti on and vel oc i t y. Pre s su re
and tem pera tu re become rel evant on ly wh en we have many part i cles toget h er. Wh i l e
these are em er gent propert i e s , the way they are em er gent is very limited . We call them
l ocal em er gent propert i e s . The pre s su re and tem pera tu re is a local proper ty of the ga s .
We can take a ver y small sample of t he gas aw ay from the rest and st i ll define and mea
su re the (same) pre s su re and tem pera tu re . Su ch propert i e s ,c a ll ed inten s ive in phys i c s ,
a re local em er gent proper ti e s . Ot h er examples from physics of l oc a lly em er gent be
h avi or ar e co ll ect ive modes of exc i t a ti on su ch as sound wave s , or light prop a ga ti on in
a med iu m . Phase tr a n s i ti ons (e.g. , solid to liquid) also repre s ent a co ll ective dy n a m i c s
that is vi s i ble on a mac ro s copic scale, but can be seen in a micro s copic sample as well .
Another example of a local emergent p roper t y is the formation o f water fr om
atoms of hydrogen and oxygen. The proper ties of water are not apparent in the prop
er ties of gasses of oxygen or hydrogen. Neither does an isolated water molecule reveal
most propert ies of water. However, a microscopic amount of water is sufficient.
In t he stu dy of com p l ex sys tems we are par ti c u l a rly intere s ted in gl obal em er gen t
properti e s . Su ch proper ties depend on the en ti re sys tem . The mat hem a tical t re a tm en t
of gl obal em er gent propert ies requ i res some ef fort . This is one re a s on that em er gen ce
is not well apprec i a ted or unders tood . We wi ll discuss gl obal em er gen ce by su m m a ri z
10 O ve r v i e w
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 10
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 10
ing the re sults of a classic mathem a tical t re a tm en t , and then discuss it in a more gen
eral manner that can be re ad i ly apprec i a ted and is useful for sem i qu a n ti t a tive analys e s .
The classic analysis of global emergent behavior is that of an associative memor y
in a simple model of neural networks known as the Hopfield or attractor network.
The analogy to a neural network is useful in order to be concrete and relate this model
to known concepts. However, this is more gener ally a model o f any syst em for med
from simple elements whose states are cor related. Without such correlations, emer
gent behavior is impossible. Yet if all elements are correlated in a simple way, then lo
cal emergent behavior is the outcome. Thus a mo del must be sufficiently rich in or
der to cap ture the phenomenon o f global emergent behavior. One of the important
qualities of the attractor network is that it displays global emergence in a part icularly
elegant manner. The following few paragraphs summarize the oper ation of the at 
t ractor network as an associat ive memory.
The Hopfield networ k has simple binary elements that are either ON or OFF. The
binary elements are an abstraction of the firing or quiescent state of neurons. The el
ements interact with each other to create cor relations in the firing patterns. The in
teractions represent the role of synapses in a neural networ k. The network can work
as a memor y. Given a set o f preselected patterns, it is possible to set the interact ions
so that these patt erns are selfconsistent states of the network—the networ k is stable
when it is in these firing patterns. Even if we change some of the neur ons, the or igi
nal pattern will be recovered. This is an associative memor y.
Assume for the moment that the pattern of firing repr esents a sentence, such as
“To be or not to be,that is the question.” We can recover the complete sentence by pre
senting only part of it to the networ k “To be or not to be, that” might be enough. We
could use any part to retrieve the whole,such as,“to be,that is the question.” This kind
of memor y is to be contrasted with a computer memor y, which works by assigning an
address to each st or age location. To access the information stored in a par ticular lo
cation we need to know the address. In the neural network memor y, we specify par t
of what is located there, rather than the analogous address: Hamlet, by William
Shakespeare, act 3, scene 1, line 64.
More centr al to our discussion,however, is that in a computer memor y a partic
ular bit of information is st ored in a part icular switch. By cont rast,the network does
not have its memor y in a neuron. Instead the memor y is in the synapses. In the model,
there are synapses between each neuron and every other neuron. If we remove a small
part of the networ k and look at its propert ies,then the number of synapses that a neu
ron is left with in this small part is only a small fr action of the number of synapses it
started with. If there are more than a few patterns st ored, then when we cut out the
small part of the network it loses the ability to remember any of the patterns, even the
par t which would be represented by the neurons contained in this par t.
This kind of behavior char acterizes emergent p roper ties. We see that emergent
properties cannot be studied by physically taking a system apart and looking at the
parts (reductionism). They can,however, be studied by looking at each of the parts in
the context of the syst em as a whole. This is the nature o f emergence and an indica
tion of how it can be studied and under stood.
Co n c e p t s : Em e r g e n c e a n d c o m p l e x i t y 11
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 11
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 11
The above discussion reflects the analysis of a relatively simple mathematical
model of emergent behavior. We can,however, provide a more qualitat ive discussion
that ser ves as a guide for thinking about diverse complex systems. This discussion fo
cuses on the proper ties of a system when part of it is removed. Our discussion of lo
cal emergent proper ties suggested that taking a small part out of a large system would
cause little change in the proper ties of the small part, or the propert ies of the large
part.On the other hand, when a system has a global emergent proper t y, the behavior
of the small part is different in isolation than when it is par t of the larger system.
If we think about the system as a whole, rather than the small part of the system,
we can identify the system that has a global emergent propert y as being for med out of
interdependent parts. The t erm “interdependent” is used here instead of the t erms
“inter connected” or “inter woven” used in the dict ionary definition of “complex”
quoted in Section 0.1, because neither of the latter terms per tain directly to the influ
ence one part has on another, which is essential to the proper ties of a dynamic system.
“Interdependent” is also distinct from “inter acting,” because even strong interact ions
do not necessarily imply int erdependence of behavior. This is clear fr om the macr o
scopic propert ies of simple solids.
Thus, we can characterize complex systems through the effect of removal of part
of the system. There are two natural possibilities. The first is that properties of the part
are affected, but the rest is not affected. The second is that propert ies of the rest are af
fected by the r emoval of a part. It is the latt er that is most appealing as a model of a
t ruly complex system. Such a system has a collective behavior that is dependent on the
behavior of all of its parts. This concept becomes more precise when we connect it to
a quantitat ive measure of complexit y.
0 . 5 . 2 Complexit y
The second concept that is central to complex syst ems is a quantitative measure of
how complex a syst em is. Loosely speaking, the complexity of a system is the
amount of information needed in order to describe it. The complexity d epends on
the level of detail r equired in the description. A more for mal definition can be un
derstood in a simple way. If we have a system that could have many possible states,
but we would like to specify which state it is actually in, then the numb er of binar y
digits (bits) we need to specify this par ticular state is r elated to the numb er of states
that are possible. If we call the number of states Ω then the number of bits of infor
mation needed is
I log
2
(Ω) (0.5.1)
To understand this we must realize that to specify which state the system is in, we must
enumerate the states. Representing each state uniquely requires as many numbers as
there are states. Thus the number of states of the representation must be the same as
the number of states of the syst em. For a st ring of N bits there are 2
N
possible states
and thus we must have
Ω 2
N
(0.5.2)
12 O ve r v i e w
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 12
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 12
which implies that N is the same as I above. Even if we use a descript ive English text
instead of numbers,t her e must be the same number of possible descriptions as there
are states, and the information content must be the same. When the number of pos
sible valid English sent ences is properly accounted for, it turns out that the best est i
mate of the amount of infor mation in English is about 1 bit per character. This means
that the infor mation content of this sentence is about 120 bits, and that of this book
is about 3 10
6
bits.
For a microstate of a p hysical syst em, where we specify the positions and mo
menta of each of the par ticles, this can be r ecognized as proport ional to the ent ropy
of the syst em, which is defined as
S k ln(Ω) k ln(2)I (0.5.3)
wh ere k 1.38 1 0
2 3
Jo u l e / ˚ Kelvin is the Boltzmann constant wh i ch is rel evant to
our conven ti onal ch oi ce of u n i t s . Using measu red en tropies we find t hat en tropies of
order 10 bits per atom are typ i c a l . The re a s on k is so small is that the qu a n ti ties of m a t ter
we t yp i c a lly con s i der are in units of Avoga n d ro’s nu m ber (moles) and the nu m ber of
bits per mole is 6.02 1 0
2 3
times as large . Thu s , t he inform a ti on in a piece of m a ter
ial is of order 10
24
bi t s .
There is one point about Eq.(0.5.3) that may r equire some clarification. The po
sitions and momenta of particles are real numbers whose specification might require
infinitely many bits. Why isn’t the infor mation necessary to sp ecify the microstate of
a system infinite? The answer to this question comes from quantum physics, which is
responsible for giving a unique value to the ent ropy and thus the information needed
to specify a state of the syst em. It does this in two ways. First, it t ells us that micro
scopic states are indistinguishable unless they differ by a discrete amount in position
and momentum—a quantum diff erence given by Planck’s constant h. Second, it in
dicates that part icles like nuclei or atoms in their ground state are uniquely specified
by this state,and are indistinguishable from each other. There is no additional infor
mation necessary to specify their int er nal st r ucture. Under standard conditions, es
sentially all nuclei are in their lowest energy state.
The rel a ti onship of en tropy and inform a ti on is not acc i den t a l , of co u rs e , but it is the
s o u rce of mu ch con f u s i on . The con f u s i on arises because the en t ropy of a physical sys
tem is largest wh en it is in equ i l i br iu m . This su ggests that the most com p l ex sys tem is a
s ys tem in equ i l i briu m . This is co u n ter to our usual understanding of com p l ex sys tem s .
Equ i l i brium sys tems have no spatial stru ctu re and do not ch a n ge over ti m e . Com p l ex
s ys tems have su b s t a n tial inter nal stru ctu re and this str u ctu re ch a n ges over ti m e .
The problem is that we have used the definition of the information necessary to
specify the microscopic state (microstate) of the system rather than the macroscopic
state (macrostate) of the syst em. We need to consider the information necessary to
describe the macrostate o f the system in o rder to define what we mean by complex
it y. One of the important p oints to realize is that in order for the macrostate of the
system to require a lot of information to describe it,there must be cor relations in the
microstate of the syst em. It is only when many microscopic atoms move in a coher
ent fashion that we can see this motion on a macroscopic scale. However, if many
Co n c e p t s : Em e r g e n c e a n d c o m p l e x i t y 13
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 13
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 13
microscopic atoms move t ogether, the syst em must be far from equilibr ium and the
microscopic information (entropy) must be lower than that of an equilibr ium system.
It is helpful, even essential, to define a complexity profile which is a func tion of
the scale of obser vation. To obtain the complexity profile, we obser ve the system at a
par ticular length (or time) scale,ignoring all finerscale details. Then we consider how
much information is necessary to describe the obser vations on this scale. This solves
the problem of distinguishing between a microscopic and a macroscopic description.
Moreover, for different choices of scale, it explicitly cap tures the dependence of the
complexit y on the level of detail that is required in the descr iption.
The complexity profile must be a monotonically falling function of the scale. This
is because the infor mation needed to describe a system on a larger scale must be a sub
set of the information needed to describe the system on a smaller scale—any finer
scale description contains the coarserscale description. The complexity profile char
acterizes the properties of a complex system. If we wish to point to a part icular
number for the complexity of a system,it is natural to consider the complexity as the
value of the complexity profile at a scale that is slightly smaller than the size of the sys
tem itself. The b ehavior at this scale includes the movement o f the syst em through
space, and dynamical changes of the system that are essentially the size of the system
as a whole. The Earth orbiting the sun is a useful example.
We can make a dir ect connection between this definition of complexity and the
discussion of the for mation of a complex system out of parts. The complexity of the
parts of the system are described by the complexity profile of the system evaluated on
the scale of the parts. When the behavior of the system depends on the behavior of the
parts, the complexity of the whole must involve a descript ion of the parts, thus it is
large. The smaller the parts that must be described to describe the behavior o f the
whole, the larger the complexit y of the entire system.
For t he I ns t ruct or
This text is designed for use in an introductory graduatelevel course, to present var
ious concepts and methodologies of the study of complex systems and to begin to de
velop a common language for researchers in this new field. It has been used for a one
semester course, but the amount of mater ial is large, and it is better to spread the
material over two semesters.A twosemest er course also provides more oppor tunities
for including various other approaches to the study of complex systems, which are as
valuable as the ones that are covered her e and may be more familiar to the instr uctor.
Consistent with the objective and purpose of the field,students attending such a
course tend to have a wide variety of backgrounds and interests. While this is a posi
tive development, it causes difficult ies for the syllabus and framework of the cour se.
One approach to a course syllabus is to include the int roductor y mat erial given
in Chapter 1 as an int egral part of the course. It is better to int erleave the later chap
ters with the relevant materials from Chapter 1. Such a course might proceed:1.1–1.6;
2; 3; 4; 1.7; 5; 6; 7; 1.8–1.10; 8; 9. Including the materials of Chapter 1 allows the dis
0 . 6
14 O ve r v i e w
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 14
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 14
cussion of impor tant mathematical methods,and addresses the diverse backgr ounds
of the stud ents. Even if the int roductor y chap ter is covered quickly (e.g., in a one
semester course),this establishes a common base of knowledge for the remainder of
the course. If a highspeed approach is taken,it must be emphasized to the students
that this mater ial ser ves only to expose them to concepts that they are unfamiliar with,
and to review concepts for those with pr ior knowledge of the t opics covered.
Unfortunately, many students are not willing to sit through such an extensive (and in
tense) int roduction.
A second approach begins from Chapter 2 and int roduces the material from
Chapter 1 only as needed. The chapters that are the most t echnically difficult,and r ely
the most on Chapter 1,are Chapters 4 and 5. Thus, for a onesemester course,the sub
ject of protein folding (Chapter s 4 and 5) could be skipped. Then much of the intro
ductor y mater ial can be omitted, with the except ion of a discussion of the last part of
Section 1.3,and some int roduct ion to the subject of entropy and information either
through thermodynamics (Section 1.3) or information theor y (Sect ion 1.8), pr efer
ably both. Then Chapters 2 and 3 can be covered first, followed by Chapters 6–9, with
selected mater ial introduced from Chapter 1 as is ap propr iate for the background of
the students.
Ther e are two additional recommendations.First,it is better to run this course as
a projectbased course rather than using graded homewor k. The varied backgrounds
of students make it difficult to select and fairly gr ade the problems. Projects for indi
viduals or small groups of students can be tailo red to their knowledge and int erests.
There are many new areas of inquir y, so that projects may approach resear chlevel
contributions and be exciting for the students. Unfor tunately, this means that stu
dents may not devote sufficient effor t to the study of course material,and rely largely
upon exposure in lectures. There is no optimal solution to this problem. Secon d,if it
is possible,a seminar ser ies with lecturers who wor k in the field should be an integral
part of the course. This pr ovides additional exposure to the varied approaches to the
study of complex systems that it is not possible for a single lecturer or text to provide.
F o r t h e i n s t r u c t o r 15
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 15
Title: Dynamics Complex Systems
Shor t / Normal / Long
00adBARYAM_29412 9/5/00 7:26 PM Page 15
16
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 16
Title: Dynamics Complex Systems
Shor t / Normal / Long
1
I nt roduct i on a nd Pre li mi na ri e s
Conce pt ua l Out li ne
A deceptively simple model of the dynamics of a system is a deterministic
iterative map applied to a single real variable. We characterize the dynamics by look
ing at its limiting behavior and the approach to this limiting behavior. Fixed points that
attract or repel the dynamics, and cycles, are conventional limiting behaviors of a
simple dynamic system. However, changing a parameter in a quadratic iterative map
causes it to undergo a sequence of cycle doublings (bifurcations) until it reaches a
regime of chaotic behavior which cannot be characterized in this way. This deter
ministic chaos reveals the potential importance of the influence of finescale details
on largescale behavior in the dynamics of systems.
A system that is subject to complex (external) influences has a dynamics
that may be modeled statistically. The statistical treatment simplifies the complex un
predictable stochastic dynamics of a single system, to the simple predictable dy
namics of an ensemble of systems subject to all possible influences. A random walk
on a line is the prototype stochastic process. Over time, the random influence causes
the ensemble of walkers to spread in space and form a Gaussian distribution. When
there is a bias in the random walk, the walkers have a constant velocity superim
posed on the spreading of the distribution.
While the microscopic dynamics of physical systems is rapid and complex,
the macr oscopic behavior of many materials is simple, even static. Before we can un
derstand how complex systems have complex behaviors, we must understand why
mater ials can be simple. The origin of si mplicity is an averaging over the fast micr o
scopic dynamics on the time scale of macroscopic observations (the ergodic t heorem)
and an averaging over microscopic spatial variat ions. The aver aging can be perf ormed
theoretical ly using an ensemble represent ation of t he physical system that assumes
all microscopic states are realized. Using this as an assumption, a statistical treat ment
of microscopic states descri bes the macr oscopic equil ibrium behavior of syst ems. The
fi nal part of Secti on 1.3 int roduces concepts that play a cent ral role in the rest of the
book. It di scusses the dif ferences between equilibrium and complex syst ems.
Equilibrium systems are divi sibl e and satisfy t he ergodic theor em. Complex systems
1 . 3
1 . 2
1 . 1
01adBARYAM_29412 3/10/02 10:15 AM Page 16
are composed out of i nterdependent par ts and violate the ergodic theor em. They have
many degr ees of f reedom whose time dependence is very slow on a microscopic scale.
To understand the separation of time scales between fast and slow de
grees of freedom, a twowell system is a useful model. The description of a particle
traveling in two wells can be simplified to the dynamics of a twostate (binary vari
able) system. The fast dynamics of the motion within a well is averaged by assuming
that the system visits all states, represented as an ensemble. After taking the aver
age, the dynamics of hopping between the wells is represented explicitly by the dy
namics of a binary variable. The hopping rate depends exponentially on the ratio of
the energy barrier and the temperature. When the temperature is low enough, the
hopping is frozen. Even though the two wells are not in equilibrium with each other,
equilibrium continues to hold within a well. The cooling of a twostate system serves
as a simple model of a glass transition, where many microscopic degrees of freedom
become frozen at the glass transition temperature.
Cellular automata are a general approach to modeling the dynamics of
spatially distributed systems. Expanding the notion of an iterative map of a single vari
able, the variables that are updated are distributed on a lattice in space. The influ
ence between variables is assumed to rely upon local interactions, and is homoge
neous. Space and time are both discretized, and the variables are often simplified to
include only a few possible states at each site. Various cellular automata can be de
signed to model key properties of physical and biological systems.
The equilibr ium state of spati al ly distri buted syst ems can be modeled by
fi elds that are treated using statistical ensembles. The simplest is the I sing model, which
capt ures the simple cooperative behavior found in magnet s and many other systems.
Cooper ative behavior is a mechani sm by whi ch microscopi c fast degrees of freedom
can become slow collective degrees of f reedom that violate the ergodic theorem and
are visible macroscopicall y. Macroscopic phase tr ansitions are the dynamics of the
cooperat ive degrees of f reedom. Cooper ative behavior of many i nteracting elements
is an important aspect of the behavior of complex systems. This should be contrasted
to the twostate model (Section 1.4), where the slow dynami cs occurs microscopically.
Computer simulations of models such as molecular dynamics or cellular
automata provide important tools for the study of complex systems. Monte Carlo sim
ulations enable the study of ensemble averages without necessarily describing the
dynamics of a system. However, they can also be used to study randomwalk dy
namics. Minimization methods that use iterative progress to find a local minimum are
often an important aspect of computer simulations. Simulated annealing is a method
that can help find low energy states on complex energy surfaces.
We have treated systems using models without acknowledging explicitly
that our objective is to describe them. All our efforts are designed to map a system
onto a description of the system. For complex systems the description must be quite
long, and the study of descriptions becomes essential. With this recognition, we turn
1 . 8
1 . 7
1 . 6
1 . 5
1 . 4
Co n c e p t u a l o u t l i n e 17
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 17
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 17
to information theory. The information contained in a communication, typically a
string of characters, may be defined quantitatively as the logarithm of the number of
possible messages. When different messages have distinct probabilities P in an en
semble, then the information can be identified as ln(P) and the average information
is defined accordingly. Long messages can be modeled using the same concepts as
a random walk, and we can use such models to estimate the information contained
in human languages such as English.
In or der to underst and the relationshi p of information to systems, we must
also underst and what we can infer from informati on that is pr ovided. The theory of logic
is concerned with inference. It is directl y linked t o computation theory, which is con
cerned with the possible (det ermini stic) operations that can be per formed on a str ing
of charact ers. All operations on char acter stri ngs can be constructed out of el emen
tary logical (Boolean) operat ions on binary var iables. Using Tu r i n g ’s model of compu
tation, it is fur ther shown that all computations can be perf ormed by a univer sal Tu r i n g
machine, as l ong as its input character string is suitabl y constructed. Computati on t he
ory i s al so related to our concer n with the dynami cs of physical systems because it ex
plores the set of possible outcomes of discrete deterministic dynamic systems.
We return to issues of structure on microscopic and macroscopic scales
by studying fractals that are selfsimilar geometric objects that embody the concept
of progressively increasing structure on finer and finer length scales. A general ap
proach to the scale dependence of system properties is described by scaling theory.
The renormalization group methodology enables the study of scaling properties by
relating a model of a system on one scale with a model of the system on another
scale. Its use is illustrated by application to the Ising model (Section 1.6), and to the
bifurcation route to chaos (Section 1.1). Renormalization helps us understand the ba
sic concept of modeling systems, and formalizes the distinction between relevant
and irrelevant microscopic parameters. Relevant parameters are the microscopic
parameters that can affect the macroscopic behavior. The concept of universality is
the notion that a whole class of microscopic models will give rise to the same macro
scopic behavior, because many parameters are irrelevant. A conceptually related
computational technique, the multigrid method, is based upon representing a prob
lem on multiple scales.
The study of complex systems begins from a set of models that capture aspects of the
dynamics of simple or complex systems. These models should be sufficiently gener al
to encompass a wide range of possibilities but have sufficient structure to capt ure in
teresting features. An exciting bonus is that even the appar ently simple mo dels dis
cussed in this chapter introduce features that are not typically t reated in the conven
tional science of simple systems, but are appropriate introductions to the dynamics of
complex systems.Our t reatment of dynamics will often consider discrete rather than
continuous time. Analytic t reatments are oft en convenient to for mulate in continu
1 . 1 0
1 . 9
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 18
Title: Dynamics Complex Systems
Shor t / Normal / Long
18 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
01adBARYAM_29412 3/10/02 10:15 AM Page 18
ous variables and differential equations;however, computer simulations are often best
formulated in discrete spacetime variables with welldefined intervals. Mor eover, the
assumpt ion of a smooth continuum at small scales is not usual ly a convenient star t
ing point for the study of complex systems. We are also generally interested not only
in one example of a system but rather in a class of systems that differ from each other
but share a char acter istic st ructure. The elements of such a class of systems are col
lectively known as an ensemble. As we introduce and study mathematical models, we
should recognize that our primary objective is to represent properties of real systems.
We must therefore develop an understanding of the nature of models and modeling,
and how they can per tain to either simple or complex systems.
I t e ra t i ve Ma ps ( a nd Cha os )
An iterative map f is a function that evolves the state of a system s in discrete time
s(t) · f(s(t − t)) (1.1.1)
where s(t) describes the state of the system at time t. For convenience we will gener
ally measure time in units of t which then has the value 1,and time takes integral val
ues star ting from the initial condition at t · 0.
Ma ny of the com p l ex sys tems we wi ll con s i der in t his text are of the form of
Eq .( 1 . 1 . 1 ) ,i f we all ow s to be a gen eral va ri a ble of a rbi t ra ry dimen s i on . The gen era l i ty
of i tera t ive maps is discussed at the end of this secti on . We start by con s i dering severa l
examples of i tera t ive maps wh ere s is a single va ri a bl e . We discuss br i ef ly the bi n a r y
va ri a ble case, s · t1 . Th en we discuss in gre a ter detail two types of maps with s a re a l
va ri a bl e , s ∈ ℜ, linear maps and qu ad ra tic maps. The qu ad ra tic iter a tive map is a sim
ple model that can display com p l ex dy n a m i c s . We assume that an itera tive map may be
s t a r ted at any init ial con d i ti on all owed by a spec i f i ed domain of its sys tem va ri a bl e .
1 . 1 . 1 Bina ry it era t ive ma ps
There are only a few binary iterat ive maps.Question 1.1.1 is a complete enumeration
of them.*
Q
ue s t i on 1 . 1 . 1 Enumer ate all possible it erative maps where the syst em
is described by a single binary variable, s · t1.
Solut i on 1 . 1 . 1 Ther e are only four possibilities:
s(t) · 1
s(t) · −1
s(t) · s(t − 1)
(1.1.2)
s(t) · −s(t − 1)
1 . 1
I t e ra t i v e m a p s ( a n d c h a o s ) 19
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 19
Title: Dynamics Complex Systems
Shor t / Normal / Long
*Questions are an integral part of the text. They are designed to promote independent thought. The reader
is encouraged to read the question, contemplate or work out an answer and then read the solut ion provided
in the text. The continuation of the text assumes that solutions to questions have been read.
01adBARYAM_29412 3/10/02 10:15 AM Page 19
It is inst r uctive to consider these possibilities in some detail. The main rea
son ther e are so few possibilities is that the form of the iterative map we are
using depends,at most, on the value of the system in the previous time. The
first two examples are constants and don’t even depend on the value o f the
system at the previous time. The third map can only be distinguished from
the first two by obser vation of its behavior when presented with two differ
ent initial conditions.
The last of the four maps is the only map that has any sustained dy
namics. It cycles between two values in per petuity. We can think about this
as representing an oscillator.
Q
ue s t i on 1 . 1 . 2
a. In what way can the map s(t) · −s(t − 1) represent a physical oscillator?
b. How can we think of the stat ic map, s(t ) · s(t − 1), as an oscillator?
c. Can we do the same for the constant maps s(t) · 1 and s(t) · −1?
Solut i on 1 . 1 . 2 (a) By looking at the oscillator displacement with a strobe
at halfcycle intervals,our measured values can be r epresented by this map.
(b) By looking at an oscillator with a st robe at cycle intervals. (c) You might
think we could, by picking a definite starting phase of the strobe with respect
to the oscillat or. However, the constant map ignores the first value, the os
cillator does not.
1 . 1 . 2 Linea r it era t ive ma ps: free mot ion, oscilla t ion, deca y
a nd growt h
The simplest example of an iterative map with s real, s ∈ℜ, is a constant map:
s(t ) · s
0
(1.1.3)
No matter what the initial value,this system always takes the par t icular value s
0
. The
constant map may seem trivial,however it will be useful to compare the constant map
with the next class of maps.
A linear it er ative map with unit coefficient is a model of free motion or propa
gat ion in space:
s(t ) · s(t − 1) + v (1.1.4)
at su cce s s ive times the va lues of s a re sep a ra ted by v, wh i ch plays the role of the vel oc i ty.
Q
ue s t i on 1 . 1 . 3 Consider the case of zero velocit y
s(t ) · s(t − 1) (1.1.5)
How is this different from the constant map?
Solut i on 1 . 1 . 3 The two maps differ in their depen den ce on the initial va lu e .
20 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 20
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 20
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 21
Title: Dynamics Complex Systems
Shor t / Normal / Long
Runaway growth or decay is a multiplicative iterative map:
s( t ) · gs(t − 1) (1.1.6)
We can gener ate the values of this it er ative map at all times by using the equivalent
expression
(1.1.7)
which is exponential growth or decay. The it er ative map can be thought of as a se
quence of snapshots of Eq.(1.1.7) at integral time. g · 1 reduces this map to the pre
vious case.
Q
ue s t i on 1 . 1 . 4 We have seen the case o f free motion, and now jumped
to the case of growth. What happened to accelerated motion? Usually we
would consider acceler ated motion as the next step after motion with a con
stant velocit y. How can we wr ite acceler ated motion as an iter ative map?
Solut i on 1 . 1 . 4 The description of accelerated motion requires two var i
ables: position and velocity. The iter ative map would look like:
x( t ) · x( t − 1) + v( t − 1)
(1.1.8)
v(t ) · v(t − 1) + a
This is a twovariable iterative map. To wr ite this in the notation of Eq.(1.1.1)
we would define s as a vector s(t ) · (x(t ), v(t )).
Q
ue s t i on 1 . 1 . 5 What hap pens in the rightmost exponential expression
in Eq. (1.1.7) when g is negative?
Solut i on 1 . 1 . 5 The logarithm of a negat ive number results in a phase i .
The t er m i t in the exp onent alt er nates sign ever y time st ep as one would
expect from Eq. (1.1.6).
At this point,it is convenient to introduce two graphical methods for describing
an it erative map. The first is the usual way of plotting the value o f s as a function of
time. This is shown in the left panels of Fig. 1.1.1. The second type of plot ,shown in
the right panels, has a different purpose. This is a plot of the iterative relation s(t ) as
a funct ion of s(t − 1). On the same axis we also draw the line for the identity map
s(t ) · s(t − 1). These two plots enable us to gr aphically obtain the successive values of
s as follows. Pick a star ting value of s, which we can call s(0). Mark this value on the
abscissa. Mark the point on the graph of s(t ) that corresponds to the point whose ab
scissa is s(0),i.e.,the point (s(0), s( 1) ) .Dr aw a hor izontal line to intersect the identit y
map. The int ersection p oint is (s(1), s(1)). Draw a ver tical line back to the iterative
map. This is the p oint ( s(1), s(2)). Successive values of s( t ) are obtained by it erating
this graphical procedure. A few examples are plotted in the r ight panels of Fig. 1.1.1.
In order to discuss the iterative maps it is helpful to recognize sever al features of
these maps.First,intersect ion points of the identity map and the iterative map are the
fixed points of the iter ative map:
(1.1.9)
s
0
· f (s
0
)
s(t ) · g
t
s
0
·e
ln(g )t
s
0
I t e ra t i v e m a p s ( a n d c h a o s ) 21
01adBARYAM_29412 3/10/02 10:15 AM Page 21
Fixed points,not surpr isingly, play an impor tant role in iterative maps. They help us
describe the state and behavior of the system after many iterations. There are two
kinds o f fixed p oints—stable and unstable. Stable fixed p oints are char acter ized by
“attracting” the result of iteration of points that are nearby. Mor e precisely, there exists
22 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 22
Title: Dynamics Complex Systems
Shor t / Normal / Long
0 1 2 3 4 5 6 7
s(t)
t
0 1 2 3 4 5 6 7
s(t)=s(t–1) +v
(a)
(b) s(t)
s(t–1)
s(t)
t
s(t)=c
s(t)=s(t–1)+v
s(t)
s(t–1)
s(t)=c
Fi gure 1 . 1 . 1 T h e le ft pa ne ls show t h e t ime  de p e nde n t va lue of t he syst e m variable s( t ) re
s u l t i n g from it e ra t ive ma p s. The first pane l ( a ) shows t he re sult of it e ra t i ng t he c ons t a n t ma p ;
( b) shows t h e re sult of add i n g v t o t h e pre v ious va lue du r i ng ea ch t ime int e rva l; ( c) –( f) sho w
t he re sult of mu l t i p l y i ng by a cons t a n t g , whe re e a ch fig u re sh ows t he be ha vior for a diffe re nt
ra nge of g value s: ( c) g > 1, ( d) 0 < g < 1, ( e ) 1 < g < 0, an d ( f) g < 1. The rig h t pane ls a re
a diffe re n t wa y of sh o w i ng gra p h ica lly t h e result s of it e ra t io ns an d are con s t r uc t e d a s fo l l o w s.
First pl ot t he fun c t ion f( s) ( solid lin e) , whe re s( t ) f( s( t 1) ) . This ca n be t h o u g h t of a s plot 
t i ng s( t ) vs. s( t 1) . Se cond, pl ot t he ide n t it y map s( t ) s( t 1) ( da s he d lin e) . Ma rk t he in i
t ial va lue s ( 0) on t he ho r i z o nt al axis, an d t h e point on t h e graph of s( t ) t hat corre s p o nds t o
t he point whose abscissa is s ( 0) , i. e. t h e point ( s( 0) , s( 1) ) . Th ese are sh own a s squa re s. Fro m
t he point ( s( 0) , s( 1) ) draw a h o r i z o nt a l lin e t o int e rsect t he ide n t it y map. The int e r s e c t io n
p o i nt is ( s ( 1) , s( 1) ) . Dra w a ve rt ical l ine ba ck t o t he it e ra t ive ma p. This is t he point ( s( 1 ) ,
s( 2) ) . Succe ssive value s of s( t ) a re obt ain ed by it era t i n g t h is gra p h ica l pro c e du re.
01adBARYAM_29412 3/10/02 10:15 AM Page 22
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 23
Title: Dynamics Complex Systems
Shor t / Normal / Long
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
s(t)=gs(t–1)
g>1
s(t)=gs(t–1)
0<g<1
s(t)=gs(t–1)
0<g<1
(d)
(c)
s(t)
s(t–1)
s(t)
s(t)
s(t–1)
s(t)
t
t
s(t)=gs(t–1)
g>1
0 1 2 3 4 5 6 7
s(t)=gs(t–1)
1<g<0
s(t)=gs(t–1)
1<g<0
0 1 2 3 4 5 6 7
s(t)=gs(t–1)
g<1
s(t)=gs(t–1)
g<1
(e)
(f)
s(t)
s(t–1)
s(t)
t
s(t)
s(t–1)
s(t)
t
23
01adBARYAM_29412 3/10/02 10:15 AM Page 23
24 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 24
Title: Dynamics Complex Systems
Shor t / Normal / Long
a neighborhood of points of s
0
such that for any s in this neighborhood the sequence
of points
(1.1.10)
converges to s
0
. We are using the notation f
2
(s) · f(f (s)) for the second iterat ion,and
similar notation for higher iterations. This sequence is just the time series of the iter
ative map for the initial condition s. Unstable fixed points have the opposite behavior,
in that it eration causes the syst em to leave the neighb orhood of s
0
. The two t ypes of
fixed points are also called att racting and repelling fixed points.
The family of multiplicative iter ative maps in Eq.(1.1.6) all have a fixed point at
s
0
· 0. Gr aphically fr om the figures, or anal ytically fr om Eq. (1.1.7), we see that the
fixed point is stable for  g < 1 and is unstable for g > 1. There is also distinct behav
ior of the syst em depending on whether g is posit ive or negative. For g < 0 the it era
tions alter nate from one side to the other of the fixed point, whether it is attracted to
or r epelled from the fixed point. Specifically, if s < s
0
then f(s) > s
0
and vice versa, or
sign(s − s
0
) · −sign(f (s) − s
0
). For g > 0 the iteration does not alternate.
Q
ue s t i on 1 . 1 . 6 Consider the iterative map.
s( t) · gs(t − 1) + v (1.1.11)
convince yourself that v does not affect the nature of the fixed p oint, only
shifts its posit ion.
Q
ue s t i on 1 . 1 . 7 Con s i der an arbi t ra r y itera t ive map of the form Eq .( 1 . 1 . 1 ) ,
with a fixed point s
0
( Eq .( 1 . 1 . 9 ) ) . If the iter a tive map can be ex p a n ded in
a Tayl or series around s
0
s h ow that the first deriva t ive
(1.1.12)
characterizes the fixed point as follows:
For  g < 1, s
0
is an attracting fixed point.
For  g > 1, s
0
is a repelling fixed point.
For g < 0, iterat ions alter nate sides in a sufficiently small neighbor hood of s
0
.
For g > 0 ,i tera ti ons remain on one side in a su f f i c i en t ly small nei gh borh ood of s
0
.
Extr a credit: Prove the same theor em for a differ entiable function (no Taylor
expansion needed) using the mean value theorem.
Solut i on 1 . 1 . 7 If the it erative map can be expanded in a Taylor series we
wr ite that
(1.1.13)
f (s) · f (s
0
) +g (s −s
0
) +h (s −s
0
)
2
+…
g ·
df (s)
ds
s
0
{s, f (s), f
2
(s), f
3
(s),…}
01adBARYAM_29412 3/10/02 10:15 AM Page 24
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 25
Title: Dynamics Complex Systems
Shor t / Normal / Long
where g is the first derivat ive at s
0
, and h is onehalf of the second der ivative
at s
0
. Since s
0
is a fixed point f(s
0
) · s
0
we can rewr ite this as:
(1.1.14)
If we did not have any higherorder terms beyond g, then by inspect ion each
of the four conditions that we have to prove would follow from this expres
sion without rest r ictions on s. For example, if  g > 1, then taking the mag
nitude of both sides shows that f(s) − s
0
is larger than s − s
0
and the iter ations
take the point s away from s
0
. If g > 0,then this expression says that f(s) stays
on the same side of s
0
. The other condit ions follow similarly.
To generalize this argument to include the higherorder terms of the ex
pansion, we must guarantee that whichever domain g is in (g > 1, 0 < g < 1,
−1 < g < 0, or g < −1), the same is also true o f the whole right side. For a
Taylor expansion, by choosing a small enough neighborhood  s − s
0
 < , we
can guarantee the higherorder terms are less than any number we choose.
We choose to be half of the minimum of  g − 1 ,  g − 0 and  g + 1 . Then
g + is in the same domain as g. This pr ovides the desired guarantee and the
proof is complete.
We have p roven that in the vicinity of a fixed p oint the iterative map
may be completely char acter ized by its firstorder expansion (with the ex
cept ion of the special points g · t1,0).
Thus far we have not considered the special cases g ·t1,0. The special cases g · 0
and g · 1 have already been t reated as simpler iter ative maps. When g · 0, the fixed
point at s · 0 is so attract ive that it is the result of any iter ation. When g · 1 all points
are fixed points.
The new special case g · −1 has a differ ent significance. In this case all points al
ter nate b etween posit ive and negative values, repeating ever y other iter ation. Such
repet ition is a generalization of the fixed point. Whereas in the fixedpoint case we re
peat ever y iteration, here we repeat after ever y two iterations. This is called a 2cycle,
and we can immediat ely consider the mo re gener al case of an ncycle. In this t ermi
nology a fixed point is a 1cycle.One way to describe an ncycle is to say that iter ating
n times gives back the same result, or equivalently, that a new it erative map which is
the nth fold composition of the original map h · f
n
has a fixed point. This descrip
tion would include also fixed points of f and all points that are mcycles, wher e m is a
divisor of n. These are excluded from the definition of the ncycles. While we have in
troduced cycles using a map where all points are 2cycles,mor e gener al iterative maps
have specific sets of points that ar e ncycles. The set o f points of an ncycle is called
an orbit. There are a variet y of proper ties of fixed points and cycles that can be pr oven
for an arbitrar y map. One of these is discussed in Question 1.1.8.
f (s) −s
0
s −s
0
·g +h (s −s
0
) +…
I t e ra t i v e m a p s ( a n d c h a o s ) 25
01adBARYAM_29412 3/10/02 10:15 AM Page 25
Q
ue s t i on 1 . 1 . 8 Prove that there is a fixed point between any two points
of a 2cycle if the iterating function f is cont inuous.
Solut i on 1 . 1 . 8 Let the 2cycle be wr itten as
(1.1.15)
Consider the function h(s) · f(s) − s, h( s
1
) and h(s
2
) have opposite signs and
therefore there must be an s
0
between s
1
and s
2
such that h(s
0
) · 0—the fixed
point.
We can also generalize the definition of attr acting and repelling fixed p oints t o
consider att r acting and repelling ncycles. Attr action and repulsion for the cycle is
equivalent to the att ract ion and repulsion of the fixed point of f
n
.
1 . 1 . 3 Qua dra t ic it era t ive ma ps: cycles a nd cha os
The next iter a tive map we wi ll con s i der de s c ri bes the ef fect of n on l i n e a ri t y (sel f  acti on ) :
s(t) · as(t − 1)(1 − s( t − 1)) (1.1.16)
or equivalently
f (s) · as(1 − s) (1.1.17)
This map has played a significant role in development of the theor y of dynamical sys
tems because even though it looks quite innocent,it has a dynamical behavior that is
not described in the conventional science of simple syst ems. Instead, Eq. (1.1.16) is
the basis of significant work on chaotic behavior, and the tr ansition of behavior from
simple to chaotic. We have chosen this form of quadratic map because it simplifies
somewhat the discussion. Question 1.1.11 describes the relationship between this
family of quadratic maps,parameter ized by a, and what might other wise appear to be
a different family of quadratic maps.
We will focus on a values in the range 4 > a > 0. For this range, any value of s in
the interval s ∈[0,1] stays within this inter val. The minimum value f(s) · 0 occurs for
s · 0,1 and the maximal value occurs for s · 1/2. For all values of a there is a fixed point
at s · 0 and there can be at most two fixed points, since a quadratic can only intersect
a line (Eq. (1.1.9)) in two points.
Taking the first derivative of the iter ative map gives
(1.1.18)
At s · 0 the der ivative is a which, by Question 1.1.7,shows that s · 0 is a stable fixed
point for a < 1 and an unstable fixed point for a > 1. The switching of the stability of
the fixed point at s · 0 coincides with the int roduct ion of a second fixed point in the
inter val [0,1] (when the slope at s · 0 is greater than one, f (s) > s for small s, and since
df
ds
·a(1−2s)
s
1
· f (s
2
)
s
2
· f (s
1
)
26 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 26
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 26
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 27
Title: Dynamics Complex Systems
Shor t / Normal / Long
f (1) · 0, we have that f(s
1
) · s
1
for some s
1
in [0,1] by the same construction as in
Question 1.1.8). We find s
1
by solving the equation
(1.1.19)
(1.1.20)
Substituting this into Eq. (1.1.18) gives
(1.1.21)
This shows that for 1 < a < 3,the new fixed point is stable by Question 1.1.7. Moreover,
the derivative is positive for 1 < a < 2,so s
1
is stable and convergence is from one side.
The derivative is negat ive for 2 < a < 3, so s
1
is stable and alternat ing.
Fig. 1.1.2(a)–(c) shows the thr ee cases: a · 0.5, a · 1.5 and a · 2.8. For a · 0.5,
start ing from anywhere within [0,1] leads to convergence to s · 0. When s(0) > 0.5 the
first it eration takes the syst em t o s(1) < 0.5. The closer we start to s(0) · 1 the closer
to s · 0 we get in the first jump. At s(0) · 1 the convergence to 0 occurs in the first
jump. A similar behavior would be found for any value of 0 < a < 1. For a · 1.5 the be
havior is more complicated. Except for the points s · 0,1,the convergence is always to
the fixed p oint s
1
· (a − 1)/a between 0 and 1. For a · 2.8 the it er ations converge to
the same point ;however, the convergence is alternating. Because there can be at most
two fixed points for the quadratic map, one might think that this behavior would be
all that would happen for 1 < a < 4.One would be wrong. The first indication that this
is not the case is the instability of the fixed point at s
1
star ting from a · 3.
What happens for a > 3? Both of the fixed points that we have found,and the only
ones that can exist for the quadratic map, are now unstable. We know that the it era
tion of the map has to go somewhere, and only within [0,1]. The only possibilit y,
within our experience, is that there is an att r acting ncycle to which the fixed points
are unstable. Let us then consider the map f
2
(s) whose fixed points are 2cycles of the
original map. f
2
(s) is shown in the right panels of Fig. 1.1.2 for increasing values of a.
The fixed points of f (s) are also fixed points of f
2
(s). However, we see that two addi
tional fixed points exist for a > 3. We can also show analytically that two fixed points
are introduced at exactly a · 3:
(1.1.22)
To find the fixed point we solve:
(1.1.23)
We already know two solut ions of this quart ic equation—the fixed points of the map
f. One of these at s · 0 is obvious. Dividing by s we have a cubic equation:
(1.1.24)
a
3
s
3
−2a
3
s
2
+a
2
(1 +a )s +(1 −a
2
) · 0
s ·a
2
s(1−s)(1 −as(1 −s))
f
2
(s) ·a
2
s(1 −s)(1 −as(1 −s))
df
ds
s
1
· 2 −a
s
1
·(a −1) / a
s
1
·as
1
(1−s
1
)
I t e ra t i v e m a p s ( a n d c h a o s ) 27
01adBARYAM_29412 3/10/02 10:15 AM Page 27
28
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 28
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 28
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 29
Title: Dynamics Complex Systems
Shor t / Normal / Long
29
01adBARYAM_29412 3/10/02 10:15 AM Page 29
30
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 30
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 30
We can reduce the equation to a quadratic by dividing by (s − s
1
) as follows (we sim
plify the algebra by dividing by a (s − s
1
) · (as − (a − 1))):
(1.1.25)
Now we can obtain the roots to the quadratic:
(1.1.26)
(1.1.27)
This has two solutions (as it must for a 2cycle) for a <−1 or for a > 3. The for mer case
is not of interest to us since we have assumed 0 < a < 4. The latter case is the two roots
that are promised. Notice that for exactly a · 3 the two roots that are the new 2cycle
are the same as the fixed point we have already found s
1
. The 2cycle splits off from
the fixed p oint at a · 3 when the fixed p oint b ecomes unstable. The two attr acting
points continue to separate as a increases. For a > 3 we expect that the result of itera
tion e ventually settles down to the 2cycle. The syst em state alternates b etween the
two roots Eq. (1.1.27). This is shown in Fig. 1.1.2(d).
As we continue to increase a beyond 3, the 2cycle will itself become unstable at
a point that can be calculated by setting
(1.1.28)
df
2
ds
s
2
· −1
s
2
·
(a +1) t (a +1)(a −3)
2a
a
2
s
2
−a(a +1)s +(a +1) · 0
(as −(a −1)) a
3
s
3
−2a
3
s
2
+a
2
(1 +a)s +(1 −a
2
)
a
3
s
3
−(a −1)a
2
s
2
−(a +1)a
2
s
2
+a
2
(1 +a)s +(1 −a
2
)
−(a +1)a
2
s
2
+a(1 +a)(a −1)s
+a(1 +a)s +(1 −a
2
)
a
2
s
2
−a(a +1)s +(a +1)
)
I t e ra t i v e m a p s ( a n d c h a o s ) 31
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 31
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 1 . 2 ( pp. 2 8  3 0 ) Plot s of t h e re sult of it e ra t in g t h e qua dra t ic ma p f ( s ) · a s( 1 − s )
for diffe re n t va lue s of a . Th e le ft a n d ce n t e r pa n e ls a re simila r t o t h e le ft a n d righ t pa n e ls of
Fig. 1. 1. 1. Th e le ft pa n e ls plot s( t ) . Th e ce n t e r pa n e ls de scribe t h e it e ra t ion of t h e ma p f ( s )
on a xe s corre spon din g t o s( t ) a n d s( t − 1) . Th e righ t pa n e ls a re simila r t o t h e ce n t e r pa n e ls
but a re for t h e fun ct ion f
2
( s) . Th e diffe re n t va lue s of a a re in dica t e d on t h e pa n e ls a n d sh ow
t h e ch a nge s from ( a ) con ve rge n ce t o s · 0 for a · 0. 5, ( b) con ve rge n ce t o s · ( a − 1) / a for
a · 1. 5, ( c) a lt e rn a t in g con ve rge n ce t o s · ( a − 1) / a for a · 2. 8, ( d) bifurca t ion — con ve r
ge n ce t o a 2 cycle for a · 3. 2, ( e ) se con d bifurca t ion — con ve rge n ce t o a 4 cycle for a · 3. 5,
( f) ch a ot ic be h a vior for a · 3. 8.
01adBARYAM_29412 3/10/02 10:15 AM Page 31
to be a · 1 + √6 · 3.44949. At this value of a the 2cycle splits into a 4cycle
(Fig. 1.1.2(e)).Each of the fixed points of f
2
(s) simultaneously split into 2cycles that
together form a 4cycle for the original map.
Q
ue s t i on 1 . 1 . 9 Show that when f has a 2cycle, both of the fixed points
of f
2
must split simultaneously.
Solut i on 1 . 1 . 9 The split occurs when the fixed points become unstable—
the derivative of f
2
equals –1. We can show that the derivat ive is equal at the
two fixed points of Eq. (1.1.27), which we call s
2
±
:
(1.1.29)
where we have made use of the chain rule. Since f (s
2
+
) · s
2
−
and vice versa, we
have shown this expression is the same whether s
2
· s
2
+
or s
2
· s
2
−
.
Note: This can be generalized to show that the der ivative o f f
k
is the
same at all of its k fixed points cor responding to a kcycle of f.
The process of taking an ncycle into a 2ncycle is called bifurcation. Bifurcation con
tinues to r eplace the limiting b ehavior of the iterative map with progressively longer
cycles of length 2
k
. The bifurcations can be simulated. They occur at smaller and
smaller inter vals and there is a limit point to the bifurcations at a
c
· 3.56994567.
Fig. 1.1.3 shows the values that are reached by the iter ative map at long times—the
stable cycles—as a function of a < a
c
. We will discuss an algebr aic treatment of the bi
furcation regime in Section 1.10.
Beyond the bifurcation r egime a > a
c
(Fig. 1.1.2(f)) the behavior of the iterative
map can no longer be described using simple cycles that attr act the iterations. The be
havior in this regime has been identified with chaos. Chaos has been characterized in
many ways, but one propert y is quite generally agreed up on—the inherent la ck o f
predictability o f the syst em dynamics. This is oft en expressed mo re precisely by d e
scribing the sensit ivity of the system’s fate to the initial conditions.A possible defini
tion is: There exists a distance d such that for any neighborhood V of any point s it is
possible to find a point s′ within the neighb orhood and a number of iterations k so
that f
k
( s′) is further than d away from f
k
(s). This means that ar bitrarily close to an y
point is a point that will be displaced a significant distance away by iter ation.
Qualitatively, there are two missing aspects of this definition,first that the points that
move far away must not be too unlikely (other wise the system is essentially pr e
dictable) and second that d is not t oo small (in which case the divergence of the dy
namics may not be significant).
If we look at the definition of chaotic behavior, we see that the concept of scale
plays an important role.A small distance between s and s′ turns into a large distance
between f
k
(s) and f
k
(s′). Thus a finescale difference eventually becomes a largescale
difference. This is the essence of chaos as a model of complex system behavior. To un
derstand it more fully, we can think about the state variable s not as one real variable,
df
2
ds
s
2
·
df ( f (s))
ds
s
2
·
df (s)
ds
f (s
2
)
df (s)
ds
s
2
32 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 32
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 32
but as an infinite sequence of binary variables that form its binary representation s ·
0.r
1
r
2
r
3
r
4
... Each of these binary variables represents the state of the system—the value
of some quantity we can measure about the system—on a part icular length scale. The
higher o rder bits represent the larger scales and the lower o rder ones represent the
finer scales. Chaotic behavior implies that the state of the first f ew binary variables,
r
1
r
2
, at a par ticular time are determined by the value of fine scale variables at an ear 
lier time. The farther back in time we look, the finer scale variables we have to con
sider in order to know the pr esent values of r
1
r
2
. Because many different variables are
relevant to the behavior of the system, we say that the system has a complex behavior.
We will retur n to these issues in Chapter 8.
The influence of fine length scales on coarse ones makes iter ative maps difficult
to simulate by computer. Computer r epresentations of real number s always have fi
nite p recision. This must be taken into account if simulations o f iterative maps or
chaotic complex systems are performed.
I t e ra t i v e m a p s ( a n d c h a o s ) 33
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 33
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 1 . 3 A plot of va lue s o f s visit e d by t h e qua d ra t ic ma p f( s ) · a s ( 1 − s) a ft e r ma n y
it e ra t ion s a s a fun ct ion of a , in cludin g st a ble poin t s, cycle s a n d ch a ot ic be h a vior. Th e diffe r
e n t re gime s a re re a dily a ppa re n t . For a < 1 t h e st a ble poin t is s · 0. For 1 < a < 3 t h e st a ble
poin t is a t s
0
· ( a − 1) / a . For 3 < a < a
c
wit h a
c
· 3. 56994567, t h e re is a bifurca t ion ca sca de
wit h 2 cycle s t h e n 4 cycle s, e t c. 2
k
 cycle s for a ll va lue s of k a ppe a r in progre ssive ly n a rrowe r
re gion s of a . Be yon d 4 cycle s t h e y ca n n ot be se e n in t h is plot . For a > a
c
t h e re is ch a ot ic be 
h a vior. Th e re a re re gion s of s va lue s t h a t a re n ot visit e d a n d re gion s t h a t a re visit e d in t he
lon g t ime be h a vior of t h e qua dra t ic ma p in t h e ch a ot ic re gime wh ich t h is figure doe s n ot fully
illust ra t e.
01adBARYAM_29412 3/10/02 10:15 AM Page 33
Another significant point about the iterative map as a model of a complex system
is that there is nothing outside of the system that is influencing it. All of the infor ma
tion we need to describe the behavior is contained in the precise value of s. The com
plex behavior arises from the way the different parts of the system—the fine and
course scales—affect each other.
Q
ue s t i on 1 . 1 . 1 0 : Why isn’t the iter ative map in the chaotic regime
equivalent to picking a number at random?
Solut i on 1 . 1 . 1 0 : We can still predict the behavior of the iterative map over
a few iter ations. It is only when we iterate long enough that the map becomes
unpredictable. More specifically, the continuity of the function f ( s) guaran
tees that for s and s′ close t ogether f (s) and f (s′) will also be close t ogether.
Specifically, given an it is possible to find a such that for s − s′ < ,  f (s)−
f (s′) < . For the family of functions we have been considering, we only need
to set < /a , since then we have:
(1.1.30)
Thus if we fix the number o f cycles to b e k, we can always find two points
close enough so that  f
k
( s′)−f
k
(s) < by setting  s − s′< /a
k
.
The tuning of the parameter a leading from simple convergent behavior through
cycle bifurcation to chaos has been id entified as a universal description o f the ap 
pearance of chaotic behavior from simple behavior of many systems. How do we take
a complicated real system and map it onto a discrete time iterative map? We must de
fine a system variable and then take snapshots of it at fixed inter vals (or at least well
defined intervals). The snapshots cor respond to an iterative map. Often there is a nat 
ural choice for the inter val that simplifies the iterative behavior. We can then check to
see if there is bifurcation and chaos in the real system when parameter s that control
the system behavior are var ied.
One of the earliest examples o f the application o f iterative maps is to the study
of heart attacks. Heart atta cks o ccur in many different ways. One kind of heart at
tack is known as fibrillation. Fibr illation is characterized by chaotic and ineffective
heart muscle cont r actions. It has been suggested that bifurcation may be obser ved in
hear tbeats as a per iod doubling (two heart beats that are inequivalent). If cor rect,
this may ser ve as a warning that the heart str uct ure, due to various changes in heart
tissue parameters, may be ap proaching fibr illation. Another system where more de
tailed studies have suggested that bifurcation occurs as a route to chaotic behavior is
that of t urbulent flows in hydrodynamic systems.A subtlet y in the application of the
ideas of bifurcation and chaos to physical syst ems is that physical syst ems are bett er
modeled as having an increasing number of degrees of freedom at finer scales. This
is to be cont rasted with a system modeled by a single real number, which has the
same number o f degrees of freedom (r epresented by the binary variables above) at
each length scale.
 f (s) − f (
′
s )  ·a  s(1−s) −
′
s (1 −
′
s )  ·a  s −
′
s   1 −(s +
′
s )  <a  s −
′
s  <
34 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 34
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 34
I t e ra t i v e m a p s ( a n d c h a o s ) 35
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 35
Title: Dynamics Complex Systems
Shor t / Normal / Long
1 . 1 . 4 Are a ll dyna mica l syst ems it era t ive ma ps?
How general is the iter ative map as a tool for describing the dynamics of systems?
There are thr ee appar ent limitations of iterative maps that we will consider modify
ing later, Eq. (1.1.1):
a. describes the homo geneous evolut ion of a system since f itself does not depend
on time,
b. describes a system where the state of the syst em at time t depends only on the
state of the system at time t – t, and
c. describes a deter ministic evolut ion of a syst em.
We can,ho wever, bypass these limitations and keep the same for m of the iter ative map
if we are willing to let s describe not just the present state of the system but also
a. the state of the sys tem and all other factors that might affect its evo luti on in ti m e ,
b. the state of the system at the present time and sufficiently many previous times,
and
c. the probability that the system is in a par ticular state.
Taking these caveats t ogether, all of the syst ems we will consider are iter ative maps,
which therefore appear to be quite gener al.Generalit y, however, can be quite useless,
since we want to discard as much information as possible when describing a system.
Another way to argue the gener ality of the it erat ive map is through the laws of
classical or quantum dynamics. If we consider s to be a variable that describes the po
sitions and velocities of all par ticles in a syst em, all closed syst ems described by clas
sical mechanics can be described as deterministic iter ative maps.Quantum evolution
of a closed syst em may also be described by an iterative map if s describes the wa ve
funct ion of the system. However, our intent is not necessarily to describe microscopic
dynamics, but rather the dynamics of variables that we consider to be relevant in de
scribing a syst em. In this case we are not always guaranteed that a deterministic iter
ative map is sufficient. We will discuss relevant gener alizations, first to stochastic
maps, in Section 1.2.
E
xt ra Cre di t Que s t i on 1 . 1 . 1 1 Show that the system of quadratic iterative
maps
(1.1.31)
is essentially equivalent in its dynamical proper ties to the it erative maps we
have considered in Eq. (1.1.16).
Solut i on 1 . 1 . 1 1 Two iter ative maps are equivalent in their propert ies if we
can per form a timeindependent onetoone map of the timedependent
system states from one case to the other. We will attempt to t ransfor m the
family of quadratic maps given in this problem to the one of Eq.(1.1.16) us
ing a linear map valid at all times
(1.1.32)
s(t ) ·m
′
s (t ) +b
s(t ) · s(t −1)
2
+k
01adBARYAM_29412 3/10/02 10:15 AM Page 35
By direct substitution this leads to:
(1.1.33)
We must now choose the values of m and b so as to obtain the form of
Eq. (1.1.16).
(1.1.34)
For a correct placement of minus signs in the par enthesis we need:
(1.1.35)
or
(1.1.36)
(1.1.37)
giving
(1.1.38)
(1.1.39)
We see that for k < 1/4 we have two solut ions. These solutions give all possi
ble (posit ive and negat ive) values of a.
What abou t k > 1/4? It turns out that this case is not very int eresting
compared to the rich behavior for k < 1/4, since there are no finite fixed
points,and therefore by Question 1.1.8 no 2cycles (it is not hard to gener
alize this to ncycles). To c onfirm this, ver ify that iterations di verge t o +∞
from any initial condit ion.
Note: The system of equations of this question are the ones ext ensively
analyzed by Devaney in his excellent textbook A First Course in Chaotic
Dynamical Systems.
E
xt ra Cre di t Que s t i on 1 . 1 . 1 2 You are given a problem to solve whic h
when r educed to mathemat ical form looks like
(1.1.40)
where f is a complicated funct ion that depends on a parameter c. You know
that there is a solut ion of this equation in the vicinity of s
0
. To solve this equa
tion you tr y to iterate it (Newton’s method) and it wor ks,since you find that
f
k
(s
0
) converges nicely to a solution. Now, however, you realize that you need
to sol ve this problem for a slightly different value of the parameter c, and
when you tr y to iterate the equation you can’t get the value of s to converge.
Instead the values start to oscillate and then behave in a completely erratic
s · f
c
(s)
a · −m ·2b ·(1 t 1−4k )
b ·(1 t 1 −4k ) / 2
2b
m
· −1
b
2
−b +k ·0
′
s (t ) ·( −m)
′
s (t −1)( −
2b
m

.
`
,
−
′
s (t −1)) +
1
m
(b
2
+k −b)
′
s (t ) ·m
′
s (t −1)(
′
s (t −1) +
2b
m
) +
1
m
(b
2
+k −b)
m
′
s (t ) +b ·(m
′
s (t −1) +b)
2
+k
36 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 36
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 36
I t e ra t i v e m a p s ( a n d c h a o s ) 37
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 37
Title: Dynamics Complex Systems
Shor t / Normal / Long
way. Suggest a solut ion for this problem and see if it wor ks for the function
f
c
( s) · cs(1 − s), c · 3.8, s
0
· 0.5. A solut ion is given in stages (a)  (c) below.
Solut i on 1 . 1 . 1 2 ( a ) A common resolution of this problem is to consider it
erating the function:
(1.1.41)
where we can adjust to obtain r apid convergence. Note that solutions of
(1.1.42)
are the same as solutions of the original pr oblem.
Q
ue s t i on 1 . 1 . 1 2 ( b) Explain why this could wor k.
Solut i on 1 . 1 . 1 2 ( b) The der ivative of this funct ion at a fixed point can be
controlled by the value of . It is a linear interpolation between the fixed
point derivat ive of f
c
and 1. If the fixed point is unstable and oscillating, the
derivat ive of f
c
must be less than −1 and the inter polation should help.
We can also explain this result without appealing to our work on itera
tive maps by noting that if the iter ation is causing us to overshoot the mark,
it makes sense to mix the value s we start fr om with the value we get fr om
f
c
(s) to get a better estimate.
Q
ue s t i on 1 . 1 . 1 2 ( c) Explain how to pick .
Solut i on 1 . 1 . 1 2 ( c) If the solution is oscillating, then it makes sense to as
sume that the fixed point is in between successive values and the distance is
revealed by how much further it gets each time;i.e., we assume that the iter
ation is essentially a linear map near the fixed point and we adjust so that
we compensate exactly for the overshoot of f
c
.
Using two trial iterations, a linear approximation to f
c
at s
0
looks like:
(1.1.43)
Adopt ing the linear approximat ion as a definition of g we have:
(1.1.44)
Set up so that the first it eration of the modified syst em will take you
to the desired answer :
(1.1.45)
or
(1.1.46)
(1.1.47)
(1 − ) ·(s
0
−s
1
) /(s
2
−s
1
)
s
0
−s
1
·(1 − )( f
c
(s
1
) −s
1
) ·(1− )(s
2
−s
1
)
s
0
· s
1
+(1− ) f
c
(s
1
)
g ≡(s
3
−s
2
) /(s
2
−s
1
)
s
2
· f
c
(s
1
) ≈g(s
1
−s
0
) +s
0
s
3
· f
c
(s
2
) ≈ g(s
2
−s
0
) +s
0
s ·h
c
(s)
h
c
(s) · s +(1 − ) f
c
(s)
01adBARYAM_29412 3/10/02 10:15 AM Page 37
To eliminate the unknown s
0
we use Eq. (1.1.43) to obtain:
(1.1.48)
(1.1.49)
or
(1.1.50)
(1.1.51)
It is easy to check, using the formula in t er ms of g, that the modified iter a
tion has a zero der ivative at s
0
when we use the approximate linear forms for
f
c
. This means we have the best convergence possible using the information
from two iterations of f
c
. We then use the value of to iterate to convergence.
Tr y it!
St ocha s t i c I t e ra t i ve Ma ps
Many of the systems we would like to consider are described by system variables
whose value at the next time step we cannot predict with complete cer taint y. The un
certainty may arise from many sources,including the exist ence of interactions and pa
rameters that are too complicated or not ver y r elevant to our problem. We are then
faced with describing a system in which the outcome of an it eration is p r obabilistic
and not deterministic. Such syst ems are called stochastic syst ems. There are several
ways to describe such systems mathematically. One of them is to consider the out
come of a par t icular update to be selected from a set of possible values. The proba
bility of each of the possible values must be specified. This description is not really a
model of a single system, because each realization of the system will do something dif
fer ent. Instead,this is a model of a collection of systems—an ensemble.Our task is to
study the proper ties of this ensemble.
A stochastic system is generally described by the time evolution of random vari
ables. We begin the discussion by defining a random variable.A random variable s is
defined by its probability distribution P
s
(s′), which describes the likelihood that s has
the value s′. If s is a continuous variable,then P
s
(s′)ds′ is the probability that s resides
between s′ and s′ + ds′. Note that the subscript is the variable name rather than an in
dex. For example, s might be a binary variable that can have the value +1 or −1. P
s
(1)
is the probability that s · 1 and P
s
(−1) is the probability that s · −1. If s is the outcome
of an unbiased coin toss, with heads called 1 and tails called −1, both of these values
are 1/2. When no confusion can arise,the notatio n P
s
(s′) is abbr eviated to P(s), where
s may be either the variable or the value. The sum over all possible values of the prob
ability must be 1.
(1.2.1)
P
s
(
′
s ) ·1
′
s
∑
1 . 2
· −g /(1 −g ) ·(s
2
−s
3
) /(2s
2
−s
1
−s
3
)
1− ·1/(1− g)
(s
0
−s
1
) ·(s
2
−s
1
) /(1 −g )
(s
2
−s
1
) · g(s
1
−s
0
) +(s
0
−s
1
)
38 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 38
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 38
In the discussion of a system described by random variables, we often would like
to know the average value of some quantity Q(s) that depends in a definite way on the
value of the stochastic var iable s. This average is given by:
(1.2.2)
Note that the average is a linear operation.
We now consider the case of a timedependent random variable. Rather than de
scribing the time dependence of the variable s(t), we describe the time dependence of
the probability dist ribut ion P
s
(s′;t). Similar to the iter ative map, we can consider the
case where the outcome only depends on the value o f the syst em variable at a previ
ous time,and the transition probabilities do not depend explicitly on time. Such sys
tems are called Markov chains. The transition probabilities from a state at a par ticu
lar t ime to the next discrete time are written:
(1.2.3)
P
s
is used as the notation for the t ransition probability, since it is also the probability
dist r ibution of s at time t, given a par t icular value s′(t − 1) at the previous time. The
use of a time index for the arguments illustr ates the use of the t ransition probabilit y.
P
s
(11) is the pr obability that when s · 1 at time t − 1 then s · 1 at time t. P
s
(−11) is
the probability that when s · 1 at time t − 1 then s ·−1 at time t. The transition prob
abilities,along with the initial probability dist ribution of the system P
s
(s′; t · 0), de
termine the timedependent ensemble that we are interested in. Assuming that we
don’t lose systems on the way, the t ransition probabilities of Eq. (1.2.3) must satisfy:
(1.2.4)
This states that no matter what the value of the system variable is at a particular time,
it must reach some value at the next time.
The st ochastic syst em described by t ransition probabilities can be written as an
iterat ive map on the probabilit y distribution P(s)
(1.2.5)
It may be more intuitive to write this using the notat ion
(1.2.6)
in which case it may be sufficient, though hazardous, to write the abbreviated form
(1.2.7)
P(s(t )) · P(s(t )  s(t −1))P(s(t −1))
s(t −1)
∑
P
s
(
′
s (t );t ) · P
s
(
′
s (t ) 
′
s (t −1))P
s
(
′
s (t −1);t −1)
′
s (t −1)
∑
P
s
(
′
s ;t ) · P
s
(
′
s 
′ ′
s )P
s
(
′ ′
s ;t −1)
′′
s
∑
P
s
(
′ ′
s 
′
s )
′ ′
s
∑
·1
P
s
(
′
s (t ) 
′
s (t −1))
<Q(s) >· P
s
(
′
s )Q(
′
s )
′
s
∑
S t o c h a s t i c i t e r a t i v e m a p s 39
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 39
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:15 AM Page 39
It is impor tant to recognize that the time evolution equation for the probability
is linear. The linear evolution of this syst em (Eq. (1.2.5)) guarant ees that superposi
tion applies. If we start with an initial distr ibut ion at
time t · 0, then we could find the result at time t by separat ely looking at the evolu
tion of each of the probabilities P
1
(s;0) and P
2
(s ;0)
.
Explicitly we can wr ite
P(s;t ) = P
1
(s ;t ) + P
2
(s;t ). The meaning of this equation should be well noted. The
right side o f the equation is the sum of the evolved probabilities P
1
(s ;t) and P
2
(s;t).
This linearity is a direct consequence of the independence of different members of the
ensemble and says nothing about the complexity of the dynamics.
We note that ultimat ely we are int erested in the behavior of a par t icular system
s(t) that only has one value of s at ever y time t. The ensemble describes how many such
systems will behave. Analytically it is easier to describe the ensemble as a whole,how
ever, simulat ions may also be used to obser ve the behavior of a single system.
1 . 2 . 1 Ra ndom wa lk
Stochastic systems with only one binary variable might seem to be trivial, but we will
devote quite a bit of attention to this problem. We begin by considering the simplest
possible binary stochastic system. This is the system which corresponds to a coin toss.
Ideally, for each toss there is equal pr obability of heads (s · +1) or tails ( s · −1), and
there is no memor y from one toss to the next. The ensemble at each time is indepen
dent of time and has an equal probability of ±1:
(1.2.8)
where the discrete delta function is defined by
(1.2.9)
Since Eq. (1.2.8) is independent of what happens at all p revious times, the evolution
of the state variable is given by the same expression
(1.2.10)
We can illustrate the evaluation of the average of a function of s at time t :
(1.2.11)
For example, if we just take Q(s) to be s itself we have the average of the system
var iable:
(1.2.12)
<s >
t
·
1
2
′
s
s'·t1
∑
· 0
<Q(s) >
t
· Q(
′
s )P
s
(
′
s ;t )
′
s · t1
∑
· Q(
′
s )
1
2
′ s ,1
+
1
2
′ s ,−1 ( )
′
s ·t1
∑
·
1
2
Q(
′
s )
′
s · t1
∑
P(
′
s  s) ·
1
2
′ s ,1
+
1
2
′ s ,−1
i ,j
·
1 i · j
0 i ≠ j
¹
'
¹
¹
¹
P(s;t) ·
1
2
s,1
+
1
2
s ,−1
40 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 40
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 40
Q
ue s t i on 1 . 2 . 1 Will you win more fair coin tosses if (a) you pick heads
ever y time, or if (b) you alter nate heads and tails, or if (c) you pick heads
or tails at random or if (d) you pick heads and tails by some other system?
Explain why.
Solut i on 1 . 2 . 1 In general, we cannot predict the number of coin tosses that
will be won, we can only estimate it based on the chance of winning.
Assuming a fair coin means that this is the best that can be done. Any of the
possibilities ( a)–(c) give the same chance of winning. In none of these ways
of gambling does the choice you make corr elate with the result of the coin
toss. The only system (d) that can help is if you have some information about
what the result of the toss will be,like betting on the known result after the
coin is tossed.A way to wr ite this for mally is to write the probability dist ri
bution of the choice that you are making. This choice is also a stochastic
process. Calling the choice c(t), the four possibilities ment ioned are:
(a) (1.2.13)
(b) (1.2.14)
(c) (1.2.15)
(d) (1.2.16)
It is sufficient to show that the aver age probability of winning is the
same in each of (a)–(c) and is just 1/2. We follow through the manipulations
in order to illust rate some concepts in the t reatment of more than one st o
chastic variable. We have to sum over the probabilities of each of the possi
ble values of the coin toss and each of the values o f the choices, adding up
the probabilit y that they coincide at a par ticular t ime t:
(1.2.17)
This expression assumes that the values of the coin toss and the value of
the choice are independent,so that the joint probability of having a particu
lar value of s and a part icular value of c is the product of the probabilities of
each of the var iables independently:
(1.2.18)
—the probabilitiesofindependentvariables factor. This is valid in cases
(a)–(c) and not in case (d), where the probability of c occurr ing is explicitly
a function of the value of s.
We eva lu a te the prob a bi l i ty of winning in each case (a) thro u gh (c) using
P
s,c
(
′
s ,
′
c ;t) · P
s
(
′
s ;t )P
c
(
′
c ;t )
<
c,s
>·
′ c , ′ s
P
s
(
′
s ;t )P
c
(
′
c ,t )
′
c
∑
′
s
∑
P(c;t ) ·
c,s(t )
P(c;t ) ·
1
2
c,1
+
1
2
c,−1
P(c;t ) ·
1+( −1)
t
2
c,1
+
1 −( −1)
t
2
c,−1
· mod
2
(t )
c,1
+mod
2
(t +1)
c,−1
P(c;t ) ·
c,1
S t o c h a s t i c i t e r a t i v e m a p s 41
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 41
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 41
(1.2.19)
wher e the last equality follows from the nor malization of the probability (the
sum over all possibilities must be 1, Eq. (1.2.1)) and does not d epend at all
on the distr ibution. This shows that the independence of the variables guar
antees that the probability of a win is just 1/2.
For the last case (d) the t r ivial answer, that a win is guaranteed by this
method of gambling, can be ar rived at for mally by evaluating
(1.2.20)
The value of s at time t is independent of the value of c, but the value of c de
pends on the value of s. The joint probability P
s,c
(s′,c′;t) may be wr itten as the
product of the probability of a part icular value of s · s′ times the conditional
probability P
c
( c′s′;t) of a par ticular value o f c · c′ given the assumed value
of s:
(1.2.21)
The next step in our analysis of the binary stochastic system is to consider the be
havior of the sum of s(t ) over a particular number of time steps. This sum is the dif
fer ence between the total number of heads and the total number of tails. It is equiva
lent to asking how much you will win or lose if you gamble an equal amount of money
on each coin toss after a certain number of bets. This problem is known as a random
walk, and we will define it as a consideration of the state variable
(1.2.22)
The way to write the evolut ion of the state variable is:
(1.2.23)
Thus a ra n dom walk con s i ders a state va ri a ble d that can take integer va lues d ∈ { . . . ,
− 1 , 0 , 1 , . . . } . At every time step, d(t) can on ly move to a va lue one high er or one lower
than wh ere it is. We assume that the prob a bi l i ty of a step to the ri ght (high er) is equ a l
to that of a step to the left (lower ) . For conven i en ce , we assume (with no loss of gen er
P(
′
d  d) ·
1
2
′ d ,d+1
+
1
2
′ d ,d −1
d(t ) · s(
′
t )
′
t ·1
t
∑
<
c,s
> ·
′ c , ′ s
P
s
( ′ s ;t )P
c
( ′ c  ′ s ;t )
′ c
∑
′ s
∑
·
′ c , ′ s
P
s
( ′ s ;t )
′ c , ′ s
′ c
∑
′ s
∑
· P
s
( ′ s ;t )
′ s
∑
·1
<
c,s
> ·
′ c , ′ s
P
s ,c
(
′
s ,
′
c ;t )
′ c
∑
′ s
∑
<
c,s
> ·
′
c ,
′
s
(
1
2
′
s ,1
+
1
2
′
s ,−1
) P
c
( ′ c ;t )
′
c
∑
′
s
∑
· (
1
2
′ c ,1
+
1
2
′ c , −1
)P
c
(
′
c ;t )
′ c
∑
· (
1
2
P
c
(1;t ) +
1
2
P
c
( −1;t ))
′ c
∑
·
1
2
42 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 42
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 42
a l i ty) that the sys tem st arts at po s i ti on d(0) · 0 . This is built into Eq .( 1 . 2 . 2 2 ) . Bec a u s e
of the sym m etr y of the sys tem under a shift of the ori gi n , t his is equ iva l ent to con s i d
ering any ot her starting poi n t . O n ce we solve for the prob a bi l i t y distr i buti on of d a t
time t, because of su perpo s i ti on we can also find the re sult of evo lving any initial prob
a bi l i ty distr i but i on P(d;t · 0 ) .
We can pictu re t he ra n dom walk as t hat of a dr unk who has difficulty con s i s
ten t ly moving forw a rd . Our model of t his walk assumes that t he dr unk is equ a lly
l i kely to take a step forw a r d or back w a rd . S t a r ting at po s i ti on 0, he moves to ei t h er
+1 or −1 . Let’s say it was +1 . Next he moves t o +2 or back to 0. Let’s say it was 0. Nex t
to +1 or −1 . Let’s say it was +1 . Next to +2 or 0. Let’s say +2 . Next to +3 or +1 . Let’s
s ay +1 . And so on .
What is the value of system variable d(t ) at time t ? This is equivalent to asking
how far has the walk progressed after t steps.Of course there is no way to know how
far a par ticular syst em goes without watching it. The average distance over the en
semble of systems is the aver age over all possible values of s(t ). This aver age is given
by applying Eq. (1.2.2) or Eq. (1.2.11) to all of the var iables s(t ):
(1.2.24)
The average is written out explicitly on the first line using Eq.(1.2.11). The second line
expression can be arrived at either directly or from the linearity of the aver age. The fi
nal answer is clear, since it is equally likely for the walker to move to the right as to the
left.
We can also ask what is a typical distance tr aveled by a part icular walker. By t yp
ical distance we mean how far from the star ting point. This can either be defined by
the average absolute value of the distance, or as is more commonly accepted,the root
mean square (RMS) distance:
(1.2.25)
(1.2.26)
To evaluate the aver age of the product of the two steps, we treat differently the case in
which they are the same step and when they are different steps. When the two st eps
are the same one we use s(t) · t1 t o obtain:
(1.2.27)
Which follows from the nor malization of the probability (or is obvious). To evaluate
the average of the product of two steps at different times we need the joint probabil
ity of s(t ) and s(t ′). This is the probability that each o f them will take a part icular
<s(t)
2
> · <1 > ·1
<d(t )
2
> ·< s(
′
t )
′
t ·1
t
∑

.
`
,
2
> ·< s(
′
t )s(
′ ′
t )
′
t ,
′ ′
t ·1
t
∑
> · <s(
′
t )s(
′ ′
t )
′
t ,
′ ′
t ·1
t
∑
>
(t ) · <d(t )
2
>
<d(t ) > ·
1
2
s(t ) ·t1
∑
K
1
2
s (3) ·t1
∑
1
2
s ( 2)·t1
∑
1
2
d(t )
s(1) ·t1
∑
· <s( ′ t ) >
′ t ·1
t
∑
·0
S t o c h a s t i c i t e r a t i v e m a p s 43
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 43
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 43
value. Because we have assumed that the steps are independent , the joint probability is
the product of the probabilities for each one separately:
t ≠ t ′ (1.2.28)
so that for example there is 1/4 chance that s(t ) · +1 and s(t ) · −1. The independence
of the two steps leads the aver age of the product of the two steps to factor:
t ≠ t ′ (1.2.29)
This is zero, since either of the aver ages are zero. We have the combined result:
(1.2.30)
and finally:
(1.2.31)
This gives the classic and impor tant re sult that a r a n dom walk travels a typical distance
that grows as t he squ a re root of the nu m ber of s teps taken : .
We can now consider more completely the p robability dist ribution of the posi
tion of the walker at t ime t. The probability distr ibut ion at t · 0 may be wr itten:
(1.2.32)
After the first time step the probability distr ibution changes to
(1.2.33)
this results from the definition d (1) · s (1). After the second step d (2) · s(1) + s(2) it
is:
(1.2.34)
More generally it is not difficult to see that the probabilities are given by normalized
binomial coefficients,since the number of ones chosen out of t steps is equivalent to
the number of powers of x in (1 + x)
t
. To reach a position d aft er t steps we must take
(t + d)/2 steps to the right and (t − d)/2 steps to the left. The sum of these is the num
ber of steps t and their difference is d. Since each choice has 1/2 probabilit y we have:
P(d ;2) ·
1
4
d ,2
+
1
2
d ,0
+
1
4
d ,−2
P(d ;1) ·
1
2
d ,1
+
1
2
d ,−1
P(d ;0) ·
d,0
<d(t )
2
> · <s(
′
t )s(
′ ′
t ) >
′
t ,
′′
t ·1
t
∑
·
′ t , ′′ t
′
t ,
′ ′
t ·1
t
∑
· 1
′
t ·1
t
∑
·t
<s(t)s(
′
t ) > ·
t , ′ t
<s(t)s( ′ t ) > · P(s(t ), s( ′ t ))s( t)s( ′ t )
s (t ), s( ′ t )
∑
· P(s(t ))P(s(
′
t ))s(t )s(
′
t )
s (t ),s(
′
t )
∑
· <s(t ) > <s( ′ t ) > ·0
P(s(t ), s(
′
t )) ·P(s(t ))P(s(
′
t ))
44 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 44
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 44
(1.2.35)
wher e the unusual delta function imposes the condition that d takes only odd or only
even values depending on whether t is odd or even.
Let us now consider what happens after a long time. The probability distr ibution
spreads out,and a single step is a small distance compared to the t ypical distance trav
eled. We can consider s and t to be continuous variables where both conditions
d,t >> 1 are satisfied. Mor eover, we can also consider ]d]<< t, because the chance that
all steps will be taken in one direction b ecomes ver y small. This enables us to use
Sterling’s approximation to the factorial
(1.2.36)
For large t it also makes sense not to rest rict d to be either odd or even. In order to al
low both, we,in effect, interpolate and then take only half of the probability we have
in Eq. (1.2.35). This leads t o the expression:
(1.2.37)
where we have defined x · d / t. To approximate this expression it is easier to consider
it in logar ithmic for m:
or exponentiating:
(1.2.39)
P(d ,t ) ·
1
2 t
e
−d
2
/ 2t
·
1
2
e
−d
2
/ 2
2
ln( P(d, t )) · −(t / 2)[(1+ x) ln(1 +x) +(1 −x) ln(1 −x)] −(1/ 2) ln( 2 t(1 −x
2
))
≈ −(t / 2)[(1 +x)(x −x
2
/ 2 +K) +(1 −x)(−x −x
2
/ 2 +K)] −(1/ 2) ln(2 t +K)
· −tx
2
/ 2 −ln( 2 t )
P(d ,t ) ·
t
2 (t −d)(t +d ) 2
t
t
t
e
−t
[(d +t ) / 2]
[(d+t ) /2]
[(t −d) / 2]
[(t −d ) / 2]
e
−(d+t ) /2−(t −d) /2
·
( 2 t(1− x
2
))
−1/ 2
(1 +x)
[(1+x )t / 2]
(1 −x)
[(1−x )t / 2]
x!~ 2 x e
−x
x
x
ln( x!) ~ x(lnx −1) +ln( 2 x )
P(d ,t ) ·
1
2
t
t
(d +t ) / 2

.
`
,
t ,d
oddeven
·
1
2
t
t !
[(d +t ) / 2]![(t −d) / 2]!
t ,d
oddeven
t ,d
oddeven
·
(1 +( −1)
t +d
)
2
S t o c h a s t i c i t e r a t i v e m a p s 45
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 45
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1.2.38)
01adBARYAM_29412 3/10/02 10:16 AM Page 45
The prefactor of the exponential, 1/√2 , or iginates from the factor √2 x in
Eq. (1.2.36). It is independent of d and takes care of the nor malization of the proba
bilit y. The result is a Gaussian dist ribut ion. Questions 1.2.2–1.2.5 investigate higher
order cor rections to the Gaussian dist ribution.
Q
ue s t i on 1 . 2 . 2 In order to obtain a corr ect ion to the Gaussian dist ribu
tion we must add a cor rect ion ter m to Sterling’s approximation:
(1.2.40)
Using this expression, find the first correction term to Eq. (1.2.37).
Solut i on 1 . 2 . 2 The cor rection term in Sterling’s approximation contributes
a factor to Eq. (1.2.37) which is (for convenience we write here c · 1/12):
(1.2.41)
where we have only kept the largest correction ter m,n eglecting d compared
to t. Note that the correction term vanishes as t becomes large.
Q
ue s t i on 1 . 2 . 3 Keeping additional ter ms of the expansion in Eq.(1.2.38),
and the result of Question 1.2.2,find the first order correct ion terms to
the Gaussian distr ibution.
Solut i on 1 . 2 . 3 Correct ion t erms in Eq. (1.2.38) arise from several places.
We want to keep all t erms that are o f order 1/ t. To do this we must keep in
mind that a t ypical distance tr aveled is d ∼ √t , so that . The next
terms are obtained from:
This gives us a dist ribut ion:
ln( P(d, t )) · −(t / 2)[(1+ x) ln(1 +x) +(1 −x) ln(1 −x)]
−(1/ 2) ln(2 t(1 −x
2
)) +ln(1−1/ 4t )
≈ −(t / 2)[(1+x)( x −
1
2
x
2
+
1
3
x
3
−
1
4
x
4
K)
+(1 −x)(−x −
1
2
x
2
−
1
3
x
3
−
1
4
x
4
K)]
−ln( 2 t ) −(1/ 2) ln(1−x
2
) +ln(1−1/ 4t)
≈ −(t / 2)[(x +x
2
−
1
2
x
2
−
1
2
x
3
+
1
3
x
3
+
1
3
x
4
−
1
4
x
4
K)
+(−x +x
2
−
1
2
x
2
+
1
2
x
3
−
1
3
x
3
+
1
3
x
4
−
1
4
x
4
K)]
−ln( 2 t ) +(x
2
/ 2 +…) +( −1/ 4t + …)
· −tx
2
/ 2 −ln( 2 t ) −tx
4
/12 +x
2
/ 2 −1/ 4t
(1+c/ t )
(1 +2c/(t +d))(1 +2c/(t −d))
·(1−
3c
t
+ …) ·(1 −
1
4t
+…)
x!~ 2 x e
−x
x
x
(1+
1
12x
+…)
ln( x!) ~ x(ln x −1) +ln( 2 x ) +ln(1 +
1
12x
+…)
46 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 46
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1.2.42)
01adBARYAM_29412 3/10/02 10:16 AM Page 46
(1.2.43)
Q
ue s t i on 1 . 2 . 4 What is the size of the additional factor? Estimate the
size of this term as t becomes large.
Solut i on 1 . 2 . 4 The t ypical value of the variable d is its root mean squar e
value · √t . At this value the additional ter m gives a factor
(1.2.44)
which approaches 1 as time increases.
Q
ue s t i on 1 . 2 . 5 What is the fraction error that we will make if we neglect
this term after one hundred steps? After ten thousand steps?
Solut i on 1 . 2 . 5 After one hundred time steps the walker has traveled a typ
ical distance of ten steps. We generally approximate the probability of ar r iv
ing at this distance using Eq.(1.2.39). The fractional er ror in the probabilit y
of ar r iving at this distance according to Eq. (1.2.44) is 1 − e
1/6t
≈ −1 / 6t ·
−0.00167. So already at a distance of ten steps the error is less than 0.2%.
It is mu ch less likely for the walker to ar rive at the distance 2 · 2 0 . Th e
ra tio of the pr ob a bi l i t y to arr ive at 20 com p a red to 10 is e
−2
/ e
−0 . 5
∼ 0 . 2 2 . If
we want to know the er ror of t his small er prob a bi l i t y case we would wri te
(1 − e
−1 6 / 1 2t +4 / 2t −1 / 4t
) · (1 − e
5 / 1 2t
) ≈ −0 . 0 0 4 2 , wh i ch is a lar ger but st i ll small
err or.
After t en thousand steps the er ror s are smaller than the er rors at one
hundred steps by a factor of one hundred.
1 . 2 . 2 Genera lized ra ndom wa lk a nd t he cent ra l limit t heorem
We can gener alize the random walk by allowing a variet y of steps from the current lo
cation of the walker to sites nearby, not only to the adjacent sites and not only to in 
teger locations. If we rest rict ourselves to st eps that on aver age are balanced left and
right and are not too long ranged, we can show that all such systems have the same
behavior as the simplest random walk at long enough times (and character istically
not even for very long times). This is the content of the cent ral limit theorem. It says
that summing any set of independent random variables eventually leads to a Gaussian
dist ribut ion of probabilities, which is the same distribution as the one we arrived at
for the random walk. The reason that the same dist ribut ion arises is that successive it
eration of the probability update equation, Eq.(1.2.7),smoothes out the distr ibution,
and the only r elevant information that survives is the width of the distribut ion which
is given by (t ). The proof given below makes use of a Fourier tr ansform and can be
skipped by readers who are not well acquaint ed with t ransfor ms. In the next section
we will also include a bias in the random walk. For long times this can be described as
e
1 / 6t
P(d ,t ) ·
1
2 t
e
−d
2
/ 2t
e
−d
4
/ 12t
3
+d
2
/ 2t
2
−1 / 4t
S t o c h a s t i c i t e r a t i v e m a p s 47
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 47
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 47
an aver age motion superimposed on the unbiased random walk.We start with the un
biased random walk.
Each step of the random walk is described by the state variable s(t ) at time t . The
probability of a part icular step size is an unsp ecified funct ion that is independent of
time:
(1.2.45)
We treat the case of integer values of s. The continuum case is Question 1.2.6. The ab
sence of bias in the random walk is described by setting the average displacement in
a single step to zero:
(1.2.46)
The statement above that each step is not too long ranged,is mathematically just that
the mean square displacement in a single step has a welldefined value (i.e., is not
infinite):
(1.2.47)
Eqs. (1.2.45)–(1.2.47) hold at all times.
We can still evaluate the aver age of d(t ) and the RMS value of d(t ) directly using
the linear ity of the average:
(1.2.48)
(1.2.49)
Since s(t ′) and s(t ″) are independent for t ′ ≠ t ′′, as in Eq. (1.2.29), the average
factors:
t ′ ≠ t ′′ (1.2.50)
Thus, all ter ms t ′ ≠ t ′′ are zero by Eq. (1.2.46). We have:
(1.2.51)
This means that the t ypical value of d (t ) is
0
√t .
To obtain the full dist ribution of the random walk state variable d (t ) we have to
sum the stochastic variables s(t ). Since d (t ) · d (t − 1) + s(t ) the probability of tr ansi
tion from d (t − 1) to d (t ) is f (d (t ) − d (t − 1)) or:
<d(t )
2
> · <s(
′
t )
2
>
′ t ·1
t
∑
·t
0
2
<s(
′
t )s(
′ ′
t ) > · <s(
′
t ) >< s(
′ ′
t ) > ·0
<d(t )
2
> ·< s(
′
t )
′
t ·1
t
∑

.
`
,
2
> · <s(
′
t )s(
′ ′
t ) >
′
t ,
′′
t ·1
t
∑
<d(t ) > · < s(
′
t )
′
t ·1
t
∑
> ·t <s > ·0
<s
2
> · s
2
f (s) ·
0
2
s
∑
<s > · sf (s) · 0
s
∑
P(s;t) · f (s)
48 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 48
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 48
(1.2.52)
We can now wr ite the time evolut ion equation and iterate it t t imes to get P ( d ;t ).
(1.2.53)
This is a convolut ion, so the most convenient way to effect a t fold iter ation is in
Fourier space. The Fourier representation of the probability and transition functions
for integral d is:
(1.2.54)
We use a Fourier series because of the restriction to integer values of d. Once we solve
the problem using the Fourier representation, the p robability dist r ibution is recov
ered from the inverse formula:
(1.2.55)
which is proved
(1.2.56)
using the expression:
(1.2.57)
Applying Eq. (1.2.54) to Eq. (1.2.53):
(1.2.58)
˜
P (k;t ) · e
−ikd
d
∑
f (d − ′ d ) P( ′ d ;t −1)
′ d
∑
·
′ d
∑
e
−ik(d− ′ d )
e
−ik ′ d
f (d − ′ d )P( ′ d ;t −1)
d
∑
·
′ d
∑
e
−ik ′ ′ d
e
−ik ′ d
f (
′ ′
d )P(
′
d ;t −1)
′ ′ d
∑
· e
−ik ′ ′ d
f (
′ ′
d )
′′
d
∑
e
−ik ′ d
P(
′
d ;t −1)
′
d
∑
·
˜
f (k)
˜
P (k;t −1)
d , ′ d
·
1
2
dke
ik( d − ′ d )
−
∫
1
2
dke
ikd
˜
P (k;t)
−
∫
·
1
2
dke
ikd
e
−ik ′ d
P( ′ d ;t )
′ d
∑
−
∫
·
1
2
P( ′ d ;t ) dke
ik( d− ′ d )
−
∫
′ d
∑
· P( ′ d ;t )
d , ′ d
′ d
∑
· P(d ;t)
P(d ;t ) ·
1
2
dke
ikd
˜
P (k ;t )
−
∫
˜
P (k;t ) ≡ e
−ikd
P(d;t )
d
∑
˜
f (k) ≡ e
−iks
f ( s)
s
∑
P(d;t ) · P(d  ′ d )P( ′ d ;t −1)
d'
∑
· f (d − ′ d )P( ′ d ;t −1)
d '
∑
P(
′
d  d) · f (
′
d −d)
S t o c h a s t i c i t e r a t i v e m a p s 49
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 49
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 49
we can iterat e the equation to obtain:
(1.2.59)
where we use the definition d (1) · s(1) that ensures that P( d ;1) · P(s;1) · f (d).
For large t the walker has t raveled a large distance,so we are interested in varia
tions of the probability P( d ;t) over large distances. Thus,in Fourier space we are con
cer ned with small values of k. To simplify Eq.(1.2.59) for large t we expand
˜
f(k) near
k · 0. From Eq.(1.2.54) we can directly evaluate the derivat ives of
˜
f (k) at k · 0 in terms
of averages:
(1.2.60)
We can use this expression to evaluate the terms of a Taylor expansion of
˜
f (k):
(1.2.61)
(1.2.62)
Using the normalization of the probability (< 1 > · 1),and Eqs.(1.2.46) and (1.2.47),
gives us:
(1.2.63)
We must now rem em ber that a typical va lue of d (t ) ,f rom its RMS va lu e , is
0
√t . By the
proper ties of the Fourier transfor m,this implies that a t ypical value of k that we must
consider in Eq.(1.2.63) varies with time as 1/ √t . The next term in the expansion,cu
bic in k, would give rise to a term that is smaller by this factor, and therefore becomes
unimportant at long times. If we write k · q / √t , then it becomes clearer how to wr ite
Eq. (1.2.63) using a limit ing expr ession for large t :
(1.2.64)
This Gaussian, when Fourier transformed back to an expression in d, gives us a
Gaussian as follows:
(1.2.65)
P(d ;t ) ·
1
2
dke
ikd
e
−t
0
2
k
2
/ 2
−
∫
≅
1
2
dke
ikd
e
−t
0
2
k
2
/2
−∞
∞
∫
˜
P (k;t ) · 1 −
1
2
0
2
q
2
t
+K

.
`
,
t
~ e
−
0
2
q
2
/2
·e
−t
0
2
k
2
/ 2
˜
P (k;t ) · 1 −
1
2
0
2
k
2
+K
( )
t
˜
f (k) ·<1 > −i <s >k −
1
2
<s
2
>k
2
+K
˜
f (k) ·
˜
f ( 0) +
˜
f (k)
k
k ·0
k +
1
2
2
˜
f (k)
k
2
k ·0
k
2
+K
d
n
˜
f (k)
d
n
k
k ·0
· ( −is)
n
f (s)
s
∑
·(−i)
n
<s
n
>
˜
P (k;t ) ·
˜
f (k)
˜
P (k ;t −1) ·
˜
f (k)
t
50 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 50
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 50
We have extended the integral because the decaying exponential becomes narrow as t
increases. The integral is perfor med by complet ing the square in the exponent, giving:
(1.2.66)
or equivalently:
(1.2.67)
which is the same as Eq. (1.2.39).
Q
ue s t i on 1 . 2 . 6 Prove the central limit theorem when s takes a contin
uum of values.
Solut i on 1 . 2 . 6 The proof follows the same course as the integer valued
case. We must define the appropr iate aver ages,and the transfor m. The aver
age of s is still zero, and the mean square displacement is defined similarly:
(1.2.46´)
(1.2.47´)
To avoid problems of notation we substitute the variable x for the state vari
able d:
(1.2.48´)
Skipping steps that are the same we find:
(1.2.51´)
since s( t ′) and s(t ′′) are still ind ependent for t ′ ≠ t ′′. Eq. (1.2.53) is also es
sentially unchanged:
(1.2.53´)
The transform and inverse t ransform must now be defined using
(1.2.54´)
˜
P (k;t ) ≡ dx
∫
e
−ikx
P(x ;t )
˜
f (k) ≡ ds
∫
e
−iks
f (s)
P( x;t) · d
′
x f ( x −
′
x )P(
′
x ;t −1)
∫
< x(t)
2
> · < s(
′
t )
′ t ·1
t
∑

.
`
,
2
> · <s(
′
t )
2
>
′ t ·1
t
∑
·t
0
2
< x(t) > · < s(
′
t )
′
t ·1
t
∑
> ·t <s > ·0
<s
2
> · ds
∫
s
2
f (s) ·
0
2
<s > · ds
∫
sf (s) · 0
P(d ;t ) ·
1
2 (t )
2
e
−d
2
/ 2 (t )
2
·
1
2
dke
−d
2
/ 2t
0
2
e
−(t
0
2
k
2
−2ikd −d
2
/t
0
2
) /2
−∞
∞
∫
·
1
2 t
0
2
e
−d
2
/ 2t
0
2
S t o c h a s t i c i t e r a t i v e m a p s 51
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 51
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 51
(1.2.55´)
The latt er is proved using the proper ties of the Dirac (continuum) delta
funct ion:
(1.2.56´)
where the latter equation holds for an ar bitr ar y function g(x).
The remainder of the der ivation carries forward unchanged.
1 . 2 . 3 Bia sed ra ndom wa lk
We now r et urn to the simple random walk with binary steps of ±1. The model we con
sider is a random walk that is biased in one dir ect ion.Each time a step is taken there
is a probabilit y P
+
for a st ep of +1, that is different from the probabilit y P
–
for a st ep
of –1, or:
(1.2.68)
(1.2.69)
wher e
(1.2.70)
What is the average distance tr aveled in time t ?
(1.2.71)
This equation justifies defining the mean velocity as
(1.2.72)
Since we already have an aver age displacement it doesn’t make sense to also ask
for a typical displacement,as we did with the random walk—the typical displacement
is the average one. However, we can ask about the spread of the displacements around
the average displacement
(1.2.73)
This is called the standard d eviation and it r educes to the RMS distance in the unbi
ased case. For many pur poses ( t) plays the same role in the biased random walk as
in the unbiased rand om walk. From Eq. (1.2.71) and Eq. (1.2.72) the second t er m is
(vt)
2
. The first term is:
(t )
2
· <( d(t) − <d(t ) >)
2
> · <d(t )
2
> −2 <d(t) >
2
+ <d(t ) >
2
· <d(t )
2
> − <d(t) >
2
v ·P
+
−P
−
<d(t ) > · <s(
′
t ) >
′
t ·1
t
∑
· (P
+
−P
−
)
′
t ·1
t
∑
·t (P
+
−P
−
)
P
+
+P
–
·1
P(
′
d  d) ·P
+ ′ d ,d +1
+P
− ′ d , d−
P(s;t) ·P
+ s ,1
+P
− s ,−1
( x − ′ x ) ·
1
2
dke
ik( x −
′
x )
∫
d
′
x (x −
′
x )
∫
g(
′
x ) · g(x)
P(d ;t ) ·
1
2
dke
ikd
˜
P (k ;t )
∫
52 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 52
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 52
(1.2.74)
Substituting in Eq. (1.2.73):
(1.2.75)
It is interesting to consider this expression in the two limits v · 1 and v · 0. For v · 1
the walk is deter ministic, P
+
· 1 and P
−
· 0, and there is no element of chance; the
walker always walks to the right. This is equivalent to the iterat ive map Eq.(1.1.4).Our
result Eq.(1.2.66) is that · 0, as it must be for a deter ministic system. However, for
smaller velocities,the spreading of the systems increases until at v · 0 we recover the
case of the unbiased random walk.
The complete probability distr ibut ion is given by:
(1.2.76)
For large t the distribution can be found as we did for the unbiased random walk. The
work is left to Question 1.2.7.
Q
ue s t i on 1 . 2 . 7 Find the long time (continuum) distribution for the bi
ased random walk.
Solut i on 1 . 2 . 7 We use the Sterling approx i m a ti on as before and take the log
a rithm of the prob a bi l i t y. In ad d i ti on to the ex pre s s i on from the first line of
Eq . (1.2.38) we have an ad d i ti onal factor due to the coef f i c i ent of Eq .( 1 . 2 . 7 6 )
wh i ch appe a rs in place of the factor of 1 / 2
t
. We again define x · d/ t, and di
vi de by 2 to all ow both odd and even integers . We obtain the ex pre s s i on :
(1.2.77)
It makes the most sense to expand this around the mean of x, <x> · v. To
simplify the notat ion we can use Eq. (1.2.70) and Eq. (1.2.72) to wr ite:
(1.2.78)
With these substitutions we have:
(1.2.79)
ln( P(d, t )) ·(t / 2)[(1+ x) ln(1 +v) +(1−x) ln(1 −v)]
−(t / 2)[(1 +x) ln(1 +x) +(1−x) ln(1 −x)] −(1/ 2) ln(2 t (1−x
2
))
P
+
·(1 +v) / 2
P
−
·(1 −v) / 2
ln( P(d, t )) ·(t / 2)[(1+ x) ln2P
+
+(1−x ) ln2P
−
]
−(t / 2)[(1 +x) ln(1 +x) +(1−x) ln(1 −x)] −(1/ 2) ln(2 t (1−x
2
))
P(d ;t ) · P
+
(d +t ) / 2
P
−
(d−t ) / 2
t
(d +t ) / 2

.
`
,
t , d
oddeven
2
·t(1 −v
2
)
<d(t )
2
> ·< s( ′ t )
′ t ·1
t
∑

.
`
,
2
> · <s( ′ t )s( ′ ′ t ) >
′ t , ′′ t ·1
t
∑
·
′ t , ′′ t
+(1 −
′ t , ′′ t
)( P
+
2
+P
−
2
−2P
+
P
−
)

.
`
,
′
t ,
′ ′
t ·1
t
∑
·t +t(t −1)v
2
·t
2
v
2
+t (1−v
2
)
S t o c h a s t i c i t e r a t i v e m a p s 53
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 53
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 53
We expand the first two terms in a Taylor expansion around the mean of x
and expand the third term inside the logarithm. The first term of Eq.(1.2.79)
has only a constant and linear t erm in a Taylor expansion. These cancel the
constant and the first derivat ive of the Taylor expansion of the second ter m
of Eq. (1.2.79) at x · v. Higher derivatives ar ise only from the second ter m:
In the last line we have restored d and used Eq.(1.2.75). Keeping only the first
terms in both expansions gives us:
(1.2.81)
which is a Gaussian dist ribution around the mean we obtained before. This
implies that aside from the constant velocit y, and a slightly modified stan
dard deviation, the distribut ion remains unchanged.
The second ter m in both expansions in Eq.(1.2.80) become small in the
limit of large t, as long as we are not inter ested in the tail of the distribution.
Values of ( d − vt) r elevant to the main part of the dist ribution are given by
the standard deviation, (t). The second t er ms in Eq. (1.2.80) are thus re
duced by a factor of (t) compared to the first terms in the series. Since (t)
grows as the square root of the time, they become insignificant for long
times. The convergence is slower, however, than in the unbiased random
walk (Questions 1.2.2–1.2.5).
Q
ue s t i on 1 . 2 . 8 You are a manager of a casino and are told by the owner
that you have a cash flow problem. In order to survive, you have to make
sure that nine out of ten working days you have a profit. Assume that the only
game in your casino is a roulette wheel. Bets are limited to only red or black
with a 2:1 payoff. The roulette wheel has an equal number of red number s
and black numbers and one green number (the house always wins on green).
Assume that people make a fixed number of 10
6
total $1 bets on the roulette
wheel in each day.
a. What is the maximum number of red numbers on the roulette wheel
that will still allow you to achieve your objective?
b. With this numb er of red numbers, how much money do you make o n
average in each day?
P(d ;t ) ·
1
2 (t )
2
e
−(d−vt )
2
2 (t )
2
ln( P(d, t )) · −(t / 2)[
1
(1−v
2
)
( x −v)
2
+
2
3(1−v
2
)
2
(x −v)
3
+K]
−(1/ 2) ln(2 t[(1−v
2
) −2v(x −v) +K])
· −[
(d −vt )
2
2 (t)
2
+
(d −vt )
3
3 (t )
4
+K] −(1/ 2) ln(2 ( (t )
2
−2v(d −vt) +K))
54 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 54
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1.2.80)
01adBARYAM_29412 3/10/02 10:16 AM Page 54
Solut i on 1 . 2 . 8 The casino wins $1 for ever y wrong bet and loses $1 for
ever y right bet. The results of bets at the casino are equivalent to a random
walk with a bias given by:
(1.2.82)
(1.2.83)
where,as the manager, we consider positive the wins of the casino. The color
subscripts can be used interchangeably, since the number of red and black is
equal. The velocity of the random walk is given by:
(1.2.84)
To calculate the probability that the casino will lose on a part icular day we
must sum the probability that the random walk after 10
6
steps will result in
a negative number. We approximate the sum by an integral over the distrib
ut ion of Eq. (1.2.81). To avoid problems of notat ion we replace d with y:
(1.2.85)
(1.2.86)
We have wr itten the p robability of loss in a day in terms of the err or func
tion erf(x)—the integral of a Gaussian defined by
(1.2.87)
Since
(1.2.88)
we have the expression
(1.2.89)
which is also known as the complementary error function er fc(x).
(1 −erf(z
0
)) ≡
2
dz
z
0
∞
∫
e
−z
2
erf(∞) ·1
erf(z
0
) ≡
2
dz
0
z
0
∫
e
−z
2
z · ′ y 2 (t )
z
0
· −vt / 2 (t )
2
· −vt / 2t (1−v
2
)
P
loss
· dyP( y;t ·10
6
)
−∞
0
∫
·
1
2 (t )
2
dy
−∞
0
∫
e
−( y −vt )
2
2 (t )
2
·
1
2 (t )
2
d
′
y
−∞
−vt
∫
e
−( ′ y )
2
2 (t )
2
·
1
dz
−∞
z
0
∫
e
−z
2
·
1
2
(1 −erf(z
0
))
v ·1/( 2N
red
+1)
P
−
· N
black
/(N
red
+N
black
+1)
P
+
·( N
red
+1) /(N
red
+N
black
+1)
S t o c h a s t i c i t e r a t i v e m a p s 55
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 55
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 55
To obtain the desired const raint on the number of red numbers, or
equivalently on the velocit y, we inver t Eq. (1.2.85) to find a value of v that
gives the desired P
loss
· 0.1, or er f(z
0
) · 0.8. Looking up the er ror function or
using iter ative guessing on an appropr iate computer gives z
0
· 0.9062.
Inver ting Eq. (1.2.86) gives:
(1.2.90)
The approximation holds because t is large.The numer ical result is v· 0.0013.
This gives us the desired number o f each color (inverting Eq. (1.2.84)) o f
N
red
· 371. Of course the result is a ver y large numb er and the pr oblem of
winning nine out of ten days is a ver y conservative problem for a casino. Even
if we insist on winning ninet ynine out of one hundred days we would have
erf(z
0
) · 0.98, z
0
· 1.645, v · 0.0018 and N
red
· 275. The profits per day in
each case are given by vt, which is approximately $1,300 and $1,800 resp ec
tively. Of course this is much less than for bets on a more realistic roulett e
wheel. Eventually as we reduce the chance of the casino losing and z
0
becomes
larger, we might become concerned that we are describing the properties of
the tail of the distribution when we calculate the fraction of days the casino
might lose,and Eq.(1.2.85) will not be ver y accurate. However, it is not dif
ficult to see that casinos do not have cash flow pr oblems.
In order to generalize the p roof of the central limit theorem to the case of a bi
ased random walk, we can t reat the continuum case most simply by consider ing the
system var iable
ˆ
x, where (using d →x for the continuum case):
(1.2.91)
O n ly x is a stoch a s t ic va ri a ble on the r i ght side , v and t a re nu m bers . Si n ce iter a ti ons of
this va ri a ble would satisfy the con d i ti ons for t he gen era l i zed ra n dom walk, the gen er
a l i z a ti on of the Gaussian distri buti on to Eq . (1.2.81) is proved . The discrete case is more
difficult to prove because we cannot shift the va ri a ble d by arbi tra r y amounts and con
ti nue to con s i der it as discrete . We can argue the discrete case to be valid on the basis
of the re sult for the con ti nuum case, but a sep a ra te proof can be con s tru cted as well .
1 . 2 . 4 Ma st er equa t ion a pproa ch
The Master equation is an alternat ive approach to stochastic systems,an alternat ive to
Eq. (1.2.5), that is usually applied when time is continuous. We d evelop it star ting
from the discrete time case. We can rewr ite Eq. (1.2.5) in the for m o f a diff erence
equation for a par ticular probability P( s). Beginning from:
(1.2.92)
P(s;t) ·P(s;t −1) + P(s 
′
s )P(
′
s ;t −1)
′
s
∑
−P(s;t −1)

.
`
,
ˆ
x ·x − <x >
t
· x −t <s > ·x −vt
v ·
1
t / 2z
0
−1
≈ 2z
0
/ t
56 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 56
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 56
we ext ract the term where the system remains in the same state:
(1.2.93)
We use the nor malization of probability to wr ite it in terms of the t ransitions away
from this site:
(1.2.94)
Canceling the terms in the bracket that refer only to the probability P(s;t − 1) we wr ite
this as a difference equation. On the right appear only the probabilities at different
values of the state var iable (s′ ≠ s):
(1.2.95)
To write the cont inuum form we reint roduce the time difference between steps ∆t.
(1.2.96)
When the limit of ∆t → 0 is meaningful, it is possible to make the change to the
equat ion
(1.2.97)
Where the ratio P(s  s′)/∆t has been replaced by the rate of transition R(s  s′).
Eq.(1.2.97) is called the Master equation and we can consider Eq.(1.2.95) as the dis
crete time analog.
The Master equation has a simple inter pretation: The rate of change of the prob
ability of a particular state is the total rate at which probability is being added into that
state from all other states,minus the total rate at which probability is leaving the state.
Probability is acting like a fluid that is flowing to or from a particular state and is be
ing conser ved,as it must be. Eq.(1.2.97) is very much like the continuity equation of
fluid flow, where the density of the fluid at a particular place changes according to how
much is flowing to that location or from it. We will construct and use the Master equa
tion approach to discuss the problem of relaxation in activated processes in
Section 1.4.
˙
P (s,t) · R(s 
′
s )P(
′
s ;t ) −R(
′
s  s)P(s;t )
( )
′
s ≠s
∑
P(s,t) −P(s;t − ∆t)
∆t
·
P(s  ′ s )
∆t
P(
′
s ;t − ∆t ) −
P( ′ s  s)
∆t
P(s;t − ∆t )

.
`
,
′ s ≠s
∑
P(s,t) −P(s;t −1) · P(s 
′
s ) P(
′
s ;t −1) −P(
′
s  s) P(s;t −1)
( )
′
s ≠s
∑
P(s;t) ·P(s;t −1) + P(s 
′
s )P(
′
s ;t −1)
′
s ≠s
∑
+ 1 − P(
′
s  s)
′
s ≠s
∑

.
`
,
P(s;t −1) −P(s;t −1)

.
`
,
P(s;t) ·P(s;t −1) + P(s 
′
s )P(
′
s ;t −1)
′
s ≠s
∑
+P(s  s)P(s;t −1) −P(s;t −1)

.
`
,
S t o c h a s t i c i t e r a t i v e m a p s 57
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 57
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 57
The rmodyna mi cs a nd St a t i s t i ca l Me cha ni cs
The field of thermodynamics is easiest to understand in the context o f Newtonian
mechanics. Newtonian mechanics describes the effect of forces on objects.
Ther modynamics describes the effect of heat transfer on objects. When heat is trans
ferred,the t emperature of an object changes. Temperature and heat are also intimately
related to energy. A hot gas in a pist on has a high pressure and it can do mechanical
work by applying a force to a pist on. By Newtonian mechanics the wor k is directly re
lated to a transfer o f energy. The laws o f Newtonian mechanics are simplest to de
scribe using the abst ract concept of a point object with mass but no internal st ruc
ture. The analogous abstract ion for thermodynamic laws are materials that are in
equilibrium and (even b etter) are homo geneous. It turns out that even the descrip
tion of the equilibrium propert ies of materials is so rich and varied that this is still a
primar y focus of act ive research today.
Statistical mechanics begins as an effor t to explain the laws of thermodynamics
by considering the microscopic application of Newton’s laws. Microscopically, the
temper ature of a gas is found to be related to the kinetic motion of the gas molecules.
Heat transfer is the tr ansfer of Newtonian energy from one object to another. The sta
tistical t reatment of the many par ticles of a mat erial, with a key set of assumptions,
reveals that ther modynamic laws are a natural consequence of many microscopic par
ticles inter acting with each other. Our studies of complex systems will lead us to dis
cuss the propert ies of systems composed of many interacting parts. The concepts and
tools of statistical mechanics will play an impor tant role in these studies, as will the
laws of ther modynamics that emerge from them. Thermodynamics also begins to
teach us how to think about systems inter acting with each other.
1 . 3 . 1 Thermodyna mics
Thermodynamics describes macroscopic pieces of mat erial in equilibr ium in ter ms of
macroscopic parameters. Thermodynamics was developed as a result of exper i
ence/experiment and,like Newton’s laws,is to be understood as a set of selfconsistent
definitions and equations. As with Newtonian mechanics, where in its simplest form
objects are point par ticles and frict ion is ignored,the discussion assumes an idealiza
tion that is directly exp erienced only in special circumstances. However, the funda
mental laws, once understood,can be widely applied. The central quantities that ar e
to be defined and r elated are the energy U, temper ature T, entropy S, pressure P, the
mass (which we write as the number of par ticles) N, and volume V. For magnets,the
quantities should include the magnet ization M, and the magnetic field H. Other
macroscopic quantities that are relevant may be added as necessary within the frame
work developed by thermodynamics.Like Newtonian mechanics,a key aspect of ther
modynamics is to und erstand how syst ems can be acted upon or can act up on each
other. In addition to the quantities that describe the state of a syst em, there are two
quantities that describe act ions that may be made on a system to change its state: work
and heat transfer.
1 . 3
58 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 58
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 58
The equations that relate the macroscopic quantities are known as the zeroth,
first and second laws o f thermodynamics. Much o f the difficulty in understanding
thermodynamics arises from the way the entropy appears as an essential but counter
intuitive quantity. It is more easily understood in the context of a statistical treatment
included below. A second source of difficulty is that even a seemingly simple mater ial
system, such as a piece of metal in a room, is actually quite complicated thermody
namically. Under usual circumstances the metal is not in equilibrium but is emitting
a vapor of its own atoms.A ther modynamic t reatment of the metal r equires consid
eration not only of the metal but also the vap or and even the air that applies a pres
sure upon the metal. It is therefore generally simplest to consider the thermodynam
ics of a gas confined in a closed (and iner t) chamb er as a model thermodynamic
system. We will discuss this example in detail in Question 1.3.1. The translational mo
tion of the whole system, treated by Newtonian mechanics, is ignored.
We begin by defining the concept of equilibr ium.A system left in isolation for a
long enough time achieves a macroscopic state that does not vary in time. The system
in an unchanging state is said to be in equilibrium. Thermodynamics also relies upon
a par ticular t ype of equilibr ium known as ther mal equilibrium. Two syst ems can be
brought together in such a way that they int eract only by transfer ring heat fr om one
to the other. The systems are said to be in ther mal contact. An example would be two
gases separated by a fixed but thermally conducting wall. After a long enough time the
system composed of the combination of the two or iginal syst ems will be in equilib
rium. We say that the two systems are in thermal equilibr ium with each other. We can
generalize the definition o f thermal equilibrium to include syst ems that are not in
contact. We say that any two systems are in thermal equilibrium with each other if
they do not change their (macroscopic) state when they are br ought into thermal con
tact. Thermal equilibrium does not imply that the system is homogeneous, for exam
ple, the two gases may be at different pressures.
The zeroth law of thermodynamics states that if two systems are in thermal equi
libr ium with a third they are in thermal equilibrium with each other. This is not ob
vious without exp erience with macroscopic objects. The zeroth law implies that the
interaction that occurs during ther mal contact is not specific to the mater ials,it is in
some sense weak,and it matters not how many or how big are the systems that are in
contact. It enables us to define the temperature T as a quantity which is the same for
all systems in ther mal equilibrium. A more specific definition of the temperature
must wait till the second law of thermodynamics. We also define the concept of a ther
mal reser voir as a ver y large system such that any system that we are interested in,
when brought into contact with the thermal reser voir, will change its state by trans
ferr ing heat to or from the reser voir until it is in equilibr ium with the reser voir, but
the transfer of heat will not affect the t emperat ure of the reser voir.
Quite basic to the formulation and assump tions of thermodynamics is that the
macroscopic state of an isolated system in equilibrium is completely defined by a
specification of three parameters: energy, mass and volume (U,N,V). For magnets we
must add the magnetization M; we will leave this case for lat er. The confinement of
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 59
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 59
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 59
the syst em to a volume V is understood to result from some form o f containment.
The state of a syst em can be char acterized by the force p er unit area—the pressure
P—exerted by the syst em on the container or by the container on the syst em, which
are the same. Since in equilibr ium a system is uniquely described by the thr ee quan
tities (U,N,V), these d etermine all the other quantities, such as the pressure P and
temperature T. Str ictly speaking, temperature and pressure are only defined for a sys
tem in equilibr ium, while the quantities ( U,N,V) have meaning both in and out of
equilibrium.
It is assumed that for a homogeneous mater ial, changing the size of the system by
adding more material in equilibrium at the same pressure and t emper ature changes
the mass, number of par ticles N, volume V and energy U, in direct proport ion to each
other. Equivalently, it is assumed that cutting the syst em into smaller parts results in
each subpart retaining the same proper ties in proport ion to each other (see Figs.1.3.1
and 1.3.2). This means that these quantities are additive for different parts of a system
whether isolated or in thermal contact or full equilibrium:
(1.3.1)
where indexes the parts of the system. This would not be true if the parts of the sys
tem were st rongly interact ing in such a way that the energy depended on the relat ive
location of the parts. Proper ties such as ( U,N,V) that are p r opor tional to the size of
the syst em are called ext ensive quantities. Intensive quantities are p ropert ies that do
not change with the size of the system at a given pressure and temperature. The ratio
of two extensive quantities is an int ensive quantity. Examples are the particle density
N/V and the energy densit y U/V. The assumption of the existence of extensive and in
tensive quantities is also far from t r ivial, and corresponds to the intuition that for a
macroscopic object,the local propert ies of the system do not depend on the size of the
system. Thus a mat erial may be cut into two parts, or a small part may be separated
from a large part, without affecting its local proper ties.
The simplest ther modynamic systems are homogeneous ones,like a gas in an in
ert container. However we can also use Eq.(1.3.1) for an inhomogeneous system. For
example,a sealed container with water inside will reach a state where both water and
vapor are in equilibrium with each other. The use of intensive quantities and the pro
port ionality of extensive quantities to each other applies only within a single phase—
a single homogeneous part of the syst em, either wat er or vap or. However, the addi
tivity of extensive quantities in Eq. (1.3.1) still applies to the whole system. A
homogeneous as well as a heterogeneous system may contain different chemical
species. In this case the quantit y N is replaced by the number of each chemical species
N
i
and the first line of Eq.(1.3.1) may be replaced by a similar equation for each species.
U · U
∑
V · V
∑
N · N
∑
60 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 60
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 60
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 61
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 61
Title: Dynamics Complex Systems
Shor t / Normal / Long
Mass N
Volume V
Energy U
Mass (1–α)N
Volume (1–α)V
Energy (1–α)U
Mass αN
Volume αV
Energy αU
Fi gure 1 . 3 . 1 Th e rmody
na mics con side rs ma cro
scopic ma t e ria ls. A ba sic
a ssumpt ion is t h a t cut 
t in g a syst e m in t o t wo
pa rt s will n ot a ffe ct t he
loca l prope rt ie s of t h e
ma t e ria l a n d t h a t t h e e n
e rgy U, ma ss ( or n umbe r
of pa rt icle s) N a n d t he
volume V will be divide d
in t h e sa me proport ion .
Th e proce ss of se pa ra t ion
is a ssume d t o le a ve t he
ma t e ria ls un de r t h e sa me
con dit ion s of pre ssure
a n d t e mpe ra t ure.
Fi gure 1 . 3 . 2 Th e a ssumpt ion t h a t t h e loca l prope rt ie s of a syst e m a re un a ffe ct e d by subdi
vision a pplie s a lso t o t h e ca se wh e re a sma ll pa rt of a much la rge r syst e m is re move d. Th e lo
ca l prope rt ie s, bot h of t h e sma ll syst e m a n d of t h e la rge syst e m a re a ssume d t o re ma in un 
ch a nge d. Eve n t h ough t h e sma ll syst e m is m uch sma lle r t h a n t h e origin a l syst e m, t h e sma ll
syst e m is unde rst ood t o be a ma croscopic pie ce of ma t e ria l. Th us it re t a in s t h e sa me loca l
prope rt ie s it h a d a s pa rt of t he la rge r syst e m.
The first law of ther modynamics describes how the energy of a system may
change. The energy of an isolat ed syst em is conser ved. There are two macroscopic
processes that can change the energy of a system when the number of par ticles is fixed.
01adBARYAM_29412 3/10/02 10:16 AM Page 61
The first is work,in the sense o f applying a force over a distance, such as driving a pis
ton that compresses a gas. The second is heat t ransfer. This may be written as:
dU · q + w (1.3.2)
where q is the heat t ransfer into the syst em, w is the work done on the syst em and U
is the internal energy of the system. The differential d signifies the incremental change
in the quantity U as a result of the incremental process of heat tr ansfer and work. The
work per for med on a gas (or other system) is the force times the distance applied Fdx,
wher e we write F as the magnitude of the f orce and dx as an incremental distance.
Since the force is the pressure times the area F · PA, the work is equal to the pressure
times the volume change or :
w · −PAdx · −PdV (1.3.3)
The negative sign arises because posit ive work on the syst em,increasing the system’s
energy, occurs when the volume change is negative. Pressure is defined to be positive.
If two syst ems act upon each other, then the energy t r ansferred consists of both
the work and heat t ransfer. Each of these are separat ely equal in magnitude and op
posit e in sign:
dU
1
· q
21
+ w
21
dU
2
· q
12
+ w
12
(1.3.4)
q
12
· −q
21
w
12
· −w
21
wher e q
21
is the heat t ransfer from system 2 to system 1,and w
21
is the work performed
by system 2 on syst em 1. q
12
and w
12
are similarly defined. The last line of Eq.( 1.3.4)
follows from Newton’s third law. The other equations follow from setting dU · 0 (Eq.
(1.3.2)) for the total system, composed of both of the systems acting upon each other.
The se cond law of ther modynamics given in the following few paragraphs de
scribes a few key aspects of the relationship of the equilibr ium state with nonequilib
rium states. The statement of the second law is essentially a definition and description
of proper ties of the ent ropy. Entropy enables us to describe the process of approach
to equilibrium. In the natural course of events,any system in isolation will change its
state t oward equilibrium. A syst em which is not in equilibr ium must therefore un 
dergo an ir reversible pr ocess leading to equilibr ium. The p rocess is irreversible be
cause the r everse p rocess would take us away fr om equilibr ium, which is impossible
for a macroscopic system. Reversible change can occur if the state of a system in equi
libr ium is changed by transfer of heat or by work in such a way (slowly) that it always
remains in equilibr ium.
For ever y macroscopic state of a system (not necessarily in equilibr ium) there ex
ists a quantit y S called the ent ropy of the system. The change in S is posit ive for any
natural process (change toward equilibr ium) of an isolated system
(1.3.5)
dS ≥0
62 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 62
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 62
For an isolated syst em, equality holds only in equilibrium when no change occurs.
The converse is also tr ue—any possible change that increases S is a natural process.
Ther efore, for an isolated syst em S achieves its maximum value for the equilibr ium
state.
The second proper ty of the ent ropy describes how it is affected by the processes
of work and heat transfer during rever sible processes. The entropy is affected only by
heat transfer and not by work. If we only perform work and do not t ransfer heat the
entropy is constant. Such processes where q · 0 are called adiabatic processes. For adi
abatic processes dS · 0.
The third proper ty of the entropy is that it is extensive:
(1.3.6)
Since in equilibr ium the state of the system is defined by the macroscopic quan
tities ( U,N,V), S is a function of them—S · S(U,N,V)—in equilibr ium. The fourth
propert y of the entropy is that if we keep the size of the system constant by fixing both
the number of par ticles N and the volume V, then the change in ent ropy S with in 
creasing energy U is always positive:
(1.3.7)
wher e the subscripts denote the (values of the) constant quantities. Because of this we
can also inver t the function S · S(U,N,V) to obtain the energy U in terms of S, N and
V: U · U(S,N,V).
Finally, we mention that the zero of the ent ropy is ar bit rary in classical t reat
ments. The zero o f ent ropy does attain significance in statistical treatments that in 
clude quantum effects.
Having described the p roperties of the ent ropy for a single system, we can now
reconsider the problem of two int er acting syst ems. Since the entropy describes the
process o f equilibration, we consider the process by which two systems e quilibrate
thermally. According to the zer oth law, when the two systems are in equilibrium they
are at the same temperature. The two systems are assumed to be isolated fr om any
other influence,so that together they form an isolated system with energy U
t
and en
tropy S
t
. Each of the subsystems is itself in equilibrium, but they are at different tem
peratures initially, and therefore heat is t ransferred to achieve equilibr ium. The heat
t ransfer is assumed to be performed in a reversible fashion—slowly. The two subsys
tems are also assumed to have a fixed number of particles N
1
,N
2
and volume V
1
,V
2
.
No work is done, only heat is t ransferred. The energies of the two systems U
1
and U
2
and entropies S
1
and S
2
are not fixed.
The t ransfer of heat results in a tr ansfer of energy between the two systems ac
cording to Eq. (1.3.4), since the total energy
U
t
· U
1
+ U
2
(1.3.8)
S
U

.
`
,
N ,V
> 0
S · S
∑
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 63
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 63
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 63
is conser ved, we have
dU
t
· dU
1
+ dU
2
· 0 (1.3.9)
We will consider the processes of equilibration twice. The first time we will iden
tify the equilibrium condition and the second time we will describe the equilibration.
At equilibrium the entropy o f the whole system is maximized. Variation o f the en
t ropy with respect to any internal parameter will give zero at equilibr ium. We can con
sider the change in the entropy of the system as a function of how much of the energy
is allocated to the first system:
(1.3.10)
in equilibr ium. Since the total energy is fixed, using Eq. (1.3.9) we have:
(1.3.11)
or
(1.3.12)
in equilibr ium. By the definition of the temperature,any function of the derivative of
the entropy with respect to energy could be used as the temperature. It is conventional
to define the temperature T using:
(1.3.13)
This definition corresponds to the Kelvin temper ature scale. The units of temperature
also define the units of the entropy. This definition has the advantage that heat always
flows from the system at higher temperature to the system at lower temper ature.
To prove this last stat ement, consider a natural small transfer o f heat fr om one
system to the other. The tr ansfer must result in the two systems raising their collective
entropy:
dS
t
· dS
1
+ dS
2
≥ 0 (1.3.14)
We rewrite the change in entropy of each system in terms of the change in energy. We
recall that N and V are fixed for each of the two systems and the entropy is a function
only of the three macroscopic parameters ( U,N,V). The change in S for each system
may be written as:
(1.3.15)
dS
1
·
S
U

.
`
,
N
1
,V
1
dU
1
dS
2
·
S
U

.
`
,
N
2
, V
2
dU
2
1
T
·
dS
dU

.
`
,
N ,V
dS
1
dU
1
·
dS
2
dU
2
dS
t
dU
1
·
dS
1
dU
1
−
dS
2
dU
2
·0
dS
t
dU
1
·
dS
1
dU
1
+
dS
2
dU
1
·0
64 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 64
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 64
to arr ive at:
(1.3.16)
or using Eq. (1.3.9) and the definition of the temperature (Eq. (1.3.13)) we have:
(1.3.17)
or:
(T
2
−T
1
) dU
1
≥ 0 (1.3.18)
This implies that a natural process of heat tr ansfer results in the energy of the first sys
tem increasing ( dU
1
> 0) if the t emper ature of the second system is greater than the
first ((T
2
− T
1
) > 0), or conversely, ifthe temperature of the second system is less than
the temper ature of the first.
Using the definition o f temperature, we can also rewrite the expression for the
change in the energy of a system due to heat transfer or wor k, Eq.(1.3.2). The new ex
pression is restr icted to reversible processes. As in Eq. (1.3.2), N is still fixed.
Consider ing only r eversible processes means we consider only equilibr ium states of
the syst em, so we can write the energy as a funct ion of the ent ropy U · U(S,N,V).
Since a rever sible pr ocess changes the ent ropy and volume while keeping this function
valid, we can wr ite the change in energy for a rever sible process as
(1.3.19)
The first ter m r eflects the effect o f a change in entropy and the second r eflects the
change in volume. The change in ent ropy is r elated to heat t r ansfer but not to work.
If work is done and no heat is t ransfer red,then the first term is zero. Comparing the
second term to Eq. (1.3.2) we find
(1.3.20)
and the incremental change in energy for a reversible process can be written:
dU · TdS − PdV (1.3.21)
This r elationship enables us to make direct exp er imental measur ements o f ent ropy
changes. The work done on a system, in a r ever sible or ir reversible process, changes
the energy of the syst em by a known amount. This energy can then be ext r acted in a
reversible process in the for m of heat. When the system ret urns to its or iginal state, we
P · −
U
V

.
`
,
N ,S
dU ·
U
S

.
`
,
N ,V
dS +
U
V

.
`
,
N ,S
dV
·TdS +
U
V

.
`
,
N ,S
dV
1
T
1

.
`
,
−
1
T
2

.
`
,
]
]
]
]
dU
1
≥ 0
S
U

.
`
,
N
1
,V
1
dU
1
+
S
U

.
`
,
N
2
,V
2
dU
2
≥0
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 65
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 65
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 65
can quantify the amount of heat tr ansferred as a form of energy. Measured heat trans
fer can then be r elated to entropy changes using q · TdS.
Our t r eatment of the fundamentals o f thermodynamics was b r ief and does not
contain the many applications necessary for a detailed understanding. The proper ties
of S that we have described are sufficient to provide a systematic treatment of the ther
modynamics of macroscopic b odies. However, the ent ropy is mor e understandable
from a microscopic (statistical) description of matter. In the next section we int ro
duce the statistical treatment that enables contact between a microscopic picture and
the macroscopic thermodynamic t reatment of matter. We will use it to give micr o
scopic meaning to the entropy and temper ature.Once we have developed the micro
scopic picture we will discuss two applications. The first application, the ideal gas, is
discussed in Sect ion 1.3.3. The discussion of the second application,the Ising model
of magnetic systems, is post poned to Sect ion 1.6.
1 . 3 . 2 The ma croscopic st a t e from microscopic st a t ist ics
In o rder to develop a microscopic und erstanding o f the macroscopic p roper ties o f
matter we must begin by restating the nature of the systems that thermodynamics de
scribes. Even when developing a microscopic picture, the thermodynamic assump
tions are relied upon as guides. Macroscopic systems are assumed to have an extremely
large number N of individual particles (e.g.,at a scale of 10
23
) in a volume V. Because
the size of these systems is so large,they are t ypically investigated by considering the
limit of N →∞ and V→∞, while the density n · N /V remains constant. This is called
the thermodynamic limit. Various proper ties of the syst em are separat ed into exten
sive and int ensive quantities. Extensive quantities are p ropor tional to the size of the
system. Intensive quantities are independent of the size of the system. This reflects the
intuition that local proper ties of a macroscopic object do not depend on the size of
the system. As in Figs.1.3.1 and 1.3.2, the system may be cut into two parts, or a small
par t may be separat ed from a large par t without affecting its local propert ies.
The total energy U of an isolated system in equilibrium, along with the number
of par ticles N and volume V, defines the macroscopic state (macrostate) of an isolated
system in equilibr ium. Microscopically, the energy of the system E is given in classical
mechanics in ter ms of the complete specification of the individual particle positions,
momenta and int eraction potentials. Together these define the microscopic state (mi
crostate) of the system. The microstate is defined differently in quantum mechanics
but similar considerations apply. When we describe the system microscopically we use
the notation E rather than U to describe the energy. The reason for this difference is
that macroscopically the energy U has some degree of fuzziness in its definition,
though the degr ee o f fuzziness will not enter into our considerations. Moreover, U
may also be used to describe the energy of a system that is in ther mal equilibrium with
another system. However, thinking microscopically, the energy of such a system is not
well defined,since thermal contact allows the exchange of energy between the two sys
tems. We should also distinguish between the microscopic and macroscopic concepts
of the number of part icles and the volume, but since we will not make use of this dis
tinction, we will not do so.
66 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 66
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 66
There are many possible microstates that cor respond to a par ticular macrostate
of the syst em specified only by U,N,V. We now make a key assumption of statistical
mechanics—that all of the possible microstates of the system occur with equal prob
ability. The number of these microstates (U,N,V), which by definition depends on
the macroscopic parameters, turns out to be cent ral to statistical mechanics and is di
rectly related to the ent ropy. Thus it deter mines many of the ther modynamic proper
ties of the system, and can be discussed even though we are not always able to obtain
it explicitly.
We consider again the problem of int eracting systems. As before, we consider two
systems (Fig. 1.3.3) that are in equilibr ium separat ely, with state variables (U
1
,N
1
,V
1
)
and (U
2
,N
2
,V
2
). The systems have a number of microstates
1
(U
1
,N
1
,V
1
) and
2
( U
2
,N
2
,V
2
) resp ectively. It is not necessary that the two systems be formed of the
same mat er ial or have the same functional form of (U,N,V), so the funct ion is
also labeled by the system index. The two systems inter act in a limited way, so that they
can exchange only energy. The number o f par ticles and volume of each syst em r e
mains fixed. Conser vation of energy r equires that the total energy U
t
· U
1
+ U
2
re
mains fixed, but energy may be t ransferred fr om one syst em to the other. As before,
our object ive is to identify when energy tr ansfer stops and equilibrium is reached.
Consider the number of microstates of the whole system
t
. This number is a
function not only of the total energy of the system but also of how the energy is allo
cated b etween the syst ems. So, we write
t
(U
1
,U
2
), and we assume that at any time
the energy of each of the two syst ems is well defined. Moreover, the int eract ion be
tween the two systems is sufficiently weak so that the number of states of each system
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 67
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 67
Title: Dynamics Complex Systems
Shor t / Normal / Long
U
1
, N
1
, V
1
1
(U
1
, N
1
, V
1
)
S
1
(U
1
, N
1
, V
1
)
U
2
, N
2
, V
2
2
(U
2
, N
2
, V
2
)
S
2
(U
2
, N
2
, V
2
)
U
t
, N
t
, V
t
t
(U
t
, N
t
, V
t
)
S
t
(U
t
, N
t
, V
t
)
F i g u re 1.3 .3 I l l u s t rat ion
of a syst e m forme d out
of t wo pa rt s. Th e t e xt
discusse s t h is syst e m
wh e n e n e rgy is t ra ns
fe rre d from on e pa rt t o
t h e ot h e r. Th e t ra n sfe r of
e n e rgy on a microscopi c
sca le is e quiva le n t t o
t h e t ra n sfe r of h e a t on a
ma croscopic sca le , sin ce
t h e t wo syst e ms a re n ot
a llowe d t o ch a n ge t h e ir
numbe r of pa rt icle s or
t h e ir volume.
01adBARYAM_29412 3/10/02 10:16 AM Page 67
may be counted independently. Then the total number of microstates is the product
of the number of microstates of each of the two systems separately.
t
(U
1
,U
2
) ·
1
(U
1
)
2
(U
2
) (1.3.22)
where we have dropped the arguments N and V, since they are fixed throughout this
discussion. When energy is tr ansferred,the number of microstates of each of the two
systems is changed. When will the tr ansfer of energy stop? Left on its own,the system
will evolve until it reaches the most probable separation of energy. Since any particu
lar state is equally likely, the most probable separation of energy is the separation that
gives rise to the greatest possible number of states. When the number of par ticles is
large,the greatest number of states cor responding to a particular energy separation is
much larger than the number of states cor responding to any other possible separa
tion. Thus any other possibility is completely negligible. No matt er when we look at
the syst em, it will be in a state with the most likely separation of the energy. For a
macroscopic system,it is impossible for a spontaneous transfer of energy to occur that
moves the system away from equilibr ium.
The last paragraph implies that the tr ansfer of energy fr om one system to the
other st ops when
t
reaches its maximum value. Since U
t
· U
1
+ U
2
we can find the
maximum value of the number of microstates using:
(1.3.23)
or
(1.3.24)
The equivalence of these quantities is analogous to the equivalence of the t empera
ture of the two syst ems in equilibr ium. Since the der ivatives in the last equation ar e
per formed at constant N and V, it appears, by analogy to Eq. (1.3.12), that we can
ident ify the entropy as:
S · k ln( (E,N,V)). (1.3.25)
The constant k, known as the Boltzmann constant, is needed to ensure cor respon
dence of the microscopic counting of states with the macroscopic units of the entropy,
as defined by the r elationship of Eq. (1.3.13), once the units of temper ature and en
ergy are defined.
The entropy as defined by Eq.(1.3.25) can be shown to satisfy all of the proper
ties of the ther modynamic entropy in the last sect ion. We have argued that an isolated
1
1
(U
1
)
1
(U
1
)
U
1
·
1
2
(U
2
)
2
(U
2
)
U
2
ln
1
(U
1
)
U
1
·
ln
2
(U
2
)
U
2
t
(U
1
,U
t
−U
1
)
U
1
·0 ·
1
(U
1
)
U
1
2
(U
t
−U
1
) +
1
(U
1
)
2
(U
t
−U
1
)
U
1
0 ·
1
(U
1
)
U
1
2
(U
2
) −
1
(U
1
)
2
(U
2
)
U
2
68 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 68
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 68
system evolves its macrostate in such a way that it maximizes the number of microstates
that cor respond to the macrostate. By Eq. (1.3.25), this is the same as the first prop
erty of the entr opy in Eq. (1.3.5), the maximization of the entropy in equilibrium.
Interestingly, demonstrating the second proper ty of the ent ropy, that it does not
change during an adiabatic process, requires further for mal developments relating
entropy to information that will be discussed in Sections 1.7 and 1.8. We will connect
the two discussions and thus be able to demonst r ate the second proper ty of the entropy
in Chapter 8 (Sect ion 8.3.2).
The extensive propert y of the entropy follows from Eq.(1.3.22). This also means
that the number of states at a par ticular energy grows exponentially with the size of
the system. More properly, we can say that experimental obser vation that the entropy
is extensive suggests that the inter action between macroscopic mater ials, or parts of a
single macroscopic mat erial, is such that the microstates of each part of the syst em
may be enumer ated independently.
The number of microstates can be shown by simple examples to increase with the
energy of the system. This corresponds to Eq.(1.3.7). There are also examples where
this can be violated, though this will not enter into our discussions.
We consider next a second example of interacting systems that enables us to eval
uate the meaning of a system in equilibrium with a reser voir at a t emperature T. We
consider a small part of a much larger system (Fig. 1.3.4). No assumption is necessar y
regarding the size of the small syst em; it may be either microscopic or macroscopic.
Because of the contact of the small system with the large system, its energy is not
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 69
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 69
Title: Dynamics Complex Systems
Shor t / Normal / Long
U
t
, N
t
, V
t
, T
E({x, p}), N, V
U, N, V
Fi gure 1 . 3 . 4 I n or de r t o un de rst a n d t e mpe ra t ure we con si de r a close d syst e m compose d o f
a la r ge a n d sma ll syst e m, or e quiva le n t ly a sma ll syst e m wh ich is pa rt of a much la r ge r sys
t e m. Th e la rge r syst e m se rve s a s a t h e rma l re se rvoir t ra n sfe rrin g e n e rgy t o a n d from t h e sma ll
syst e m wit h out a ffe ct in g it s own t e mpe ra t ure . A microscopic de script ion of t h is proce ss in
t e rms of a sin gle microscopic st a t e of t h e sma ll syst e m le a ds t o t h e Bolt zma n n proba bilit y.
An a n a lysis in t e rms of t h e ma croscopic st a t e of t h e sma ll syst e m le a ds t o t h e prin ciple of
min imiza t ion of t h e fre e e n e rgy t o obt a in t h e e quilibrium st a t e of a syst e m a t a fixe d t e m
pe ra t ure. Th is prin ciple re pla ce s t h e prin ciple of ma ximiza t ion of t h e e n t ropy, wh ich on ly a p
plie s for a close d syst e m.
01adBARYAM_29412 3/10/02 10:16 AM Page 69
always the same.Energy will be t ransferred back and for th between the small and large
systems. The essential assumpt ion is that the contact between the large and small sys
tem does not affect any other aspect of the description of the small syst em. This means
that the small system is in some sense independent of the large system, despite the en
ergy t ransfer. This is t rue if the small syst em is itself macroscopic, but it may also be
valid for cer tain microscopic systems. We also assume that the small system and the
large system have fixed numbers of particles and volumes.
Our obj ective is to con s i der the prob a bi l i t y that a par ticular micro s t a te of t h e
s m a ll sys tem wi ll be re a l i zed . A micro s t a te is iden ti f i ed by all of the micro s copic par a
m eters nece s s a ry to com p l etely define this st ate . We use the notati on {x , p¦ to den o te
these coord i n a te s . The prob a bi l i ty that this par ticular state wi ll be re a l i zed is given by
the fract i on of s t a tes of the whole sys tem for wh i ch the small sys tem attains this state .
Because t here is on ly one su ch state for the small sys tem , the prob a bi l i ty t hat this state
wi ll be re a l i zed is given by (propor ti onal to) a count of the nu m ber of s t a tes of the re s t
of the sys tem . Si n ce t he large sys tem is mac ro s cop i c , we can count this nu m ber by us
ing the mac ro s copic ex pre s s i on for the nu m ber of s t a tes of the large sys tem :
P({x, p¦) ∝
R
(U
t
− E({x, p¦),N
t
− N,V
t
− V) (1.3.26)
where E({x,p¦),N,V are the energy, number o f par t icles and volume of the micr o
scopic system respectively. E({x,p¦)is a funct ion of the microscopic parameters {x,p¦.
U
t
,N
t
,V
t
are the energy, number of par t icles and volume of the whole system,in clud
ing both the small and large systems.
R
is the entropy of the large subsystem (reser
voir). Since the number of states generally grows faster than linear ly as a function of
the energy, we use a Taylor expansion o f its logarithm (or equivalently a Taylor ex
pansion of the entropy) to find
where we have not expanded in the number of par ticles and the volume because they
are unchanging. We take only the first t erm in the expansion, because the size of the
small system is assumed to be much smaller than the size of the whole system.
Exponentiating gives the relative probabilit y of this par ticular microscopic state.
R
(U
t
− E({x,p¦),N
t
− N,V
t
− V) ·
R
(U
t
,N
t
− N,V
t
− V)e
−E({x,p¦)/kT
(1.3.28)
The p robability o f this par ticular state must be nor malized so that the sum over all
states is one. Since we are normalizing the probability anyway, the constant coefficient
does not affect the result. This gives us the Boltzmann probabilit y distribution:
ln
R
(U
t
−E({x, p}), N
t
−N ,V
t
−V)
·ln
R
(U
t
, N
t
−N,V
t
−V) +
ln
R
(U
t
, N
t
−N ,V
t
−V )
E
t

.
`
,
N
t
,V
t
( −E({x , p}))
·ln
R
(U
t
, N
t
−N,V
t
−V ) +
1
kT
( −E({x , p}))
70 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 70
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1.3.27)
01adBARYAM_29412 3/10/02 10:16 AM Page 70
(1.3.29)
Eq. (1.3.29) is independent of the states of the large syst em and depends only on the
microscopic descrip tion of the states o f the small syst em. It is this expression which
generally provides the most convenient star t ing point for a connection between the
microscopic descrip tion of a system and macroscopic ther modynamics. It identifies
the probability that a particular microscopic state will be realized when the system has
a welldefined t emper ature T. In this way it also provides a microscopic meaning t o
the macroscopic temperature T. It is emphasized that Eq.(1.3.29) describes both mi
croscopic and macroscopic systems in equilibr ium at a temper ature T.
The p robability o f occur rence of a par ticular state should be related to the de
scription of a syst em in t er ms of an ensemble. We have found by Eq. (1.3.29) that a
system in thermal equilibrium at a temperature T is represented by an ensemble that
is formed by taking each of the states in propor tion to its Boltzmann pr obabilit y. This
ensemble is known as the canonical ensemble. The canonical ensemble should be
contrasted with the assumption that each state has equal probability for isolated sys
tems at a par ticular energy. The ensemble of fixed energy and equal a p riori proba
bility is known as the microcanonical ensemble. The canonical ensemble is both eas
ier to discuss analytically and easier to connect with the physical world. It will be
generally assumed in what follows.
We can use the Boltzmann probability and the definition of the canonical en
semble to obtain all of the thermodynamic quantities. The macroscopic energy is
given by the average over the microscopic energy using:
(1.3.30)
For a macroscopic system,the average value of the energy will always be observed in
any specific measurement, despite the Boltzmann probability that allows all energies.
This is because the number of states of the system rises rapidly with the energy. This
rapid growth and the exponential decrease of the probability with the energy results
in a sharp peak in the probability distribution as a function of energy. The sharp peak
in the probability dist r ibution means that the probability of any other energy is neg
ligible. This is discussed below in Quest ion 1.3.1.
For an isolated macroscopic system, we were able to identify the equilibr ium state
from among other states of the system using the pr inciple of the maximization of the
entropy. There is a similar procedure for a macroscopic system in contact with a ther
mal reser voir at a fixed temper ature T. The impor tant point to recognize is that when
we had a closed system,the energy was fixed. Now, however, the object ive becomes to
identify the energy at equilibr ium. Of course, the energy is given by the aver age in
U ·
1
Z
E({x , p})e
−E({x ,p })/ kT
{x, p}
∑
P({x, p}) ·
1
Z
e
−E({x ,p })/ kT
Z · e
−E({x,p }) /kT
{x, p}
∑
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 71
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 71
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 71
Eq.(1.3.30). However, to gener alize the concept of maximizing the ent ropy, it is sim
plest to reconsider the problem of the syst em in contact with the reser voir when the
small system is also macroscopic.
Instead of consider ing the probability of a part icular microstate of welldefined
energy E, we consider the probability of a macroscopic state of the system with an en
ergy U. In this case, we find the equilibrium state of the syst em by maximizing the
number of states of the whole system, or alternat ively of the entropy:
(1.3.31)
To find the equilibr ium state, we must maximize this expression for the entropy of the
whole system. We can again ignore the constant second term. This leaves us with
quantities that are only char acter izing the small syst em we are int erested in, and the
temperature of the reservoir. Thus we can find the equilibrium state by maximizing
the quantit y
S − U/T (1.3.32)
It is conventional to rewr ite this and, rather than maximizing the function in Eq.
(1.3.32), to minimize the function known as the free energy:
F · U − TS (1.3.33)
This suggests a simple p hysical significance of the process of change t oward equilib
rium. At a fixed temper ature, the syst em seeks to minimize its energy and maximize
its entropy at the same time. The relat ive importance of the entropy compared to the
energy is set by the t emperature. For high temper ature, the ent ropy b ecomes mo re
dominant, and the energy rises in o rder to increase the ent ropy. At low temperature,
the energy becomes more dominant, and the energy is lowered at the exp ense of the
entropy. This is the precise statement of the obser vation that “ever ything flows down
hill.” The energy entropy competition is a balance that is rightly considered as one of
the most basic of physical phenomena.
We can obtain a microscopic expression for the free energy by an exer cise that be
gins from a microscopic expression for the entropy:
(1.3.34)
The su m m a ti on is over all micro s copic state s . The delta functi on is 1 on ly wh en
E({x, p¦) · U. Thus t he sum counts all of the micro s copic states wit h en er gy U. S tri ct ly
s pe a k i n g, the f u n cti on is assu m ed to be sligh t ly “f u z z y,” so that it gives 1 wh en
E( {x, p¦) differs from U by a small amount on a mac ro s copic scale, but by a large amount
in terms of the differen ces bet ween en er gies of m i c ro s t a te s . We can then wri te
S ·k ln( ) ·k ln
E x, p
{ ¦ ( )
,U
{x, p}
∑

.
`
,
ln (U, N,V ) +ln
R
(U
t
−U, N
t
−N ,V
t
−V)
·S(U,N ,V) /k +S
R
(U
t
−U , N
t
−N ,V
t
−V ) /k
·S(U, N ,V ) /k +S
R
(U
t
,N
t
−N,V
t
−V) / k +
1
kT
( −U)
72 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 72
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 72
(1.3.35)
Let us compare the sum in the logarithm with the expression for Z in Eq.(1.3.29). We
will argue that they are the same. This discussion hinges on the rapid increase in the
number of states as the energy increases. Because of this rapid growth,the value of Z
in Eq.(1.3.29) actually comes from only a narrow region of energy. We know from the
expression for the energy average, Eq.(1.3.30),that this nar row region of energy must
be at the energy U. This implies that f or all int ents and pur poses the quantity in the
brackets of Eq. (1.3.35) is equivalent to Z. This argument leads to the expression:
(1.3.36)
Comparing with Eq. (1.3.33) we have
F · −kTlnZ (1.3.37)
Since the Boltzmann probability is a convenient start ing point,this expression for the
free energy is often simpler to evaluate than the expression for the ent ropy, Eq.
(1.3.34).A calculation of the free energy using Eq.(1.3.37) provides contact between
microscopic models and the macroscopic behavior of ther modynamic systems. The
Boltzmann nor malization Z, which is directly related to the free energy is also known
as the par tition function. We can obtain other ther modynamic quantities directly
from the free energy. For example, we rewrite the expression for the energy Eq.
(1.3.30) as:
(1.3.38)
where we use the notation · 1/ kT. The entropy can be obtained using this expres
sion for the energy and Eq. (1.3.33) or (1.3.36).
Q
ue s t i on 1 . 3 . 1 Consider the possibility that the macroscopic energy of
a system in contact with a thermal reservoir will deviate from its typical
value U. To do this expand the probability dist r ibut ion of macroscopic en
ergies of a syst em in contact with a reser voir around this value. How large
are the deviat ions that occur?
Solut i on 1 . 3 . 1 We considered Eq.(1.3.31) in or der to opt imize the ent ropy
and find the t ypical value of the energy U. We now consider it again to find
the dist ribution of probabilities of values of the energy around the value U
similar to the way we discussed the distr ibution of microscopic states {x, p¦
in Eq.(1.3.27). To do this we distinguish between the obser ved value of the
U ·
1
Z
E({x , p})e
− E({x, p})
{x , p}
∑
· −
ln(Z )
·
F
S ·
U
T
+k ln Z
S ·k ln( ) ·k ln
E x, p
{ ¦ ( )
,U
e
−E x, p { ¦ ( ) / kT
e
U /kT
{x ,p }
∑

.
`
,
·
U
T
+k ln
E x ,p
{ ¦ ( )
,U
e
−E x , p { ¦ ( ) / kT
{x, p}
∑

.
`
,
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 73
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 73
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 73
energy U ′ and U. Note that we consider U ′ to be a macroscopic energy,
though the same der ivation could be used to obtain the dist r ibution of mi
croscopic energies. The probability of U′ is given by:
(1.3.39)
In the latter form we ignore the fixed arguments N and V. We expand the log
ar ithm of this expression around the expected value of energy U:
(1.3.40)
wher e we have kept t erms to second order. The firstorder t er ms, which are
of the form (1/ kT)(U ′ − U), have opposite signs and ther efore cancel. This
implies that the probability is a maximum at the expected energy U. The sec
ond derivat ive of the entropy can be evaluated using:
(1.3.41)
where C
V
is known as the specific heat at constant volume. For our purposes,
its only relevant propert y is that it is an ext ensive quantity. We can obtain a
similar expression for the reservoir and define the reser voir specific heat C
VR
.
Thus the probabilit y is:
(1.3.42)
where we have left out the (constant) terms that do not depend on U ′.
Because C
V
and C
VR
are extensive quantities and the reser voir is much big
ger than the small system, we can neglect 1/ C
VR
compared to 1/ C
V
. The r e
sult is a Gaussian distr ibution (Eq. (1.2.39)) with a standard deviat ion
· T √kC
V
(1.3.43)
This describes the characteristic deviation of the energy U ′ from the aver age
or t ypical energy U. However, since C
V
is ext ensive, the square r oot means
that the deviation is proportional t o √N. Note that the result is consistent
with a random walk of N steps. So for a large system of N ∼ 10
23
par ticles,t he
possible deviation in the energy is smaller than the energy by a factor of (we
are neglecting ever ything but the N dependence) 10
12
—i.e., it is unde
tectable. Thus the energy of a ther modynamic system is ver y well defined.
1 . 3 . 3 Kinet ic t heory of ga ses a nd pressure
In the previous sect ion, we described the microscopic analog of temperature and en
tropy. We assumed that the microscopic analog of energy was understood,and we d e
P(
′
U ) ∝e
−(1/ 2kT
2
)(1 /C
V
+1/ C
VR
)(U − ′ U )
2
≈e
−(1/ 2kT
2
)(1 /C
V
)(U − ′ U )
2
d
2
S(U)
dU
2
·
d
dU
1
T
· −
1
T
2
1
dU / dT
· −
1
T
2
C
V
S(
′
U ) +S
R
(U
t
−
′
U )
· S(U ) /k +S
R
(U
t
−U) /k +
1
2k
d
2
S(U)
dU
2
(U − ′ U )
2
+
1
2k
d
2
S(U
t
−U)
dU
t
2
(U − ′ U )
2
P(
′
U ) ∝ (
′
U , N,V)
R
(U
t
−
′
U , N
t
−N, V
t
−V ) ·e
S(
′
U ) /k +S
R
(U
t
−
′
U ) / k
74 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 74
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 74
veloped the concept of free energy and its microscopic analog. One quantity that we
have not discussed microscopically is the pr essure. Pressure is a Newtonian concept—
the force per unit area. For various reasons,it is helpful for us to consider the micro
scopic origin of pressure for the example of a simplified model of a gas called an ideal
gas. In Question 1.3.2 we use the ideal gas as an example of the thermodynamic and
statist ical analysis of mater ials.
An ideal gas is composed of indistinguishable point par ticles with a mass m but
with no internal str ucture or size. The inter act ion between the part icles is neglected,
so that the energy is just their kinetic energy. The par ticles do int eract with the walls
of the container in which the gas is confined. This interact ion is simply that of reflec
tion—when the par ticle is incident on a wall, the component of its velocity per pen
dicular to the wall is reversed.Energy is conserved. This is in accordance with the ex
pectation from Newton’s laws for the collision of a small mass with a much larger
mass object.
To o btain an expr ession for the pr essure, we must suffer with some notational
hazards,as the p ressure P, probability of a particular velocity P(v) and momentum of
a par t icular particle p
i
are all designat ed by the lett er P but with different case, argu
ments or subscripts.A bold letter F is used briefly for the force,and otherwise F is used
for the free energy. We rely largely upon context to distinguish them. Since the objec
tive of using an established notation is to make contact with known concepts,this sit
uation is sometimes preferable to int roducing a new notation.
Because of the absence of collisions between different par ticles of the gas, there
is no communication b etween them, and ea ch o f the par t icles bounces around the
container on its own course. The pressure on the container walls is given by the force
per unit area exerted on the walls,as illustrat ed in Fig. 1.3.5. The force is given by the
action of the wall on the gas that is needed to reverse the momenta of the incident par
ticles between t and t + ∆t :
(1.3.44)
wher e  F is the magnitude of the force on the wall. The latt er expression r elates the
pressure to the change in the momenta of incident particles per unit area of the wall.
A is a small but still macroscopic area,so that this part of the wall is flat. Microscopic
roughness of the surface is neglected. The change in velocity ∆v
i
of the part icles dur
ing the time ∆t is zero for particles that are not incident on the wall. Par ticles that hit
the wall between t and t + ∆t are moving in the direction of the wall at time t and are
near enough to the wall to reach it during ∆t. Faster particles can reach the wall from
farther away, but only the velocity perpendicular to the wall matters. Denoting this ve
locit y component as v
⊥
, the maximum distance is v
⊥
∆t (see Fig. 1.3.5).
If the par ticles have velocity only per pendicular to the wall and no velocity par 
allel to the wall,then we could count the incident particles as those in a volume Av
⊥
∆t.
We can use the same expression even when particles have a velocity parallel to the sur
face, because the parallel velocity takes par ticles out of and into this volume equally.
P ·
F
A
·
1
A∆t
m∆v
i
i
∑
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 75
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 75
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 75
Another way to say this is that for a particular parallel velocity we count the part icles
in a shear ed box with the same height and base and therefor e the same volume. The
total number of par ticles in the volume, (N / V) Av
⊥
∆t, is the volume times the density
(N / V).
Within the volume Av
⊥
∆t, the numb er o f par ticles that have the velocit y v
⊥
is
given by the number of par t icles in this volume times the probability P(v
⊥
) that a par
ticle has its p erpendicular velocity component equal to v
⊥
. Thus the number of par
76 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 76
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 3 . 5 I llust ra t ion of a ga s of ide a l pa rt icle s in a con t a in e r n e a r on e of t h e wa lls.
Pa rt icle s in cide n t on t h e wa ll a re re fle ct e d, re ve rsin g t h e ir ve locit y pe rpe n dicula r t o t h e wa ll,
a n d n ot a ffe ct in g t h e ot h e r compon e n t s of t h e ir ve locit y. Th e wa ll e xpe rie n ce s a pre ssure due
t o t h e collision s a n d a pplie s t h e sa me pre ssure t o t h e ga s. To ca lcula t e t h e pre ssure we must
coun t t h e n umbe r of pa rt icle s in a un it of t ime ∆t wit h a pa rt icula r pe rpe n dicula r ve locit y v
⊥
t h a t h it a n a re a A. Th is is e quiva le n t t o coun t in g t h e n umbe r of pa rt icle s wit h t h e ve locit y
v
⊥
in t h e box sh own wit h on e of it s side s of le n gt h ∆t v
⊥
. Pa rt icle s wit h ve locit y v
⊥
will h it
t h e wa ll if a n d on ly if t h e y a re in t h e box. Th e sa me volume of pa rt icle s a pplie s if t h e pa rt i
cle s a lso h a ve a ve locit y pa ra lle l t o t h e surfa ce , sin ce t h is just ske ws t h e box, a s sh own , le a v
in g it s he igh t a nd ba se a re a t h e sa me.
01adBARYAM_29412 3/10/02 10:16 AM Page 76
ticles incid ent on the wall with a part icular velocity p erpendicular to the wall v
⊥
is
given by
(1.3.45)
The total change in momentum is found by multiplying this by the change in mo
mentum of a single par t icle reflected by the collision, 2mv
⊥
, and integr ating over all
velocit ies.
(1.3.46)
Divide this by A∆t to obtain the change in momentum per unit time per unit area,
which is the pressure (Eq. (1.3.44)),
(1.3.47)
We rewr ite this in terms of the average squared velocity perpendicular t o the sur face
(1.3.48)
wher e the equal probability of having positive and negative velocities enables us to ex
tend the integral to −∞while eliminating the factor of two. We can rewr ite Eq.(1.3.48)
in t erms of the average square magnitude of the total velocit y. There are thr ee com
ponents of the velocity (two parallel to the surface). The squares of the velocity com
ponents add t o give the total velocit y squared and the aver ages are equal:
< v
2
> · < v
⊥
2
+ v
2
2
+ v
3
2
> · 3 < v
⊥
2
> (1.3.49)
wher e v is the magnitude of the part icle velocit y. The pressure is:
(1.3.50)
Note that the wall does not influence the probability of having a par ticular velocit y
nearby. Eq. (1.3.50) is a microscopic expr ession for the pr essure, which we can cal
culate using the Boltzmann probability from Eq. (1.3.29). We do this as part of
Question 1.3.2.
Q
ue s t i on 1 . 3 . 2 Develop the statistical description of the ideal gas by ob
taining expressions for the thermodynamic quantities Z, F, U, S and P,
in terms of N, V, and T. For hints read the first three paragraphs of the
solut ion.
Solut i on 1 . 3 . 2 The primary task of statistics is counting. To tr eat the ideal
gas we must count the number of microscopic states to obtain the ent ropy,
P ·
N
V
m
1
3
<v
2
>
P ·
N
V
m 2 dv
⊥
0
∞
∫
P(v
⊥
)v
⊥
2
·
N
V
m dv
⊥
−∞
∞
∫
P(v
⊥
)v
⊥
2
·
N
V
m < v
⊥
2
>
P ·
1
V
N dv
⊥
0
∞
∫
P( v
⊥
)v
⊥
(2mv
⊥
)
m∆v
i
i
∑
·
1
V
NA∆t dv
⊥
0
∞
∫
P( v
⊥
)v
⊥
(2mv
⊥
)
N
V
AP(v
⊥
)v
⊥
∆t
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 77
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 77
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 77
or sum over the Boltzmann probability to obtain Z and the free energy. The
ideal gas presents us with two difficulties. The first is that each particle has a
continuum of possible locations. The second is that we must treat the part i
cles as microscopically indistinguishable. To solve the first problem, we have
to set some inter val of position at which we will call a particle here different
from a par ticle there. Moreover, since a par ticle at any location may have
many different velocities, we must also choose a difference of velocities that
will be considered as distinct. We define the interval of position to be ∆x and
the inter val of momentum to be ∆p. In each spatial dimension,the positions
between x and x +∆x correspond to a single state,and the momenta between
p and p + ∆p correspond to a single state. Thus we consider as one state of
the system a particle which has position and momenta in a sixdimensional
box of a size ∆x
3
∆p
3
. The size of this box enter s only as a constant in classi
cal statistical mechanics, and we will not be concerned with its value.
Quantum mechanics identifies it with ∆x
3
∆p
3
· h
3
, where h is Planck’s con
stant, and for convenience we adopt this notation for the unit volume for
count ing.
Ther e is a subtle but important choice that we have made. We have cho
sen to make the counting int er vals have a fixed width ∆p in the momentum.
From classical mechanics,it is not entirely clear that we should make the in
tervals of fixed width in the momentum or, for example,make them fixed in
the energy ∆E. In the latter case we would count a single state between E and
E +∆E. Since the energy is proportional to the square of the momentum,this
would give a different counting. Quantum mechanics provides an unam
biguous answer that the momentum inter vals are fixed.
To solve the problem of the indistinguishability of the part icles, we must
remember ever y time we count the number of states of the system to divide
by the number of possible ways there are to interchange the particles, which
is N !.
The energy of the ideal gas is given by the kinetic energy of all o f the
par ticles:
(1.3.51)
where the velocity and momentum of a particle are threedimensional vec
tors with magnitude v
i
and p
i
respectively. We start by calculating the par ti
tion function (Boltzmann nor malization) Z from Eq. (1.3.29)
(1.3.52)
where the integral is to be evaluated over all possible locations of each of the
N par ticles of the system. We have also included the correct ion to over
Z ·
1
N!
{x , p}
∑
e
−
p
i
2
2mkT
i ·1
N
∑
·
1
N!
e
−
p
i
2
2mkT
i ·1
N
∑
d
3
x
i
d
3
p
i
h
3
i ·1
N
∏
∫
E({x, p}) ·
1
2
i ·1
N
∑
mv
i
2
·
p
i
2
2m
i ·1
N
∑
78 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 78
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 78
counting, N !. Since the par ticles do not see each other, the energy is a sum
over each par ticle energy. The integrals separate and we have:
(1.3.53)
The position integr al gives the volume V, immediately giving the depen
dence of Z on this macroscopic quantity. The integral over momentum can
be evaluated giving:
and we have that
(1.3.55)
We could have simplified the integration by recognizing that each compo
nent of the momentum p
x
,p
y
and p
z
can be int egr ated separat ely, giving 3N
independent onedimensional int egrals and leading more succinctly to the
result. The result can also be written in t er ms of a natural length (T) that
depends on temperat ure (and mass):
(T) · (h
2
/ 2 mkT )
1/2
(1.3.56)
(1.3.57)
From the partition funct ion we obtain the free energy, making use of
Sterling’s approximation (Eq. (1.2.36)):
F · kTN(lnN − 1) − kTN ln(V/ (T )
3
) (1.3.58)
wher e we have neglected t erms that grow less than linearly with N. Terms
that vary as ln( N) vanish on a macroscopic scale. In this form it might ap
pear that we have a problem,since the N ln(N) term from Sterling’s approx
imation to the factorial does not scale proportional to the size of the system,
and F is an extensive quantity. However, we must also note the N ln(V) term,
which we can combine with the N ln(N) term so that the extensive nature is
apparent:
F · kTN [lnN (T)
3
/V) − 1] (1.3.59)
Z (V ,T , N) ·
V
N
N! (T )
3 N
Z (V ,T , N) ·
V
N
N!
2 mkT /h
2

.
`
,
3N 2
e
−
p
2
2mkT
d
3
p
∫
·4 p
2
dp
0
∞
∫
e
−
p
2
2mkT
· 4 ( 2mkT)
3/ 2
y
2
dy
0
∞
∫
e
−y
2
·4 (2mkT)
3/ 2
−
a
a ·1

.
`
,
dy
0
∞
∫
e
−ay
2
·4 (2mkT)
3/ 2
−
a
a ·1

.
`
,
1
2 a
·(2 mkT)
3/ 2
Z ·
1
N!
1
h
3
e
−
p
2
2mkT
d
3
xd
3
p
∫

.
`
,
N
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 79
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 79
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1.3.54)
01adBARYAM_29412 3/10/02 10:16 AM Page 79
It is interesting that the factor of N!,and thus the indistinguishability of par
ticles,is necessary for the free energy to be extensive. If the particles were dis
tinguishable,then cutting the system in two would result in a different count
ing, since we would lose the states cor responding to particles switching from
one part to the other. If we combined the two systems back t ogether, there
would be an effect due to the mixing of the distinguishable particles
(Quest ion 1.3.3).
The energy may be obtained from Eq. (1.3.38) (any of the forms) as:
(1.3.60)
which p rovides an example of the equipar tition the orem, which says that
each degree of freedom (positionmomentum pair) of the syst em carries
kT / 2 of energy in equilibrium.Each of the three spatial coordinates of each
particle is one degree of freedom.
The expression for the entropy (S · (U − F)/T)
S · kN[ln( V/N (T )
3
) + 5/2] (1.3.61)
shows that the ent ropy per particle S/N gr ows logarithmically with the vol
ume per part icleV/ N. Using the expression for U, it may be wr itten in a for m
S(U,N,V).
Finally, the pressure may be obtained from Eq.(1.3.20), but we must be
careful to keep N and S constant rather than T. We have
(1.3.62)
Taking the same der ivative of the ent ropy Eq. (1.3.61) gives us (the deriva
tive of S with S fixed is zero):
(1.3.63)
Substituting, we obtain the ideal gas equation of state:
PV · NkT (1.3.64)
which we can also obtain from the microscopic expression for the pressure—
Eq.(1.3.50). We describe two ways to do this.One way to obtain the pressure
from the microscopic expr ession is to evaluate first the aver age of the energy
(1.3.65)
This may be substituted in to Eq. (1.3.60) to obtain
U · <E({x , p}) > ·
1
2
i·1
N
∑
m <v
i
2
> ·N
1
2
m <v
2
>
0 · −
1
V
−
3
2
T
V
N ,S
P · −
U
V
N ,S
· −
3
2
Nk
T
V
N ,S
U ·
3
2
NkT
80 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 80
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 80
(1.3.66)
which may be substituted dir ectly in to Eq. (1.3.50). Another way is to ob
tain the aver age squared velocity directly. In averaging the velocit y, it doesn’t
matter which part icle we choose. We choose the fir st particle:
(1.3.67)
where we have further chosen to average over only one of the components of
the velocity of this par ticle and multiply by three. The d enominator is the
nor malization constant Z. Note that the factor 1/N !, due to the indistin
guishability of par ticles, appears in the numerat or in any ensemble aver age
as well as in the denominator, and cancels. It does not affect the Boltzmann
probabilit y when issues of dist inguishability are not involved.
There are 6N integrals in the numerator and in the denominator of Eq.
(1.3.67). All integrals factor into onedimensional integrals.Each integral in
the numerat or is the same as the cor responding one in the denominator, ex
cept for the one that involves the part icular component of the velocity we are
interested in. We cancel all other integr als and obtain:
(1.3.68)
The integr al is per formed by the same technique as used in Eq.(1.3.54). The
result is the same as by the other methods.
Q
ue s t i on 1 . 3 . 3 An insulated box is divided into two compar t ments by a
partition. The two compar t ments contain two different ideal gases at the
same pressure P and temperature T. The first gas has N
1
par ticles and the sec
ond has N
2
par ticles. The part ition is punctured. Calculate the resulting
change in ther modynamic parameters (N, V, U, P, S, T, F). What changes in
the analysis if the two gases are the same, i.e., if they are composed o f the
same type of molecules?
Solut i on 1 . 3 . 3 By additivity the ext rinsic p roper ties o f the whole system
before the puncture are (Eq. (1.3.59)–Eq. (1.3.61)):
<v
1
2
> ·3 <v
1 ⊥
2
> ·3
v
1⊥
2
e
−
p
1 ⊥
2
2mkT
dp
1 ⊥
∫
e
−
p
1⊥
2
2mkT
dp
1⊥
∫
·3(
2kT
m
)
y
2
e
−y
2
dy
∫
e
−y
2
dy
∫
· 3(
2kT
m
)(
1
2
)
<v
1
2
> ·3 <v
1 ⊥
2
> ·
1
N!
3v
1 ⊥
2
e
−
p
i
2
2mkT
i ·1
N
∑
d
3
x
i
d
3
p
i
h
3
i·1
N
∏
∫
1
N!
e
−
p
i
2
2mkT
i ·1
N
∑
d
3
x
i
d
3
p
i
h
3
i·1
N
∏
∫
1
2
m <v
2
> ·
3
2
kT
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 81
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 81
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 81
(1.3.69)
The pressure is intrinsic, so before the puncture it is (Eq. (1.3.64)):
P
0
· N
1
kT / V
1
· N
2
kT / V
2
(1.3.70)
After the punct ure, the total energy remains the same, because the
whole syst em is isolated. Because the two gases do not interact with each
other even when they are mixed, their propert ies continue to add after the
puncture. However, each gas now occupies the whole volume, V
1
+ V
2
. The
expression for the energy as a function of temperature remains the same,so
the temperature is also unchanged. The pr essure in the container is now ad
ditive: it is the sum of the pr essure of each of the gases:
P · N
1
kT / (V
1
+ V
2
) + N
2
kT / (V
1
+ V
2
) · P
0
(1.3.71)
i.e., the pressure is unchanged as well.
The only changes are in the entropy and the free energy. Because the two
gases do not int er act with each other, as with other quantities, we can wr ite
the total entropy as a sum over the entropy of each gas separ ately:
S · kN
1
[ln((V
1
+ V
2
) / N
1
( T)
3
) + 5/ 2]
+ kN
2
[ln((V
1
+ V
2
) / N
2
(T)
3
) + 5/ 2] (1.3.72)
· S
0
+ (N
1
+ N
2
)k ln( V
1
+ V
2
) − N
1
k ln(V
1
) − N
2
k ln(V
2
)
If we simplify to the case V
1
· V
2
, we have S · S
0
+ (N
1
+ N
2
)k ln(2). Since
the energy is unchanged, by the relationship of free energy and ent ropy
(Eq. (1.3.33)) we have:
F · F
0
− T( S − S
0
) (1.3.73)
If the two gases are composed of the same molecule,there is no change
in ther modynamic parameters as a result of a puncture. Mathematically, the
difference is that we replace Eq. (1.3.72) with:
S · k(N
1
+ N
2
)[ln((V
1
+ V
2
) / ( N
1
+ N
2
) (T)
3
) + 5/ 2] · S
0
(1.3.74)
wher e this is equal to the original ent ropy because of the relationship
N
1
/ V
1
· N
2
/ V
2
from Eq. (1.3.70). This example illustrates the effect of in
distinguishability. The ent r opy increases aft er the punct ure when the gases
are different, but not when they are the same.
Q
ue s t i on 1 . 3 . 4 An ideal gas is in one compart ment of a twocompar tment
sealed and ther mally insulated box. The compar t ment it is in has a vol
ume V
1
. It has an energy U
0
and a number of particles N
0
. The second com
U
0
·U
1
+U
2
·
3
2
( N
1
+ N
2
)kT
V
0
·V
1
+V
2
S
0
·kN
1
[ln(V
1
/ N
1
(T )
3
) +5/ 2] +kN
2
[ln(V
2
/ N
2
(T )
3
) +5/ 2]
F
0
·kTN
1
[ln( N
1
(T )
3
/V
1
) −1] +kTN
2
[ln( N
2
(T )
3
/ V
2
) −1]
82 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 82
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 82
par tment has volume V
2
and is empt y. Write expressions for the changes in
all thermodynamic par ameter s (N, V, U, P, S, T, F) if
a . the barrier between the two compart ments is punctured and the gas ex
pands to fill the box.
b. the bar ri er is moved slowly, l i ke a piston , expanding the gas to fill the box .
Solut i on 1 . 3 . 4 Recognizing what is conserved simplifies the solution of
this t ype of problem.
a. The energy U and the number of par ticles N are conserved. Since
the volume change is given to us explicitly, the expressions for T
(Eq. (1.3.60)), F (Eq. (1.3.59)), S (Eq. (1.3.61)), and P (Eq. (1.3.64)) in
terms of these quantities can be used.
N · N
0
U · U
0
V · V
1
+ V
2
(1.3.75)
T · T
0
F · kTN[ln(N ( T)
3
/ (V
1
+ V
2
)) − 1] · F
0
+ kTN ln( V
1
+ V
2
))
S · kN[ln((V
1
+ V
2
) / N T)
3
) + 5/2] · S
0
+ kN ln((V
1
+ V
2
)/V
1
)
P · NkT / V · NkT/( V
1
+ V
2
) · P
0
V
1
/ (V
1
+ V
2
)
b. The process is rever sible and no heat is t ransferred,thus it is adiabatic—
the ent ropy is conserved. The number of par ticles is also conser ved:
N · N
0
S · S
0
(1.3.76)
Our main task is to calculate the effect of the wor k done by the gas pres
sure on the piston. This causes the energy of the gas to decrease,and the
temperature decreases as well. One way to find the change in t emper a
ture is to use the conser vation of entropy, and Eq. (1.3.61), to o btain
that V/ (T)
3
is a constant and therefore:
T ∝V
2/3
(1.3.77)
Thus the temperature is given by:
(1.3.78)
Since the temper ature and energy are propor tional to each other
(Eq. (1.3.60)), similar ly:
(1.3.79)
U ·U
0
V
1
+V
2
V
1

.
`
,
−2 / 3
T ·T
0
V
1
+V
2
V
1

.
`
,
−2 / 3
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 83
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 83
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 83
The freeenergy expression in Eq. (1.3.59) changes only through the
temper ature prefactor :
(1.3.80)
Finally, the pressure (Eq. (1.6.64)):
(1.3.81)
The ideal gas illust rates the significance of the Boltzmann distribut ion. Consider
a single particle. We can treat it either as part of the large system or as a subsystem in
its own right. In the ideal gas, without any interact ions, its energy would not change.
Thus the par ticle would not be described by the Boltzmann probability in Eq.
(1.3.29). However, we can allow the ideal gas model to include a weak or infrequent
inter action (collision) between par ticles which changes the particle’s energy. Over a
long time compared to the time between collisions, the particle will explore all possi
ble positions in space and all possible momenta. The probability of its being at a par
ticular position and momentum (in a region d
3
xd
3
p) is given by the Boltzmann dis
t ribut ion:
(1.3.82)
Instead of consider ing the t rajectory of this par t icular particle and the effects of
the (unspecified) collisions, we can think of an ensemble that represents this particu
lar par ticle in contact with a thermal reser voir. The ensemble would be composed of
many different par ticles in different boxes. There is no need to have more than one
par ticle in the system. We do need to have some mechanism for energy to be tr ans
ferr ed to and from the par t icle instead o f collisions with other part icles. This could
happen as a result of the collisions with the walls of the box if the vib rations of the
walls give energy to the part icle or absorb energy from the particle. If the wall is at the
temperature T, this would also give rise to the same Boltzmann dist r ibut ion for the
par ticle. The probability of a par ticular par ticle in a par ticular box being in a par tic
ular location with a particular momentum would be given by the same Boltzmann
probability.
Using the Boltzmann probability distr ibution for the velocit y, we could calculate
the average velocity of the part icle as:
e
−
p
2
2mkT
d
3
pd
3
x / h
3
e
−
p
2
2mkT
d
3
pd
3
x / h
3
∫
P ·NkT /V ·P
0
TV
0
T
0
V
· P
0
V
1
+V
2
V
1

.
`
,
−5 / 3
F ·kTN[ln(N (T )
3
/V ) −1] · F
0
T
T
0
· F
0
V
1
+V
2
V
1

.
`
,
−2 / 3
84 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 84
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 84
(1.3.83)
which is the same result as we obtained for the ideal gas in the last part of
Question 1.3.2. We could even consider one coor dinate of one par t icle as a separate
system and arrive at the same conclusion.Our description of systems is actually a de
scription of coordinates.
Ther e are differ ences when we consider the par ticle to be a member o f an e n
semble and as one par ticle of a gas. In the ensemble, we do not need to consider the
distinguishability of par ticles. This d oes not affect any of the p roper ties o f a single
particle.
This discussion shows that the ideal gas model may be viewed as quite close to
the basic concept of an ensemble.Gene ralize the physical part icle in three dimensions
to a point with coordinates that describe a complete system. These coordinates change
in time as the system evolves accor ding to the rules of its dynamics. The ensemble rep
resents this system in the same way as the ideal gas is the ensemble of the particle. The
lack of interaction between the different members of the ensemble,and the existence
of a t ransfer o f energy to and from each of the syst ems to gener ate the Boltzmann
probability, is the same in each of the cases. This analo gy is helpful when thinking
about the nature of the ensemble.
1 . 3 . 4 Pha se t ra nsit ions—first a nd second order
In the previous sect ion we constructed some of the underpinnings of thermody
namics and their connection with microscopic descriptions of mater ials using statis
tical mechanics. One of the central conclusions was that by minimizing the free en
ergy we can find the equilibrium state of a material that has a fixed number of
par ticles, volume and temperature. Once the fr ee energy is minimized to o btain the
equilibr ium state o f the mat erial, the energy, entropy and p ressure are uniquely de
ter mined. The free energy is also a function of the t emper ature, the volume and the
number of par ticles.
One of the impor tant proper ties of mater ials is that they can change their prop
er ties suddenly when the temperature is changed by a small amount. Examples of this
are the transition of a solid to a liquid, or a liquid to a gas. Such a change is known as
a phase t r ansition. Each welldefined state of the mat erial is considered a par t icular
phase. Let us consider the process of minimizing the fr ee energy as we vary the tem
perature. Each of the proper ties of the mat er ial will, in general, change smoothly as
the temper ature is varied. However, special circumstances might occur when the
minimization of the free energy at one temperature results in a ver y different set of
<v
2
>· 3 <v
⊥
2
>· 3
v
⊥
2
e
−
p
2
2mkT
d
3
pd
3
x / h
3
∫
e
−
p
2
2mkT
d
3
pd
3
x /h
3
∫
·
v
⊥
2
e
−
p
⊥
2
2mkT
dp
⊥
∫
e
−
p
⊥
2
2mkT
dp
⊥
∫
·
3kT
m
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 85
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 85
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 85
properties of the material from this minimization at a slightly different temperature.
This is illust rated in a ser ies of frames in Fig. 1.3.6, where a schematic of the free en
ergy as a function of some macroscopic parameter is illustrat ed.
The temperature at which the jump in propert ies of the mater ial occurs is called
the critical or transition temper ature,T
c
. In general,all of the propert ies of the mate
rial except for the free energy jump discontinuously at T
c
. This kind of phase t ransi
tion is known as a firstorder phase transition. Some of the properties of a firstorder
phase t r ansition are that the two phases can coexist at the t r ansition temperature so
that part of the material is in one phase and part in the other. An example is ice float
ing in wat er. If we start fr om a t emperature below the tr ansition t emper ature—with
ice—and add heat to the system gradually, the temperature will rise until we reach the
t ransition temperature. Then the temperature will stay fixed as the mater ial converts
from one phase to the other—fr om ice to water. Once the whole system is converted
to the higher temper ature phase, the temper ature will star t to increase again.
86 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 86
Title: Dynamics Complex Systems
Shor t / Normal / Long
T
c
+2∆T
T
c
+∆T
T
c
T
c
–∆T
T
c
–2∆T
F i g u re 1 . 3.6 Ea ch of t he
curve s re pre se n t s t he
va ria t ion of t h e fre e e n
e rgy of a syst e m a s a
fun ct ion of ma croscopic
pa ra me t e rs. Th e diffe r
e n t curve s a re for dif
fe re n t t e mpe ra t ure s. As
t h e t e mpe ra t ure is va r
ie d t h e min imum of t he
fre e e n e rgy a ll of a sud
de n swit ch e s from one
se t of ma croscopic pa ra
me t e rs t o a n ot h e r. Th is
is a first  orde r ph a se
t ra n sit ion like t h e me lt 
in g of ice t o form wa t e r,
or t h e boilin g of wa t e r
t o form st e a m. Be low
t h e ice  t o wa t e r ph a se
t ra n sit ion t h e ma cro
scopic pa ra me t e rs t h a t
de scribe ice a re t h e min
imum of t h e fre e e n e rgy,
wh ile a bove t h e ph a se
t ra n sit ion t h e ma cro
scopic pa ra me t e rs t h a t
de scribe wa t e r a re t he
min imum of t h e fre e
e n e rgy.
01adBARYAM_29412 3/10/02 10:16 AM Page 86
The temperature T
c
at which a transition occurs depends on the number of par
ticles and the volume of the system. Alter natively, it may be consider ed a funct ion of
the pressure. We can draw a phasetr ansition diagr am (Fig. 1.3.7) that shows the tran
sition t emperature as a function o f pr essure. Each r egion of such a diagr am corre
sponds to a par ticular phase.
Ther e is another kind of phase t ransition, known as a secondorder phase t r an
sition, where the energy and the pressure do not change discontinuously at the phase
t ransition p oint. Instead, they change continuously, but they are nonanalyt ic at the
t ransition temperature.A common way that this can occur is illustr ated in Fig. 1.3.8.
In this case the single minimum of the free energy breaks into two minima as a func
tion of temper ature. The temperature at which the two minima ap pear is the t ransi
tion temperature. Such a secondorder transition is often coupled to the existence of
firstorder tr ansitions. Below the secondorder transition temperature, when the two
minima exist, the variation of the pr essure can change the r elat ive energy of the two
minima and cause a firstorder t ransition to o ccur. The firstorder t ransition occurs
at a part icular pressure P
c
(T) for each temperature below the secondorder transition
temperature. This gives rise to a line of firstorder phase t ransitions. Above the
secondorder t ransition t emper ature, there is only one minimum, so that there ar e
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 87
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 87
Title: Dynamics Complex Systems
Shor t / Normal / Long
T
P
ice
water
steam
1
st
order
2
nd
order
Fi gure 1 . 3 . 7 Sch e ma t ic ph a se dia gra m of H
2
O sh owin g t h re e ph a se s — ice , wa t e r a n d st e a m.
Ea ch of t h e re gion s sh ows t h e doma in of pre ssure s a n d t e mpe ra t ure s a t wh ich a pure ph a se
is in e quilibrium. Th e lin e s sh ow ph a se t ra n sit ion t e mpe ra t ure s, T
c
( P) , or ph a se t ra n sit ion
pre ssure s, P
c
( T) . Th e diffe re n t wa ys of crossin g lin e s h a ve diffe re n t n a me s. I ce t o wa t e r: me lt 
in g; ice t o st e a m: sublima t ion ; wa t e r t o st e a m: boilin g; wa t e r t o ice : fre e zin g; st e a m t o wa
t e r: conde n sa t ion ; st e a m t o ice : conde n sa t ion t o frost . Th e t ra n sit ion lin e from wa t e r t o st e a m
e n ds a t a poin t of h igh pre ssure a n d t e mpe ra t ure wh e re t h e t wo be come in dist in guish a ble . At
t h is h igh pre ssure st e a m is compre sse d t ill it h a s a de n sit y a pproa ch in g t h a t of wa t e r, a n d a t
t h is h igh t e mpe ra t ure wa t e r mole cule s a re e n e rge t ic like a va por. Th is spe cia l poin t is a
se cond orde r ph a se t ra nsit ion poin t ( se e Fig. 1. 3. 8) .
01adBARYAM_29412 3/10/02 10:16 AM Page 87
88 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 88
Title: Dynamics Complex Systems
Shor t / Normal / Long
T
c
+2∆T
T
c
+∆T
T
c
T
c
–∆T
T
c
–2∆T
T
c
–3∆T
Fi gure 1 . 3 . 8 Simila r t o
Fig. 1. 3. 6, e a ch of t he
curve s re pre se n t s t h e
va ria t ion of t h e fre e e n
e rgy of a syst e m a s a
fun ct ion of ma croscopic
pa ra me t e rs. I n t h is ca se,
h owe ve r, t h e ph a se t ra n 
sit ion occurs wh e n t wo
min ima e me rge from
on e. Th is is a se con d or 
de r ph a se t ra n sit ion .
Be low t h e t e mpe ra t ure
a t wh ich t h e se con d or
de r ph a se t ra n sit ion oc
curs, va ryin g t h e pre s
sure ca n give rise t o a
first  orde r ph a se t ra n si
t ion by ch a n gin g t h e re l
a t ive e n e rgie s of t h e t wo
min ima ( se e Figs. 1. 3. 6
a nd 1. 3. 7) .
also no firstorder t ransitions. Thus, the secondorder t ransition point occurs as the
end of a line of firstorder t ransitions.A secondorder t ransition is found at the end
of the liquidtovapor phase line of water in Fig. 1.3.7.
01adBARYAM_29412 3/10/02 10:16 AM Page 88
The proper ties of secondorder phase t ransitions have been ext ensively studied
because of inter esting phenomena that are associated with them. Unlike a firstor der
phase t ransition, there is no coexistence o f two phases at the phase t ransition, be
cause there is only one phase at that point. Instead, there exist large fluctuations in
the local p roper ties of the mat erial at the phase tr ansition. A suggestion of why this
occurs can be seen from Fig. 1.3.8, where the fr ee energy is seen to be ver y flat at the
phase t r ansition. This results in large excursions (fluctuations) of all the properties
of the system except the free energy. These excursions, however, are not coherent
over the whole material. Instead, they occur at ever y length scale from the micro
scopic on up. The closer a mater ial is to the phase t ransition, the longer are the
length scales that are affected. As the temperature is varied so that the system moves
away from the tr ansition temperature,the fluctuations disappear, first on the longest
length scales and then on shorter and shorter length scales. Because at the phase
tr ansition itself even the macroscopic length scales are affected,t her modynamics it 
self had to be carefully r ethought in the vicinity of secondorder phase t ransitions.
The methodology that has b een developed, the renormalization gr oup, is an impor
tant t ool in the investigation of phase t ransitions. We will discuss it in Section 1.10.
We note that, to be consistent with Question 1.3.1, the specific heat C
V
must diverge
at a secondorder phase t ransition, where energy fluctuations can be large.
1 . 3 . 5 Use of t hermodyna mics a nd st a t ist ica l mecha nics in
describing t he rea l world
How do we gener alize the notions of thermodynamics that we have just described to
apply to more realistic situations? The assumptions o f thermodynamics—that sys
tems are in equilibr ium and that dividing them into parts leads to unchanged local
properties—do not gener ally apply. The br eakdown of the assump tions of ther mo
dynamics occurs for even simple mat er ials, but are more r adically violat ed when we
consider biological organisms like t rees or p eople. We still are able to measure their
temperature. How do we extend thermodynamics to apply to these systems?
We can start by consider ing a system quite close to the thermodynamic ideal—a
pure piece of material that is not in equilibrium. For example, a glass of water in a
room. We generally have no t rouble placing a thermometer in the glass and measur 
ing the temperature of the water. We know it is not in equilibrium, because if we wait
it will evapor ate to become a vapor spread out throughout the room (even if we sim
plify by consider ing the room closed). Moreover, if we wait longer (a few hundr ed
years to a few tens of thousands of years),the glass itself will flow and cover the table
or flow down to the floor, and at least part of it will also sublime to a vapor. The table
will und ergo its own p rocesses o f deterioration. These effects will occur even in an
idealized closed room without considerations of various exter nal influences or traffic
through the room. There is one essential concept that allows us to continue to apply
thermodynamic principles to these materials,and measure the temperature of the wa
ter, glass or table, and generally to discover that they are at the same (or close to the
same) temperature. The concept is the separation of t ime scales. This concept is as ba
sic as the other pr inciples of thermodynamics. It plays an essential role in discussions
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 89
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 89
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 89
of the dynamics of physical systems and in par ticular of the dynamics of complex sys
tems. The separation of time scales assumes that our obser vations of syst ems have a
limited time resolution and are per for med over a limited time. The processes that oc
cur in a mat er ial are then separat ed into fast processes that are much fast er than the
time resolution o f our obser vation, slow processes that occur on longer time scales
than the duration of observation,and dynamic processes that occur on the time scale
of our obser vation. Macroscopic a verages are assumed to be averages o ver the fast
processes. Thermodynamics allows us to deal with the slow and the fast processes but
only in ver y limited ways with the dynamic processes. The dynamic processes are dealt
with separ ately by Newtonian mechanics.
Slow p rocesses establish the framework in which thermodynamics can be ap
plied. In formal ter ms,the ensemble that we use in ther modynamics assumes that all
the parameters of the system described by slow processes are fixed. To describe a sys
tem using statistical mechanics, we consider all of the slowly var ying parameter s of
the system to be fixed and assume that equilibr ium applies to all of the fast processes.
Specifically, we assume that all possible ar r angements of the fast coordinates exist in
the ensemble with a probability given by the Boltzmann probability. Gener ally,
though not always, it is the microscopic processes that are fast. To justify this we can
consider that an atom in a solid vibrates at a rate of 10
10
–10
12
times per second,a gas
molecule at room temper ature t r avels five hundred meters per second. These are,
however, only a couple of select examples.
Sometimes we may still choose to per for m our analysis by averaging over many
possible values of the slow coordinates. When we do this we have two kinds of en
sembles—the ensemble of the fast coordinates and the ensemble of the different val
ues of the slow coordinates. These ensembles are called the annealed and quenched
ensembles. For example, say we have a glass of water in which there is an ice cube.
There are fast processes that cor respond to the motion of the water molecules and the
vibrations of the ice molecules,and there are also slow processes corresponding to the
movement of the ice in the water. Let’s say we want to determine the average amount
of ice. If we per for m several measurements that determine the coordinates and size of
the ice, we may want to average the size we find over all the measurements even
though they are measurements corresponding to different locations of the ice. In con
t rast, if we want ed to measure the motion o f the ice, aver aging the measur ements of
location would be absurd.
Closely related to the discussion of fast coor dinates is the ergodic theorem. The
ergodic theorem states that a measurement p erformed on a system by aver aging a
proper ty over a long time is the same as taking the average over the ensemble of the
fast coordinates. This theorem is used to relate exp erimental measurements that ar e
assumed to occur over long times to theoret ically obtained aver ages over ensembles.
The ergodic theorem is not a theorem in the sense that it has been proven in general,
but rather a statement of a proper ty that applies to some macroscopic systems and is
known not to apply to other s. The objective is to identify when it applies. When it does
not apply, the solution is to identify which quantities may be aver aged and which may
90 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 90
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 90
not, often by separating fast and slow coordinates or equivalently by identifying quan
tit ies conserved by the fast dynamics of the system.
Experimental measurements also generally aver age properties over large regions
of space compared to microscopic lengths. It is this spatial averaging rather than time
aver aging that often enables the ensemble average to stand for exper imental mea
surements when the microscopic processes are not fast compared to the measurement
time. For example, mater ials are oft en formed of microscopic grains and have many
dislocations. The grain boundaries and dislocations do move, but they oft en change
ver y slowly over time. When exper iments are sensitive to their propert ies, they often
average over the effects of many gr ains and dislocations because they do not have suf
ficient resolut ion to see a single grain boundar y or dislocation.
In order to determine what is the relevant ensemble for a particular experiment,
both the effect of time and space aver aging must be considered. Technically, this r e
quires an understanding of the correlation in space and time o f the proper ties of an
individual system. More concept ually, measurements that are made for part icular
quantities are in effect made over many independent systems both in space and in
time, and ther efore cor respond to an ensemble average. The exist ence of correlation
is the opposite of independence. The key question (like in the case of the ideal gas) be
comes what is the inter val of space and time that cor responds to an independent sys
tem. These quantities are known as the cor relation length and the cor relation time. If
we are able to describe theoretically the ensemble o ver a correlation length and cor
relat ion t ime, then by appropr iate aver aging we can describe the measurement.
In summar y, the p rogr am of use of thermodynamics in the real world is to use
the separation of the different time scales to apply equilibrium concepts to the fast de
grees of freedom and discuss their influence on the dynamic degrees of freedom while
keeping fixed the slow degr ees of freedom. The use of ensembles simplifies consider
ation of these systems by systematizing the use of equilibrium concepts to the fast de
grees of freedom.
1 . 3 . 6 From t hermodyna mics t o complex syst ems
Our objective in this book is to consider the dynamics of complex systems. While,as
discussed in the previous section, we will use the principles o f thermodynamics t o
help us in this analysis,another imp or tant reason to review thermodynamics is to rec
ognize what c omplex syst ems are not. Thermodynamics describes macroscopic sys
tems without structure or dynamics. The task of thermodynamics is to relate the ver y
few macroscopic parameters to each other. It suggests that these are the only relevant
parameters in the description o f these syst ems. Mater ials and complex syst ems ar e
both for med out of many inter acting parts. The ideal gas example describ ed a mat e
rial where the interact ion between the part icles was weak. However, thermodynamics
also describes solids, where the inter action is strong. Having d ecided that complex
systems are not described fully by the rmodynamics, we must ask, Where do the as
sumptions of thermodynamics break down? Ther e are sever al ways the assumptions
may break down, and each one is significant and plays a role in our investigation of
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 91
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 91
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 91
complex systems. Since we have not yet examined part icular examples of complex sys
tems, this discussion must be quite abstr act. However, it will be useful as we study
complex syst ems to refer back to this discussion. The abst ract stat ements will have
concrete realizations when we const ruct models of complex systems.
The assumptions of ther modynamics separate into spacerelated and time
related assumptions. The first we discuss is the divisibility of a macroscopic mater ial.
Fig. 1.3.2 (page 61) illustr ates the propert y of divisibility. In this pr ocess,a small part
of a syst em is separated fr om a large part of the syst em without affecting the local
properties of the mater ial. This is inherent in the use of extensive and intensive quan
tities. Such divisibility is not t rue of systems t ypically considered to be complex sys
tems. Consider, for example, a person as a complex system that cannot be separated
and continue to have the same p roper ties. In words, we would say that complex sys
tems are formed out of not only interacting, but also interdependent parts. Since both
thermodynamic and complex syst ems are formed out of interacting parts, it is the
concept of inter dependency that must distinguish them. We will dedicate a few para
gr aphs to defining a sense in which “interdependent” can have a more precise
meaning.
We must first address a simple way in which a system may have a nonext ensive
energy and still not be a complex system. If we look closely at the propert ies of a ma
terial, say a piece of metal or a cup of water, we discover that its surface is different
from the bulk. By separating the mater ial into pieces, the surface area of the material
is changed. For macroscopic materials,this gener ally does not affect the bulk proper
ties of the mater ial.A char acter istic way to identify surface properties, such as the sur
face energy, is through their dependence on particle number. The surface energy
scales as N
2/3
, in contr ast to the extensive bulk energy that is linear in N. This kind of
correction can be incorporated directly in a slightly more detailed t reatment of ther
modynamics, where ever y macroscopic parameter has a surface term. The presence of
such surface ter ms is not sufficient to identify a material as a complex system. For this
reason, we are careful to identify complex syst ems by requiring that the scenario of
Fig. 1.3.2 is violated by changes in the local (i.e., ever ywhere including the bulk) prop
er ties of the system, rather than just the surface.
It may be asked whether the notion of “local p roper ties” is sufficiently well d e
fined as we are using it. In principle,it is not. For now, we adopt this notion from ther
modynamics. When only a few p roper ties, like the energy and entropy, are relevant,
“affect locally”is a precise concept .Later we would like to replace the use of local ther
modynamic propert ies with a more general concept—the behavior of the system.
How is the scenario of Fig. 1.3.2 violated for a complex system? We can find that
the local propert ies of the small part are affected without affecting the local proper
ties of the large part.Or we can find that the local proper ties of the large part are af
fected as well. The distinction between these two ways of affecting the syst em is im
portant, because it can enable us to distinguish between different kinds of complex
systems. It will be helpful to name them for later reference. We call the first cat egor y
of systems complex mater ials, the second category we call complex organisms.
92 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 92
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 92
Why don’t we also include the possibility that the large part is affected but not the
small part? At this point it makes sense to consider generic subdi vision rather than
special subdivision. By generic subdivision, we mean the ensemble of possible subdi
visions rather than a particular one.Once we are considering complex systems,the ef
fect of removal of part of a syst em may depend on which part is r emoved. However,
when we are t r ying to understand whether or not we have a complex system, we can
limit ourselves to consider ing the gener ic effects of removing a part of the system. For
this reason we do not consider the possibility that subdivision affects the large system
and not the small. This might be possible for the r emoval of a par ticular small part,
but it would be sur prising to discover a syst em where this is gener ically tr ue.
Two examples may help to illu s t ra te the different classes of com p l ex sys tem s . At
least su perf i c i a lly, plants are com p l ex mater i a l s , while animals are com p l ex or ga n i s m s .
The re a s on that plants are com p l ex mater ials is that the cut ting of p a rts of a plant, su ch
as leave s , a br a n ch , or a roo t , t yp i c a lly does not affect the local proper ties of the rest of
the plant, but does affect the exc i s ed par t . For animals this is not gen eri c a lly the case.
However, it would be bet ter to argue that plants are in an interm ed i a te category, wh ere
s ome divi s i on s , su ch as cut t ing out a lateral secti on of a tree tr u n k , a f fect both small
and large part s , while others affect on ly the small er par t . For animals, e s s en ti a lly all di
vi s i ons affect both small and lar ge part s . We bel i eve that com p l ex or ganisms play a spe
cial role in the stu dy of com p l ex sys tem beh avi or. The essen tial qu a l i ty of a com p l ex
or ganism is that its proper ties are ti ed to the ex i s ten ce of a ll of its part s .
How large is the small part we are talking about? Loss of a few cells from the skin
of an animal will not gener ally affect it. As the size of the removed port ion is de
creased,it may be exp ected that the influence on the local proper ties of the larger sys
tem will be reduced. This leads to the concept of a robust complex system.
Qualitatively, the larger the part that can be removed from a complex system without
affecting its local proper ties,the more robust the system is. We see that a complex ma
terial is the limiting case of a highly robust complex system.
The flip side o f subdivision of a syst em is aggregation. For thermodynamic sys
tems, subdivision and aggregation are the same, but for complex systems they are
quite different. One o f the questions that will concern us is what happens when we
place a few or many complex systems together. Generally we expect that the individ
ual complex systems will inter act with each other. However, one of the points we can
make at this time is that just placing together many complex systems, trees or people,
does not make a larger complex system by the criteria of subdivision. Thus, a collec
tion of complex systems may result in a system that behaves as a thermodynamic sys
tem und er subdivision—separating it into parts does not affect the behavior of the
parts.
The topic of bri n ging toget h er many pieces or su b d ividing into many parts is also
qu i te disti n ct from the topic of su b d ivi s i on by rem oval of a single part . This br i n gs us
to a second assu m pti on we wi ll discuss. Th erm odynamic sys tems are assu m ed to be com
po s ed of a very large nu m ber of p a rti cl e s . What abo ut com p l ex sys tems? We know that
the nu m ber of m o l ecules in a cup of w a ter is not gre a ter than the nu m ber of m o l ec u l e s
Th e r m o d y n a m i c s a n d s t a t i s t i c a l m e c h a n i c s 93
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 93
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 93
in a human bei n g. And yet , we understand that this is not qu i te the ri ght poi n t . We should
not be co u n ting the nu m ber of w a ter molecules in the pers on ,i n s te ad we might co u n t
the nu m ber of cell s , wh i ch is mu ch small er. Thus appe a rs the probl em of co u n ting the
nu m ber of com pon ents of a sys tem . In the con text of correl a ti ons in materi a l s , this was
bri ef ly discussed at the end of the last secti on . Let us assume for the mom ent that we
k n ow how to count the nu m ber of com pon en t s . It seems clear that sys tems with on ly a
few com pon ents should not be tre a ted by therm ody n a m i c s . One of the intere s ting qu e s
ti ons we wi ll discuss is wh et h er in the limit of a ver y large nu m ber of com pon ents we
wi ll alw ays have a therm odynamic sys tem .S t a ted in a simpler way from the point of vi ew
of the stu dy of com p l ex sys tem s , the qu e s ti on becomes how large is too large or how
m a ny is too many. From the therm odynamic per s pective the qu e s ti on is, Un der wh a t
c i rc u m s t a n ces do we end up with the therm odynamic limit?
We now switch to a discussion of timerelated assumptions.One of the basic as
sumptions of ther modynamics is the ergodic theorem that enables the description of
a single system using an ensemble. When the ergodic theorem breaks d own, as dis
cussed in the previous section, additional fixed or quenched variables become im 
portant. This is the same as saying that there are significant differences between dif
ferent examples of the macroscopic syst em we are interested in. This is a necessar y
condition for the existence of a complex system. The alternat ive would be that all re
alizations of the system would be the same, which does not coincide with intuitive no
tions of complexity. We will discuss sever al examples o f the breaking of the ergodic
theorem later. The simplest example is a magnet. The orientation of the magnet is an
additional parameter that must be specified, and therefore the ergodic theorem is vi
olated for this system. Any system that breaks symmet r y violates the ergodic theorem.
However, we do not accept a magnet as a complex system. Therefore we can assume
that the breaking o f ergodicity is a necessary but not sufficient condition for com
plexity. All of the syst ems we will discuss break ergodicit y, and therefor e it is always
necessary to specify which coordinates of the complex system are fixed and which are
to be assumed to be so rapidly var ying that they can be assigned equilibr ium
Boltzmann probabilities.
A sp ecial case of the breaking of the ergodic theorem, but one that strikes even
more deeply at the assump tions of thermodynamics, is a violation of the separation
of time scales. If there are dynamical processes that occur on ever y time scale, then it
becomes impossible to treat the system using the conventional separation of scales
into fast,slow and dynamic processes. As we will discuss in Sect ion 1.10,the techniques
of renor malization that are used in phase tr ansitions to deal with the existence of many
spatial scales may also be used to descr ibe systems changing on many time scales.
Finally, inherent in thermodynamics,the concept of equilibrium and the ergodic
theorem is the assumpt ion that the initial condition of the system does not matter. For
a complex system,the initial condition of the system does matter over the time scales
relevant to our observation. This br ings us back to the concept of cor relation time.
The correlation time describes the length of time over which the initial conditions are
relevant to the dynamics. This means that our observation of a complex system must
be shor ter than a correlation time. The spatial analog, the cor relation length, describes
94 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 94
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 94
the effects of surfaces on the system. The discussion of the effects of subdivision also
implies that the system must be smaller than a correlation length. This means that
complex systems change their internal st ructure—adapt—to conditions at their
boundaries. Thus, a suggest ive though incomplete summary of our discussion of
complexity in the context of thermodynamics is that a complex syst em is contained
within a single correlation distance and correlat ion t ime.
Act i va t e d Proce s s e s ( a nd Gla s s e s )
In the last section we saw figures (Fig. 1.3.7) showing the free energy as a function of
a macroscopic parameter with two minima. In this section we analyze a single par t i
cle system that has a pot ential energy with a similar shape (Fig. 1.4.1). The par ticle is
in equilibrium with a thermal reservoir. If the aver age energy is lower than the energy
of the bar rier between the two wells, then the par ticle gener ally resides for a time in
one well and then switches to the other. At ver y low t emperatures, in equilibrium,it
will be more and more likely to be in the lower well and less likely to be in the higher
well. We use this model to think about a syst em with two possible states, where one
state is higher in energy than the other. If we start the system in the higher energy state,
the system will relax to the lower energy state. Because the process of relaxation is en
abled or accelerated by energy from the thermal resevoir, we say that it is activated.
1 . 4
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 95
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 95
Title: Dynamics Complex Systems
Shor t / Normal / Long
E
–1
E
1
E
B
E
x
–1
x
1
x
B
x
E
B
E
1
Fi gure 1 . 4 . 1 I llust ra t ion of t h e pot e n t ia l e n e rgy of a syst e m t h a t h a s t wo loca l min imum e n 
e rgy con figura t ions x
1
a n d x
−1
. Wh e n t h e t e mpe ra t ure is lowe r t h a n t h e e n e rgy ba rrie rs E
B
−
E
−1
a n d E
B
− E
1
, t h e syst e m ma y be con side re d a s a t wo st a t e syst e m wit h t ra n sit ion s be t we e n
t h e m. Th e re la t ive proba bilit y of t h e t wo st a t e s va rie s wit h t e mpe ra t ure a n d t h e re la t ive e n
e rgy of t h e bot t om of t h e t wo we lls. Th e ra t e of t ra n sit ion a lso va rie s wit h t e mpe ra t ure. Wh e n
t h e syst e m is coole d syst e ma t ica lly t h e t wo st a t e syst e m is a simple mode l of a gla ss ( Fig.
1. 4. 2) . At low t e mpe ra t ure s t h e syst e m ca n n ot move from on e we ll t o t h e ot h e r, but is in
e quilibrium wit h in a single we ll.
01adBARYAM_29412 3/10/02 10:16 AM Page 95
1 . 4 . 1 Twost a t e syst ems
It might seem that a syst em with only two different states would be easy to analyze.
Eventually we will reach a simple p roblem. However, building the simple model will
require us to identify some questions and approximations relevant to our under
standing of the application of this model to physical systems (e. g. the problem of pro
tein folding found in Chapter 4). Rather than jumping to the simple twostate prob
lem (Eq. (1.4.40) below), we begin from a particle in a doublewell potential. The
kinetics and thermodynamics in this system give some additional content to the ther
modynamic discussion of the previous section and introduce new concepts.
We consider Fig. 1.4.1 as describing the potential energy V(x) exper ienced by a
classical particle in one dimension. The region to the right of x
B
is called the right well
and to the left is called the left well.A classical t rajector y of the particle with conserved
energy would consist of the part icle bouncing back and for th within the potential well
between two points that are the solut ion of the equation V(x) · E, where E is the t o
tal energy of the part icle. The kinetic energy at any t ime is given by
(1.4.1)
which deter mines the magnitude of the velocity at any position but not the direction.
The velocity switches direction ever y bounce. When the energy is larger than E
B
, there
is only one distinct tr ajector y at each energy. For energies larger than E
1
but smaller
than E
B
, there are two possible trajector ies, one in the right well—to the right of x
B
—
and one in the left well. Below E
1
, which is the minimum energy of the right well,t here
is again only one trajector y possible, in the left well. Below E
−1
ther e are no possible
locations for the par ticle.
If we consider this system in isolation,there is no possibility that the particle will
change fr om one t rajectory to another. Our first objective is to enable the par ticle to
be in contact with some other system (or coordinate) with which it can transfer en
ergy and momentum. For example, we could imagine that the particle is one of many
moving in the double well—like the ideal gas. Sometimes there are collisions that
change the energy and direction of the motion. The same effect would be found for
many other ways we could imagine the particle int eracting with other systems. The
main approximation, however, is that the inter action of the par t icle with the rest of
the universe occurs only over short times. Most of the time it acts as if it were by itself
in the potential well. The particle follows a t rajector y and has an energy that is the sum
of its kinetic and potential energies (Eq.(1.4.1)). There is no need to describe the en
ergy associated with the int eraction with the other syst ems. All of the other par t icles
of the gas (or whatever pict ure we imagine) form the ther mal reservoir, which has a
welldefined temper ature T.
We can increase the rate of collisions between the system and the reser voir with
out changing our description. Then the par t icle does not go ver y far before it forgets
the direction it was traveling in and the energy that it had. But as long as the collisions
themselves occur over a short time compared to the time between collisions,any time
we look at the particle, it has a welldefined energy and momentum. From moment
E( x, p) −V(x) ·
1
2
mv
2
96 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 96
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 96
to moment,the kinet ic energy and momentum changes unpredictably. Still,the posi
tion of the part icle must change continuously in time. This scenario is known as dif
fusive mot ion. The different times are related by:
collision (inter action) time << time between collisions << t ransit time
wher e the transit time is the time between bounces from the walls of the potential well
if there were no collisions—the per iod of oscillation of a particle in the well. The par
ticle undergoes a kind of random walk, with its direction and velocity changing ran
domly fr om moment to moment. We will assume this scenario in our treatment of
this system.
When the par ticle is in contact with a thermal reser voir, the laws o f ther mody
namics apply. The Boltzmann probability gives the probability that the particle is
found at position x with moment um p:
(1.4.2)
Formally, this expression describes a large number of independent systems that make
up a canonical ensemble. The ensemble of systems provides a for mally precise way of
describing probabilities as the number o f systems in the ensemble with a particular
value o f the position and momentum. As in the previous section, Z guarant ees that
the sum over all probabilities is 1. The factor of h is not relevant in what follows, but
for completeness we keep it and associate it with the momentum integral, so that
Σ
p
→
∫
dp /h.
If we are interested in the position of the particle,and are not interested in its mo
mentum, we can simplify this expression by int egr ating o ver all values of the mo 
mentum. Since the energy separ ates into kinetic and potential energy:
(1.4.3)
The resulting expression looks similar to our or iginal expression. Its meaning is some
what different ,however, because V(x) is only the potential energy of the system. Since
the kinetic energy contr ibutes equivalently to the p robability at ever y location, V(x)
determines the probability at ever y x. An expression of the form e
−E/ kT
is known as the
Boltzmann factor of E. Thus Eq.(1.4.3) says that the probabilit y P(x) is proportional
to the Boltzmann factor of V(x). We will use this same trick to describe the probabil
ity of being to the right or being to the left of x
B
in ter ms of the minimum energy of
each well.
To simplify to a twostate syst em, we must define a variable that specifies only
which of the two wells the particle is in. So we label the system by s · t1, where s · +1
if x > x
B
and s · −1 if x < x
B
for a par ticular realization o f the system at a part icular
time, or:
P( x) ·
e
−V( x ) /kT
(dp / h )
∫
e
−p
2
/ 2mkT
dx
∫
e
−V( x ) /kT
(dp / h)
∫
e
−p
2
/ 2mkT
·
e
−V( x) /kT
dx
∫
e
−V ( x ) /kT
P( x, p) ·e
−E( x ,p) /kT
/ Z
Z ·
x ,p
∑
e
−E(x ,p ) /kT
·
1
h
dxdp
∫
e
−E(x ,p ) /kT
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 97
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 97
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 97
s · sign( x − x
B
) (1.4.4)
Probabilistically, the case x · x
B
never happens and therefore does not have to be ac
counted for.
We can calculate the probabilit y P(s) of the system having a value of s ·+1 using:
(1.4.5)
The largest con tri but i on to t his prob a bi l i ty occ u rs wh en V(x) is small e s t . We assu m e
that k T is small com p a red to E
B
, t h en the va lue of t he integral is dom i n a ted by the re
gi on immed i a tely in the vi c i n i ty of the minimum en er gy. De s c ri bing this as a two  s t a te
s ys tem is on ly meaningful wh en t his is tru e . We simplify the integr al by expanding it in
the vi c i n i ty of the minimum en er gy and keeping on ly the qu ad ra tic ter m :
(1.4.6)
where
(1.4.7)
is the effective spr ing constant and
1
is the frequency of small oscillations in the right
well. We can now wr ite Eq. (1.4.5) in the for m
(1.4.8)
Because the int egrand in the numerator falls rapid ly away from the p oint x · x
1
, we
could extend the lower limit t o −∞. Similarly, the probability of being in the left
well is:
(1.4.9)
Here the upper limit of the integr al could be ext ended t o ∞. It is simplest to assume
that k
1
· k
−1
. This assumpt ion, that the shape of the wells are the same, does not sig
nificantly affect most of the discussion (Question 1.4.1–1.4.2). The two probabilities
are propor tional to a new constant times the Boltzmann factor e
−E/ kT
of the energy at
the bott om o f the well. This can be seen e ven without performing the integrals in
Eq. (1.4.8) and Eq. (1.4.9). We redefine Z for the twostate representation:
P( −1) ·
e
−E
−1
/ kT
dx e
−k
−1
( x −x
−1
)
2
/ 2kT
−∞
x
B
∫
dx
∫
e
−V ( x) / kT
P(1) ·
e
−E
1
/kT
dx e
−k
1
(x −x
1
)
2
/2 kT
x
B
∞
∫
dx
∫
e
−V (x ) /kT
k
1
·m
1
2
·
d
2
V( x)
dx
2
x
1
V(x) ·E
1
+
1
2
m
1
2
( x − x
1
)
2
+ …·E
1
+
1
2
k
1
(x −x
1
)
2
+ …
P(1) ·
dx e
−V (x ) / kT
x
B
∞
∫
dx
∫
e
−V ( x) / kT
98 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 98
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 98
(1.4.10)
(1.4.11)
The new normalization Z
s
can be obtained from:
(1.4.12)
giving
(1.4.13)
which is different from the value in Eq. (1.4.2). We ar rive at the desired twostate
result:
(1.4.14)
where f is the Fermi probability or Fer mi function:
(1.4.15)
For readers who were introduced to the Fermi funct ion in quantum statistics,it is not
unique to that field, it occurs anyt ime there are exactly two different possibilities.
Similarly,
(1.4.16)
which is consistent with Eq. (1.4.12) above since
(1.4.17)
Q
ue s t i on 1 . 4 . 1 Discuss how k
1
≠ k
−1
would affect the results for the two
state system in equilibr ium. Obtain expressions for the p robabilities in
each of the wells.
Solut i on 1 . 4 . 1 Extending the integrals to t∞, as described in the text after
Eq. (1.4.8) and Eq. (1.4.9), we obtain:
(1.4.18)
(1.4.19)
P( −1) ·
e
−E
1
/ kT
2 kT / k
−1
dx
∫
e
−V (x ) / kT
P(1) ·
e
−E
1
/kT
2 kT /k
1
dx
∫
e
−V ( x) / kT
f (x) + f ( −x) ·1
P( −1) ·
e
−E
−1
/kT
e
−E
1
/ kT
+e
−E
−1
/ kT
·
1
1 +e
(E
−1
−E
1
) / kT
· f ( E
−1
− E
1
)
f (x) ·
1
1 +e
x /kT
P(1) ·
e
−E
1
/kT
e
−E
1
/kT
+e
−E
−1
/kT
·
1
1+e
(E
1
−E
−1
) /kT
· f (E
1
−E
−1
)
Z
s
·e
−E
1
/kT
+e
−E
−1
/ kT
P(1) +P( −1) ·1
P(1) ·
e
−E
1
/kT
Z
s
P( −1) ·
e
−E
−1
/ kT
Z
s
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 99
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 99
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 99
Because of the approximate extension of the integrals, we are no longer guar
anteed that the sum of these probabilities is 1. However, within the accuracy
of the approximation, we can reimpose the normalization condition. Before
we do so, we choose to rewr ite k
1
· m
1
2
· m(2
1
)
2
, where
1
is the natural
frequency of the well. We then ignore all common factor s in the two proba
bilities and wr ite
(1.4.20)
(1.4.21)
(1.4.22)
Or we can wr ite, as in Eq. (1.4.14)
(1.4.23)
and similarly for P(−1).
Q
ue s t i on 1 . 4 . 2 Redefine the energies E
1
and E
−1
to include the effect of
the difference between k
1
and k
−1
so that the probability P(1) (Eq.
(1.4.23)) can be written like Eq. (1.4.14) with the new energies. How is the
result r elated to the concept of free energy and entropy?
Solut i on 1 . 4 . 2 We define the new energy of the right well as
(1.4.24)
This definition can be seen to recover Eq. (1.4.23) from the form of Eq.
(1.4.14) as
(1.4.25)
Eq. (1.4.24) is ver y reminiscent of the definition of the free energy Eq.
(1.3.33) if we use the expression for the ent ropy:
(1.4.26)
Note that if we consider the temperature dependence, Eq. (1.4.25) is not
identical in its behavior with Eq.(1.4.14). The free energy, F
1
, depends on T,
while the energy at the bottom of the well, E
1
, does not.
In Question 1.4.2, Eq. (1.4.24), we have defined what might be inter preted as a
free energy of the right well. In Section 1.3 we defined only the free energy of the sys
tem as a whole. The new free energy is for part of the ensemble rather than the whole
ensemble. We can do this quite generally. Start by identifying a cer tain subset of all
S
1
· −k ln(
1
)
P(1) · f (F
1
−F
−1
)
F
1
·E
1
+kT ln(
1
)
P(1) ·
1
1+(
1
/
−1
)e
(E
1
−E
−1
) /kT
′
Z
s
·
−1
−1
e
−E
1
/ kT
+
−1
−1
e
−E
−1
/kT
P( −1) ·
−1
−1
e
−E
−1
/kT
′ Z
s
P(1) ·
1
−1
e
−E
1
/kT
′ Z
s
100 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 100
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 100
possible states of a syst em. For example, s · 1 in Eq. (1.4.4). Then we define the fr ee
energy using the expression:
(1.4.27)
This is similar to the usual expression for the free energy in ter ms of the par t ition
funct ion Z, but the sum is only over the subset of states. When there is no ambiguity,
we often drop the subscript and wr ite this asF(1). From this definition we see that the
probability of being in the subset of states is proportional to the Boltzmann factor of
the free energy
(1.4.28)
If we have sever al different subsets that account for all possibilities, then we can nor
malize Eq. (1.4.28) to find the probability itself. If we do this for the left and right
wells, we immediately arrive at the expression for the probabilities in Eq.(1.4.14) and
Eq. (1.4.16), with E
1
and E
−1
replaced by F
s
(1) and F
s
(−1) respect ively. From
Eq.(1.4.28) we see that for a collection of states,the free energy plays the same role as
the energy in the Boltzmann probability.
We note that Eq. (1.4.24) is not the same as Eq.(1.4.27). However, as long as the
relative energy is the same, F
1
− F
−1
· F
s
(1) − F
s
(−1),the normalized probability is un
changed. When k
1
· k
−1
, the entropic part of the free energy is the same for both wells.
Then direct use of the energy instead of the free energy is valid,as in Eq.(1.4.14). We
can evaluate the free energy of Eq. (1.4.27), including the momentum integral:
(1.4.29)
(1.4.30)
where we have used the definition of the well oscillation frequency above Eq.(1.4.20)
to simplify the expression.A similar exp ression holds for Z
−1
. The result would be ex
act for a pur e harmonic well.
The new definition of the free energy of a set of states can also be used to under
stand the treatment of macroscopic systems,specifically to explain why the energy is
deter mined by minimizing the free energy. Par tition the possible microstates by the
value of the energy, as in Eq. (1.3.35). Define the free energy as a function of the en
ergy analogous to Eq. (1.4.27)
(1.4.31)
F(U) · −kT ln
E x ,p
{ ¦ ( )
,U
e
−E x ,p
{ ¦ ( )
/ kT
{x ,p }
∑

.
`
,
F
s
(1) · E
1
+kT ln(h
1
/kT )
Z
1
· dx
x
B
∞
∫
(dp / h)
∫
e
−E( x ,p) / kT
· dx
x
B
∞
∫
e
−V (x) / kT
(dp / h)
∫
e
−p
2
/ 2mkT
≈e
−E
1
/kT
dx e
−k
1
(x −x
1
)
2
/2 kT
x
B
∞
∫
2 mkT / h ≈e
−E
1
/ kT
m /k
1
2 kT /h
·e
−E
1
/kT
kT / h
1
P(1) ∝e
−F
s
(1) / kT
F
s
(1) · −kT ln(
s ,1
e
−E({x ,p })/ kT
{x, p}
∑
) · −kT ln(Z
1
)
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 101
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 101
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 101
Since the relative probabilit y of each value of the energy is given by
(1.4.32)
the most likely energy is given by the lowest fr ee energy. For a macroscopic syst em,
the most likely value is so much more likely than any other value that it is observed
in any measurement. This can immediat ely be generalized. The minimization of the
free energy gives not only the value of the energy but the value of any macroscopic
par ameter.
1 . 4 . 2 Rela xa t ion of a t wost a t e syst em
To investigate the kinetics of the twostate system, we assume an ensemble of systems
that is not an equilibrium ensemble. Instead,the ensemble is char acterized by a time
dependent probability of occupying the two wells:
(1.4.33)
Normalization cont inues to hold at ever y t ime:
(1.4.34)
For example, we might consider star t ing a syst em in the up per well and see how the
system evolves in time. Or we might consider star ting a syst em in the lower well and
see how the syst em evolves in time. We answer the question using the timeevolving
probabilities that describe an ensemble of systems with the same start ing condition.
To achieve this objective, we construct a differential equation describing the rate of
change of the probability of being in a part icular well in terms of the rate at which sys
tems move from one well to the other. This is just the Master equation approach from
Section 1.2.4.
The systems that make tr ansitions from the left to the right well are the ones that
cross the point x · x
B
. More precisely, the rate at which transitions occur is the prob
ability current per unit time of systems at x
B
, moving toward the right. Similar to Eq.
(1.3.47) used to obtain the pressure of an ideal gas on a wall,the number of par t icles
crossing x
B
is the probabilit y of syst ems at x
B
with velocit y v, times their velocity:
(1.4.35)
where J(1 −1) is the number o f systems p er unit time moving fr om the left to the
r ight. Ther e is a hidden assumption in Eq. (1.4.35). We have adopted a notation that
treats all syst ems on the left together. When we are considering transitions,this is only
valid if a syst em that crosses x · x
B
from right to left makes it down into the well on
the left, and thus does not immediately cross back over to the side it came from.
We further assume that in each well the syst ems are in equilibrium, even when
the two wells are not in equilibr ium with each other. This means that the probabilit y
of being in a par t icular location in the right well is given by:
J(1  −1) · (dp / h) vP(x
B
, p;t)
0
∞
∫
P(1;t) +P( −1;t ) ·1
P(1) →P(1;t )
P( −1) →P( −1;t )
P(U) ∝e
−F(U ) /kT
102 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 102
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 102
(1.4.36)
In equilibrium,this statement is t rue because then P(1) · Z
1
/ Z. Eq.(1.4.36) pr esumes
that the rate of collisions between the particle and the thermal reservoir is faster than
both the rate at which the system goes from one well to the other and the frequency
of oscillation in a well.
In order to evaluate the tr ansition rate Eq.(1.4.35), we need the probability at x
B
.
We assume that the systems that cross x
B
moving from the left well to the right well
( i.e.,moving to the right) are in equilibrium with systems in the left well from where
they came. Syst ems that are moving fr om the right well to the left have the e quilib
rium distribution char acter istic of the right well. With these assumpt ions, the rate at
which systems hop from the left to the r ight is given by:
(1.4.37)
We find using Eq. (1.4.29) that the current o f systems can be written in terms o f a
t ransition r ate per system:
(1.4.38)
Similarly, the current and rate at which systems hop from the right to the left are given
by:
(1.4.39)
When k
1
· k
−1
then
1
·
−1
. We continue to deal with this case for simplicity and de
fine ·
1
·
−1
. The expressions for the rate of transition suggest the inter pretation
that the frequency is the rate of attempt to cross the barr ier. The probability of cross
ing in each att empt is given by the Boltzmann factor, which gives the likelihood that
the energy exceeds the bar rier. While this inter pretation is appealing, and is often
given,it is misleading. It is better to consider the frequency as describing the width of
the well in which the particle wanders. The wider the well is,the less likely is a barrier
crossing. This inter pretation survives better when more general cases are considered.
The t ra n s i ti on r a tes en a ble us to con s tru ct the time va ri a ti on of t he prob a bi l i t y
of occ u pying each of t he well s . This gives us the co u p l ed equ a ti ons for the two
prob a bi l i ti e s :
(1.4.40)
˙
P ( −1;t ) · R(−1 1)P(1;t ) −R(1 −1)P( −1;t )
˙
P (1;t) ·R(1 −1) P( −1;t ) −R( −1 1)P(1;t )
J( −1  1) · R( −1  1) P(1;t )
R( −1  1) ·
1
e
−(E
B
−E
1
) / kT
J(1 −1) · R(1 −1)P( −1;t )
R(1 −1) ·
−1
e
− E
B
−E
−1
( ) /kT
J(1  −1) · (dp / h)( p /m) P( −1;t )e
−(E
B
+p
2
/ 2m) / kT
/ Z
−1

.
`
,
0
∞
∫
· P(−1;t )e
−E
B
/ kT
(kT / h) / Z
−1
P( x, p;t) ·P(1;t)e
−E (x ,p ) /kT
/ Z
1
Z
1
· dxdp
x
B
∞
∫
e
−E( x,p) / kT
Ac t i v a t e d P r o c e s s e s ( a n d g l a s s e s ) 103
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 103
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 103
These are t he Ma s ter equ a ti ons ( Eq . (1.2.86) ) for the two  s t a te sys tem . We have ar
rived at these equ a ti ons by introducing a set of a s su m pti ons for tr e a ting the kinet
ics of a single par ti cl e . The equ a ti ons are mu ch more gen er a l , s i n ce t hey say on ly
that t here is a r a te of t ra n s i ti on bet ween one state of the sys tem and the other. It is
the corre s pon den ce bet ween the t wo  s t a te sys t em and t he moving part i cle that we
h ave est abl i s h ed in Eq s . (1.4.38) and (1.4.39). This corre s pon den ce is approx i m a te .
Eq . (1.4.40) does not rely upon t he rel a ti onship bet ween E
B
and t he r a te at wh i ch
s ys tems move from one well to the ot her. However, it does rely upon the assu m p
t i on t hat we need to know on ly wh i ch well t he sys tem is in to specify it s ra te of
t ra n s i ti on to t he ot her well . On avera ge t his is alw ays t ru e , but it would not be a
good de s c ri pt i on of the sys tem , for ex a m p l e , i f en er gy is con s erved and t he key
qu e s ti on determining the kinetics is wh et h er the par ti cle has more or less en er gy
than t he barr i er E
B
.
We can solve the coupled equations in Eq. (1.4.40) dir ectly. Both equations ar e
not necessar y, given the normalization const raint Eq.(1.4.34). Substituting P( −1;t) ·
1 − P(1;t) we have the equation
(1.4.41)
We can rewrite this in terms of the equilibrium value of the probabilit y. By definition
this is the value at which the time derivat ive vanishes.
(1.4.42)
where the righthand side follows from Eq.(1.4.38) and Eq.(1.4.39) and is consistent
with Eq. (1.4.13), as it must be. Using this expression, Eq. (1.4.24) becomes
(1.4.43)
wher e we have defined an addit ional quantity
(1.4.44)
The solution of Eq. (1.4.43) is
(1.4.45)
This solution describes a decaying exponential that changes the probability from the
start ing value to the equilibr ium value. This explains the definition of , called the re
laxation time. Since it is inver sely related to the sum of the rates of transition between
the wells,it is a typical time taken by a system to hop between the wells. The relaxation
time does not depend on the star ting probability. We note that the solution of
Eq.(1.4.41) does not depend on the explicit form of P(1; ∞) or . The definitions im
plied by the first equal signs in Eq.(1.4.42) and Eq.(1.4.44) are sufficient. Also, as can
be quickly checked, we can replace the index 1 with the index −1 without changing
anything else in Eq (1.4.45). The other equations are valid (by symmet r y) aft er the
substitution 1 ↔−1.
P(1;t) ·( P(1;0) − P(1;∞))e
−t /
+ P(1;∞)
1/ ·(R(1 −1) +R( −1  1)) · (e
−(E
B
−E
1
) / kT
+e
−(E
B
−E
−1
) /kT
)
˙
P (1;t) ·( P(1;∞) −P(1;t ))/
P(1;∞) ·R(−1 1) /(R(1 −1) +R( −1 1)) · f (E
1
−E
−1
)
˙
P (1;t) ·R( −1  1) −P(1;t )(R(1  −1) +R( −1 1))
104 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 104
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 104
There are several intuitive relationships between the equilibr ium probabilities
and the tr ansition rates that may be written d own. The first is that the ratio of the
equilibrium probabilities is the r atio of the transit ion rates:
(1.4.46)
The second is that the equilibrium probability divided by the r elaxation time is the
r ate of transition:
(1.4.47)
Q
ue s t i on 1 . 4 . 3 Eq. (1.4.45) implies that the relaxation time of the sys
tem depends largely on the smaller of the two energy barriers E
B
− E
1
and
E
B
− E
−1
. For Fig. 1.4.1 the smaller barrier is E
B
− E
1
. Since the r elaxation time
is independent of the starting probabilit y, this barr ier cont rols the rate of re
laxation whether we start the syst em from the lower well or the up per well.
Why does the barr ier E
B
− E
1
cont rol the relaxation rate when we start from
the lower well?
Solut i on 1 . 4 . 3 Even though the rate of transition from the lower well to the
upper well is controlled by E
B
− E
−1
, the fraction of the ensemble that must
make the transition in order to reach equilibr ium depends on E
1
. The higher
it is,the fewer systems must make the transition from s · −1 to s · 1. Taking
this into consideration implies that the time to reach equilibrium depends
on E
B
− E
1
r ather than E
B
− E
−1
.
1 . 4 . 3 Gla ss t ra nsit ion
Glasses are materials that when cooled from the liquid do not undergo a conventional
t ransition to a solid. Instead their viscosity increases,and in the vicinity of a particu
lar temper ature it becomes so large that on a reasonable time scale they can be treated
as solids. However, on long enough time scales,they flow as liquids. We will model the
glass tr ansition using a twostate system by considering what happens as we cool
down the twostate syst em. At high enough t emperatures, the system hops back and
for th between the two minima with rates given by Eqs.(1.4.38) and (1.4.39). is a mi
croscopic quantity; it might be a vibr ation rate in the mat erial. Even if the barriers are
higher than the temperature, E
B
− E
t1
>> kT, the system will still be able to hop back
and for th quit e rapidly from a macroscopic perspective.
As the syst em is cooled down, the hopping back and for th slows down. At some
point the rate of hopping will become longer than the time we are observing the sys
tem. Systems in the higher well will stay there. Systems in the lower well will stay
ther e. This means that the population in each well becomes fixed. Even when we
continue to cool the system down, there will be no change, and the ensemble will no
longer be in equilibr ium. Within each well the system will continue to have a proba
bility dist ribut ion for its energy given by the Boltzmann probability, but the relat ive
P
1
( ∞) · R( −1 1)
P
1
( ∞) P
−1
(∞) · R(−1 1) / R(1 −1)
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 105
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 105
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 105
populations of the two wells will no longer be described by the equilibrium
Boltzmann probability.
To gain a feeling for the numbers,a typical atomic vibr ation rate is 10
12
/ sec. For
a barr ier of 1eV, at twice room temper ature, kT ≈ 0.05eV (600°K), the transition rate
would be of order 10
3
/sec. This is quite slow from a microscopic per spective, but at
room temperature it would be only 10
−6
/sec, or one tr ansition per year.
The rate at which we cool the system down plays an essential role. If we cool
faster, then the temper ature at which t ransitions stop is higher. If we cool at a slower
rate, then the t emper ature at which the transitions st op is lower. This is found to b e
the case f or glass tr ansitions, where the cooling rate deter mines the depar ture point
from the equilibrium trajector y of the system,and the e ventual properties of the glass
are also determined by the cooling rate. Rapid cooling is called quenching. If we raise
the temperature and lower it slowly, the pr ocedure is called annealing.
Using the model two  s t a te sys tem we can simu l a te what would happen if we per
for m an ex peri m ent of cooling a sys tem that becomes a gl a s s .F i g. 1.4.2 shows the prob
a bi l i ty of being in the upper well as a functi on of the tem pera tu re as the sys tem is coo l ed
down . The cur ves dep a rt from the equ i l i brium curve in the vi c i n i t y of a t ra n s i ti on tem
pera tu re we might call a freezing tra n s i ti on , because the kinet ics become frozen . Th e
glass tra n s i ti on is not a t ra n s i ti on like a first or secon d  order tra n s i ti on (Secti on 1.3.4)
because it is a tra n s i ti on of the kinetics ra t h er t han of the equ i l i brium stru ctu re of t h e
s ys tem . Bel ow the freezing tra n s i ti on , the rel a t ive prob a bi l i ty of t he sys tem being in the
u pper well is given approx i m a tely by the equ i l i brium prob a bi l i t y at the tra n s i ti on .
The freezing transition of the relative population of the upper state and the lower
state is only a simple model of the glass transit ion;however, it is also more widely ap
plicable. The fr eezing d oes not depend on cooperative effects o f many par ticles. To
find examples, a natural place to look is the dynamics of individual at oms in solids.
Potential energies with two wells occur for impurities, defects and even bulk atoms in
a solid. Impurities may have two different local configurations that differ in energy
and are separated by a bar r ier. This is a dir ect analog of our model twostate syst em.
When the t emper ature is lower ed, the r elat ive population of the two configurations
becomes frozen. If we r aise the temperature, the system can equilibrate again.
It is also possible to artificially cause impurity configurations to have unequal en
ergies.One way is to apply uniaxial st ress to a cr ystal—squeezing it along one axis. If
an impurity resides in a bond between two bulk at oms, applying st ress will raise the
energy of impurities in bonds oriented with the stress axis compared to bonds per
pendicular to the stress axis. If we start at a relatively high temperature, apply st ress
and then cool down the mater ial, we can freeze unequal populations of the impurity.
If we have a way of measuring relaxation, then by raising the temperature gradually
and observing when the defects begin to equilibr ate we can discover the barrier to re
laxation. This is one of the few methods available to study the kinet ics of impurity re
or ientation in solids.
The twostate syst em p rovides us with an example of how a simple syst em may
not be able to equilibr ate over experimental time scales. It also sho ws how an e qui
106 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 106
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 106
librium ensemble can be used to treat relative probabilities within a subset of states.
Because the motion within a particular well is fast,the relative probabilities of differ
ent positions or momenta within a well may be described using the Boltzmann
probability. At the same time, the r elative probability of finding a system in each of
the two wells depends on the initial conditions and the hist ory of the syst em—what
temperature the syst em experienced and for how long. At sufficiently low temper a
tures, this relative probability may be t reated as fixed. Systems that are in the higher
well may be assumed to stay there. At int er mediate temper atures, a t reatment of the
dynamics of the t ransition b etween the two wells can (and must) be included. This
manifests a violation of the ergodic theorem due to the divergence of the time scale
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 107
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 107
Title: Dynamics Complex Systems
Shor t / Normal / Long
0.00
0.02
0.04
0.06
0.08
0.10
0.12
100 200 300 400 500 600
P(1;∞)
100˚K/sec
200˚K/sec
0.4˚K/sec
…
T
P(1;t)
t
Fi gure 1 . 4 . 2 Plot of t h e fra ct ion of t h e syst e ms in t h e h igh e r e n e rgy we ll a s a fun ct ion of
t e mpe ra t ure. Th e e quilibrium va lue is sh own wit h t h e da sh e d line . Th e solid lin e s sh ow wh a t
h a ppe n s wh e n t h e syst e m is coole d from a h igh t e mpe ra t ure a t a pa rt icula r coolin g ra t e . The
e xa mple give n use s E
1
− E
−1
· 0. 1e V a n d E
B
− E
−1
· 1. 0e V. Bot h we lls h a ve oscilla t ion fre
que n cie s of v · 10
12
/ se c. Th e fa st e st coolin g ra t e is 200° K/ se c a n d e a ch succe ssive curve is
coole d a t a ra t e t h a t is h a lf a s fa st , wit h t h e slowe st ra t e be in g 0. 4° K/ se c. For e ve ry coolin g
ra t e t h e syst e m st ops ma kin g t ra n sit ion s be t we e n t h e we lls a t a pa rt icula r t e mpe ra t ure t h a t
is a n a logous t o a gla ss t ra n sit ion in t h is syst e m. Be low t h is t e mpe ra t ure t h e proba bilit y be 
come s e sse nt ia lly fixe d.
01adBARYAM_29412 3/10/02 10:16 AM Page 107
for equilibration between the two wells. Thus we have identified many of the fea
tures that are necessary in describing nonequilibrium systems: divergent time scales,
violation of the ergodic theorem, frozen and dynamic coordinates. We have illus
tr ated a method for t reating syst ems where there is a separation of long time scales
and short time scales.
Q
ue s t i on 1 . 4 . 4 Wr ite a program that can generate the time dependence
of the twostate system for a specified time histor y. Repr oduce Fig. 1.4.2.
For an additional “experiment,” tr y the following quenching and annealing
sequence:
a . Start ing from a high enough temperature to be in equilibrium, cool the sys
tem at a rate of 10°K/sec down to T · 0.
b. Heat the system up to temperature T
a
and keep it there for one second.
c. Cool the system back down to T · 0 at rate of 100°K/sec.
Plot the results as a funct ion of T
a
. Descr ibe and explain them in words.
1 . 4 . 4 Diffusion
In t his secti on we bri ef ly con s i der a mu l tiwell sys tem . An example is illu s tra ted in
F i g. 1 . 4 . 3 , wh ere the po ten tial well depths and barri ers va r y from site to site . A simpler
case is found in Fig. 1 . 4 . 4 , wh er e all the well depths and bar ri ers are the same. A con
c rete example would be an inters ti tial impuri t y in an ideal cr ys t a l . The impuri ty live s
in a peri odic en er gy t hat repeat s ever y integral mu l tiple of an el em en t a r y length a.
We can apply the same analysis from the previous section to describe what hap
pens to a system that begins fr om a par ticular well at x · 0. Over time, the syst em
makes tr ansitions left and right at random,in a manner that is reminiscent of a ran
dom walk. We will see in a moment that the connection with the random walk is valid
but requires some additional discussion.
108 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 108
Title: Dynamics Complex Systems
Shor t / Normal / Long
0 1 2 1
E
–1
E
0
E
1
E
2
E
B
(0 1)
E
B
(1 0)
E
B
(2 1)
E
B
(3 2)
V(x)
x
Fi gure 1 . 4 . 3 I llust ra t ion of a mult iple  we ll syst e m wit h ba rrie r h e igh t s a n d we ll de pt h s t h a t
va ry from sit e t o sit e. We focus on t he un iform syst e m in Fig. 1. 4. 4.
01adBARYAM_29412 3/10/02 10:16 AM Page 108
The probability of the system being in a particular well is changed by probabilit y
currents into the well and out from the well. Systems can move to or from the well im
mediately to their right and immediately to their left. The Master equation for the ith
well in Fig. 1.4.3 is:
(1.4.48)
(1.4.49)
where E
i
is the depth of the i th well and E
B
(i + 1 i) is the bar r ier to its right. For the
per iodic system of Fig. 1.4.4 (
i
→ , E
B
(i + 1 i) →E
B
) this simplifies to:
(1.4.50)
(1.4.51)
Since we are already describing a continuum differential equation in time,it is conve
nient to consider long times and wr ite a continuum equation in space as well.
Allowing a change in notation we write
(1.4.52)
Intr oducing the elementary distance between wells a we can rewr ite Eq. (1.4.50)
using:
(1.4.53)
(P(i −1;t ) +P(i +1;t ) −2P(i;t ))
a
2
→
( P( x
i
−a ;t ) +P( x
i
+a;t) −2P( x
i
;t ))
a
2
→
2
x
2
P(x ;t )
P(i;t ) →P( x
i
;t )
R · e
−(E
B
−E
0
) / kT
˙
P (i;t ) · R(P(i −1;t ) + P(i +1;t ) −2P(i ;t ))
R(i +1 i) ·
i
e
−(E
B
(i +1i )−E
i
) /kT
R(i −1 i ) ·
i
e
−(E
B
(i i−1)−E
i
) /kT
˙
P (i;t ) · R(i  i −1) P(i −1;t) +R(i  i +1)P(i +1;t ) −( R(i +1 i) +R(i −1  i ))P(i ;t )
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 109
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 109
Title: Dynamics Complex Systems
Shor t / Normal / Long
0 1 2 1
E
0
V(x)
x
E
B
Fi gure 1 . 4 . 4 Wh e n t h e ba rrie r h e igh t s a n d we ll de pt h s a re t h e sa me, a s illust ra t e d, t h e long
t ime be h a vior of t h is syst e m is de scribe d by t h e diffusion e qua t ion . Th e e volut ion of t h e sys
t e m is con t rolle d by h oppin g e ve n t s from on e we ll t o t h e ot h e r. Th e n e t e ffe ct ove r lon g t ime s
is t he sa me a s for t h e ra ndom wa lk discusse d in Se ct ion 1. 2.
01adBARYAM_29412 3/10/02 10:16 AM Page 109
where the last expression assumes a is small on the scale of interest. Thus the contin
uum version of Eq. (1.4.50) is the conventional diffusion equation:
(1.4.54)
The diffusion constant D is given by:
(1.4.55)
The solution of the diffusion equation, Eq. (1.4.54), depends on the initial con
ditions that are chosen. If we consider an ensemble of a system that starts in one well
and spreads out over time, the solution can be checked by substitution to be the
Gaussian dist ribution found for the random walk in Sect ion 1.2:
(1.4.56)
We see that motion in a set of uniform wells after a long time reduces to that of a ran
dom walk.
How does the similari t y to the ra n dom walk arise? This might appear to be a nat
u ral re su l t ,s i n ce we showed that the Gaussian distri buti on is qu i te gen eral using the cen
tral limit theorem . The scen a rio here ,h owever, is qu i te differen t . The cen tral limit the
orem was proven in Secti on 1.2.2 for the case of a distri buti on of prob a bi l i ties of s tep s
t a ken at specific time interva l s . Here we have a time con ti nu u m . Hopping events may
h a ppen at any ti m e . Con s i der the case wh ere we star t from a particular well . Our differ
en tial equ a ti on de s c ri bes a sys tem that might hop to the next well at any ti m e . A hop is
an even t , and we might con cern ours elves with the dist ri buti on of su ch events in ti m e .
We have assu m ed that these events are uncorrel a ted . Th ere are unphysical con s equ en ce s
of this assu m pti on . For ex a m p l e , no matter how small an inter val of t ime we ch oo s e ,t h e
p a r ti cle has some prob a bi l i ty of traveling arbi tra ri ly far aw ay. This is not nece s s a ri ly a
cor rect micro s copic pictu re , but it is the con ti nuum model we have devel oped .
There is a procedure to conver t the eventcont rolled ho pping motion between
wells into a random walk that takes steps with a cer tain probability at specific time in
tervals. We must select a time inter val. For this time inter val, we evaluate the total
probability that hops move a system from its original position to all possible positions
of the system. This would give us the function f (s) in Eq.(1.2.34). As long as the mean
square displacement is finite,the central limit theorem continues to apply to the prob
ability distribut ion after a long enough time. The generality of the conclusion also im
plies that the result is more widely applicable than the assumptions indicate. However,
ther e is a counter example in Question 1.4.5.
Q
ue s t i on 1 . 4 . 5 Discuss the case of a parti cle t hat is not in con t act with a
t h ermal re s evoir moving in the mu l tiple well sys tem (en er gy is con s er ved ) .
P( x, t) ·
1
4 Dt
e
−x
2
/ 4 Dt
·
1
2
e
−x
2
/ 2
2
· 2Dt
D ·a
2
R ·a
2
e
−(E
B
−E
0
) / kT
˙
P ( x;t) ·D
2
x
2
P( x;t)
110 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 110
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 110
Solut i on 1 . 4 . 5 If the energy of the system is lower than E
B
, the system stays
in a single well bouncing back and forth. A model that describes how t ran
sitions occur between wells would just say there are none.
For the case where the energy is larger than E
B
, the system will move
with a periodically var ying velocity in one direct ion. There is a problem in
selecting an ensemble to describe it. If we choose the ensemble with only
one syst em moving in one direction, then it is described as a deterministic
walk. This descript ion is consistent with the motion of the system.
However, we might also think to describe the system using an ensemble
consisting of part icles with the same energy. In this case it would be one
par ticle moving to the right and one moving to the left. Taking an int erval
of time to be the time needed to move to the next well, we would find a
tr ansition probability of 1/2 to move to the right and the same to the left.
This would lead to a conventional random walk and will give us an incor
rect result for all later times.
This example illust rates the need for an assumpt ion that has not yet been
explicitly mentioned. The ensemble must describe systems that can make
t ransitions to each other. Since the energyconser ving systems cannot switch
directions, the ensemble cannot include both dir ections. It is enough, how
ever, for ther e to be a small nonzero probability for the system to switch di
rections for the centr al limit theor em to apply. This means that over long
enough times, the distr ibution will be Gaussian. Over short times,however,
the pr obability distribut ion from the random walk model and an almost bal
listic system would not be ver y similar.
We can gener alize the multiple well picture to describe a biased random walk.
The pot ential we would use is a “washboard pot ential,” illust r ated in Fig. 1.4.5. The
Master equation is:
(1.4.57)
(1.4.58)
To obtain the continuum limit, replace i → x : P(i + 1;t) → P(x + a,t), and
P(i − 1;t) →P(x − a,t), and expand in a Taylor series to second order in a to obtain:
(1.4.59)
(1.4.60)
D ·a
2
(R
+
+R
−
) / 2
v · a( R
+
−R
−
)
˙
P ( x;t) · −v
x
P(x ;t ) +D
2
x
2
P(x ;t )
R
+
·
i
e
− E
+
/kT
R
−
·
i
e
− E
−
/kT
˙
P (i;t ) · R
+
P( i −1;t) +R
−
P(i +1;t ) −(R
+
+R
−
)P(i ;t )
Ac t i v a t e d p r o c e s s e s ( a n d g l a s s e s ) 111
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 111
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 111
The solution is a moving Gaussian:
(1.4.61)
Since the descript ion of diffusive motion always allows the system to stay where it is,
there is a limit to the degree of bias that can occur in the random walk. For this limit
set R
−
· 0. Then D · av/2 and the spreading o f the probability is given by · √avt.
This shows that unlike the biased rand om walk in Section 1.2, diffusive motion on a
washboard with a given spacing a cannot describe ballistic or deter ministic motion in
a single direction.
Ce llula r Aut oma t a
The first four sections of this chapter wer e dedicated to systems in which the existence
of many parameters (degrees of freedom) describing the system is hidden in one way
or another. In this section we begin to describe syst ems where many degrees of free
dom are explicitly represented. Cellular automata (CA) for m a general class of mod
els of dynamical systems which are appealingly simple and yet capture a rich variet y
of behavior. This has made them a favorite tool for studying the generic behavior of
and modeling complex dynamical systems. Historically CA are also intimately related
to the development of concepts of computers and computation. This connection con
tinues to be a theme often found in discussions of CA. Moreover, despite the wide dif
ferences between CA and conventional computer architectures,CA are convenient for
1 . 5
P( x, t) ·
1
4 Dt
e
−(x −vt )
2
/ 4Dt
·
1
2
e
−( x −vt )
2
/ 2
2
· 2Dt
112 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 112
Title: Dynamics Complex Systems
Shor t / Normal / Long
0 1 2 1
V(x)
x
∆E
+
∆E
–
Fi gure 1 . 4 . 5 Th e bia se d ra n dom wa lk is a lso foun d in a mult iple  we ll syst e m wh e n t h e illus
t ra t e d wa sh boa rd pot e n t ia l is use d. Th e ve locit y of t h e syst e m is give n by t h e diffe re n ce in
hoppin g ra t e s t o t he right a nd t o t h e le ft .
01adBARYAM_29412 3/10/02 10:16 AM Page 112
computer simulations in gener al and parallel computer simulations in particular.
Thus CA have gained importance with the increasing use of simulations in the devel
opment of our understanding of complex systems and their behavior.
1 . 5 . 1 Det erminist ic cellula r a ut oma t a
The concept of cellular automata begins from the concept of space and the locality of
influence. We assume that the system we would like to represent is dist ributed in
space,and that nearby regions of space have more to do with each other than regions
far apart. The idea that r egions near by have greater influence upon each other is o f
ten associated with a limit (such as the speed of light) to how fast infor mation about
what is happening in one place can move to another place.*
Once we have a syst em spread out in space, we mark off the space into cells. We
then use a set of variables to describe what is happening at a given instant of time in
a par ticular cell.
s( i, j, k;t) · s(x
i
, y
j
, z
k
;t) (1.5.1)
where i, j, k are integers (i, j, k ∈Z),and this notation is for a threedimensional space
(3d). We can also describe automata in one or two dimensions (1d or 2d) or higher
than three dimensions. The time dependence of the cell variables is given by an it er
ative r ule:
s(i, j, k;t) · R({s(i ′ − i, j ′ − j, k′ − k;t − 1)}
i ′, j ′, k ′ ∈Z
) (1.5.2)
where the rule R is shown as a function of the values of all the variables at the p revi
ous time,at positions relative to that of the cell s(i, j, k;t − 1). The rule is assumed t o
be the same ever ywhere in the space—there is no space index on the rule. Differences
between what is happening at different locations in the space are due only to the val
ues of the variables, not the update rule. The rule is also homogeneous in time; i.e.,
the r ule is the same at different times.
The locality of the rule shows up in the form of the rule. It is assumed to give the
value of a par t icular cell variable at the next time only in terms of the values of cells
in the vicinity of the cell at the previous time. The set of these cells is known as its
neighborhood. For example, the rule might depend only on the values of twenty
seven cells in a cube centered on the location of the cell itself. The indices of these cells
are obtained by independently incrementing or decrementing once, or leaving the
same, each of the indices:
s(i, j, k;t) · R(s(i t 1,0, j t 1, 0, k t 1, 0;t − 1)) (1.5.3)
Ce l l u l a r a u t o m a t a 113
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 113
Title: Dynamics Complex Systems
Shor t / Normal / Long
*These assump t ions are both reasonable and valid for many systems. However, there are systems wher e
this is not the most natural set of assumptions. For example, when there are widely divergent sp eeds o f
propagation of different quantities (e.g.,light and sound) it may be convenient to represent one as instan
taneous (light) and the other as propagating (sound). On a fundamental level, Einstein, Podalsky and
Rosen carefully for mulated the simple assumptions of local influence and found that quantum mechanics
violates these simple assump tions.A complete under standing of the nature of their par adox has yet to be
reached.
01adBARYAM_29412 3/10/02 10:16 AM Page 113
where the infor mal notation i t 1,0 is the set {i − 1,i,i + 1}. In this case there are a to
tal of twentyseven cells upon which the update rule R(s) depends. The neighbor hood
could be smaller or larger than this example.
CA can be usefully simplified to the point where each cell is a single binary var i
able. As usual, the binary variable may use the notation {0,1}, {−1,1}, {ON,OFF} o r
{↑,↓}. The ter minology is often suggested by the system to be described. Two 1d ex
amples are given in Question 1.5.1 and Fig. 1.5.1. For these 1d cases we can show the
time evolution of a CA in a single figure, where the time axis runs vert ically down the
page and the horizontal axis is the space axis.Each figure is a CA spacetime diagr am
that illust rates a par ticular histor y.
In these examples, a finite space is used rather than an infinite space. We can de
fine various boundary conditions at the edges. The most common is to use a periodic
boundary condition where the space wraps around to itself. The onedimensional ex
amples can be described as circles.A twodimensional example would be a tor us and
a thr eedimensional example would be a gener alized t or us. Periodic boundary con
ditions are convenient, because there is no special position in the space. Some car e
must be taken in considering the boundary conditions even in this case, because there
are rules where the behavior depends on the size of the space. Another standard kind
of boundary condition arises from setting all of the values of the variables outside the
finite space of interest to a particular value such as 0.
Q
ue s t i on 1 . 5 . 1 Fill in the evolution o f the two rules of Fig. 1.5.1. The
first CA (Fig. 1.5.1(a)) is the major ity rule that sets a cell to the majority
of the thr ee cells consisting o f itself and its two neighbors in the previous
time. This can be written using s(i ;t ) · t1 as:
s( i ;t + 1) · sign(s(i − 1;t) + s(i ;t) + s(i + 1;t)) (1.5.4)
In the figure {−1, + 1} are represented by {↑, ↓} respectively.
The second CA (Fig. 1.5.1(b)), called the mod2 rule,is obtained by set
ting the i th cell to be OFF if the number of ON squares in the neighborhood
is e ven, and ON if this numb er is odd. To write this in a simple form use
s(i;t) · {0, 1}. Then:
s(i ;t + 1) · mod
2
(s(i − 1;t) + s(i ; t) + s(i + 1;t)) (1.5.5)
Solut i on 1 . 5 . 1 Notes:
1. The first rule (a) becomes trivial almost immediately, since it achieves a
fixed state after only two updates. Many CA, as well as many physical
systems on a macroscopic scale, behave this way.
2. Be careful about the boundary conditions when updating the rules,par
ticular ly for r ule (b).
3. The second rule (b) goes through a sequence of states ver y different
from each other. Sur pr isingly, it will recover the initial configuration af
ter eight updates.
114 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 114
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 114
Ce l l u l a r a u t o m a t a 115
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 115
Title: Dynamics Complex Systems
Shor t / Normal / Long
(a)
t
0
Rule
1
2
3
4
5
6
7
8
9
t
1
2
3
4
5
6
7
8
9
(b)
t
1
2
3
4
5
6
7
8
9
0
Rule
1
2
3
4
5
6
7
8
9
t
1 0 1 1 0 0 0 1 0 1 0 0 1 1 0 0
1 0 0 0 1 0 1 1 0 1 1 1 0 0 1 1
Fi gure 1 . 5 . 1 Two exa mple s of on e dime n sion a l ( 1 d) ce llula r a ut oma t a . Th e t op row in e a ch
ca se give s t h e in it ia l con dit ion s. Th e va lue of a ce ll a t a pa rt icula r t ime is give n by a rule t h a t
de pe n ds on t h e va lue s of t h e ce lls in it s n e igh borh ood a t t h e pre vious t ime. For t h e se rule s
t h e n e igh borh ood con sist s of t h re e ce lls: t h e ce ll it se lf a n d t h e t wo ce lls on e it h e r side. The
first t ime st e p is sh own be low t h e in it ia l con dit ion s for ( a ) t h e ma jorit y rule , wh e re e a ch ce ll
is e qua l t o t h e va lue of t h e ma jorit y of t h e ce lls in it s n e igh borh ood a t t h e pre vious t ime a nd
( b) t h e mod2 rule wh ich sums t h e va lue of t h e ce lls in t h e n e igh borh ood modulo t wo t o ob
t a in t h e va lue of t h e ce ll in t h e next t ime. Th e rule s a re writ t e n in Que st ion 1. 5. 1. Th e re st
of t h e t ime st e ps a re t o be fille d in a s pa rt of t h is que st ion .
01adBARYAM_29412 3/10/02 10:16 AM Page 115
Q
ue s t i on 1 . 5 . 2 The evo lut i on of the mod2 r ule is peri odic in t i m e . Af ter
ei ght update s , the initial state of the sys tem is recover ed in Fig. 1 . 5 . 1 ( b ) .
Because the st ate of the sys tem at a par ticular time determines uniqu ely the
s t a te at ever y su cceeding ti m e , this is an 8cycle t hat wi ll repeat itsel f . Th ere
a re sixteen cells in t he space shown in Fig. 1 . 5 . 1 ( b ) . Is the nu m ber of cells con
n ected with the length of t he cycle? Tr y a space that has ei ght cells (Fig. 1 . 5 . 2 ( a ) ) .
Solut i on 1 . 5 . 2 For a space with eight cells, the maximum length of a cycle
is four. We could also use an initial condition that has a space periodicity of
four in a space with eight cells (Fig. 1.5.2(b)). Then the cycle length would
only be two. From these examples we see that the mod2 rule returns to the
initial value aft er a time that depends upon the size of the space. More
precisely, it d epends on the periodicity of the initial conditions. The time
periodicity (cycle length) for these examples is simply related to the space
periodicity.
Q
ue s t i on 1 . 5 . 3 Look at the mod2 rule in a space with six cells
(Fig. 1.5.2(c)) and in a space with five cells (Fig. 1.5.2(d)) . What can you
conclude from these tr ials?
Solut i on 1 . 5 . 3 The mod2 rule can behave quite differently depending on
the per iodicity of the space it is in. The examples in Question 1.5.1 and 1.5.2
considered only spaces with a per iodicity given by 2
k
for some k. The new ex
amples in this question show that the evolution of the rule may lead to a
fixed point much like the majority rule. More than one initial condition
leads to the same fixed point. Both the example sho wn and the fixed point
itself does. Systematic analyses of the cycles and fixed points (cycles of pe
riod one) for this and other rules of this t ype,and various boundary condi
tions have been performed.
The choice of initial conditions is an impor tant aspect of the operation of many
CA. Computer investigations of CA often begin by assuming a “seed” consisting of a
single cell with the value +1 (a single ON cell) and all the rest −1 ( OFF). Alter natively,
the initial conditions may be chosen to be random: s(i, j, k;0) · t1 with equal proba
bilit y. The behavior of the system with a par ticular initial condition may be assumed
to be generic, or some quantity may be aver aged over different choices of initial
conditions.
Like the iterat ive maps we considered in Sect ion 1.1,the CA dynamics may be de
scribed in terms of cycles and attractors. As long as we consider only binary variables
and a finite space, the dynamics must repeat itself after no more than a number of
steps equal to the number of possible states of the system. This number grows expo
nentially with the size of the space. There are 2
N
states of the system when there are a
total of N cells. For 100 cells the length of the longest possible cycle would be of or der
10
30
. To consider such a long time for a small space may seem an unusual model of
spacetime. For most analogies of CA with physical systems,this model of spacetime
is not the most appr opriate. We might restrict the notion of cycles to apply only when
their length does not grow exponent ially with the size of the syst em.
116 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 116
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 116
Rules can be distinguished from each other and classified a ccording to a variet y
of features they may possess. For example, some rules are rever sible and others ar e
not. Any reversible rule takes each state onto a unique successor. Other wise it would
be impossible to construct a single valued inverse mapping. Even when a rule is
reversible,it is not guaranteed that the inverse rule is itself a CA,since it may not de
pend only on the local values of the var iables. An example is given in question 1.5.5.
Ce l l u l a r a u t o m a t a 117
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 117
Title: Dynamics Complex Systems
Shor t / Normal / Long
(a)
1
2
3
4
5
6
7
8
9
t
0
Rule
0 1 1 1 0 0 0 1
0 0 1 0 1 0 1 1
t
1
2
3
4
5
6
7
8
9
(b)
1
2
3
4
5
6
7
8
9
t
0
Rule
0 1 1 1 0 1 1 1
0 0 1 0 0 0 1 0
t
1
2
3
4
5
6
7
8
9
t
1
2
3
4
5
6
7
8
9
(c)
1
2
3
4
5
6
7
8
9
t
0
Rule
t
1
2
3
4
5
6
7
8
9
(d)
1
2
3
4
5
6
7
8
9
t
0
Rule
1
1
0 1 1 1 0
0 0 1 0 1
1
1
0 1 1 1
0 0 1 0
Fi gure 1 . 5 . 2 Four a ddit ion a l e xa mple s for t h e mod2 rule t h a t h a ve diffe re n t in it ia l con di
t ion s wit h spe cific pe riodicit y: ( a ) is pe riodic in 8 ce lls, ( b) is pe riodic in 4 ce lls, t h ough it
is sh own e mbe dde d in a spa ce of pe riodicit y 8, ( c) is pe riodic in 6 ce lls, ( d) is pe riodic in 5
ce lls. By fillin g in t h e spa ce s it is possible t o le a rn a bout t h e e ffe ct of diffe re n t pe riodicit ie s
on t h e it e ra t ive prope rt ie s of t h e mod2 rule . I n pa rt icula r, t h e le n gt h of t h e re pe a t t ime ( cy
cle le n gt h ) de pe n ds on t h e spa t ia l pe riodicit y. Th e cycle le n gt h ma y a lso de pe n d on t h e spe
cific in it ia l condit ion s.
01adBARYAM_29412 3/10/02 10:16 AM Page 117
Q
ue s t i on 1 . 5 . 4 Which if any of the two rules in Fig 1.5.1 is reversible?
Solut i on 1 . 5 . 4 The major ity rule is not r eversible, because locally we can
not identify in the next time step the difference between sequences that con
tain (11111) and (11011), since both result in a middle three of (111).
A discussion of the mod2 rule is more involved,since we must take int o
consideration the size of the space. In the examples of Questions 1.5.1–1.5.3
we see that in the space of six cells the rule is not reversible. In this case sev
eral initial conditions lead to the same result. The other examples all appear
to be rever sible, since each initial condition is part of a cycle that can be run
backward to inver t the rule. It turns out to be possible to constr uct explicitly
the inverse of the mod2 r ule. This is done in Question 1.5.5.
E
xt ra Cre di t Que s t i on 1 . 5 . 5 Find the inverse of the mod2 rule, when this
is possible. This question involves some careful algebr aic manipulation
and may be skipped.
Solut i on 1 . 5 . 5 To find the inverse of the mod2 rule,it is useful to recall that
equalit y modulo 2 sat isfies simple addition propert ies including:
s
1
· s
2
⇒ s
1
+ s · s
2
+ s mod
2
(1.5.6)
as well as the special proper ty:
2s · 0 mod
2
(1.5.7)
Together these imply that variables may be moved from one side of the
equalit y to the other :
s
1
+ s · s
2
⇒s
1
· s
2
+ s mod
2
(1.5.8)
Our task is to find the value of all s(i ;t) from the values of s(j;t + 1) that
are assumed known. Using Eq. (1.5.8), the mod2 update rule (Eq. (1.5.5))
s(i;t + 1) · (s(i − 1;t) + s( i;t) + s(i + 1;t)) mod
2
(1.5.9)
can be r ewritten to give us the value of a cell in a la yer in t er ms of the next
layer and its own neighbor s:
s(i − 1;t) · s(i ;t + 1) + s( i ;t) + s(i + 1;t ) mod
2
(1.5.10)
Substitute the same equation for the second t erm on the right (using one
higher index) to obtain
s(i − 1;t ) · s(i ;t + 1) + [s(i + 1;t + 1) + s(i + 1;t) + s(i + 2;t)] + s(i + 1;t)
mod
2
(1.5.11)
the last term cancels against the middle term of the parenthesis and we have:
s( i − 1;t) · s(i ;t + 1) + s(i + 1;t + 1) + s( i + 2;t) mod
2
(1.5.12)
It is convenient to rewrite this with one higher index:
s(i ;t ) · s(i + 1;t + 1) + s(i + 2;t + 1) + s(i + 3;t ) mod
2
(1.5.13)
118 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 118
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 118
Interestingly, this is actually the solut ion we have been looking for,
though some discussion is necessary to show this. On the right side of the
equation appear three cell values. Two of them are from the time t + 1, and
one from the time t that we are tr ying to reconstruct. Since the two cell val
ues from t + 1 are assumed known, we must know only s(i + 3; t) in order to
obtain s(i;t). We can it erate this exp ression and see that instead we need t o
know s(i + 6;t) as follows:
s( i;t) · s(i + 1;t +1) + s(i + 2;t + 1)
+ s(i + 4;t + 1) + s(i + 5;t +1) + s(i + 6;t)
mod
2
(1.5.14)
There are two possible cases that we must deal with at this point. The
first is that the number of cells is divisible by three,and the second is that it
is not. If the number of cells N is divisible by thr ee, then aft er it erating Eq.
(1.5.13) a total of N/3 times we will have an expression that looks like
s( i;t) · s(i + 1;t +1) + s(i + 2;t + 1)
+ s(i + 4;t + 1) + s(i + 5;t +1) + s(i + 6;t)
mod
2
(1.5.15)
+ . . .
+ s( i + N − 2;t + 1) + s(i + N − 1;t + 1) + s(i; t)
where we have used the proper ty of the per iodic boundary conditions to set
s(i + n;t) · s(i;t). We can cancel this value from both sides of the equation.
What is left is an e quation that states that the sum over par ticular values of
the cell variables at t ime t + 1 must be zero.
0 · s(i + 1; t + 1) + s (i + 2; t + 1)
+ s (i + 4; t + 1) + s( i + 5; t +1) + s(i + 6; t)
mod
2
(1.5.16)
+ . . .
+ s (i + N − 2; t + 1) + s( i + N − 1; t + 1)
This means that any set of cell values that is the result of the mod2 rule up
date must satisfy this condition. Consequently, not all possible sets of cell
values can be a result of mod2 updates. Thus the rule is not onetoone and
is not invert ible when N is divisible by 3.
When N is not di visible by thr ee, this problem does not arise, because
we must go around the cell ring thr ee times b efore we get back t o s(i;t). In
this case,the analogous equation to Eq.(1.5.16) would have every cell value
appearing exactly twice on the right of the equation. This is because each cell
appears in two out of the three t ravels around the ring. Since the cell values
all appear twice,they cancel,and the equation is the tautology 0 · 0. Thus in
this case there is no restriction on the result of the mod2 r ule.
We almost have a full procedure for r econstr ucting s(i; t). Choose the
value of one particular cell variable, say s(1;t) · 0. From Eq.(1.5.13), obtain
in sequence ea ch of the cell variables s(N − 2;t), s(N − 5,t), . . . By going
Ce l l u l a r a u t o m a t a 119
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 119
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 119
around the ring thr ee times we can find uniquely all of the values. We now
have to decide whether our or iginal choice was cor rect. This can be done by
directly applying the mod2 rule to find the value of say, s(1; t + 1). If we ob
tain the right value, then we have the right choice; if the wrong value, then
all we have to do is switch all of the cell values to their opposites. How do we
know this is correct?
Ther e was only one other possible choice for the value of s(1; t) · 1. If
we were to choose this case we would find that each cell value was the oppo
site, or one’s complement, 1 − s(i ; t) of the value we found. This can be seen
from Eq. (1.5.13). Moreover, the mod2 rule pr eser ves complementation.
Which means that if we complement all o f the values o f s(i; t) we will find
the complements of the values of s(1; t + 1). The proof is direct:
1 − s(i ;t + 1) · 1 − (s(i − 1;t) + s(i ;t) + s( i + 1;t))
· (1 − s(i − 1;t)) + (1 − s(i ;t)) + (1 − s(i + 1;t))) − 2 mod
2
(1.5.17)
· (1 − s(i − 1;t)) + (1 − s( i ;t)) + (1 − s(i + 1;t)))
Thus we can find the unique predecessor for the cell values s( i;t + 1). With
some care it is possible to write d own a fully algebraic exp ression f or the
value of s(i ;t) by implementing this procedure algebraically. The result f or
N · 3k + 1 is:
mod
2
(1.5.18)
A similar result for N · 3k + 2 can also be found.
Note that the inverse of the mod2 rule is not a CA because it is not a lo
cal rule.
One of the int eresting ways to classify CA—introduced by Wolfram—separates
them into four classes depending on the nature of their limiting behavior. This
scheme is part icularly interesting for us,since it begins to identify the concept of com
plex behavior, which we will address more fully in a later chapter. The notion of com
plex behavior in a spatially distributed system is at least in part distinct from the con
cept of chaotic behavior that we have discussed previously. Specifically, the
classificat ion scheme is:
Classone CA: evolve to a fixed homogeneous state
Classtwo CA: evolve to fixed inhomogeneous states or cycles
Classthree CA: evolve to chaotic or aper iodic behavior
Classfour CA: evolve to complex localized struct ures
One example of each class is given in Fig. 1.5.3. It is assumed that the length of the cy
cles in classtwo automata does not grow as the size of the space increases. This clas
sification scheme has not yet found a firm foundation in anal yt ical wor k and is sup
por ted largely by observation of simulations of var ious CA.
s( i;t ) ·s(i;t +1) + (
j ·1
(N −1) / 3
∑
s( i +3 j −2;t +1) +s(i +3 j ;t +1))
120 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 120
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 120
Ce l l u l a r a u t o m a t a 121
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 121
Title: Dynamics Complex Systems
Shor t / Normal / Long
F i g u re 1 .5 . 3 I l l u s t ra t ion of four CA upda t e rule s wit h ra n dom in it ia l cond i t io n s t h a t are in a
p e r io d ic spa ce wit h a pe riod of 100 ce lls. The in it ia l c ond i t io ns a re sh own a t t he t op and t ime
p roc ee ds do w n w a rd. Ea ch is updat ed for 100 st e ps. O N c ell s are ind ica t e d as fille d squa re s. O F F
ce lls are n ot sh own . Ea ch of t h e rules give s t h e va lue of a c ell in t e rms of a ne ig h b o r hood of
five ce lls a t t he pre v ious t ime. The ne ig h b o r hood consist s of t h e cel l it se lf a nd t he t wo ce lls
t o t h e le ft and t o t he rig h t . The rule s are known a s “t ot a list ic” rule s sin ce t h e y de p e n d on ly
on t he sum of t he va ria bles in t h e n e ig h b o r h o o d. Us i n g t he no t a t ion s
i
· 0, 1, t he rule s ma y
be re p re s e nt e d using
i
( t ) · s
i − 2
( t − 1) + s
i − 1
( t − 1) + s
i
( t − 1) + s
i + 1
( t − 1) + s
i + 2
( t − 1 )
by spe cifying t he value s of
i
( t ) for wh ich s
i
( t ) is O N. T he se a re ( a ) on ly
i
( t ) · 2, ( b) on ly
i
( t ) · 3, ( c)
i
( t ) · 1 a nd 2, a nd ( d)
i
( t ) · 2 a nd 4. Se e pa per 1. 3 in Wo l f ra m’s col lec t io n
of a rt icle s on CA.
01adBARYAM_29412 3/10/02 10:16 AM Page 121
It has been suggested that classfour automata have propert ies that enable them
to be used as computers.Or, more precisely, to simulate a computer by setting the ini
tial conditions to a set of data r epresenting both the program and the input to the
program. The result of the computation is to be obtained by looking some time later
at the state of the syst em. A crit eria that is clearly necessary for an au tomaton to b e
able to a ct as a computer is that the result of the dynamics is sensitive to the initial
conditions. We will discuss the topic of computation further in Sect ion 1.8.
The flip side of the use o f a CA as a model o f computation is to design a com
puter that will simulate CA with high efficiency. Such machines have been built, and
are called cellular automaton machines (CAMs).
1 . 5 . 2 2 d cellula r a ut oma t a
Two and threedimensional CA provide more oppor tunities for contact with physi
cal systems. We illust rate by describing an example of a 2d CA that might serve as a
simple model of droplet gr owth during condensation. The rule,il lustrated in part pic
torially in Fig. 1.5.4, may be described by saying that a part icular cell with four or
122 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 122
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 5 . 4 I llust ra t ion of a 2 d CA t h a t ma y be t h ough t of a s a simple mode l of drople t
conde n sa t ion . Th e rule se t s a ce ll t o be ON ( con de n se d) if four or more of it s n e igh bors a re
conde n se d in t h e pre vious t ime, a nd OFF ( un conde n se d) ot h e rwise . Th e re a re a t ot a l of 2
9
=512
possible in it ia l con figura t ion s; of t h e se on ly 10 a re sh own . Th e on e s on t h e le ft h a ve 4 or
more ce lls conde n se d a n d t h e on e s on t h e righ t h a ve le ss t h a n 4 conde n se d. Th is rule is ex
pla ine d furt h e r by Fig. 1. 5. 5 a nd simula t e d in Fig. 1. 5. 6.
01adBARYAM_29412 3/10/02 10:16 AM Page 122
Ce l l u l a r a u t o m a t a 123
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 123
Title: Dynamics Complex Systems
Shor t / Normal / Long
more “condensed” neighbor s at time t is condensed at time t + 1. Neighbors are
counted from the 3 × 3 square region surrounding the cell, including the cell itself.
Fig. 1.5.5 shows a simulation of this rule star ting from a random initial star ting
point of approximately 25% condensed (ON) and 75% uncondensed (OFF) cells. Over
the first few updates, the rand om ar r angement of dots resolves into dr oplets, where
isolated condensed cells disappear and regions of higher density become the droplets.
Then over a longer time, the droplets grow and reach a stable configur ation.
The character istics of this rule may be under stood by consider ing the properties
of boundaries between condensed and uncondensed r egions,as shown in Fig. 1.5.6.
Boundaries that are vert ical,horizontal or at a 45˚ diagonal are stable. Other bound
aries will move,increasing the size o f the condensed region. Moreover, a concave cor
ner of stable edges is not stable. It will grow to increase the condensed region.On the
other hand,a convex cor ner is stable. This means that convex droplets are stable when
they are for med of the stable edges.
It can be shown that for this size space,the 25% initial filling is a tr ansition den
sit y, where sometimes the result will fill the space and somet imes it will not. For
higher densities, the system almost always reaches an end point where the whole
space is condensed. For lower densities, the system almost always reaches a stable set
of droplets.
This example illustrates an impor tant p oint about the dynamics of many sys
tems, which is the existence of phase t ransitions in the kinetics of the syst em. Such
phase t ransitions are similar in some ways to the thermodynamic phase t ransitions
that describe the equilibr ium state of a system changing from, for example,a solid to
a liquid. The kinetic phase tr ansitions may arise from the choice of initial conditions,
as they did in this example. Alternatively, the phase t ransition may occur when we
consider the behavior of a class of CA as a function of a parameter. The parameter
gr adually changes the local kinetics of the syst em; however, measures of its behavior
may change abruptly at a particular value. Such t ransitions are also common in CA
when the ou tcome of a par ticular update is not deterministic b ut st ochastic, as dis
cussed in Sect ion 1.5.4.
1 . 5 . 3 Conwa y’s Ga me of Life
One of the most popular CA is known as Conway’s Game of Life. Conceptually, it is
designed to capture in a simple way the reproduct ion and death of biological organ
isms. It is based on a model wher e,locally, if there are too few organisms or too many
organisms the organisms will disappear. On the other hand,if the number of organ
isms is just right,they will multiply. Quite surpr isingly, the model takes on a life of its
own with a r ich dynamical behavior that is best understood by direct obser vation.
The specific rule is defined in t er ms of the 3 × 3 neighborhood that was used in
the last sect ion. The rule,illustrat ed in Fig. 1.5.7,specifies that when there are less than
three or mor e than four ON (populated) cells in the neighbor hood,the central cell will
be OFF (unpopulated) at the next time. If there are three ON cells,the central cell will
be ON at the next time. If there are four ON cells,then the centr al cell will keep its pr e
vious state—ON if it was ON and OFF if it was OFF.
01adBARYAM_29412 3/10/02 10:16 AM Page 123
124 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 124
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 5 . 5 S i mu l a t ion of t he con de n s a t ion CA de scribe d in Fig. 1. 5. 4. Th e in it ia l c ond i t io ns
a re chose n by se t t ing ra n domly ea ch sit e O N wit h a probabil it y of 1 in 4. Th e in it ia l fe w st e ps
re sult in isola t ed O N sit e s disa ppea ring a nd small ra g ge d dropl et s of O N sit e s fo r m i ng in h ig h e r 
de nsit y re g io n s. Th e dropl et s grow a n d smo o t h en t h e ir bounda r ie s unt il a t t h e sixt ie t h fra me
a st at ic arra n ge me nt of con vex drople t s is re a c h e d. The f irst fe w st e ps a re shown on t he first
p a ge. Eve ry t e n t h st e p is shown on t h e se con d pa ge up t o t he sixt ie t h.
01adBARYAM_29412 3/10/02 10:16 AM Page 124
Ce l l u l a r a u t o m a t a 125
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 125
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 5 . 5 C o n t i n u e d . T h e in it ial oc cupat ion probabilit y of 1 in 4 is ne a r a ph ase t ra n s i
t ion in t h e kin e t ics of t h is mo del for a spa ce of t h is size. For sl ig ht ly h ig he r de n s i t ie s t he fi
na l con fig u ra t ion con sist s of a drople t cove rin g t h e whole spa ce. For slig ht ly l owe r de n s i t ie s
t he final c on fig u ra t ion is of isola t e d dro p l e t s. At a proba bilit y of 1 in 4 e it he r ma y occur de 
p e n d i ng on t he spe cific in it ial st a t e.
01adBARYAM_29412 3/10/02 10:16 AM Page 125
126 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 126
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 5 . 6 Th e drople t conde n sa t ion mode l of Fig. 1. 5. 4 ma y be unde rst ood by n ot in g t h a t
ce rt a in bounda rie s be t we e n con de n se d a n d un conde n se d re gion s a re st a ble . A comple t e ly st a
ble sh a pe is illust ra t e d in t h e uppe r le ft . I t is compose d of bounda rie s t h a t a re h orizon t a l,
ve rt ica l or dia gon a l a t 45˚. A boun da ry t h a t is a t a diffe re n t a n gle , such a s sh own on t h e up
pe r righ t , will move , ca usin g t h e drople t t o grow. On a lon ge r le n gt h sca le a st a ble sh a pe
( drople t ) is illust ra t e d in t h e bot t om figure. A simula t ion of t h is rule st a rt in g from a ra n dom
in it ia l con dit ion is shown in Fig. 1. 5. 5.
01adBARYAM_29412 3/10/02 10:16 AM Page 126
F i g. 1.5.8 shows a simu l a ti on of the rule starting from the same initial con d i ti on s
u s ed for the con den s a ti on rule in the last secti on . Th ree sequ en tial frames are shown ,
t h en after 100 steps an ad d i ti onal three frames are shown . Frames are also shown after
200 and 300 step s . Af ter this amount of time the rule sti ll has dynamic activi ty from fra m e
to frame in some regi ons of the sys tem , while others are app a ren t ly static or under go sim
ple cyclic beh avi or. An example of c yclic beh avi or may be seen in several places wh ere
t h ere are hori zontal bars of t h ree O N cells that swi tch ever y time step bet ween hori zon
tal and verti c a l . Th ere are many more com p l ex local stru ctu res that repeat cycl i c a lly wi t h
mu ch lon ger repeat cycl e s . Moreover, t h ere are special stru ctu res call ed gl i ders that tra n s
l a te in space as they cycle thro u gh a set of con f i g u ra ti on s . The simplest gl i der is shown
in Fig. 1 . 5 . 9 ,a l ong with a stru ctu re call ed a gl i der gun, wh i ch cre a tes them peri od i c a lly.
We can make a con n ecti on bet ween Conw ay ’s Game of L i fe and t he qu ad ra t ic it
era t ive map con s i dered in Secti on 1.1. The r i ch beh avi or of the itera t ive map was fo u n d
bec a u s e , for low va lues of the va ri a ble the itera ti on would increase its va lu e , while for
Ce l l u l a r a u t o m a t a 127
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 127
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 5 . 7 Th e CA rule Con wa y’s Ga me of Life is illust ra t e d for a fe w ca se s. Wh e n t h e re a re
fe we r t h a n t h re e or more t h a n four n e igh bors in t h e 3 × 3 re gion t h e ce n t ra l ce ll is OFF in t he
next st e p. Wh e n t h e re a re t h re e n e igh bors t h e ce n t ra l ce ll is ON in t h e next st e p. Wh e n t h e re
a re four n e igh bors t h e ce n t ra l ce ll re t a in s it s curre n t va lue in t h e next st e p. Th is rule wa s de 
sign e d t o ca pt ure some ide a s a bout biologica l orga n ism re product ion a n d de a t h wh e re t oo
fe w orga n isms would le a d t o disa ppe a ra n ce be ca use of la ck of re product ion a n d t oo ma ny
would le a d t o ove rpopula t ion a n d de a t h due t o e xh a ust ion of re source s. Th e rule is simula t e d
in Fig. 1. 5. 8 a n d 1. 5. 9.
01adBARYAM_29412 3/10/02 10:16 AM Page 127
128 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 128
Title: Dynamics Complex Systems
Shor t / Normal / Long
1
2
3
101
102
103
Fi gure 1 . 5 . 8 Simula t ion of Con wa y’s Ga me of Life st a rt in g from t h e sa me in it ia l con dit ion s
a s use d in Fig. 1. 5. 6 for t h e conde n sa t ion rule wh e re 1 in 4 ce lls a re ON. Un like t h e conde n
sa t ion rule t h e re re ma in s a n a ct ive st e p by st e p e volut ion of t h e popula t ion of ON ce lls for
ma n y cycle s. I llust ra t e d a re t h e t h re e in it ia l st e ps, a n d t h re e succe ssive st e ps e a ch st a rt in g
a t st e ps 100, 200 a n d 300.
01adBARYAM_29412 3/10/02 10:16 AM Page 128
Ce l l u l a r a u t o m a t a 129
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 129
Title: Dynamics Complex Systems
Shor t / Normal / Long
201
202
203
301
302
303
Fi gure 1 . 5 . 8 Co n t in u e d . Aft e r t h e in it ia l a ct ivit y t h a t occurs e ve rywh e re, t h e pa t t e rn of a c
t ivit y con sist s of re gion s t h a t a re a ct ive a n d re gion s t h a t a re st a t ic or h a ve sh ort cyclica l a c
t ivit y. Howe ve r, t h e a ct ive re gion s move ove r t ime a roun d t h e wh ole spa ce le a din g t o ch a nge s
e ve rywh e re. Eve n t ua lly, a ft e r a lon ge r t ime t h a n illust ra t e d h e re, t h e wh ole spa ce be come s e i
t h e r st a t ic or h a s sh ort cyclica l a ct ivit y. Th e t ime t a ke n t o re la x t o t h is st a t e in cre a se s wit h
t he size of t h e spa ce.
01adBARYAM_29412 3/10/02 10:16 AM Page 129
130 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 130
Title: Dynamics Complex Systems
Shor t / Normal / Long
1
2
3
4
5
6
Fi gure 1 . 5 . 9 Spe cia l in it ia l con dit ion s simula t e d usin g Con wa y’s Ga me of Life re sult in st ruc
t ure s of ON ce lls ca lle d glide rs t h a t t ra ve l in spa ce wh ile progre ssin g cyclica lly t h rough a se t
of con figura t ions. Se ve ra l of t h e simple st t ype of glide rs a re sh own movin g t owa rd t h e lowe r
righ t . Th e more comple x se t of ON ce lls on t h e le ft , bounde d by a 2 × 2 squa re of ON ce lls on
t op a n d bot t om, is a glide r gun . Th e glide r gun cycle s t h rough 30 con figura t ion s durin g wh ich
a sin gle gli de r is e mit t e d. Th e st re a m of gli de rs movin g t o t h e lowe r righ t re sult e d from t he
a ct ivit y of t h e glide r gun .
01adBARYAM_29412 3/10/02 10:16 AM Page 130
h i gh va lues the itera ti on would dec rease its va lu e . Conw ay ’s Game of L i fe and other CA
that ex h i bit intere s ting beh avi or also contain similar nonlinear feed b ack . Moreover, t h e
s p a tial arr a n gem ent and coupling of the cells gives rise to a va ri ety of n ew beh avi ors .
1 . 5 . 4 St ocha st ic cellula r a ut oma t a
In addition to the d eterministic automaton of Eq. (1.5.3), we can define a st ochastic
automaton by the probabilities of t ransition from one state of the system to another:
P ({s(i , j, k; t)}{s(i, j, k; t − 1)}) (1.5.19)
This general stochastic rule for the 2
N
states of the system may be simplified. We have
assumed for the deterministic rule that the rule for updating one cell may be per
for med independently of others. The analog for the stochastic rule is that the update
probabilities for each of the cells is independent. If this is the case,then the total prob
ability may be written as the p roduct of probabilities of each cell value. Moreover, if
the rule is local,the probability for the update of a particular cell will depend only on
the values of the cell var iables in the neighborhood of the cell we are consider ing.
(1.5.20)
where we have used the notation N(i , j , k; t) to indicate the values of the cell variables
in the neighbor hood of (i , j , k). For example, we might consider modifying the
droplet condensation model so that a cell value is set to be ON with a certain proba
bilit y (depending on the number of ON neighbors) and OFF otherwise.
Stochastic automata can be thought of as modeling the effects of noise and more
specifically the ensemble of a dynamic syst em that is subject to thermal noise. There
is another way to make the analo gy between the dynamics of a CA and a ther mody
namic syst em that is exact—if we consider not the space o f the au tomaton but the
d + 1 dimensional spacetime. Consider the ensemble of all possible hist ories of the
CA. If we have a threedimensional space,then the histories are a set of variables with
four indices {s(i, j, k, t)}. The probability of a par t icular set of these variables occur
ring (the probability of this histor y) is given by
(1.5.21)
This expression is the product of the pr obabilities of each update occur ring in the his
tory. The first factor on the right is the probability of a par ticular initial state in the
ensemble we are considering. If we consider only one start ing configuration,its prob
abilit y would be one and the other s zero.
We can relate the probability in Eq.(1.5.21) to thermodynamics using Boltzmann
probability. We simply set it to the expression for the Boltzmann probability at a par
ticular t emperat ure T.
P ({s( i, j, k,t)}) · e
−E({s(i, j, k, t )})/kT
(1.5.22)
There is no need to include the normalization constant Z because the probabilities are
automatically nor malized. What we have done is to define the energy of the par ticu
lar state as:
E({s(i , j, k, t)}) · kT ln (P ({s(i , j, k,t)})) (1.5.23)
P({s(i , j ,k,t )}) ·
t
∏
P
0
(s(i, j ,k;t )  N(i, j ,k;t −1))
i ,j,k
∏
P({s(i , j ,k;0)})
P({s(i , j , k; t )} {s(i, j, k; t −1)}) · P
0
(s(i, j, k; t )  N(i, j , k; t −1))
i, j ,k
∏
Ce l l u l a r a u t o m a t a 131
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 131
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 131
This expression shows that any d dimensional automaton can be related to a d + 1 di
mensional syst em describ ed by equilibrium Boltzmann probabilities. The ensemble
of the d + 1 dimensional system is the set of time histories of the automaton.
There is an important caut ionary note about the conclusion reached in the last
paragraph. While it is true that time histories are directly related to the ensemble of a
thermodynamic system,t here is a hidden danger in this analo gy. These are not t ypi
cal ther modynamic systems, and therefore our intuition about how they should be
have is not t rustwor thy. For example, the time direction may be ver y different from
any of the space dir ections. For the d + 1 dimensional ther modynamic syst em, this
means that one of the directions must be singled out. This kind o f asymmetr y does
occur in thermodynamic systems, but it is not standard. Another example of the dif
ference between thermodynamic syst ems and CA is in their sensit ivity to boundary
conditions. We have seen that many CA are quite sensitive to their initial conditions.
While we have shown this for deterministic automata,it continues to be true for many
stochastic automata as well. The analog of the initial conditions in a d + 1 dimensional
thermodynamic system is the surface or boundary conditions. Thermodynamic sys
tems are typically insensitive to their boundary conditions. However, the relationship
in Eq.(1.5.23) suggests that at least some thermodynamic systems are quite sensitive
to their boundary conditions. An int eresting use of this analogy is to att empt to dis
cover special ther modynamic systems whose behavior mimics the interesting behav
ior of CA.
1 . 5 . 5 CA genera liza t ions
There are a variet y of generalizations of the simplest version of CA which are useful
in developing models of par ticular systems. In this section we br iefly describe a few of
them as illust rated in Fig. 1.5.10.
It is often convenient to consider more than one variable at a particular site.
One way to think about this is as multiple spaces (planes in 2d,lines in 1d) that are
coupled to each other. We could think about each space as a different physical quan
tit y. For example, one might represent a magnet ic field and the other an elect ric
field. Another possibility is that we might use one space as a ther mal reser voir. The
system we are actually int erested in might be simulated in one space and the thermal
reservoir in another. By consider ing various combinations of multiple spaces r epre
senting a physical system, the nature of the physical system can become quite rich in
its structure.
We can also consider the update rule to be a compound rule for med of a sequence
of steps.Each of the steps updates the cells. The whole rule consists of cycling through
the set of individual step rules. For example,our update rule might consist of two dif
ferent steps. The first one is perfor med on ever y odd step and the second is per formed
on ever y even st ep. We could reduce this to the previous single update step case by
looking at the composite of the first and second steps. This is the same as looking at
only ever y even state of the system. We could also reduce this to a multiple space rule,
where both the odd and even states are combined together to be a single step.
132 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 132
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 132
However, it may be mo re convenient at times to think about the system as per form
ing a cycle of update steps.
Finally, we can allow the state of the system at a particular time to depend on the
state of the system at sever al previous times,not just on the state of the system at the
previous time.A rule might depend on the most recent state of the system and the pre
vious one as well. Such a rule is also equivalent to a rule with multiple spaces, by con
sidering both the present state o f the syst em and its predecessor as two spaces. One
use of consider ing rules that depend on more than one time is to enable systematic
construction of reversible deter ministic rules from nonreversible rules. Let the or igi
nal (not necessarily in vert ible) rule be R(N(i, j, k; t)). A new inver tible rule can be
written using the for m
s(i, j, k; t ) · mod
2
(R(N(i, j, k;t − 1)) + s(i, j, k; t − 2)) (1.5.24)
The inverse of the update rule is immediately constructed using the proper ties of ad
dition modulo 2 (Eq. (1.5.8)) as:
s(i, j, k; t − 2) · mod
2
(R(N(i, j, k; t − 1)) + s(i, j, k; t)) (1.5.25)
1 . 5 . 6 Conserved qua nt it ies a nd Ma rgolus dyna mics
Standard CA are not well suited to the description of systems with const raints or con
ser vation laws. For example, if we want to conser ve the number of ON cells we must
establish a rule where turning OFF one cell (swit ching it from ON to OFF) is tied to
turning ON another cell. The standard rule considers each cell separately when an up
date is perfor med. This makes it difficult to guarantee that when this particular cell is
turned OFF then another one will be turned ON. There are many examples of physical
systems where the conser vation of quantities such as number of par t icles, energy and
momentum ar e central to their behavior.
A syst ematic way to constr uct CA that describe systems with conserved quant i
ties has been developed. Rules of this kind are known as part itioned CA or Margolus
r ules (Fig. 1.5.11). These rules separate the space into nonover lapping partitions (also
known as neighbor hoods). The new value of each cell in a partition is given in ter ms
of the previous values of the cells in the same part ition. This is different from the con
ventional au tomaton, since the local rule has more than one output as well as mo re
than one input. Such a rule is not sufficient in itself to describe the system update,
since there is no communication in a single update between different par titions. The
complete rule must sp ecify how the partitions are shift ed after each update with re
spect to the und erlying space. This shifting is an essential part of the dynamical rule
that restores the cellular symmet ry of the space.
The convenience of this kind of CA is that specification of the rule gives us direct
control of the dynamics within each par t ition, and therefore we can impose conser
vation rules within the partition. Once the conservation rule is imposed inside the
par tition, it will be maintained globally—throughout the whole space and through
ever y time step. Fig. 1.5.12 illustrates a rule that conser ves the number of ON cells in
side a 2 × 2 neighborhood. The ON cells may be thought o f as part icles whose num
Ce l l u l a r a u t o m a t a 133
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 133
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 133
134 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 134
Title: Dynamics Complex Systems
Shor t / Normal / Long
R
R
R
R
R
R
R
R
R
R
R
R
(a)
(b)
Fi gure 1 . 5 . 1 0 Sch e ma t ic illust ra t ion s of se ve ra l modifica t ion s of t h e simple st CA rule . The
ba sic CA rule upda t e s a se t of spa t ia lly a rra ye d ce ll va ria ble s sh own in ( a ) . Th e first modifi
ca t ion use s more t h a n on e va ria ble in e a ch ce ll. Con ce pt ua lly t h is ma y be t h ough t of a s de 
scribin g a se t of couple d spa ce s, wh e re t h e ca se of t wo spa ce s is sh own in ( b) . Th e se con d
modifica t ion ma ke s use of a compoun d rule t h a t combin e s se ve ra l diffe re n t rule s, wh e re t he
01adBARYAM_29412 3/10/02 10:16 AM Page 134
Ce l l u l a r a u t o m a t a 135
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 135
Title: Dynamics Complex Systems
Shor t / Normal / Long
R1
R2
R1
R2
R1
R2
R
(c)
(d)
ca se of t wo rule s is sh own in ( c) . Th e t h ird modifica t ion sh own in ( d) ma ke s use of a rule t h a t
de pe n ds on n ot just t h e most re ce n t va lue of t h e ce ll va ria ble s but a lso t h e pre vious one. Bot h
( c) a n d ( d) ma y be de scribe d a s spe cia l ca se s of ( b) wh e re t wo succe ssive va lue s of t h e ce ll
va ria ble s a re con side re d inst e a d a s occurring a t t h e sa me t ime in diffe re nt spa ce s.
01adBARYAM_29412 3/10/02 10:16 AM Page 135
136 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 136
Title: Dynamics Complex Systems
Shor t / Normal / Long
Conventional CA rule
Partitioned (Margolus) CA rule
Partition Alternation
Fi gure 1 . 5 . 1 1 Pa r t i t io ne d CA ( Ma rgolus rule s) en a ble t he imposit ion of con s e r v a t ion laws in
a dire ct wa y. A con ve nt io na l CA gives t he val ue of a n ind i v idua l ce ll in t e rms of t he pre v io u s
value s of cells in it s ne ig h b o r h ood ( t op) . A pa rt it io n ed CA give s t h e va lue of seve ra l ce lls in a
p a r t icula r par t it ion in t e rms of t h e pre v ious values of t he same ce lls ( ce n t e r) . Th is en able s con 
s e r v a t ion rules t o be impose d direct ly wit h in a pa rt icula r pa rt it ion . An exa mple is give n in Fig .
1. 5. 12. I n a dd i t ion t o t he rule for upda t i ng t he pa rt it ion, t he dy n a m ics must specify how t h e
p a r t i t io ns a re t o be sh ift e d from st e p t o st e p. For exa mple ( bot t om) , t h e use of a 2 × 2 pa rt i
t ion ma y be imple me nt ed by a lt e rn a t i ng t he pa rt it io ns from t h e solid lin es t o t h e da s he d line s.
Eve ry e ven updat e t he da s h ed l ine s a re used a nd e very odd upda t e t he solid line s are used t o
p a r t i t ion t h e spa ce. Th is re s t o res t he ce llula r perio d icit y of t h e spa ce a nd e n able s t he ce ll s t o
c o m mu n ica t e wit h e a ch ot h e r, which is not possibl e wit hout t h e sh ift ing of pa rt it io ns.
01adBARYAM_29412 3/10/02 10:16 AM Page 136
ber is conser ved. The only requirement is that each o f the possible ar rangement o f
par ticles on the left results in an arr angement on the right with the same number of
particles. This rule is augmented by specifying that the 2 × 2 partitions are shifted by
a single cell to the right and down after ever y update. The motion of these particles is
that of an unusual gas of particles.
The rule shown is only one of many possible that use this 2 × 2 neighb orhood
and conserve the number of par ticles. Some of these rules have additional proper ties
or symmetr ies.A rule that is const ructed to conser ve particles may or may not be re
ver sible. The one illustrat ed in Fig. 1.5.12 is not reversible. There exist more than one
predecessor for par t icular values of the cell variables. This can be se en fr om the two
mappings on the lower left that have the same output but differ ent input.A rule that
conser ves par ticles also may or may not have a par ticular symmetr y, such as a sym
metr y of reflection.A symmetr y of reflection means that reflect ion of a configuration
across a par ticular axis befor e application of the rule results in the same effect as r e
flection aft er application of the r ule.
The existence of a welldefined set o f r ules that conserves the numb er of par t i
cles enables us to choose to study one of them for a sp ecific reason. Alter natively, by
r andomly constr uct ing a rule which conser ves the number of par ticles, we can lear n
what par ticle conser vation does in a dynamical system independent of other regular
ities of the syst em such as reversibility and reflection or rotation symmetr ies. More
systematically, it is possible to consider the class of automata that conserve par ticle
number and investigate their properties.
Q
ue s t i on 1 . 5 . 6 Design a 2d Margolus CA that represents a par ticle or
chemical reaction: A+ B ↔C. Discuss some of the parameters that must
be set and how you could use symmet ries and conser vation laws to set them.
Solut i on 1 . 5 . 6 We could use a 2 × 2 par t ition just like that in Fig. 1.5.12.
On each of the four squares there can appear any one of the four possibili
ties ( O, A, B, C). There are 4
4
· 256 different initial conditions of the par t i
tion.Each of these must be paired with one final condit ion,if the rule is de
terministic. If the rule is probabilistic, then p robabilities must be assigned
for each possible transition.
To represent a chemical react ion, we choose cases where A and B are ad
jacent (hor izontally or ver tically) and r eplace them with a C and a 0. If we
prefer to be consistent, we can always place the C where A was before. To go
the other direction, we take cases where C is next to a 0 and replace them with
an A and a B. One question we might ask is, Do we want to have a reaction
whenever it is possible, or do we want to assign some probability for the re
act ion? The latter case is more interesting and we would have to use a prob
abilistic CA to represent it. In addition to the reaction, the rule would in
clude par ticle mot ion similar to that in Fig. 1.5.12.
To apply symmetr ies, we could assume that reflect ion along hor izontal
or ver tical axes, or rotations o f the par t ition by 90˚ before the update, will
have the same effect as a reflection or rotation of the par tition after the
Ce l l u l a r a u t o m a t a 137
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 137
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 137
update. We could also assume that A, B and C move in the same way when
they are by themselves. Moreover, we might assume that the rule is symmet
r ic under the transfor mation A ↔B.
There is a simpler approach that r equires enumer ating many fewer states.
We choose a 2 × 1 rectangular par tition that has only two cells,and 4
2
· 16
possible states. Of these, four do not change: [A,A], [B,B], [C,C] and [0,0].
138 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 138
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 5 . 1 2 I llust ra t ion of a pa rt icula r 2 d Ma rgolus rule t h a t pre se rve s t h e n umbe r of ON
ce lls wh ich ma y be t h ough t of a s pa rt icle s in a ga s. Th e re quire me n t for con se rva t ion of n um
be r of pa rt icle s is t h a t e ve ry in it ia l con figura t ion is ma t ch e d wit h a fin a l con figura t ion h a v
in g t h e sa me n umbe r of ON ce lls. Th is pa rt icula r rule doe s n ot obse rve con ve n t ion a l symme
t rie s such a s re fle ct ion or rot a t ion symme t rie s t h a t migh t be e xpe ct e d in a t ypica l ga s. Ma n y
rule s t h a t con se rve pa rt icle s ma y be con st ruct e d in t h is fra me work by ch a n gin g a roun d t he
fina l st a t e s wh ile pre se rving t he numbe r of pa rt icle s in e a ch ca se.
01adBARYAM_29412 3/10/02 10:16 AM Page 138
Eight others are paired b ecause the cell values can be switched to achieve
particle motion (with a certain probability): [A,0] ↔[0,A], [ B,0] ↔[0,B],
[C,A] ↔[A,C],and [ C,B] ↔[B,C] .Finally, the last four, [C,0] ,[ 0, C], [A,B]
and [B, A],can participate in reactions. If the rule is det er ministic,they must
be paired in a unique way for possible transitions. Otherwise,each possibil
ity can be assigned a probability: [C,0] ↔[A,B],[0, C] ↔[ B,A], [C,0] ↔[ B,A]
and [0,C] ↔[A,B]. The switching of the particles without undergoing reac
tion for these states may also be allowed with a certain probability. Thus,each
of the four states can have a nonzero transition probability to each of the oth
ers. These probabilities may be related by the symmetr ies mentioned before.
Once we have determined the update rule for the 2x1 par tition, we can choose
several ways to map the par titions onto the plane. The simplest are obtained
by dividing each of the 2 × 2 par t itions in Fig. 1.5.11 hor izontally or vert i
cally. This gives a total of four ways to part ition the plane. These four can al
ternate when we simulate this CA.
1 . 5 . 7 Different ia l equa t ions a nd CA
Cellular automata are an alt ernat ive to differential equations for the modeling of
physical syst ems. Differential equations when modeled numerically on a computer
are often discr etized in order to per form int egr als. This discr etization is an ap proxi
mation that might be considered essentially equivalent to setting up a locally discrete
dynamical system that in the macroscopic limit r educes to the differential equation.
Why not then start from a discr ete syst em and p rove its r elevance to the p roblem of
interest? This a priori approach can provide distinct computational advantages. This
argument might lead us to consider CA as an approximation to differ ential equa
tions. However, it is possible to adopt an even more direct approach and say that dif
ferential equations are themselves an approximation to aspects of physical reality. CA
are a different but equally valid approach to approximating this realit y. In general,
differential equations are more convenient for analyt ic solut ion while CA are more
convenient for simulations. Since complex systems of differential equations are often
solved numer ically anyway, the alternat ive use of CA appears to be worth syst ematic
considerat ion.
While both cellular au tomata and differential equations can be used to model
macroscopic systems,this should not be taken to mean that the relationship between
differential equations and CA is simple. Recognizing a CA analog to a standard dif
ferential equation may be a difficult problem.One of the most extensive effor ts to use
CA for simulation of a syst em more commonly known by its differ ential equation is
the problem of hydrodynamics. Hydrodynamics is t ypically modeled by the Navier
Stokes equation. A t ype of CA called a lattice gas (Section 1.5.8) has been designed
that on a length scale that is large compared to the cellular scale r eproduces the be
havior of the NavierStokes equation. The difficulties of solving the differential equa
tion for specific boundary conditions make this CA a p owerful tool for studying hy
drodynamic flow.
Ce l l u l a r a u t o m a t a 139
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 139
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 139
A frequently occurr ing differential equation is the wave equation. The wave equa
tion describes an elastic medium that is approximated as a continuum. The wave
equation emerges as the continuum limit o f a large variet y of systems. It is to be ex
pected that many CA will also display wavelike propert ies. Here we use a simple ex
ample to illustrate one way that wavelike proper ties may arise. We also show how the
analogy may be quite differ ent than intuition might suggest. The wave equation writ
ten in 1d as
(1.5.26)
has two types of solut ions that are waves traveling to the right and to the left with wave
vectors k and frequencies of oscillat ion
k
· ck:
(1.5.27)
A par ticular solution is obtained by choosing the coefficients A
k
and B
k
. These solu
tions may also be written in real space in the form:
f ·
˜
A(x − ct) +
˜
B(x + ct) (1.5.28)
where
(1.5.29)
are two ar bitrary functions that specify the initial conditions of the wave in an infi
nite space.
We can const ruct a CA analog of the wave equation as illust r ated in Fig. 1.5.13. It
should be underst ood that the wave equation will arise only as a continuum or long
wave limit of the CA dynamics. However, we are not restr icted to considering a model
that mimics a vib r ating elastic medium. The rule we constr uct consists of a 1d par 
titioned space dynamics.Each update, adjacent cells are paired into partitions of two
cells each. The pairing switches from update to update,analo gous to the 2d example
in Fig. 1.5.11. The dynamics consists solely of switching the contents of the two adja
cent cells in a single partition. Start ing from a par t icular initial configuration, it can
be seen that the contents of the odd cells moves systematically in one direction (right
in the figure), while the contents of the even cells moves in the opposite direction (left
in the figure). The movement proceeds at a constant velocity of c · 1 cell/update. Thus
we identify the contents of the odd cells as the rightward tr aveling wave,and the even
cells as the leftward tr aveling wave.
The dynamics of this CA is the same as the dynamics of the wave equation of
Eq.(1.5.28) in an infinite space. The only requirement is to encode appropriately the
initial conditions
˜
A(x),
˜
B(x) in the cells. If we use variables with values in the conven
˜
A (x) · A
k
e
ikx
k
∑
˜
B (x) · B
k
e
ikx
k
∑
f · A
k
e
i( kx −
k
t )
+B
k
e
i (kx +
k
t )

.
`
,
k
∑
2
f
t
2
·c
2
2
f
x
2
140 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 140
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 140
tional real continuum s
i
∈ℜ, then the (discretized) waves may be encoded directly. If
a binary r epresentation s
i
· t1 is used, the local average over odd cells represents the
right t raveling wave
˜
A(x − ct),and the local average over even cells represents the left
t raveling wave
˜
B(x + ct ).
1 . 5 . 8 La t t ice ga ses
A lattice gas is a t ype of CA designed to model gases or liquids o f colliding par ticles.
Lattice gases are formulated in a way that enables the collisions to conser ve
momentum as well as number of par t icles. Momentum is represented by setting the
velocity of each par ticle to a discrete set o f possibilities.A simple example, the HPP
gas,is illustr ated in Fig. 1.5.14.Each cell contains four binary variables that represent
the presence (or absence) of par ticles with unit velocity in the four compass direct ions
NESW. In the figure,the p resence of a particle in a cell is indicated by an arrow. There
can be up to four particles at each site.Each particle present in a single cell must have
a distinct velocit y.
Ce l l u l a r a u t o m a t a 141
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 141
Title: Dynamics Complex Systems
Shor t / Normal / Long
t
0
1
2
3
4
5
6
7
8
9
t
1
2
3
4
5
6
7
8
9
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Fi gure 1 . 5 . 1 3 A simple 1 d CA usin g a Ma r golus rule , wh ich swit ch e s t h e va lue s of t h e t wo
a dja ce n t ce lls in t h e pa rt it ion , ca n be use d t o mode l t h e wa ve e qua t ion . Th e pa rt it ion s a l
t e rn a t e be t we e n t h e t wo possible wa ys of pa rt it ion in g t h e ce lls e ve ry t ime st e p. I t ca n be
se e n t h a t t h e in it ia l st a t e is propa ga t e d in t ime so t h a t t h e odd ( e ve n ) ce lls move a t a fixe d
ra t e of on e ce ll pe r upda t e t o t h e righ t ( le ft ) . Th e solut ion s of t h e wa ve e qua t ion like wise
con sist of a righ t a n d le ft t ra ve lin g wa ve . Th e in it ia l con dit ion s of t h e wa ve e qua t ion solu
t ion a re t he a n a log of t he in it ia l condit ion of t h e ce lls in t he CA.
01adBARYAM_29412 3/10/02 10:16 AM Page 141
The dynamics of the HPP gas is p erfor med in two st eps that alt ernate: propaga
tion and collision. In the propagation step, par t icles move from the cell they are in to
the neighb oring cell in the direction of their motion. In the collision step, each cell
acts ind ependently, changing the particles fr om incoming to outgoing according t o
prespecified collision rules. The rule for the HPP gas is illustrated in Fig. 1.5.15.
Because of momentum conser vation in this rule, there are only two possibilities for
changes in the part icle velocity as a result of a collision.A similar lattice gas,the FHP
gas, which is implemented on a hexagonal lattice of cells rather than a square lattice,
has been proven to give rise to the NavierStokes hydrodynamic equations on a
macroscopic scale. Due to proper ties of the square lattice in two dimensions, this be
havior does not occur for the HPP gas. One way to understand the limitation o f the
square lattice is to realize that for the HPP gas (Fig. 1.5.14),momentum is conserved
in any individual horizontal or vertical str ipe of cells. This type of conser vation law is
not satisfied by hydrodynamics.
1 . 5 . 9 Ma t eria l growt h
One of the natural physical syst ems to model using CA is the problem of layerby
layer mater ial growth such as is achieved in molecular beam epitaxy. There are many
areas of study of the growth of materials. For example,in cases where the mat er ial is
formed of only a single t ype of atom,it is the surface str uct ure during growth that is
of interest. Here, we focus on an example of an alloy for med of several different atoms,
where the growth of the atoms is precisely layer by layer. In this case the surface struc
ture is simple, but the relative abundance and location of differ ent at oms in the ma
terial is of interest. The simplest case is when the atoms are found on a lattice that is
prespecified, it is only the t ype of atom that may var y.
The analogy with a CA is established by consider ing each layer of atoms, when it
is deposited, as represented by a 2d CA at a particular time. As shown in Fig. 1.5.16
the cell values o f the au tomaton r epr esent the t ype of atom at a par ticular site. The
values of the cells at a par t icular time are preserved as the atoms of the layer deposited
at that time. It is the time histor y of the CA that is to be interpreted as r epresenting
the st r ucture of the alloy. This picture assumes that once an at om is incorpor ated in
a complete layer it does not move.
In order to construct the CA, we assume that the probability of a part icular atom
being d eposit ed at a part icular location depends on the atoms residing in the layer
immediately preceding it. The stochastic CA rule in the form of Eq.(1.5.20) specifies
the probability of attaching each kind of atom to ever y possible atomic environment
in the pr evious layer.
We can illustrate how this might wor k by describing a specific example. There ex
ist alloys formed out of a mixture of gallium,arsenic and silicon.A mat er ial for med
of equal propor tions of gallium and arsenic forms a GaAs cr ystal, which is exactly like
a silicon crystal, except the Ga and As atoms alter nate in positions. When we put sili
con t ogether with GaAs then the silicon can substitute for either the Ga or the As
atoms. If there is more Si than GaAs, then the cr ystal is essentially a Si cr ystal with
small regions of GaAs,and isolated Ga and As. If there is more GaAs than Si,then the
142 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 142
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 142
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 143
Title: Dynamics Complex Systems
Shor t / Normal / Long
Propagation step
Collision step
Fi gure 1 . 5 . 1 4 I llust ra t ion of t h e
upda t e of t h e HPP la t t ice ga s. I n a
la t t ice ga s, bin a ry va ria ble s in e a ch
ce ll in dica t e t h e pre se n ce of pa rt i
cle s wit h a pa rt icula r ve locit y. He re
t h e re a re four possible pa rt icle s in
e a ch ce ll wit h un it ve locit ie s in t he
four compa ss dire ct ion s, NESW.
Pict oria lly t h e pre se n ce of a pa rt icle
is in dica t e d by a n a rrow in t h e di
re ct ion of it s ve locit y. Upda t in g t h e
la t t ice ga s con sist s of t wo st e ps:
propa ga t in g t h e pa rt icle s a ccording
t o t h e ir ve locit ie s, a n d a llowin g t h e
pa rt icle s t o collide a ccordin g t o a
collision rule . Th e propa ga t ion st e p
con sist s of movin g pa rt icle s from
e a ch ce ll in t o t h e n e igh borin g ce lls
in t h e dire ct ion of t h e ir mot ion . The
collision st e p con sist s of e a ch ce ll
in de pe n de n t ly ch a n gin g t h e ve loci
t ie s of it s pa rt icle s. Th e HPP colli
sion rule is sh own in Fig. 1. 5. 15, a n d
imple me n t e d h e re from t h e middle
t o t h e bot t om pa n e l. For con ve 
n ie n ce in vie win g t h e diffe re n t st e ps
t h e a rrows in t h is figure a lt e rn a t e
be t we e n in comin g a n d out comin g.
Pa rt icle s be fore propa ga t ion ( t op)
a re sh own a s out wa rd a rrows from
t h e ce n t e r of t h e ce ll. Aft e r t h e prop
a ga t ion st e p ( middle ) t h e y a re
sh own a s in comin g a rrows. Aft e r col
lision ( bot t om) t h e y a re a ga in
shown a s out goin g a rrows.
01adBARYAM_29412 3/10/02 10:16 AM Page 143
144 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 144
Title: Dynamics Complex Systems
Shor t / Normal / Long
t t
Fi gure 1 . 5 . 1 6 I llust ra t ion of t h e t ime h ist ory of a CA a n d it s use t o mode l t h e st ruct ure of
a ma t e ria l ( a lloy) forme d by a la ye r by la ye r growt h . Ea ch h orizon t a l da sh e d lin e re pre se n t s
a la ye r of t h e ma t e ria l. Th e a lloy h a s t h re e t ype s of a t oms. Th e con figura t ion of a t oms in e a ch
la ye r de pe n ds on ly on t h e a t oms in t h e la ye r pre ce din g it . Th e t ype of a t om, in dica t e d in t he
figure by fille d, e mpt y a n d sh a de d dot s, a re de t e rmin e d by t h e va lue s of t h e ce ll va ria ble s of
t h e CA a t a pa rt icula r t ime, s
i
( t ) · t1, 0. Th e t ime h ist ory of t h e CA is t h e st ruct ure of t he
ma t e ria l.
Fi gure 1 . 5 . 1 5 The
collision rule for
t h e HPP la t t ice ga s.
Wit h t h e exce pt ion
of t h e ca se of t wo
pa rt icle s coming in
from N a n d S a nd
le a ving from E a n d
W, or vice ve rsa
( da she d box) , t he re
a re n o ch a nge s in
t h e pa rt icle ve loci
t ie s a s a re sult of
collision s in t h is
rule. Mome nt um
conse rva t ion doe s
n ot a llow a ny ot he r
ch a n ge s.
01adBARYAM_29412 3/10/02 10:16 AM Page 144
cr ystal will be essentially a GaAs cr ystal with isolated Si atoms. We can model the
growth o f the alloys for med by different r elative p ropor tions of GaAs and Si of the
for m (GaAs)
1x
Si
x
using a CA. Each cell of the CA has a variable with three possible
values s
i
· t1,0 that would r epresent the occupation of a cr ystal site by Ga, As and Si
respect ively. The CA rule (Eq. (1.5.20)) would then be const ructed by assuming dif
ferent probabilities for adding a Si, Ga and As at om at the surface. For example, the
likelihood of finding a Ga next to a Ga atom or an As next to an As is small, so the
probability of adding a Ga on top of a Ga can be set to be much smaller than other
probabilities. The probability of an Si at om s
i
· 0 could be varied to reflect different
concent rations of Si in the growth. Then we would be able to obser ve how the struc
ture of the material changes as the Si concentr ation changes.
This is one of many examples o f physical, chemical and biolo gical syst ems that
have been modeled using CA to capture some of their dynamical proper ties. We will
encounter others in later chapters.
St a t i s t i ca l Fi e lds
In real systems as well as in kinet ic models such as cellular automata (CA) discussed
in the p revious section, we are oft en int erested in finding the state of a syst em—the
time aver aged (equilibr ium) ensemble when cycles or randomness are present—that
arises aft er the fast initial kinetic p rocesses have occur red. Our object ive in this sec
tion is to t reat systems with many degrees of freedom using the t ools of equilibr ium
statistical mechanics (Sect ion 1.3). These tools describe the equilibr ium ensemble di
rectly rather than the time evolution. The simplest example is a collect ion of inter
acting binary variables, which is in many ways analogous to the simplest of the CA
models. This model is known as the Ising model,and was int roduced originally to de
scribe the proper ties of magnet s.Each of the individual variables corresponds to a mi
croscopic magnet ic region that arises due to the orbital motion of an elect ron or the
inter nal degree of freedom known as the spin of the elect ron.
The Ising model is the simplest model of i n ter acting degrees of f reedom . E ach
of the va ri a bles is bi n a ry and the inter acti ons bet ween them are on ly spec i f i ed by on e
p a ra m eter—the strength of t he interact i on . Rem a rk a bly, m a ny com p l ex sys tems we
wi ll be con s i der ing can be model ed by the Ising model as a first approx i m a ti on . We
wi ll use sever al vers i ons of the Ising model to discuss neu ral net wor ks in Ch a pter 2 and
pro teins in Ch a pter 4. The re a s on for the usefulness of this model is the very ex i s ten ce
of i n teract i ons bet ween the el em en t s . This inter acti on is not pre s ent in simpler mod
els and re sults in va rious beh avi ors that can be used to under stand some of t he key as
pects of com p l ex sys tem s . The con cept s and tools that are used to stu dy the Ising model
also may be tr a n s ferred to more com p l i c a ted model s . It should be unders tood , h ow
ever, that the Ising model is a simplistic model of m a gn ets as well as of o t h er sys tem s .
In Section 1.3 we considered the ideal gas with collisions. The collisions wer e a
form of interact ion. However, these interactions were incidental to the model because
they were assumed to be so short that they wer e not present during observation. This
is no longer t rue in the Ising model.
1 . 6
S t a t i s t i c a l f i e l d s 145
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 145
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 145
1 . 6 . 1 The Ising model wit hout int era ct ions
The Ising model describes the energy of a collect ion of elements (spins) r epresented
by binary variables. It is so simple that there is no kinet ics, only an energy E[{s
i
}] .Lat er
we will discuss how to reintroduce a dynamics for this mo del. The absence of a d y
namics is not a problem for the study of the equilibrium p roper ties o f the syst em,
since the Boltzmann probability (Eq.(1.3.29)) d epends only upon the energy. The en
ergy is sp ecified as a funct ion of the values o f the binary variables {s
i
· ±1}. Unless
necessar y, we will use one index for all of the spin variables regardless of dimension
ality. The use of the t erm “spin” or iginates fr om the magnetic analo gy. There is no
other specific term,so we adopt this terminology. The term “spin” emphasizes that the
binary variable represents the state of a physical entity such that the collect ion of spins
is the system we are interested in.A spin can be il lust rated as an arrow of fixed length
(see Fig. 1.6.1). The value of the binary variable describes its orientation, where +1 in
dicates a spin oriented in the positive z direction (UP),and –1 indicates a spin oriented
in the negative z direction (DOWN).
Before we consider the effects of inter actions between the spins, we start by con
sidering a syst em where there are no int eractions. We can write the energy o f such a
system as:
(1.6.1)
Where e
i
(s
i
) is the energy of the i th spin that does not depend on the values of any of
the other spins. Since s
i
are binar y we can write this as:
(1.6.2)
All of the terms that do not depend on the spin va ri a bles have been co ll ected toget h er
i n to a con s t a n t . We set this constant to zero by redefining the en er gy scale. The qu a n ti
ties {h
i
} de s c ri be the en er gy due to the ori en t a ti on of the spins. In the magn etic sys tem
t h ey corre s pond to an ex ternal magn etic field that va ries from loc a ti on to loc a ti on .L i ke
s m a ll magn et s , spins tr y to ori ent along the magn etic fiel d . A spin ori en ted along the
m a gn etic field (s
i
and h
i
h ave the same sign) has a lower en er gy than if it is anti p a ra ll el
to the magn etic fiel d . As in Eq .( 1 . 6 . 2 ) , the con tr i buti on of the magn etic field to the en
er gy is − h
i
 ( h
i
 ) wh en the spin is para ll el (anti p a ra ll el) to the field directi on . Wh en con
ven i ent we wi ll simplify to the case of a uniform magn etic fiel d , h
i
· h.
Wh en the spins are non i n teracti n g, the Ising model redu ces to a co ll ecti on of t wo 
s t a te sys tems that we inve s ti ga ted in Secti on 1.4. L a ter, wh en we introdu ce interacti on s
bet ween the spins, t h ere wi ll be differen ce s . For the non i n teracting case we can wri te the
prob a bi l i ty for a particular con f i g u ra ti on of the spins using the Boltzmann prob a bi l i t y:
(1.6.3)
P[{s
i
}] ·
e
− E[{s
i
}]
Z
·
e
h
i
s
i
i
∑
Z
·
e
h
i
s
i
i
∏
Z
E[{s
i
}] ·
1
2
(e
i
(1) −e
i
( −1))s
i
i
∑
+(e
i
(1) +e
i
( −1)) · E
0
– h
i
s
i
i
∑
→– h
i
s
i
i
∑
E[{s
i
}] · e
i
(s
i
)
i
∑
146 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 146
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 146
S t a t i s t i c a l f i e l d s 147
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 147
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 6 . 1 On e wa y t o visua lize t h e I sin g mode l is a s a spa t ia l a rra y of bin a ry va ria ble s
ca lle d spins, re pre se n t e d a s UP or DOWN a rrows. A on e  dime n sion a l ( 1 d) exa mple wit h a ll spins
UP is sh own on t op. Th e middle a n d lowe r figure s sh ow t wo dime n sion a l ( 2 d) a rra ys wh ich
h a ve a ll spin s UP ( middle ) or h a ve some spin s UP a n d some spins DOWN ( bot t om) .
01adBARYAM_29412 3/10/02 10:16 AM Page 147
where · 1/kT. The par tition function Z is given by:
(1.6.4)
where the second to last equality replaces the sum over all possible values of the spin
variables with a sum over each spin variable s
i
· ±1 within the product. Thus the prob
ability factor s as:
(1.6.5)
This is a product over the result we found for probability of the twostate system (Eq.
(1.4.14)) if we wr ite the energy of a single spin using the notation E
i
( s
i
) · –h
i
s
i
.
Now that we have many spin variables, we can investigate the thermodynamics of
this mo del by wr iting d own the free energy and entropy o f this mo del. This is dis
cussed in Question 1.6.1.
Q
ue s t i on 1 . 6 . 1 Evaluate the ther modynamic free energy, energy and en
tropy for the Ising model without interactions.
Solut i on 1 . 6 . 1 The fr ee energy is given in t erms of the par tition function
by Eq. (1.3.37):
(1.6.6)
The latter expression is a more common way of writing this result.
The ther modynamic energy of the system is found from Eq.(1.3.38) as
(1.6.7)
Th ere is another way to obtain the same re su l t . The therm odynamic en er gy is
the avera ge en er gy of the sys tem (Eq .( 1 . 3 . 3 0 ) ) , wh i ch can be eva lu a ted direct ly:
(1.6.8)
which is the same as before. We have used the possibility of wr iting the prob
ability of a single spin variable independent of the others in order to per for m
this average. It is convenient to define the local magnetization m
i
as the av
erage value of a par ticular spin variable:
(1.6.9)
m
i
· s
i
· s
i
P
s
i
(s
i
)
s
i
·t1
∑
· P
s
i
(1) −P
s
i
( −1)
U · E[{s
i
}] · – h
i
s
i
i
∑
· – h
i
s
i
i
∑
· – h
i
s
i
s
i
∑
P(s
i
)
i
∑
· − h
i
(e
h
i
−e
− h
i
)
(e
h
i
+e
− h
i
)
i
∑
· − h
i
tanh(
i
∑
h
i
)
U · −
ln( Z )
· −
h
i
(e
h
i
−e
− h
i
)
(e
h
i
+e
− h
i
)
i
∑
· − h
i
tanh(
i
∑
h
i
)
F · −kT ln( Z ) · −kT ln
i
∑
e
h
i
+e
− h
i 
.
`
,
· −kT ln
i
∑
2 cosh h
i ( ) ( )
P[{s
i
}] · P(s
i
)
i
∏
·
e
h
i
s
i
e
h
i
+e
− h
i

.
`
,
i
∏
Z ·
{s
i
}
∑
e
− E[{s
i
}]
·
{s
i
}
∑
e
h
i
s
i
i
∏
· e
h
i
s
i
s
i
∑
i
∏
· e
h
i
+e
− h
i 
.
`
,
i
∏
148 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 148
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 148
Or using Eq. (1.6.5):
(1.6.10)
In Fig. 1 . 6 . 2 , the magn eti z a ti on at a particular site is plotted as a functi on of t h e
m a gn etic field for several different tem pera tu res ( · 1 /kT ) . The magn eti z a ti on
i n c reases with increasing magn etic field and with dec reasing tem per a tu re unti l
it satu ra tes asym pto ti c a lly to a va lue of +1 or –1. In terms of the magn eti z a ti on
the en er gy is:
(1.6.11)
We can calculate the ent ropy of the Ising model using Eq. (1.3.36)
(1.6.12)
S ·k U +k lnZ · −k h
i
tanh(
i
∑
h
i
) +k ln
i
∑
2 cosh h
i ( ) ( )
U · – h
i
m
i
i
∑
m
i
· s
i
·tanh( h
i
)
S t a t i s t i c a l f i e l d s 149
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 149
Title: Dynamics Complex Systems
Shor t / Normal / Long
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
4 2 0 2 4
m(h)
h
·1
·2
·0. 5
Fi gure 1 . 6 . 2 Plot of t h e ma gn e t iza t ion a t a pa rt icula r sit e a s a fun ct ion of t h e ma gn e t ic fie ld
for inde pe n de n t spin s in a ma gn e t ic fie ld. Th e ma gn e t iza t ion is t h e a ve ra ge of t h e spin va lue,
so t h e ma gn e t iza t ion sh ows t h e de gre e t o wh ich t h e spin is a lign e d t o t h e ma gn e t ic fie ld.
Th e diffe re n t curve s a re for se ve ra l t e mpe ra t ure s · 0. 5, 1, 2 ( · 1/ kT) . Th e ma gn e t iza t ion
h a s t h e sa me sign a s t h e ma gn e t ic fie ld. Th e ma gn it ude of t h e spin in cre a se s wit h in cre a sing
ma gn e t ic fie ld. I n cre a sin g t e mpe ra t ure, h owe ve r, de cre a se s t h e a lign me n t due t o in cre a se d
ra ndom mot ion of t h e spins. Th e ma ximum ma gn it ude of t h e ma gn e t iza t ion is 1, corre spond
in g t o a fully a ligne d spin .
01adBARYAM_29412 3/10/02 10:16 AM Page 149
which is not par ticularly enlightening. However, we can rewr ite this in terms
of the magnetization using the identity:
(1.6.13)
and the inver se of Eq. (1.6.10):
(1.6.14)
Substituting into Eq. (1.6.12) gives
(1.6.15)
Rearranging slightly, we have:
(1.6.16)
The final expression can be der ived,at least for the case when all m
i
are
the same, by counting the number of states directly. It is worth der iving the
entropy twice, because it may be used mor e generally than this tr eatment in
dicates. We will assume that all h
i
· h are the same. The energy then depends
only on the total magnet ization:
(1.6.17)
To obtain the entropy from the counting of states (Eq.(1.3.25)) we evaluate
the number of states within a particular nar row energy range. Since the en
ergy is the sum over the values of the spins,it may also be written as the dif
ference between the number of UP spins N(1) and DOWN spins N(−1):
E[{s
i
}] · –h(N(1) − N(−1)) (1.6.18)
Thus, to find the ent ropy for a par ticular energy we must count how many
states there are with a par ticular number of UP and DOWN spins. Mor eover,
flipping a spin from DOWN to UP causes a fixed incr ement in the energy.
Thus there is no need to include in the counting the width of the energy in
terval in which we are counting states. The number of states with N(1) UP
spins and N(−1) DOWN spins is:
(1.6.19)
( E, N) ·
N
N(1)

.
`
,
·
N!
N(1)!N(−1)!
E[{s
i
}] · –h s
i
i
∑
U · –h m
i
i
∑
· −hNm
S · +k N ln( 2) −
1
2
(1 +m
i
) ln 1+m
i ( )
+(1−m
i
) ln 1−m
i ( ) ( )
i
∑
]
]
]
]
S · −k m
i
1
2
ln
1 +m
i
1 −m
i

.
`
,
i
∑
+kN ln(2) −k
1
2
ln
i
∑
1 −m
i
2

.
`
,
h
i
·
1
2
ln
1 +m
i
1−m
i

.
`
,
cosh(x) ·
1
1−tanh
2
(x)
150 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 150
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 150
The ent ropy can be wr itten using Sterling’s ap proximation (Eq. (1.2.27)),
neglect ing terms that are less than of order N, as:
S · k ln( ( E,N)) · k[N(lnN − 1) − N(1)(lnN(1) −1) − N(−1)(lnN(−1)–1]
· k[N lnN − N(1)lnN(1) − N(−1)lnN(−1)]
(1.6.20)
the latter following from N · N(1) + N(−1). To simplify this expression fur
ther, we write it in ter ms of the magnetization. Using P
s
i
(−1) + P
s
i
(1) · 1 and
Eq. (1.6.9) f or the magnetization we have the probability that a particular
spin is UP and DOWN in ter ms of the magnet ization as:
P
s
i
(1) · (1 + m) / 2
P
s
i
(−1) · (1 − m) / 2
(1.6.21)
Since there are many spins in the system, we can obtain the number of UP
spins using
N(1) · NP
s
i
(1) · N(1 + m) / 2
N(−1) · NP
s
i
(1) · N(1 − m) / 2
(1.6.22)
Using these expressions, Eq.(1.6.20) becomes the same as Eq.(1.6.16), with
h
i
· h.
There is an important diff erence between the two der ivations, in that
the second assumed that all o f the magnetic fields were the same. Thus, the
first derivation appears more gener al. However, since the original system has
no inter actions, we could consider each of the spins with its own field h
i
as a
separate syst em. If we want to calculate the entropy of the individual spin,
we would consider an ensemble of such spins. The ensemble consists of
many spins with the same field h · h
i
. The derivation of the ent ropy using
the ensemble would be identical to the der ivation we have just given, except
that at the end we would divide by the number of different systems in the en
semble N. Adding together the ent ropies of different spins would then give
exactly Eq. (1.6.16).
The en tropy of a spin from Eq . ( 1.6.16) is maximal for a magn eti z a ti on of
zero wh en it has the va lue k l n ( 2 ) . From the ori ginal def i n i ti on of the en tropy,
this corre s ponds to the case wh en there are ex act ly two different po s s i ble state s
of the sys tem . It thus cor re s ponds to the case wh ere the prob a bi l i t y of e ach
s t a te s · ±1 is 1/2. The minimal en tropy is for ei t h er m · 1 or m · −1—wh en
t h ere is on ly one po s s i ble state of the spin, so the en tropy must be zero.
1 . 6 . 2 The Ising model
We now add the essential aspect of the Ising model—inter actions between the spins.
The location of the spins in space was unimportant in the case of the noninteracting
model. However, for the int er acting model, we consider the spins to be located on a
periodic lattice in spa ce. Similar to the CA mo dels of Sect ion 1.5, we allow the spins
to interact only with their nearest neighbors. It is conventional to interpret neighbors
S t a t i s t i c a l f i e l d s 151
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 151
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 151
str ictly as the spins with the shortest Euclidean distance from a part icular site. This
means that for a cubic lattice there are two, four and six neighbors in one, two and
three dimensions respect ively. We will assume that the int eraction with each o f the
neighbors is the same and we wr ite the energy as:
(1.6.23)
The notation <ij> und er the summation indicates that the sum is to be per for med
over all i and j that are nearest neighbors. For example,in one dimension this could
be wr itten as:
(1.6.24)
If we want ed to emphasize that each spin int eracts with its two neighbor s, we could
wr ite this as
(1.6.25)
wh ere the factor of 1/2 corrects for the do u ble co u n ting of the interacti on bet ween ever y
t wo nei gh boring spins. In two and three dimen s i ons (2d and 3d), t h ere is need of ad
d i ti onal indices to repre s ent the spatial depen den ce . We could wri te the en er gy in 2d as:
(1.6.26)
and in 3d as:
(1.6.27)
In these sums,each nearest neighbor pair appears only once. We will be able to hide
the additional indices in 2d and 3d by using the nearest neighbor notation < ij> as
in Eq. (1.6.23).
The interacti on J bet ween spins may ar ise from many different source s . Similar to
the deriva ti on of h
i
in Eq .( 1 . 6 . 2 ) , this is the on ly form that an interacti on bet ween two
spins can take (Questi on 1.6.2). Th ere are two disti n ct po s s i bi l i ties for the beh avi or of
the sys tem depending on the sign of t he interacti on . Ei t h er the interacti on t ries to ori
ent the spins in the same directi on (J > 0) or in the oppo s i te directi on (J < 0). The for
m er is call ed a ferrom a gn et and is t he com m on for m of a magn et . The other is call ed
an anti fer rom a gn et (Secti on 1.6.4) and has very different ex t ernal proper ties but can
be repre s en ted by the same model , with J h aving the oppo s i te sign .
Q
ue s t i on 1 . 6 . 2 Show that the for m of the inter action given in Eq.
(1.6.24) Jss′ is the most general interact ion between two spins.
Solut i on 1 . 6 . 2 We write as a gener al for m of the energy of two spins:
E[{s
i, j ,k
}] · – h
i, j ,k
s
i, j,k
i ,j,k
∑
−J (s
i , j ,k
s
i +1, j, k
i , j,k
∑
+s
i , j,k
s
i ,j +1,k
+s
i, j ,k
s
i, j, k+1
)
E[{s
i, j
}] · – h
i ,j
s
i , j
i ,j
∑
− J (s
i, j
s
i +1,j
i ,j
∑
+s
i , j
s
i ,j +1
)
E[{s
i
}] · – h
i
s
i
i
∑
−J
1
2
(s
i
s
i +1
i
∑
+s
i
s
i −1
)
E[{s
i
}] · – h
i
s
i
i
∑
−J s
i
s
i+1
i
∑
E[{s
i
}] · – h
i
s
i
i
∑
−J s
i
s
j
<ij >
∑
152 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 152
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 152
(1.6.28)
If we expand this we wi ll find a constant term , terms that are linear in s and s′
and a ter m that is proporti onal to ss′. The linear terms give rise to the local fiel d
h
i
, and the final term is the interacti on . Th ere are other po s s i ble interacti on s
that could be wri t ten that would inclu de three or more spins.
In a magnet ic system, each microscopic spin is itself the sour ce of a small mag
netic field. Magnets have the p roper ty that they can be the source of a macroscopic
magnetic field. When a material is a source of a magnetic field, we say that it is mag
netized. The magnetic field arises from const ruct ive superposition of the microscopic
sources of the magnetic field that we represent as spins. In effect,the small spins com
bine together to form a large spin. We have seen in Section 1.6.1 that when ther e is a
magnetic field h
i
, each spin will orient itself with the magnetic field. This means that
in an external field—a field due to a source outside of the magnet—there will be a
macroscopic orientation of the spins and they will in turn give rise to a magnetic field.
Magnet s,however, can be the source of a magnet ic field even when there is no exter
nal field. This occurs only b elow a par ticular t emper ature known as the Curie tem
per ature of the material. At higher temperatures,a magnet ization exists only in an ex
ternal magnet ic field. The Ising model capt ures this behavior by showing that the
inter actions between the spins can cause a spontaneous orientation of the spins with
out any exter nal field. The spontaneous magnetization is a collect ive phenomenon. It
would not exist for an isolated spin or even for a small collection of interacting spins.
Ultimately, the reason that the spontaneous magnetization is a collect ive phe
nomenon has more to do with the kinetics than the ther modynamics of the system.
The spontaneous magnetization must occur in a part icular direct ion. Without an ex
ternal field,there is no reason for any part icular direction, but the system must choose
one. In our case,it must choose between one of two possibilities—UP or DOWN. Once
the magnetization occurs,it b reaks a symmet r y of the system, because we can now t ell
the difference between UP and DOWN on the macroscopic scale. At this point,the ki
netics of the system must reenter. If the system were able to flip between UP and
DOWN very rapidly, we would not be able to measure either case. However, we know
that if all of the spins have to flip at once, the likelihood of this happening becomes
vanishingly small as the number of spins grows. Thus for a large number of spins in
a macroscopic material, this flipping becomes slower than our obser vation of the
magnet.On the other hand,if we had only a few spins,they would still flip back and
for th. It is this propert y of the syst em that makes the spontaneous magne tization a
collective phenomenon.
Returning br iefly to the discussion at the end of Section 1.3, we see that by choos
ing a direct ion for the magnetization,the magnet breaks the ergodic theorem. It is no
longer possible to represent the syst em using an ensemble with all possible states of
e(s, ′ s ) ·e(1,1)
(1 +s)(1+ ′ s )
4
+e(1, −1)
(1 +s)(1 − ′ s )
4
+e(1, −1)
(1−s)(1 + ′ s )
4
+e( −1, −1)
(1 −s)(1 − ′ s )
4
S t a t i s t i c a l f i e l d s 153
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 153
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 153
the syst em. We must exclude half of the states that have the opposite magnetization.
The reason, as we describ ed there, is because of the existence of a slow process, or a
long time scale, that prevents the syst em from going from one choice o f magnetiza
tion to the other.
The ex i s ten ce of a spon t a n eous magn eti z a ti on ar ises because of t he en er gy lower
ing of the sys tem wh en nei gh boring spins align with each other. At su f f i c i en t ly low
tem pera tu re s , t his causes the sys tem to align co ll ectively one way or another. Above the
Cu r ie tem per a tu re , T
c
, t he en er gy gain by align m ent is de s troyed by the tem per a tu re 
i n du ced ra n dom flipping of i n d ivi dual spins. We say that the high er tem pera tu re ph a s e
is a disordered ph a s e , as com p a red to the ordered low tem pera tu re ph a s e , wh ere all
spins are align ed . Wh en we think abo ut this therm ody n a m i c a lly, the disorder is an
ef fect of optimizing the en tropy, wh i ch prom o tes the disordered state and com pete s
with the en er gy as the tem pera tu re is incre a s ed .
1 . 6 . 3 Mea n field t heory
Despite the simplicity of the Ising model, it has never been solved exactly except in
one dimension, and in two dimensions for h
i
· 0. The t echniques that are useful in
these cases do not gener alize well. We will emphasize instead a p owerful approxima
tion t echnique for describing syst ems of many interact ing parts kno wn as the mean
field approximation. The idea of this approximation is to treat a single element of the
system under the average influence of the rest of the system. The key to doing this cor
rectly is to recognize that this average must be per formed selfconsistently. The mean
ing of selfconsistency will be described shor tly. The mean field approximation can
not be applied to all inter acting systems. However, when it can be, it enables the
system to be understood in a direct way.
To use the mean field approximation we single out a particular spin s
i
and find
the effect ive field (or mean field) it experiences h
i
′. This field is obtained by replacing
all variables in the energy by their average values, except for s
i
. This leads to an effec
tive energy E
MF
(s
i
) for s
i
. To obtain it we can neglect all ter ms in the energy (Eq.
(1.6.23)) that do not include s
i
.
(1.6.29)
The sum is over all nearest neighbors of s
i
. If we are able to find what the mean field
h
i
′ is, then we can solve this interacting Ising model using the solution o f the Ising
model without interactions. The problem is that in order to find the field we have to
know the average value of the spins, which in turn depends on the effective fields. This
is the selfconsistency. We will develop a single algebraic equation for the solut ion. It
is int eresting first to consider this problem when the exter nal fields h
i
are zero. Eq.
(1.6.29) shows that a mean field might still exist. When the external field is zero, each
of the spin variables has the same equation. We might guess that the average value of
the spin in one location will be the same as that in any other location:
E
MF
(s
i
) · –h
i
s
i
−J s
i
<s
j
>
jnn
∑
· – ′ h
i
s
i
′ h
i
·h
i
+ J <s
j
>
jnn
∑
154 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 154
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 154
m · m
i
· < s
i
> (1.6.30)
In this case our equations become
wher e z is the number of nearest neighbor s,known as the coordination number of the
system. Eq.(1.6.10) gives us the value of the average magnetization when the spin is
subject to a field. Using this same expression under the influence of the mean field we
have
m · tanh( h
i
′) · tanh( zJm) (1.6.32)
This is the selfconsistent equation, which gives the value of the magnetization in
terms of itself. The solut ion of this equation may be found gr aphically, as illust rated
in Fig. 1.6.3, by plotting the functions y · m and y · tanh( zJm) and finding their in
tersections. There is always a solution m · 0. In addition, for values of zJ > 1, there
are two more solutions related by a change of sign m · ±m
0
( zJ), where we name the
positive solution m
0
( zJ). When zJ · 1, the line y · m is tangent to the plot o f y ·
tanh( zJm) at m · 0. For values zJ > 1,the value of y · tanh( zJm) must rise above
the line y · m for small positive m and then cross it. The crossing point is the solution
m
0
( zJ). m
0
( zJ) approaches one asymptotically as zJ →∞, e. g. as the temperature
goes to zero. A plot of m
0
( zJ) from a numer ical solution of Eq. (1.6.32) is shown in
Fig. 1.6.4.
We see that there are two different r egimes for this model with a transition at a
temperature T
c
given by zJ · 1 or
kT
c
· zJ (1.6.33)
To understand what is happening it is helpful to look at the energy U(m) and the free
energy F(m) as a funct ion of the magnetization,assuming that all spins have the same
magnetization. We will treat the magnet ization as a parameter that can be varied. The
actual magnetization is deter mined by minimizing the free energy.
To determine the energy, we must average Eq.(1.6.23), which includes a product
of spins on neighboring sites. The mean field approximation t reats each spin as if it
were ind ependent of other spins except for their aver age field. This implies that we
have neglected correlations between the value of one spin and the others around it.
Assuming that the spins are uncor related means the average over the product over two
spins may be approximated by the product over the aver ages:
<s
i
s
j
> ≈ <s
i
><s
j
> · m
2
(1.6.34)
The average over the energy without any external fields is then:
(1.6.35)
U(m) · < −J s
i
s
j
<ij>
∑
> · −
1
2
NJzm
2
S t a t i s t i c a l f i e l d s 155
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 155
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1.6.31)
01adBARYAM_29412 3/10/02 10:16 AM Page 155
The factor of 1/2 arises because we count each inter act ion only once (see Eqs.
(1.6.24)–(1.6.27)). A sum over the average of E
MF
( s
i
) would give twice as much, due
to counting each of the inter actions twice.
Since we have fixed the magnetization of all spins to be the same, we can use the
entropy we found in Question 1.6.1 to obtain the fr ee energy as:
(1.6.36)
This free en er gy is plot t ed in Fig. 1.6.5 as a funct i on of m/J z for va r ious va lues of
k T/J z. We see that the beh avi or of t his sys tem is prec i s ely the beh avi or of a secon d 
or der phase tr a n s i ti on de s c ri bed in Secti on 1.3. Above the t ra n s i ti on tem pera tu re
T
c
t h er e is on ly one po s s i ble phase and bel ow T
c
t h ere are t wo phases of equal en
F(m) · −
1
2
NJzm
2
−NkT ln( 2) −
1
2
(1 +m) ln 1 +m
( )
+(1−m) ln 1−m
( ) ( )
]
]
]
156 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 156
Title: Dynamics Complex Systems
Shor t / Normal / Long
1.5
1
0.5
0
0.5
1
1.5
4 2 0 2 4
tanh( zJm)
m
m
zJ·2
zJ·1
zJ·0. 5
Solution of
m=tanh( zJm)
F i g u re 1 . 6 . 3 G ra p h ica l so l ut io n of Eq . ( 1 . 6 . 32 ) m = t a n h ( z J m ) b y p lot t in g bo t h t h e l e f t 
a n d r ig h t  h a n d s ide s of t he e qua t io n a s a fu n c t io n of m a n d l o okin g fo r t h e i n t e r s e c t io n s.
m · 0 is a l wa ys a so l ut io n . To co n s ide r ot he r p os s ib le so l ut io n s we n o t e t h a t bo t h f u nc 
t io n s a r e a nt i s y m me t r i c in m s o we ne e d on l y c o ns i de r p os it ive va lue s of m. Fo r e ve r y po s
i t ive s olu t ion t h e re is a ne g a t i ve so l ut io n of e q ua l ma g n i t ude . Wh e n z J · 1 t he sl o pe of
b ot h s ide s of t h e e qua t i on i s t h e sa me a t m · 0. For z J > 1 t h e sl o pe of t h e r ig h t i s g re a t e r
t h a n t h e le ft side. For l a rge p os it ive va l u e s of m t h e r ig h t s ide of t h e e qua t io n i s a l wa ys
le s s t h a n t he le f t s ide . Th us fo r z J > 1 , t he r e mus t b e a n a dd i t io na l so l ut io n . The so l u
t io n i s p lot t e d in Fig . 1 . 6 . 4.
01adBARYAM_29412 3/10/02 10:16 AM Page 156
er gy. Q u e s ti on 1.6.3 cl a r ifies a technical point in this der iva ti on , and Quest i on 1.6.4
gen era l i zes the solut i on t o inclu de non zero magn etic fields h
i
≠ 0.
Q
ue s t i on 1 . 6 . 3 Show that the minima of the fr ee energy are the solu
tions of Eq.(1.6.32). This shows that our der ivation is inter nally consis
tent. Specifically, that our two ways of defining the mean field approxima
tion, first using Eq. (1.6.29) and then using Eq. (1.6.34), are compatible.
Solut i on 1 . 6 . 3 Taking the d erivat ive of Eq. (1.6.35) with resp ect t o m and
setting it to zero gives:
(1.6.37)
Recognizing the inver se of tanh,as in Eq.(1.6.14), gives back Eq.(1.6.32) as
desired.
Q
ue s t i on 1 . 6 . 4 Find the replacements for Eq. (1.6.31)–(1.6.36) for the
case where there is a uniform external magnet ic field h
i
· h. Plot the free
energy for a few cases.
0 · −Jzm −kT −
1
2
ln 1 +m
( )
−ln 1 −m
( ) ( )
]
]
]
S t a t i s t i c a l f i e l d s 157
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 157
Title: Dynamics Complex Systems
Shor t / Normal / Long
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.5 1 1.5 2
m( zJ)
zJ
Fi gure 1 . 6 . 4 Th e me a n fie ld a ppr oxima t ion solut ion of t h e I sin g mode l give s t h e ma gn e t i 
za t ion ( a ve ra ge va lue of t h e spin ) a s a solut ion of Eq. ( 1. 6. 32) . Th e solut ion is sh own a s a
fun ct ion of zJ . As discusse d in Fig. 1. 6. 3 a n d t h e t e xt for zJ > 1 t h e re a re t h re e solut ion s.
On ly t h e posit ive on e is sh own . Th e solut ion m · 0 is un st a ble , a s ca n be se e n by a n a lysis of
t he fre e e n e rgy sh own in Fig. 1. 6. 5. The ot h e r solut ion is t he n e ga t ive of t h a t sh own .
01adBARYAM_29412 3/10/02 10:16 AM Page 157
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 158
Title: Dynamics Complex Systems
Shor t / Normal / Long
0.8
0.7
0.6
0.5
1 0.5 0 0.5 1
0.8
0.7
0.6
0.5
1 0.5 0 0.5 1
0.7
0.6
0.5
1 0.5 0 0.5 1
(a)
(b)
(c)
h=0
kT=0. 8
kT=0. 9
kT=1. 0
kT=1. 1
h=0. 1
kT=0. 8
kT=0. 9
kT=1. 0
kT=1. 1
kT=0. 8
h=0. 1
h=0. 05
h=0
h=–0. 05
01adBARYAM_29412 3/10/02 10:16 AM Page 158
Solut i on 1 . 6 . 4 Applying an exter nal magnetic field breaks the symmetr y
between the two differ ent minima in the energy that we have found. In this
case we have instead of Eq. (1.6.29)
E
MF
(s
i
) · –h
i
′s
i
h
i
′ · h + zJm
(1.6.38)
The selfconsistent equat ion instead of Eq. (1.6.32) is:
m · tanh( h + zJm) (1.6.39)
Averaging over the energy gives:
(1.6.40)
The ent ropy is unchanged, so the free energy becomes:
(1.6.41)
Several plots are shown in Fig. 1 . 6 . 5 . Above k T
c
of Eq . ( 1.6.33) the app l i c a ti on
of an ex ternal magn etic field gives rise to a magn eti z a ti on by shifting the lo
c a ti on of the single minimu m . Bel ow this tem pera tu re there is a ti l t ing of t h e
t wo minima. Thu s , going from a po s i tive to a nega t ive va lue of h would give
an abru pt tra n s i ti on—a firs t  order tra n s i ti on wh i ch occ u rs at ex act ly h · 0 .
In discussing the mean field equations, we have assumed that we could specify
the magnet ization as a parameter to be optimized. However, the prescription we have
from thermodynamics is that we should take all possible states of the syst em with a
Boltzmann probabilit y. What is the justification for limiting ourselves to only one
value of the magnet ization? We can argue that in a macroscopic syst em, the optimal
F(m) · −Nhm −
1
2
NJzm
2
− NkT ln(2) −
1
2
(1+m) ln 1+m
( )
+(1 −m) ln 1 −m
( ) ( )
]
]
]
U(m) · < −h s
i
i
∑
−J s
i
s
j
<ij >
∑
> ·−Nhm −
1
2
NJzm
2
S t a t i s t i c a l f i e l d s 159
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 159
Title: Dynamics Complex Systems
Shor t / Normal / Long
F i g u re 1 .6 . 5 Plot s of t h e me an fie ld approx i ma t ion t o t he free en e rg y. ( a ) shows t h e fre e en
e rgy for h · 0 as a func t ion of m for va rious values of k T. The free en e rgy m a nd k T a re me a
s u re d in unit s of J z. As t he t empera t u re is lowe red be low k T/ z J · 1 t he re a re t wo min ima in
st ea d of one ( shown by a rrows) . These min ima are t he solut io ns of Eq. ( 1. 6. 32) ( see Que st io n
1. 6. 3) . Th e solut io ns are illust rat e d in Fig. 1. 6. 4. ( b) Sh ows t he same curves as ( a ) but wit h a
ma g n e t ic fie ld h/ z J · 0. 1. Th e locat ion of t h e min imum give s t h e va lue of t h e ma g n e t i z a t io n .
T he ma g n e t ic field ca uses a ma g n e t i z a t ion t o exist a t a ll t e mpera t u re s, but it is larger a t lowe r
t e m p e ra t u re s. At t he lowest t empe ra t u re shown k T/ z J · 0. 8 t h e e ffe ct of t he pha se t ra n s i t io n
can be see n in t he be gin n ings of a se cond ( met a st a ble ) minimum a t n egat ive va lue s of t he ma g
n e t i z a t ion . ( c) shows plot s a t a fixed t e mpera t u re of k T/ z J · 0. 8 for diffe re nt values of t he ma g
ne t ic fie l d. As t he value of t he fie ld goe s from posit ive t o ne g a t i v e, t he minimum of t he fre e
e n e rgy swit che s from posit ive t o ne ga t ive va lue s discont i nuo u s l y. At e xact ly h · 0 t he re is a dis
c o n t i n uous jump from posit ive t o ne ga t ive ma g n e t i z a t ion —a first  order pha se t ra n s i t io n .
01adBARYAM_29412 3/10/02 10:16 AM Page 159
value of the magnetization will so dominate other magnet izations that any other pos
sibility is negligible. This is reasonable except for the case when the magne tic field is
close to zero, below T
c
, and we have two equally likely magnetizations. In this case,the
usual justification d oes not hold, though it is often implicitly applied.A more com
plete justificat ion requires a discussion of kinetics given in Section 1.6.6.
Using the results of Question 1.6.4, we can dr aw a phase diagram like that illus
tr ated in Sect ion 1.3 for water (Fig. 1.3.7). The phase diagram of the Ising model (Fig.
1.6.6) describes the transitions as a function of temper ature (or ) and magnetic field
h. It is ver y simple for the case of the magnetic system,since the firstorder phase tran
sition line lies along the h · 0 axis and ends at the secondorder transition point given
by Eq. (1.6.33).
1 . 6 . 4 Ant iferroma gnet s
We found the existence of a phase transition in the last sect ion from the selfconsistent
mean field result (Eq. (1.6.32)), which showed that there was a nonzero magnetiza
tion for zJ > 1. This condition is satisfied for small enough temperature as long as
J > 0. What about the case o f J < 0? There are no additional solutions of Eq.( 1.6.32)
for this case. Does this mean there is no phase t ransition? Actually, it means that one
of our assumpt ions is not a good one. When J < 0,each spin would like (has a lower
energy if…) its neighbors to antialign rather than align their spins. However, we have
assumed that all spins have the same magnetization, Eq. (1.6.30). The selfconsistent
equation assumes and does not guarantee that all spins have the same magnetization.
This assumpt ion is not a good one when the spins are tr ying to ant ialign.
160 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 160
Title: Dynamics Complex Systems
Shor t / Normal / Long
kT
kT
c
=zJ
first order transition
h
Fi gure 1 . 6 . 6 Th e ph a se
dia gra m of t h e I sing
mode l foun d from t he
me a n fie ld a pproxima 
t ion . Th e lin e of first  or
de r ph a se t ra n sit ion s a t
h · 0 e n ds a t t h e se c
on d orde r ph a se t ra n si 
t ion poin t give n by
Eq. ( 1. 6. 32) . For posi
t ive va lue s of h t h e re is
a n e t posit ive ma gn e t i 
za t ion a n d for n e ga t ive
va lue s t h e re is a n e ga 
t ive ma gn e t iza t ion . The
ch a n ge t h rough h · 0 is
con t inuous a bove t he
se con d orde r t ra n sit ion
poin t , a n d discon t inu
ous be low it .
01adBARYAM_29412 3/10/02 10:16 AM Page 160
S t a t i s t i c a l f i e l d s 161
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 161
Title: Dynamics Complex Systems
Shor t / Normal / Long
(0,0) (1,0)
(1,1) (1,0)
(1,0)
(1,1) (0,1) (1,1)
(1,1)
Fi gure 1 . 6 . 7 I n orde r
t o obt a in me a n fie ld
e qua t ion s for t h e a n t i
fe rroma gn e t ic ca se J <
0 we con side r a squa re
la t t ice ( t op) a n d la be l
e ve ry sit e a ccordin g t o
t h e sum of it s re ct ilin
e a r in dice s a s odd
( ope n circle s) or e ve n
( fille d circle s) . A fe w
sit e s a re sh own wit h in 
dice s. Ea ch sit e is un
de rst ood t o be t h e loca 
t ion of a spin . We t h e n
in ve rt t h e spin s ( re de
fin e t h e m by s → −s)
t h a t a re on odd sit e s
a n d fin d t h a t t h e n e w
syst e m sa t isfie s t he
sa me e qua t ion s a s t he
fe rroma gn e t . Th e sa me
t rick works for a n y bi
pa rt it e la t t ice ; for ex
a mple t h e h exa gon a l
la t t ice sh own ( bot t om) .
By usin g t h is t rick we
le a rn t h a t a t low t e m
pe ra t ure s t h e syst e m
will h a ve a spon t a n e ous
ma gn e t ism t h a t is posi
t ive on odd sit e s a nd
n e ga t ive on e ve n sit e s
or t he opposit e.
01adBARYAM_29412 3/10/02 10:16 AM Page 161
We can solve the case of a sys tem wit h J < 0 on a squ a re or cubic latti ce direct ly us
ing a tri ck . We label ever y spin by indices (i , j) in 2d, as indicated in Fig. 1 . 6 . 7 , or (i , j, k)
in 3d. Th en we con s i der sep a ra tely the spins whose indices sum to an odd nu m ber
( “odd spins”) and those whose indices sum to an even nu m ber ( “even spins” ) . No te
that all the nei gh bor s of an odd spin are even and all nei gh bor s of an even spin are od d .
Now we inver t all of the odd spins. Ex p l i c i t ly we define new spin va ri a bles in 3d as
s′
ijk
· (−1)
i +j +k
s
ijk
(1.6.42)
In terms of these new spins,the energy without an ext ernal magnetic field is the same
as before, except that each ter m in the sum has a single additional factor of (–1). There
is only one factor of (−1) b ecause ever y nearest neighbor pair has one odd and one
even spin. Thus:
(1.6.43)
We have com p l eted the tr a n s form a ti on by defining a new inter acti on J′ · –J > 0. In
terms of the new va ri a bl e s , we are back to t he fer rom a gn et . The soluti on is t he
s a m e , and bel ow the tem pera tu r e given by k T
c
· zJ′ t h ere wi ll be a spon t a n eo u s
m a gn eti z a ti on of t he new spin va ri a bl e s . What happens in ter ms of the or i gi n a l
va ri a bles? Th ey become ant i a l i gn ed . All of the even spins have magn eti z a ti on in
one directi on , U P, and t he odd spins have magn eti z a ti on in the oppo s i te direct i on ,
DOW N, or vi ce vers a . This lowers t he en er gy of t he sys tem , because the nega tive in
ter act i on J < 0 means t hat all of t he nei gh boring spins want to ant i a l i gn . This is
c a ll ed an anti fer rom a gn et .
The t r ick we have used to solve the antif er romagnet works for certain kinds of
per iodic ar rangements of spins called bipar tite lattices. A bipar tite lattice can be di
vided into two lattices so that all the nearest neighbors of a member of one lattice are
member s of the other lattice. This is exactly what we need in o rder for our r edefini
tion of the spin variables to work. Many lattices are bipartit e,including the cubic lat
tice and the hexagonal honeycomb lattice illustrated in Fig. 1.6.7. However, the trian
gular lattice, illustrated in Fig. 1.6.8, is not.
The t riangular lattice exemplifies an important concept in interacting syst ems
known as frust ration. Consider what happens when we t r y to assign magnet izations
to each of the spins on a tr iangular lattice in an effor t to create a configuration with a
lower energy than a disordered system. We start at a position marked (1) on Fig. 1.6.8
and assign it a magnetization of m. Then, since it wants its neighbors to be an
tialigned, we assign position (2) a magnet ization of −m. What do we do with the spin
at (3)? It has interactions both with the spin at (1) and with the spin at (2). These in
ter act ions would have it be antiparallel with both—an impossible task. We say that the
spin at (3) is frustrated,since it cannot simultaneously satisfy the conflicting demands
upon it. It should not come as a surprise that the phenomenon of frustr ation becomes
a commonplace occur rence in more complex syst ems. We might even say that frus
t ration is a source of complexit y.
E[{
′
s
i
}] · −J s
i
s
j
<ij >
∑
· −( −J)
′
s
i
′
s
j
<ij>
∑
· −
′
J
′
s
i
′
s
j
<ij >
∑
162 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 162
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 162
S t a t i s t i c a l f i e l d s 163
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 163
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1) (2)
(3)
Fi gure 1 . 6 . 8 A t ria n gula r
la t t ice ( t op) is n ot a bi
pa rt it e la t t ice . I n t h is
ca se we ca n n ot solve t he
a n t ife rroma gn e t J < 0 by
t h e sa me me t h od a s use d
for t h e squa re la t t ice ( se e
Fig. 1. 6. 7) . I f we t ry t o a s
sign ma gn e t iza t ion s t o
diffe re n t sit e s we fin d
t h a t a ssign in g a ma gn e t i 
za t ion t o sit e ( 1) would
le a d sit e ( 2) t o be a n 
t ia lign e d. Th is combin a 
t ion would, h owe ve r re
quire sit e ( 3) t o be
a n t ia lign e d t o bot h sit e s
( 1) a n d ( 2) , wh ich is im
possible . We sa y t h a t sit e
( 3) is “frust ra t e d. ” The
bot t om illust ra t ion sh ows
wh a t h a ppe n s wh e n we
t a ke t h e h exa gon a l la t t ice
from Fig. 1. 6. 7 a n d supe r
pose t h e ma gn e t iza t ions
on t h e t ria n gula r la t t ice
le a vin g t h e a ddit ion a l
sit e s ( sh a de d) a s un ma g
ne t ize d ( se e Que st ions
1. 6. 5–1. 6. 7) .
01adBARYAM_29412 3/10/02 10:16 AM Page 163
Q
ue s t i on 1 . 6 . 5 Despite the existence of frustr ation, it is possible to
constr uct a state with lower energy than a completely disordered stat e
on the tr iangular lattice. Const ruct one of them and evaluate its free
energy.
Solut ion 1. 6. 5 We con s tru ct the state by ex tending the process discussed in
the text for assigning magn eti z a ti ons to indivi dual site s . We star t by assigning a
m a gn eti z a ti on m to site (1) in Fig. 1.6.8 and −m to site (2). Because site (3) is
f ru s tra ted , we assign it no magn eti z a ti on . We con ti nue by assigning magn eti z a
ti ons to any site that alre ady has two nei gh bors that are assign ed magn eti z a
ti on s . We assign a magn eti z a ti on of m wh en the nei gh bors are −m and 0, a
m a gn eti z a ti on of −m wh en the nei gh bors are m and 0 and a magn eti z a ti on of
0 wh en the nei gh bors are m and −m. This gives the illu s tra ti on at the bo t tom of
F i g. 1 . 6 . 8 . Com p a ring with Fig. 1 . 6 . 7 , we see that the magn eti zed sites corre
s pond to the hon eycomb latti ce . O n e  t h i rd of the triangular latti ce sites have a
m a gn eti z a ti on of +m, −m and 0. E ach magn eti zed site has three nei gh bors of
the oppo s i te magn eti z a ti on and three unmagn eti zed site s . The free en er gy of
this state is given by:
(1.6.44)
The first ter m is the energy. Each nearest neighbor pair of spins that are an
tialigned provides an energy Jm
2
. Let us call this a bond between two spins.
Ther e are a total of three interactions for every spin (each spin interacts with
six other spins but we can count each int er action only once). However, on
average ther e is only one out of three interact ions that is a bond in this sys
tem. To count the bonds, note that one out of three spins (with m
i
· 0) has
no bonds, while the other two out of three spins each have three bonds. This
gives a total of six bonds for three sites, but each bond must be count ed only
once for a pair of interact ing spins. We divide by two to get three bonds for
three spins, or an aver age of one bond per site. The second term in Eq.
(1.6.44) is the entropy of the N / 3 unmagnet ized sites,and the third term is
the ent ropy of the 2N / 3 magnet ized sit es.
Th ere is another way to sys tem a ti c a lly con s tru ct a state with an en er gy
l ower than a com p l etely disordered state . As s i gn magn eti z a ti on s +m a n d −m
a l tern a tely along one st ra i ght line—a on e  d i m en s i onal anti ferrom a gn et .
Th en skip both nei gh boring lines by set ting all of t h eir magn eti z a ti ons to
zero. Th en repeat the anti ferrom a gn etic line on the next para ll el line. Th i s
con f i g u ra ti on of a l tern a ting anti ferrom a gn etic lines is also lower in en er gy
than the disordered state , but it is high er in en er gy than the con f i g u ra ti on
s h own in Fig. 1.6.8 at low en o u gh tem pera tu re s , as discussed in the nex t
qu e s ti on .
F(m) · NJm
2
−
1
3
NkT ln( 2)
−
2
3
NkT ln(2) −
1
2
(1 +m) ln 1 +m
( )
+(1 −m) ln 1 −m
( )
( )
]
]
]
164 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 164
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 164
Q
ue s t i on 1 . 6 . 6 Show that the state illustrat ed on the bottom of Fig. 1.6.8
has the lowest possible free energy as the temperature goes to zero, at
least in the mean field approximat ion.
Solut i on 1 . 6 . 6 As the temper ature goes to zero, the ent ropic cont ribution
to the free energy is ir relevant. The energy o f the Ising model is minimized
in the mean field approximation when the magnetization is +1 if the local
effective field is positive, or –1 ifit is negative. The magnetization is arbitr ar y
if the effective field is zero. If we consider three spins ar ranged in a t riangle,
the lowest possible energy of the three interact ions between them is given by
having one with m · +1, one with m · –1 and the other arbitrar y. This is
forced, because we must have at least one +1 and one –1 and then the other
is ar bitrar y. This is the optimal energy for any t r iangle of int eractions. The
configuration of Fig. 1.6.8 achieves this optimal arrangement for all t riangles
and therefore must give the lowest possible energy of any state.
Q
ue s t i on 1 . 6 . 7 In the case of the fer romagnet and the antifer romagnet,
we found that there were two different states of the system with the same
energy at low temper atures. How many states are ther e of the kind shown in
Fig. 1.6.8 and descr ibed in Questions 1.6.5 and 1.6.6?
Solut i on 1 . 6 . 7 There are two ways to count the states. The first is to count
the number of distinct magnet ization structures. This counting is as follows.
Once we assign the values of the magnetization on a single tr iangle, we have
determined them everywhere in the system. This follows by inspection or by
induction on the size of the assigned t r iangle. Since we can assign arbitr ar
ily the three different magnetizations ( m, −m,0) within a triangle, there are
a total of six such distinct magnetization st ructures.
We can also count how many disti n ct ar ra n gem ents of spins there are .
This is rel evant at low tem pera tu res wh en we want to know the po s s i ble state s
at the lowest en er gy. We see that there are 2
N/ 3
a rra n gem ents of the arbi tra r y
spins for each of the magn eti z a ti on s . If we want to count all of the state s , we
can almost mu l ti p ly this nu m ber by 6. We have to correct this sligh t ly bec a u s e
of s t a tes wh ere the arbi tra r y spins are all align ed U P or DOW N. Th ere are two
of these for each ar ra n gem ent of the magn eti z a ti on s , and these wi ll be
co u n ted twi ce . Making this correct i on gives 6(2
N/ 3
− 1) state s . We see that
f ru s tr a ti on gives rise to a large nu m ber of l owest en er gy state s .
We have not yet proven that these are the on ly states with the lowest en er gy.
This fo ll ows from the requ i rem ent that every tri a n gle must have its lowest po s
s i ble en er gy, and the ob s erva ti on that set ting the va lue of the magn eti z a ti ons of
one tri a n gle then forces the va lues of a ll other magn eti z a ti ons uniqu ely.
Q
ue s t i on 1 . 6 . 8 We discovered that our assumption that all spins should
have the same magnetization does not always apply. How do we know
that we found the lowest energy in the case of the ferromagnet? Answer this
for the case of h · 0 and T · 0.
S t a t i s t i c a l f i e l d s 165
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 165
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 165
Solut i on 1 . 6 . 8 To minimize the energy, we can consider each t erm o f the
energy, which is just the product of spins on adj acent sites. The minimum
possible value for each ter m of a ferromagnet occurs for aligned spins. The
two states we found at T · 0 with m
i
· 1 and m
i
· –1 are the only possible
states with all spins aligned. Since they give the minimum possible energy,
they must be the cor rect stat es.
1 . 6 . 5 Beyond mea n field t heory ( correla t ions)
Mean field theor y t reats only the average or ientation of each spin and assumes that
spins are uncorrelated. This implies that when one spin changes its sign, the other
spins do not respond. Since the spins are interact ing, this must not be t rue in a more
complete t reatment. We expect that even above T
c
, near by spins align to ea ch other.
Below T
c
, nearby spins should be more aligned than would be suggested by the aver
age magnet ization. Alignment of spins implies their values are cor related. How do we
quantify the concept of cor relation? When two spins are cor relat ed they are more
likely to have the same value. So we might define the correlation of two spins as the
average of the product of the spins:
(1.6.45)
According to this definition, they are cor related if they are both always +1, so that
P
s
i
s
j
(1,1) · 1. Then < s
i
s
j
> achieves its maximum possible value +1. The problem with
this definition is that when s
i
and s
j
are both always +1 they are completely indepen
dent of each other, because each one is +1 independently of the other. Our concept of
correlation is the o pposite of independence. We know that if spins are independent,
then their joint probabilit y distribution factors (see Section 1.2)
P( s
i
,s
j
) · P(s
i
)P(s
j
) (1.6.46)
Thus we define the corr elation as a measure of the depar ture of the joint probabilit y
from the product of the individual probabilities.
(1.6.47)
This definition means that when the correlation is zero, we can say that s
i
and s
j
are in
dependent. However, we must be careful not to assume that they are not aligned with
each other. Eq. (1.6.45) measures the spin alignment.
Q
ue s t i on 1 . 6 . 9 One way to think about the difference between Eq.
(1.6.45) and Eq. (1.6.47) is by considering a hierarchy o f cor relations.
The first kind of correlation is of individual spins with themselves and is just
the aver age o f the spin. The second kind are cor relations b etween pairs of
spins that are not contained in the first kind. Define the next kind of corre
lation in the hier archy that would describe correlations between three spins
but exclude the corr elations that appear in the fir st two.
s
i
s
j
(P(s
i
, s
j
) −P(s
i
)P(s
j
)) · <s
i
s
j
> − <s
i
>< s
j
>
s
i
,s
j
∑
<s
i
s
j
> · s
i
s
j
P(s
i
, s
j
)
s
i
,s
j
∑
·P
s
i
s
j
(1,1) +P
s
i
s
j
( −1, −1) −P
s
i
s
j
( −1,1) − P
s
i
s
j
(1,−1)
166 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 166
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 166
Solut i on 1 . 6 . 9 The first three elements in the hierarchy of correlations are:
< s
i
>
< s
i
s
j
> − < s
i
> < s
j
> (1.6.48)
< s
i
s
j
s
k
> − < s
i
s
j
> < s
k
> − < s
i
s
k
> < s
j
> − < s
j
s
k
> < s
i
> +2 < s
i
> < s
j
> < s
k
>
The exp ression for the correlation of three spins can be checked by seeing
what happens if the variables are independent. When variables are indepen
dent, the average of their product is the same as the product of their aver
ages. Then all averages become products of averages of single variables and
ever ything cancels. Similar ly, if the first two variables s
i
and s
j
are cor related
and the last one s
k
is independent of them,then the first two ter ms cancel and
the last thr ee terms also cancel. Thus, this expression measures the cor rela
tions of three variables that are not present in any two of them.
Q
ue s t i on 1 . 6 . 1 0 To see the difference between Eqs. (1.6.45) and
(1.6.47), evaluate them for two cases: (a) s
i
is always equal to 1 and s
j
is
always equal to –1,and ( b) s
i
is always the opposite of s
j
but each of them av
erages to zero (i.e., is equally likely to be +1 or –1).
Solut i on 1 . 6 . 1 0
a. P
s
i
s
j
(1,−1) · 1, so < s
i
s
j
> · −1, but < s
i
s
j
> − < s
i
> < s
j
> · 0.
b. < s
i
s
j
> · −1, and < s
i
s
j
> − < s
i
> < s
j
> · −1.
Comparing Eq. (1.6.34) with Eq. (1.6.47), we see that cor relations measure the
departure of the system from mean field theor y. When there is an average magnet iza
tion, such as there is b elow T
c
in a ferromagnet, the effect of the average magnet iza
tion is removed by our definition of the cor relation. This can also be seen from rewrit
ing the expression for correlations as:
< s
i
s
j
> − < s
i
> < s
j
> · < (s
i
− < s
i
> ) (s
j
− < s
j
>) > (1.6.49)
Correlations measure the behavior of the difference between the spin and its aver age
value. In the rest of this sect ion we discuss qualitatively the correlations that are found
in a fer romagnet and the breakdown of the mean field approximation.
The en er gy of a fer rom a gn et is determ i n ed by t he align m ent of n ei gh bor i n g
s p i n s . Po s i t ive cor rel a ti ons bet ween nei gh bor ing spins redu ce its en er gy. Po s i t ive
or nega tive cor r el a ti on s diminish the po s s i ble con f i g u ra ti ons of spin s and there
fore redu ce t he en t ropy. At ver y high tem pera tu re s , the com pet i ti on bet ween t he
en er gy and t he en t ropy is dom i n a ted by t he en t ropy, so ther e shou ld be no cor re
l a ti ons and each spin is indepen den t . At low tem pera tu re s , well bel ow t he tr a n s i
t i on tem per a tu re , the aver a ge va lu e of t he spins is close to on e . For ex a m p l e , for
z J · 2 , wh i ch cor re s ponds to T · T
c
/ 2, t he va lue of m
0
( z J) is 0.96 ( see Fig.
1 . 6 . 4 ) . So t he cor r el a ti ons given by Eq . (1.6.47) play almost no ro l e . Cor rel a ti on s
a re most significant near T
c
, so it is near t he t ra n s i ti on t hat the mean field ap
prox i m a ti on is least va l i d .
S t a t i s t i c a l f i e l d s 167
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 167
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 167
For all T > T
c
and for h · 0, the magnetization is zero. However, star ting fr om
high temper ature, the correlation b etween neighb oring spins increases as the tem
perature is lowered. Moreover, the correlation of one spin with its neighbor s,and their
correlation with their neighbor s,induces a correlation of each spin with spins farther
away. The distance over which spins are correlated increases as the temperature de
creases. The correlation decays exponentially, so a correlation length (T) may be de
fined as the decay constant of the cor relation:
< s
i
s
j
> − < s
i
> < s
j
> ∝ e
−r
ij
/ ( T )
(1.6.50)
where r
ij
is the Euclidean distance between s
i
and s
j
. At T
c
the cor relation length di
verges. This is one way to think about how the phase transition occurs. The divergence
of the correlation length implies that two spins anywhere in the syst em become cor
related. As mentioned previously, in order for the instantaneous magnetization to be
measured, there must also be a divergence o f the r elaxation time between o pposite
values of the magnetization. This will be discussed in Sect ions 1.6.6 and 1.6.7.
For tem per a tu res just bel ow T
c
, the aver a ge magn eti z a ti on is small . The corre
l a ti on lengt h of t he spins is large . The avera ge align m ent (Eq . (1.6.45)) is essen ti a lly
t he same as t he cor rel a ti on (Eq . ( 1 . 6 . 4 7 ) ) . However, as T is fur t h er redu ced bel ow
T
c
, t he aver a ge magn et i z a ti on grows prec i p i to u s ly and the correl a ti on measu res the
d i f feren ce bet ween the spin spin align m ent and the avera ge spin va lu e . Both the
corr el a ti on and t he cor rel a ti on length dec rease aw ay from T
c
. As t he tem pera tu re
goes t o zero, the correl a ti on lengt h also goes to zero, even as t he cor rel a ti on itsel f
va n i s h e s .
At T · T
c
t h ere is a special circ u m s t a n ce wh ere the correl a ti on length is infinite .
This does not mean that the correl a ti on is unch a n ged as a functi on of the distance be
t ween spins, r
ij
. Si n ce the magn et i z a ti on is zero, t he correl a ti on is the same as the spin
a l i gn m en t . If the align m ent did not dec ay with distance , the magn et i z a ti on would be
u n i ty, wh i ch is not cor rect . The infinite correl a ti on length cor re s ponds to power law
ra t h er than ex pon en tial dec ay of the correl a ti on s . A power law dec ay of the cor rel a ti on s
is more gr adual than ex pon en tial and implies that t here is no ch a racteri s tic size for the
cor rel a ti on s : we can find cor rel a ted regi ons of spins that are of a ny size . Si n ce the cor
rel a ted regi ons flu ctu a te , we say that there are flu ctu a ti ons on ever y length scale.
The existence of correlations on ever y length scale near the phase transition and
the breakdown of the mean field approximation that neglects these cor relations
played an impor tant role in the development of the theor y of phase t r ansitions. The
discrepancy between mean field predictions and experiment was one of the great un
solved problems of statistical physics. The development of renor malization tech
niques that directly consider the behavior of the syst em on different length scales
solved this problem. This will be discussed in greater detail in Sect ion 1.10.
In Section 1.3 we discussed the nature of ensemble averages and indicat ed that
one of the centr al issues was determining the size o f an independent system. For the
Ising model and other syst ems that are spatial ly unifor m, it is the cor relation length
that deter mines the size of an independent system. If a physical system is much larger
than a correlation length then the system is selfaver aging, in that exper imental mea
168 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 168
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 168
surements average over many independent samples. We see that far from a phase tran
sition,uniform systems are generally selfaver aging;near a phase t r ansition,the phys
ical size of a syst em may enter in a more essential way.
The mean field approx i m a ti on is su f f i c i ent to captu re the co ll ective beh avi or of t h e
Ising model . However, even T
c
is not given cor rect ly by mean field theory, and indeed it
is difficult to calculate . The actual tra n s i ti on tem pera tu re differs from the mean fiel d
va lue by a factor that depends on the dimen s i on a l i ty and stru ctu re of the latti ce . In 1d ,
the failu re of mean field theor y is most severe ,s i n ce there is actu a lly no real t ra n s i ti on .
Ma gn eti z a ti on does not occ u r, except in the limit of T → 0 . The re a s on that there is no
m a gn eti z a ti on in 1d, is that there is alw ays a finite prob a bi l i ty that at some point alon g
the chain there wi ll be a swi tch from having spins DOW N to having spins U P. This is
tr ue no mat ter how low the tem pera tu re is. The prob a bi l i ty of su ch a bo u n d a ry
bet ween U P a n d DOW N spins dec reases ex pon en ti a lly with the tem pera tu re . It is given
by 1/ ( 1 + e
2J / k T
) ≈ e
−2J / k T
at low tem pera tu re . Even one su ch bo u n d a ry de s troys the
avera ge magn eti z a ti on for an arbi t ra ri ly large sys tem . While for m a lly there is no ph a s e
tra n s i ti on in one dimen s i on , u n der some circ u m s t a n ces the ex pon en ti a lly growi n g
d i s t a n ce bet ween bo u n d a ries may have con s equ en ces like a phase tra n s i ti on . The ef fect
i s ,h owever, mu ch more gradual than t he actual phase tra n s i ti ons in 2d and 3d.
The mean field approximation improves as the dimensionality increases. This is
a consequence of the increase in the numb er of neighbors. As the number of neigh
bors increases,the averaging used for deter mining the mean field becomes more reli
able as a measure of the environment of the spin. This is an impor tant point that de
serves some thought. As the numb er o f different influences on a particular variable
increases, they become b etter r epresented as an average influence. Thus in 3d, the
mean field approximation is bett er than in 2d. Moreover, it turns out that rather than
just gr adually imp roving as the number o f dimensions increases, for 4d the mean
field approximation becomes essentially exact for many of the proper ties of impor
tance in phase tr ansitions. This hap pens because correlations b ecome ir relevant on
long length scales in more than 4d. The number of effective neighbor s of a spin also
increases if we increase the range of the int eractions. Several different mo dels with
longrange inter act ions are discussed in the following section.
The Ising model has no builtin dynamics;however, we often discuss fluctuations
in this model. The simplest fluctuation would be a single spin flipping in time. Unless
the average value of a spin is +1 or –1,a spin must spend some time in each state. We
can see that the presence of cor relations implies that there must be fluctuations in time
that affect more than one spin. This is easiest to see if we consider a system above the
tr ansition, where the aver age magnetization is zero. When one spin has the value +1,
then the average magnet ization of spins around it will be positive. On average,a r e
gion of spins will tend to flip together from one sign to the other. The amount of time
that the region takes to flip depends on the length of the correlations. We have defined
correlations in space between two spins. We could generalize the definition in Eq.
(1.6.47) to allow the indices i and j to r efer to different times as well as spatial posi
tions. This would tell us about the fluctuations over time in the system. The analog of
the corr elation length Eq.(1.6.50) would be the relaxation time (Eq.(1.6.69) below).
S t a t i s t i c a l f i e l d s 169
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 169
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 169
The Ising model is useful for describing a large variet y of systems;however, there
are many other statistical models using more complex variables and interact ions that
have been used to represent various physical syst ems. In general, these models are
treated first using the mean field approximation. For each model,t here is a lower di
mension (the lower critical dimension) b elow which the mean field results are com
pletely invalid. There is also an upper critical dimension, where mean field is exact.
These dimensions are not necessar ily the same as for the Ising model.
1 . 6 . 6 Longra nge int era ct ions a nd t he spin gla ss
Longrange interact ions enable the Ising model to ser ve as a model of systems that are
much mo re complex than might be expected fr om the magnet ic analog that moti
vated its original int roduct ion. If we just consider fer romagnet ic int er act ions sepa
rately, the model with longrange interactions actually behaves more simply. If we just
consider antiferr omagnetic interactions, larger scale patter ns of UP and DOWN spins
arise. When we include both negative and positive inter actions together, there will be
additional features that enable a richer behavior. We will start by considering the case
of ferromagnetic longr ange int eractions.
The pr imary effect of the increase in the range of fer romagnetic int eract ions is
improvement of the mean field approximation. There are several ways to mo del in 
teract ions that extend beyond nearest neighbors in the Ising model. We could set a
sphere of a part icular radius r
0
around each spin and consider all of the spins within
the sphere to be neighbors of the spin at the center.
(1.6.51)
Here we do not restrict the summations over i and j in the second ter m,so we explic
itly include a factor of 1/2 to avoid counting int eractions twice. Alter natively, we could
use an inter action J(r
ij
) that decays either exponentially or as a power law with dis
tance from each spin:
(1.6.52)
In both Eqs. (1.6.51) and (1.6.52) the selfinter action t erms i · j are generally to be
excluded. Since s
i
2
· 1 they only add a constant to the energy.
Q u i te gen era lly and indepen dent of the ra n ge or even the va ri a bi l i t y of i n terac
ti on s , wh en all interacti ons ar e ferrom a gn eti c , J > 0, t h en all the spins wi ll align at low
tem pera tu re s . The mean field approx i m a ti on may be used to esti m a te the beh avi or. All
cases then redu ce to the same free en er gy ( Eq . ( 1.6.36) or Eq . (1.6.41) ) with a measu re
of the strength of the interacti ons rep l acing z J. The on ly differen ce from the neare s t
n ei gh bor model then rel a tes to the acc u r acy of the mean field approx i m a ti on . It is sim
plest to con s i der the model of a fixed inter acti on str ength wit h a cutof f l en g t h . Th e
mean field is acc u ra te wh en the correl a ti on length is shorter t han the interact i on dis
t a n ce . Wh en this occ u rs , a spin is inter acting with other spins that are uncor rel a ted
with it . The avera ging used to obt ain the mean field is then cor rect . Thus the approx
E[{s
i
}] · – h
i
s
i
i
∑
−
1
2
J s
i
s
j
r
ij
<r
0
∑
170 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 170
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 170
i m a ti on improves if the interacti on bet ween spins becomes lon ger ra n ged . However,
the correl a ti on length becomes arbi t ra ri ly long near the phase tra n s i ti on . Thu s , for
l on ger interacti on len g t h s , the mean field approx i m a ti on holds cl o s er to T
c
but even
tu a lly becomes inacc u ra te in a narrow tem pera tu re ra n ge around T
c
. Th ere is one model
for wh i ch the mean field approx i m a ti on is ex act indepen dent of tem pera tu re or di
m en s i on . This is a model of i n f i n i te ra n ge interact i ons discussed in Questi on 1.6.11.
The distance  depen dent interacti on model of Eq . (1.6.52) can be shown to beh ave like
a finite r a n ge interacti on model for interacti ons that dec ay more ra p i dly than 1/r in 3
d . For we a ker dec ay than 1/r this model is essen ti a lly t he same as the lon g  ra n ge in
ter acti on model of Q u e s ti on 1.6.11. In teracti ons that dec ay as 1/r a re a borderline case.
Q
ue s t i on 1 . 6 . 1 1 Solve the Ising model with infinite ranged interact ions
in a unifor m magnet ic field. The infinite range means that all spins in
ter act with the same interact ion st rength. In o rder to keep the energy ex
t rinsic (propor tional to the volume) we must make the interact ions between
pairs of spins weaker as the system becomes larger, so replace J →J /N. The
energy is given by:
(1.6.53)
For simplicit y, keep the i · j terms in the second sum even though they add
only a constant.
Solut i on 1 . 6 . 1 1 We can solve this problem exactly by rewriting the energy
in ter ms of a collect ive coordinate which is the average over the spin variables
(1.6.54)
in terms of which the energy becomes:
(1.6.55)
This is the same as the mean field Eq. (1.6.39) with the substit ution Jz →J.
Here the equation is exact. The result for the ent ropy is the same as before,
since we have fixed the average value of the spin by Eq.(1.6.54). The solution
for the value of m for h · 0 is given by Eq.(1.6.32) and Fig. 1.6.4. For h ≠ 0
the discussion in Quest ion 1.6.4 applies.
The case of antiferromagnet ic interactions will be considered in greater detail in
Chapter 7. If all inter actions are antiferromagnetic J < 0,then extending the range of
the interactions t ends to reduce their effect, because it is impossible for neighboring
spins to be antialigned and lower the energy. To be antialigned with a neighbor is t o
be aligned with a second neighb or. However, by for ming pat ches o f UP and DOWN
spins it is possible to lower the energy. In an infiniteranged antifer romagnetic sys
tem,all possible states with zero magnet ization have the same lowest energy at h · 0.
E({s
i
}) ·hNm −
1
2
JNm
2
m ·
1
N
s
i
i
∑
E[{s
i
}] · –h s
i
i
∑
−
1
2N
J s
i
s
j
i, j
∑
S t a t i s t i c a l f i e l d s 171
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 171
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 171
This can be seen from the energy expression in Eq.(1.6.55). In this sense,frustration
from many sources is almost the same as no inter action.
In addition to the ferromagnet and antifer romagnet, there is a third possibility
where there are both positive and negative interactions. The physical systems that
have mot ivat ed the study of such models are known as spin glasses. These are mate
r ials where magnet ic atoms are found or placed in a nonmagnetic host. The randomly
placed magnetic sites interact via longrange inter actions that oscillate in sign with
distance. Because of the randomness in the location of the spins, there is a rand om
ness in the interactions between them. Experimentally, it is found that such systems
also undergo a t ransition that has been compared to a glass t ransition, and therefore
these systems have become known as spin glasses.
A mo del for these materials, known as the Sher ringtonKirkpatrick spin glass,
makes use of the Ising model with infiniter ange random inter actions:
(1.6.56)
J
ij
· tJ
The interactions J
ij
are fixed uncorrelated random variables—quenched variables.
The pr opert ies of this system are to be averaged over the random variables J
ij
but only
after it is solved.
Similar to the ferromagnetic or antifer romagnetic Ising model,at high tempera
tures kT >> J the spin glass model has a disordered phase where spins do not feel the
effect of the int eractions beyond the exist ence of correlations. As the t emper ature is
lowered,the system undergoes a t r ansition that is easiest to describe as a breaking of
ergodicity. Because of the random interact ions,some ar rangements of spins are much
lower in energy than others. As with the case of the antiferromagnet on a t riangular
lat t ice,t here are many of these lowenergy states. The difference between any two of
these states is large,so that changing from one state to the other would involve the flip
ping of a finite fr act ion of the spins o f the system. Such a flipping would have to be
cooper ative, so that o vercoming the bar r ier between lowenergy states becomes im
possible below the transition temperature during any reasonable time. The low
energy states have been shown to be organized into a hier archy deter mined by the size
of the overlaps between them.
Q
ue s t i on 1 . 6 . 1 2 Solve a mo del that includes a sp ecial set o f corr elated
random inter actions of the t ype of the Sher ringtonKirkpatrick model,
wher e the interactions can be written in the separable form
J
ij
·
i j
i
· t1
(1.6.57)
This is the Matt is model. For simplicit y, keep the terms where i · j.
Solut i on 1 . 6 . 1 2 We can solve this probl em by defining a new set of va ri a bl e s
s′
i
·
i
s
i
(1.6.58)
E[{s
i
}] · −
1
2N
J
ij
s
i
s
j
ij
∑
172 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 172
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 172
In ter ms of these variables the energy becomes:
(1.6.59)
which is the same as the ferromagnetic Ising model. The phase tr ansition of
this model would lead to a spontaneous magnetization of the new variables.
This corr esponds to a net o rientation of the spins t oward (or opposite) the
state s
i
·
i
. This can be seen from
m · < s′
i
> ·
i
< s
i
> (1.6.60)
This mo del shows that a set of mixed int eract ions can cause the system t o
choose a part icular lowenergy state that behaves like the ordered state found
in the ferromagnet. By ext ension, this makes it plausible that fully random
interactions lead to a variet y of lowenergy states.
The existence of a large number of randomly located energy minima in the spin
glass might suggest that by engineering such a system we could cont rol where the
minima occur. Then we might use the spin glass as a memor y. The Mattis model pro
vides a clue to how this might be accomplished. The use o f an outer product r epr e
sentation for the matrix o f interactions turns out to be closely r elated to the model
developed by Hebb for biolo gical imp r inting o f memories on the brain. The engi
neer ing of minima in a longrangeinteraction Ising model is precisely the model de
veloped by Hopfield for the behavior of neural networks that we will discuss in
Chapter 2.
In the fer romagnet and antifer romagnet, there were intuit ive ways to deal with
the b reaking of ergodicity, because we could easily define a macroscopic parameter
(the magnetization) that differentiated b etween diff erent macroscopic states of the
system. More general ways to do this have been developed for the spin glass and ap
plied to the study of neural networks.
1 . 6 . 7 Kinet ics of t he Ising model
We have introdu ced the Ising model wi t h o ut the ben efit of a dy n a m i c s . Th ere are many
ch oi ces of dynamics that would lead to the equ i l i br ium en s em ble given by t he Is i n g
m odel . One of the most natu ral would ar ise from con s i der ing each spin to have the
t wo  s t a te sys tem dynamics of Secti on 1.4. In this dy n a m i c s , t ra n s i ti ons bet ween U P a n d
DOW N occur ac ross an interm ed i a te barri er that set s the tra n s i ti on ra te . We call this the
activa ted dynamics and wi ll use it to discuss pro tein folding in Ch a pter 4 because it can
be motiva ted micro s cop i c a lly. The activa ted dynamics de s c ri bes a con ti nuous r a te of
tr a n s i ti on for each of the spins. It is of ten conven i ent to con s i der t ra n s i ti ons as occ u r
ring at discrete ti m e s . A part i c u l a rly simple dynamics of this kind was introdu ced by
G l a u ber for the Ising model . It also corre s ponds to the dynamics popular in studies of
n eu ral net wor ks that we wi ll discuss in Ch a pter 2. In this sect i on we wi ll show that the
t wo different dynamics are qu i te cl o s ely rel a ted . In Secti on 1.7 we wi ll con s i der severa l
o t h er forms of dynamics wh en we discuss Mon te Ca rlo simu l a ti on s .
E[{s
i
}] · −
1
2N
i j
s
i
s
j
ij
∑
· −
1
2N
′
s
i
′
s
j
ij
∑
S t a t i s t i c a l f i e l d s 173
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 173
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 173
If there are many different possible ways to assign a dynamics to the Ising model,
how do we know which one is cor rect? As for the model itself, it is necessary to con
sider the system that is being modeled in order to determine which kinetics is appro
priate. However, we expect that there are many different choices for the kinetics that
will provide essentially the same results as long as we consider its long time behavior.
The central limit theorem in Section 1.2 shows that in a st ochastic process, many in
dependent steps lead to the same Gaussian distribut ion of probabilities,independent
of the sp ecific st eps that are taken. Similarly, if we choose a dynamics for the Ising
model that allo ws individual spin flips, the b ehavior o f processes that in volve many
spin flips should not depend on the specific dynamics chosen. Having said this, we
emphasize that the conditions under which differ ent dynamic rules provide the same
long time b ehavior are not fully established. This problem is essential ly the same as
the problem of classifying dynamic syst ems in general. We will discuss it in more de
tail in Sect ion 1.7.
Both the activated dynamics and the Glauber dynamics assume that each spin re
laxes fr om its present state toward its equilibrium distr ibution. Relaxation of each
spin is independent of other spins. The equilibr ium distribution is deter mined by the
relative energy of its UP and DOWN state at a particular time. The energy diff erence
between having the i th spin s
i
UP and DOWN is:
E
+i
({s
j
}
j≠i
) · E(s
i
· +1,{s
j
}
j≠i
) −E(s
i
· –1,{s
j
}
j≠i
) (1.6.61)
The probabilit y of the spin being UP or DOWN is given by Eq. (1.4.14) as:
(1.6.62)
P
s
i
(−1) · 1 − f(E
+i
) · f(−E
+i
) (1.6.63)
In the activated dynamics, all spins perfor m t r ansitions at all times with rates
R(1 –1) and R( −1 1) given by Eqs.(1.4.38) and (1.4.39) with a sited ependent energy
barr ier E
Bi
that sets the relaxation time for the dynamics
i
. As with the twostate
system, it is assumed that each tr ansition occurs essentially instantaneously. The
choice of the bar r ier E
Bi
is quite imp ortant for the kinetics, par ticularly since it may
also depend on the state of other spins with which the i th spin int eracts. As soon as
one of the spins makes a tr ansition,all of the spins with which it int eracts must change
their rate of relaxation accordingly. Instead of considering directly the rate of transi
tion, we can consider the evolution of the probability using the Master equation,
Eq. (1.4.40) or (1.4.43). This would be convenient for Master equation treatments of
the whole system. However, the necessity o f keeping t rack o f all of the probabilities
makes this impract ical for all but simple consider ations.
Glauber dynamics is simpler in that it considers only one spin at a time. The sys
tem is updated in equal time intervals.Each time interval is divided into N small time
increments. During each time increment, we select a part icular spin and only consider
its dynamics. The selected spin then relaxes completely in the sense that its state is set
to be UP or DOWN according to its equilibrium probabilit y, Eq. (1.6.62). The t ransi
tions of different spins occur sequentially and are not other wise coupled. The way we
P
s
i
(1) ·
1
1+e
E
+i
/ kT
· f ( E
+i
)
174 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 174
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 174
select which spin to update is an essential part of the Glauber dynamics. The simplest
and most commonly used approach is to select a spin at random in each time incr e
ment. This means that we do not guarantee that ever y spin is selected during a time
inter val consisting of N spin updates.Likewise,some spins will be updat ed more than
once in a time inter val.On average,however, every spin is updated once per time in
terval.
In order to show that the Glauber dynamics are intimately related to the activated
dynamics, we begin by considering how we would implement the activated dynamics
on an ensemble of independent twostate systems whose dynamics are completely de
termined by the relaxation time · (R(1 –1) + R(1 –1))
−1
(Eq.(1.4.44)). We can think
about this ensemble as r epresenting the dynamics of a single twostate syst em, or, in
a sense that will become clear, as r epr esenting a nonint er acting Ising model. The to
tal number of spins in our ensemble is N. At time t the ensemble is described by the
number of UP spins given by NP(1;t) and the number of DOWN spins NP(−1;t).
We describe the a ctivated dynamics of the ensemble using a small time interval
∆t, which eventually we would like to make as small as possible. During the int erval
of t ime ∆t, which is much smaller than the relaxation time , a cer tain number of spins
make t ransitions. The probability that a par ticular spin will make a tr ansition from
UP to DOWN is given b y R( −1 1)∆t. The total number o f spins making a transition
from DOWN to UP, and from UP to DOWN, is:
NP(−1;t)R(1 –1)∆t
NP(1;t)R(−1 1) ∆t
(1.6.64)
respectively. To implement the dynamics, we must randomly pick out of the whole en
semble this number of UP spins and DOWN spins and flip them. The result would be
a new numb er of UP and DOWN spins NP(1;t + ∆t) and NP(−1;t + ∆t). The process
would then be repeated.
It might seem that there is no reason to randomly pick the ensemble elements to
flip, because the result is the same if we rear range the spins arbitrarily. However, if
each spin represents an identifiable physical system (e.g., one spin out of a noninter
act ing Ising model) that is per forming an internal dynamics we are representing, then
we must randomly pick the spins to flip.
It is somewhat inconvenient to have to wor r y about selecting a par ticular num
ber of UP and DOWN spins separat ely. We can modify our pr escription so that we se
lect a subset of the spins regardless of or ientation. To achieve this, we must allow that
some of the selected spins will be flipped and some will not. We select a fraction of
the spins of the ensemble. The number of these that ar e DOWN is NP(−1; t). In or
der to flip the same number of spins from DOWN to UP, as in Eq.(1.6.64), we must flip
UP a fraction R(1 –1)∆t / of the NP(−1; t) spins. Con s equ en t ly, the fracti on of s p i n s
we do not flip is (1 – R( 1  – 1 )∆t / ) . Si m i l a rly, the nu m ber of s el ected U P spins is
N P ( 1;t ) the fr acti on of these to be flipped is R(−11 )∆t / , and the fracti on we do not
flip is (1 − R(−1 1) ∆t / ) . In order for these ex pre s s i ons to make sense (to be po s i tive )
must be lar ge en o u gh so that at least one spin wi ll be flipped . This implies > max
(R( 1  – 1 )∆t, R( −1  1 )∆t) . Moreover, we do not want to be larger than it must be
S t a t i s t i c a l f i e l d s 175
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 175
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 175
because this will just force us to select additional spins we will not be flipping. A con
venient choice would be to take
· (R(1 − 1) + R(−1 1))∆t · ∆t / (1.6.65)
The consequences of this choice are quite int eresting, since we find that the fr action
of selected DOWN spins to be flipped UP is R(1 –1) / ( R(1 –1) + R( −1 1)) · P(1), the
equilibrium fr action o f UP spins. The fr action not to be flipped is the equilibrium
fr action of DOWN spins. Similarly, the fraction o f selected UP spins that are to be
flipped DOWN is the equilibrium fr act ion of DOWN spins, and the fr action to be left
UP is the equilibr ium fract ion of UP spins. Consequently, the outcome of the dynam
ics of the selected spin does not depend at all on the initial state of the spin. The r e
vised prescription for the dynamics is to select a fr act ion of spins from the ensem
ble and set them according to their equilibr ium probability.
We still must choose the time interval ∆t. The smallest time int erval that makes
sense is the inter val for which the number of selected spins would be just one. A
smaller number would mean that sometimes we would not choose any spins. Setting
the number of selected spins N · 1 using Eq. (1.6.65) gives:
(1.6.66)
which also implies the condition ∆t << , and means that the approximation of a fi
nite time incr ement ∆t is dir ectly coupled to the size of the ensemble. Our new pre
scription is that we select a single spin and set it UP or DOWN according to its equi
libr ium probabilit y. This would be the prescription of Glauber dynamics if the
ensemble were considered to be the Ising model without interactions. Thus for a non
interacting Ising model, the Glaub er dynamics and the activated dynamics are the
same. So far we have made no ap proximation except the finite size of the ensemble.
We still have one more st ep to go to apply this to the inter acting Ising model.
The act iva ted dynamics is a stoch a s t ic dy n a m i c s , so it does not make sense to
discuss on ly t he dynamics of a par ticular sys tem but t he dynamics of an en s em bl e
of Ising model s . At any mom en t , the act iva ted dynamics t reats the Ising model as a
co ll ect i on of s ever al kinds of s p i n s . E ach kind of spin is iden ti f i ed by a par t i c u l a r
va lue of E
+
and E
B
. These para m eter s are con t ro ll ed by t he local envi ron m ent of t h e
s p i n . The dynamics is not con cer n ed with the source of t hese qu a n ti ti e s , on ly thei r
va lu e s . The dynamics are t hat of an en s em ble con s i s t ing of s everal kinds of s p i n s
wit h a differ ent nu m ber N
k
of e ach kind of s p i n , wh er e k i n dexes the kind of s p i n .
According t o the re sult of the previous para gr a ph , and spec i f i c a lly Eq . ( 1 . 6 . 6 5 ) , we
can per for m this dynamics over a t ime int er val ∆t by sel ecting N
k
∆t /
k
spins of e ach
kind and updat ing t hem according to t he Glauber met h od . This is st r i ct ly
a pp l i c a ble on ly for an en s em ble of Ising sys tem s . If the Ising sys tem that we are con
s i der ing cont ains many cor rel a ti on len g t h s , Eq . ( 1 . 6 . 5 0 ) , t h en it repre s ents the en
s em ble by itsel f . Thus for a large en o u gh Ising model , we can app ly t his to a singl e
s ys tem .
∆t ·
1
N( R(1  −1) + R(−1 1))
·
N
176 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 176
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 176
If we want to select spins arbitr arily, rather than of a par t icular kind, we must
make the assumpt ion that all of the r elaxation times are the same,
k
→ . This as
sumption means that we would select a total number of spins:
(1.6.67)
As before, ∆t may also be chosen so that in each time inter val only one spin is selected.
Using two assumptions, we have b een able to der ive the Glauber dynamics di
rectly from the activated dynamics.One of the assumpt ions is that the dynamics must
be considered to apply only as the dynamics of an ensemble. Even though both dy
namics are stochastic dynamics, applying the Glauber dynamics directly to a single
system is only the same as the act ivated dynamics for a large enough system. The sec
ond assumption is the equivalence of the relaxation times
k
. When is this assumption
valid? The expression for the relaxation time in ter ms of the twostate system is given
by Eq. (1.4.44) as
1/ · (R(1 –1) + R( −1 1)) · (e
−(E
B
−E
1
) / kT
+ e
−( E
B
−E
−1
) / kT
) (1.6.68)
When the r elative energy of the two states E
1
and E
−1
varies between different spins,
this will in general var y. The size of the r elaxation time is largely controlled by the
smaller of the two energy differences E
B
− E
1
and E
B
− E
−1
. Thus,maintaining the same
relaxation time would require that the smaller energy difference is nearly constant.
This is essential, because the relaxation time changes exponentially with the energy
difference.
We have shown that the Glauber dynamics and the activated dynamics are closely
related despite appearing to be quite different. We have also found how to generalize
the Glauber dynamics if we must allow different r elaxation times for different spins.
Finally, we have found that the time increment for a single spin update cor responds
to /N. This means that a single Glaub er time st ep consisting of N spin updates cor
responds to a physical time —the microscopic relaxation time of the individual
spins.
At this point we have int roduced a dynamics for the Ising mo del, and it should
be possible for us to investigate questions about its kinet ics.Often questions about the
kinetics may be described in t er ms of t ime correlations. Like the cor relation length,
we can int roduce a cor relation time
s
that is given by the decay of the spinspin cor
relation
< s
i
(t ′)s
i
(t) > − < s
i
>
2
∝e
− t −t ′ /
s
(1.6.69)
For the case of a relaxing twostate system,the cor relation time is the relaxation time
. This follows from Eq. (1.4.45), with some att ention to notation as described in
Quest ion 1.6.13.
Q
ue s t i on 1 . 6 . 1 3 Show that for a twostate syst em, the correlation time
is the relaxation time .
N
k
∆t
k
k
∑
→N
∆t
S t a t i s t i c a l f i e l d s 177
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 177
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 177
Solut i on 1 . 6 . 1 3 The difficulty in this question is restoring some of the no
tational details that we have been leaving out for convenience. From
Eq. (1.6.45) we have for the aver age:
(1.6.70)
Let’s assume that t ′> t , then each of these joint probabilities of the form
P
s
i
(t ′),s
i
(t)
(s
2
,s
1
) is given by the probability that the twostate system starts in
the state s
1
at time t, multiplied by the probability that it will evolve from s
1
into s
2
at time t ′.
(1.6.71)
The fir st factor on t he ri ght is call ed t he con d i ti onal prob a bi l i t y. The prob
a bi l i t y for a par t icular state of the spin is t he equ i l i brium prob a bi l i t y
t h a t we wro t e as P(1) and P( −1 ) . The con d i ti onal prob a bi l i ties sat i s f y
P
s
i
( t′) ,s
i
( t )
( 1s
1
) + P
s
i
( t ′) ,s
i
( t )
(−1s
1
) · 1 , so we can simplify Eq . (1.6.70) to :
(1.6.72)
The evolution of the probabilities are described by Eq.(1.4.45),r epeated here:
P(1;t) · ( P(1;0) − P(1; ∞)) e
t /
+ P(1;∞) (1.6.73)
Since the conditional probability assumes a definite value for the initial state
(e.g., P(1;0) · 1 for P
s(t ′),s(t)
(1 1)), we have:
P
s(t ′),s( t)
(1 1) · (1 − P(1)) e
− (t′t) /
+ P(1)
P
s( t′),s(t )
(−1 –1) · (1 − P(−1))e
− ( t′t) /
+ P( −1)
(1.6.74)
Insert ing these into Eq. (1.6.72) gives:
(1.6.75)
The constant ter m on the right is the same as the square of the average of the
spin:
<s
i
(t)>
2
· (P(1) − P(−1))
2
(1.6.76)
Insert ing into Eq.(1.6.69) leads to the desired result (we have assumed that
t′ > t ):
<s
i
( t ′)s
i
(t )> − <s
i
(t )>
2
· 4P(1)P( −1)e
−(t′ − t ) /
∝ e
−(t′ − t ) /
(1.6.77)
<s
i
(
′
t )s
i
(t) > ·(2 (1 −P(1))e
−( ′ t −t ) /
+P(1)
[ ]
−1)P(1)
+(2 (1− P( −1))e
−( ′ t −t ) /
+P(−1)
[ ]
−1) P( −1)
· 4P(1)P( −1)e
−( ′ t −t ) /
+( P(1) −P(−1))
2
<s
i
(
′
t )s
i
(t) > ·(2P
s
i
( ′ t ), s
i
(t )
(1 1) −1) P(1) +(2P
s
i
( ′ t ),s
i
(t )
( −1 −1) −1) P( −1)
P
s
i
( ′ t ),s
i
(t )
(s
2
,s
1
) · P
s
i
( ′ t ), s
i
(t )
(s
2
 s
1
)P
s
i
(t )
(s
1
)
<s
i
( ′ t )s
i
(t) > · P
s
i
( ′ t ),s
i
(t )
(1, 1) +P
s
i
( ′ t ),s
i
(t )
( −1, −1)
−P
s
i
( ′ t ), s
i
(t )
(1, −1) −P
s
i
( ′ t ),s
i
(t )
( −1,1)
178 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 178
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 178
From the beginning of our discussion of the Ising model,a centr al issue has been
the breaking of the ergodic theorem associated with the spontaneous magnetization.
Now that we have int roduced a kinetic mo del, we will tackle this problem dir ectly.
First we describe the problem fully. The ergodic theorem states that a time aver age
may be replaced by an ensemble average. In the ensemble,all possible states of the sys
tem are included with their Boltzmann probability. Without for mal justification, we
have tr eated the spontaneous magnetization of the Ising model at h · 0 as a macro
scopically obser vable quantit y. According to our prescript ion,this is not the case. Let
us per for m the average < s
i
> over the ensemble at T · 0 and h · 0. Ther e are two pos
sible states of the system with the same energy, one with {s
i
· 1} and one with {s
i
· –1}.
Since they must occur with equal probability by our assumption, we have that the av
erage < s
i
> is zero.
This argument breaks down because of the kinetics of the sys t em that preven t s
a tr a n s i ti on from one st ate to t he other du r ing the co u rse of a measu rem en t . Thu s
we measu re on ly one of t he two po s s i ble st ates and find a magn eti z a ti on of 1 or –1.
How can we prove that this sys tem breaks t he er godic t heorem? The most direct te s t
is to st ar t from a sys tem with a sligh t ly po s i tive magn etic field near T · 0 wh ere t he
m a gn eti z a ti on is +1 , and reverse t he sign of the magn etic fiel d . In t his case t he equ i
l i brium st ate of the sys tem should have a magn eti z a ti on of – 1 . In s te ad t he sys tem wi ll
maintain it s magn eti z a ti on as +1 for a long time before even tu a lly swi tching from
one to the other. The process of s wi tching corr e s ponds to the kinetics of a firs t  order
tr a n s i ti on .
1 . 6 . 8 Kinet ics of a first order pha se t ra nsit ion
In this section we discuss the firstorder transition kinetics in the Ising model. Similar
arguments apply to other firstorder t ransitions like the fr eezing or boiling of water.
If we start with an Ising model in equilibrium at a t emper ature T < T
c
and a small
positive magnet ic field h << zJ, the magnet ization of the system is essentially m
0
( zJ).
If we change the magnetic field suddenly to a small negative value, the equilibrium
state of the system is −m
0
( zJ) ;however, the system will require some time to change
its magnetization. The change in the magnetic field has ver y little effect on the energy
of an individual spin s
i
. This energy is mostly due to the inter action with its neigh
bors, with a r elatively small contr ibution due to the external field. Most of the time
the neighbors are oriented UP, and this makes the spin have a lower energy when it is
UP. This gives rise to the magnetization m
0
( zJ). Until s
i
’s neighbors change their av
er age magnetization, s
i
has no reason to change its magnetization. But then neither do
the neighbors. Thus, because each spin is in its own local equilibr ium,the process that
eventually equilibr ates the syst em r equires a cooperative effect including more than
one spin. The process by which such a firsto rder t ransition occurs is not the simul 
taneous switching of all of the spins fr om one value to the other. This would r equire
an impossibly long time. Instead the t ransition occurs by nucleation and growth of
the equilibrium phase.
It is easiest to describe the nucleation process when T is sufficiently less than T
c,
so that the spins are almost always +1. In mean field, already for T < 0.737T
c
the
S t a t i s t i c a l f i e l d s 179
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 179
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 179
probability of a spin being UP is greater than 90% ( P(1) · (1 + m) / 2 > 0.9),and for
T < 0.61T
c
the p robability of a spin being UP is greater than 95%. As long as T is
greater than zero, individual spins will flip fr om time to time. However, even though
the magnetic field would like them to be DOWN, their local environment consisting of
UP spins does not. Since the int eract ion with their neighbor s is st ronger than the in
teract ion with the external field,the spin will generally flip back UP after a short time.
Ther e is a smaller probability that a second spin,a neighbor of the first spin, will also
flip DOWN. Because one of the neighbors of the second spin is already DOWN, there is
a lower energy cost than for the first one. However, the energy of the second spin is
still higher when it is DOWN, and the spins will gener ally flip back, first one then the
other. There is an even smaller probability that three interacting spins will flip DOWN.
The existence of two DOWN spins makes it more likely for the third to do so. If the first
two spins were neighbor s,than the third spin can have only one of them as its neigh
bor. So it still costs some energy to flip DOWN the third spin. If there are thr ee spins
flipped DOWN in an L shape,the spin that completes a 2 × 2 square has two neighbor s
that are +1 and two neighbors that are –1,so the interactions with its neighbors can
cel. The exter nal field then gives a p reference for it to be DOWN. There is still a high
probability that several of the spins that are DOWN will flip UP and the little cluster
will then disappear. Fig. 1.6.9 shows various clusters and their energies compared to a
unifor m region of +1 spins. As more spins are added,the internal region of the clus
ter becomes composed of spins that have four neighbor s that are all DOWN. Beyond a
certain size (see Question 1.6.14) the cluster of DOWN spins will grow, because adding
spins lowers the energy of the syst em. At some point the growing r egion of DOWN
spins encounters another region of DOWN spins and the whole system reaches its new
equilibr ium state, where most spins are DOWN.
Q
ue s t i on 1 . 6 . 1 4 Using an estimate of how the energy of large cluster s of
DOWN spins grows, show that large enough clusters must have a lo wer
energy than the same region if it were composed of UP spins.
Solu t ion 1 . 6 . 1 4 The en er gy of a clu s ter of DOW N spins is given by its inter
acti on with the ex ternal magn etic field and the nu m ber of a n ti a l i gn ed bon d s
that for m its bo u n d a ry. The ch a n ge in en er gy due to the ex ternal magn eti c
f i eld is ex act ly 2hN
c
, wh i ch is propor ti onal to the nu m ber of spins in the
180 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 180
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 6 . 9 I llust ra t ion of sma ll clust e rs o f DOWN spin s sh own a s fille d da rk squa re s re si d
in g in a ba ckgroun d of UP spin s on a squa re la t t ice . Th e e n e rgie s for cre a t in g t h e clust e rs a re
sh own . Th e ma gn e t ic fie ld, h , is n e ga t ive . Th e forma t ion of such clust e rs is t h e first st e p t o
wa rds nucle a t ion of a DOWN re gion wh e n t h e syst e m unde rgoe s a first  orde r t ra n sit ion from UP
t o DOWN. Th e e n e rgy is coun t e d by t h e n umbe r of spin s t h a t a re DOWN t ime s t h e ma gn e t ic fie ld
st re n gt h , plus t h e in t e ra ct ion st re n gt h t ime s t h e n umbe r of a n t ia lign e d n e igh borin g spins,
wh ich is t h e le n gt h of t h e boun da ry of t h e clust e r. I n a first  or de r t ra n sit ion , a s t h e size o f
t h e clust e rs grows t h e ga in from orie n t in g t owa rd t h e ma gn e t ic fie ld e ve n t ua lly be come s
gre a t e r t h a n t h e loss from t h e bounda ry e n e rgy. Th e n t h e clust e r be come s more like ly t o grow
t h a n sh rin k. Se e Que st ion 1. 6. 14 a nd Fig. 1. 6. 10.
01adBARYAM_29412 3/10/02 10:16 AM Page 180
S t a t i s t i c a l f i e l d s 181
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 181
Title: Dynamics Complex Systems
Shor t / Normal / Long
8h+20J
2h+8J
4h+12J
6h+16J
8h+16J
10h+24J
10h+20J
12h+24J
12h+22J
12h+26J
01adBARYAM_29412 3/10/02 10:16 AM Page 181
clu s ter N
c
. This is nega tive since h is nega tive . The en er gy of the bo u n d a r y is
proporti onal to the nu m ber of a n ti a l i gn ed bon d s , and it is alw ays po s i tive .
Because ever y ad d i ti onal anti a l i gn ed bond raises the clu s ter en er gy, t h e
bo u n d a r y of the clu s ter tends to be smooth at low tem pera tu re s . Th erefore , we
can esti m a te the bo u n d a r y en er gy using a simple shape like a squ a re or circ u
lar clu s ter in 2d (a cube or ball in 3d). Ei t h er way the en er gy wi ll increase as
f JN
c
(d 1 ) /d
, wh ere d is the dimen s i on a l i ty and f is a constant acco u n ting for the
s h a pe . Si n ce the nega tive con tri buti on to the en er gy incre a s e s , in proporti on
to the area (vo lume) of the clu s ter, and the po s i tive con tri buti on to the en er gy
i n c reases in proporti on to the per i m eter (su rf ace area) of the clu s ter, the neg
a tive term even tu a lly wi n s .O n ce a clu s ter is large en o u gh so that its en er gy is
dom i n a ted by the interacti on with the magn etic fiel d ,t h en , on  avera ge , ad d i n g
an ad d i ti onal spin to the clu s ter wi ll lower the sys tem en er gy.
Q
ue s t i on 1 . 6 . 1 5 Without looking at Fig. 1.6.9, construct all of the dif
ferent possible clusters of as many as five DOWN spins.Label them with
their energy.
Solut i on 1 . 6 . 1 5 See Fig. 1.6.9.
The scenario just described, known as nucleation and growth, is generally re
sponsible for the kinetics of firstorder t ransitions. We can illustrate the process
schematically (Fig. 1.6.10) using a one dimensional plot indicating the energy per spin
of a cluster as a function of the number of atoms in the cluster. The energy of the clus
ter increases at first when there are very few spins in the cluster, and then d ecreases
once it is large enough. Eventually the energy decreases linear ly with the number of
spins in the cluster. The decrease per spin is the energy difference per spin between the
two phases. The first cluster size that is “over the hump” is known as the critical clus
ter. The process of reaching this cluster is known as nucleation.A first estimate of the
time to nucleate a critical cluster at a par t icular place in space is given by the inverse
of the Boltzmann factor of the highest energy barrier in Fig. 1.6.10. This cor responds
to the rate of tr ansition over the barr ier given by a twostate system with this same
barr ier (see Eq. (1.4.38) and Eq. (1.4.44)). The size of the critical cluster depends on
the magnitude of the magnetic field.A larger magnetic field implies a smaller critical
cluster. Once the critical cluster is reached,the kinetics cor responds to the biased dif
fusion described at the end of Section 1.4. The pr imary difficulty with an illustration
such as Fig. 1.6.10 is that it is onedimensional. We would need to show the energy of
each t ype of cluster and all of the ways one cluster can t ransform into another.
Moreover, the clusters themselves may move in space and merge or separate. In Fig.
1.6.11 we show frames from a simulation of nucleation in the Ising model using
Glauber dynamics. The frames illust rate the process of nucleation and growth.
Experimental studies of nucleation kinet ics are somet imes quite difficult. In
physical systems,impurities often lower the bar rier to nucleation and therefore con
t rol the rate at which the firsto rder t r ansition occurs. This can be a p roblem for the
investigation of the inherent nucleation because of the need to study highly purified
182 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 182
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:16 AM Page 182
systems. However, this sensitivity should be understood as an oppor tunity for control
over the kinet ics. It is similar to the sensitivity of elect rical propert ies to dopant im
purities in a semiconductor, which enables the constr uction o f semiconductor d e
vices. There is at least one direct example of the control of the kinetics of a firstorder
tr ansition. Before describing the example, we review a few propert ies of the waterto
ice tr ansition. The t emper ature of the watertoice tr ansition can be lowered signifi
cantly by the addition of impurities. The freezing temperature of salty ocean water is
lower than that of pure wat er. This sup pression is ther modynamic in origin, which
means that the T
c
is actually lower. There exist fish that live in subzerodegrees ocean
water whose blood has less salt than the surrounding ocean. These fish use a family of
socalled antifr eeze proteins that are believed to kinetically sup press the freezing of
their blood. Instead of lowering the freezing t emperature,these proteins suppress ice
nucleation.
The existence of a long nucleation time implies that it is often possible to create
metastable mater ials. For example, supercooled water is water whose temperature has
been lowered below its fr eezing point. For many years, par ticle physicists used a su
per heated fluid to detect elementary part icles. Ultrapure liquids in large tanks were
S t a t i s t i c a l f i e l d s 183
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 183
Title: Dynamics Complex Systems
Shor t / Normal / Long
E
N
c
0
E
Bmax
Critical N
c
Fi gure 1 . 6 . 1 0 Sch e ma t ic illust ra t ion of t h e e n e rgie s t h a t con t rol t h e kin e t ics of a first  orde r
ph a se t ra n sit ion . Th e h orizon t a l a xis is t h e size of a clust e r o f DOWN spin s N
c
t h a t a re t h e e qui
librium ph a se . Th e clust e r is in a ba ckgroun d of UP spin s t h a t a re t h e me t a st a ble ph a se . The
ve rt ica l a xis is t h e e n e rgy of t h e clust e r. I n it ia lly t h e e n e rgy in cre a se s wit h clust e r size un t il
t h e clust e r re a ch e s t h e crit ica l clust e r size . Th e n t h e e n e rgy de cre a se s. Ea ch spin flip h a s it s
own ba rrie r t o ove rcome, le a din g t o a wa sh boa rd pot e n t ia l. Th e h igh e st ba rrie r E
Bma x
t h a t t he
syst e m must ove rcome t o cre a t e a crit ica l n ucle us con t rols t h e ra t e of nucle a t ion . Th is is sim
ila r t o t h e re la xa t ion of a t wo le ve l syst e m discusse d in Se ct ion 1. 4. Howe ve r, t h is simple pic
t ure n e gle ct s t h e ma n y diffe re n t possible clust e rs a n d t h e ma n y wa ys t h e y ca n con ve rt in t o
e a ch ot h e r by t h e flippin g of spins. A fe w diffe re n t t ype s of clust e rs a re sh own in Fig. 1. 6. 9.
01adBARYAM_29412 3/10/02 10:17 AM Page 183
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 184
Title: Dynamics Complex Systems
Shor t / Normal / Long
t=240
t=280 t=400
t=360
t=200 t=320
Fi gure 1 . 6 . 1 1 Fra me s from a simula t ion illust ra t in g nucle a t ion a n d growt h in a n I sin g mode l
in 2 d. Th e t e mpe ra t ure is T · zJ / 3 a n d t h e ma gn e t ic fie ld is h · −0. 25. Gla ube r dyn a mics wa s
use d. Ea ch t ime st e p con sist s of N upda t e s wh e re t h e spa ce size is N · 60 × 60. Fra me s sh own
a re in in t e rva ls of 40 t ime st e ps. Th e first fra me sh own is a t t · 200 st e ps a ft e r t h e be gin 
n in g of t h e simula t ion . Bla ck squa re s a re DOWN spin s a n d wh it e a re a s a re UP spins. Th e
184 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
01adBARYAM_29412 3/10/02 10:17 AM Page 184
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 185
Title: Dynamics Complex Systems
Shor t / Normal / Long
t=440
t=640
t=560
t=480
t=520
t=600
me t a st a bilit y of t he UP ph a se is se e n in t h e e xist e n ce of on ly a fe w DOWN spin s un t il t h e fra me
a t t · 320. All e a rlie r fra me s a re qua lit a t ive ly t h e sa me a s t h e fra me s a t t · 200, 240 a n d 280.
A crit ica l n ucle us forms be t we e n t · 280 a n d t · 320. Th is n ucle us grows syst e ma t ica lly un
t il t he fin a l fra me whe n t he whole syst e m is in t h e e quilibrium DOWN ph a se.
Co m p u t e r s i m u l a t i o n s 185
01adBARYAM_29412 3/10/02 10:17 AM Page 185
suddenly shifted above their boiling temperature. Small bubbles would then nucleate
along the ionization trail left by charged part icles moving through the tank. The bub
bles could be photogr aphed and the t r acks of the par ticles identified. Such detectors
were called bubble chambers. This methodology has been largely abandoned in favor
of elect ronic detector s. There is a limit to how far a system can be supercooled or su
perheated. The limit is easy to understand in the Ising model. If a system with a pos
itive magnetization m is subject to a negative magnet ic field of magnitude greater
than zJm, then each individual spin will flip DOWN independent of its neighbors. This
is the ult imate limit for nucleat ion kinet ics.
1 . 6 . 9 Connect ions bet ween CA a nd t he Ising model
Our primary object ive throughout this section is the investigation of the equilibrium
properties of interact ing systems. It is useful, once again, to consider the relationship
between the equilibr ium ensemble and the kine tic CA we c onsider ed in Section 1.5.
When a deter ministic CA evolves to a unique steady state independent of the initial
conditions, we can identify the final state as the T · 0 equilibr ium ensemble. This is,
however, not the way we usually consider the relationship between a dynamic system
and its equilibrium condition. Instead, the equilibr ium state of a syst em is generally
regarded as the time average over microscopic dynamics. Thus when we use the CA
to represent a microscopic dynamics, we could also id entify a long time aver age of a
CA as the equilibr ium ensemble. Alter natively, we can consider a st ochastic CA that
evolves to a unique steadystate distribution where the steady state is the equilibrium
ensemble of a suitably defined energy function.
Comput e r Si mula t i ons ( Mont e Ca rlo,
Si mula t e d Anne a li ng)
Com p uter simu l a ti ons en a ble us to inve s ti ga te the proper ties of dynamical sys tems by
d i rect ly stu dying the proper ties of p a rt icular model s . O ri gi n a lly, the introdu cti on of
com p uter simu l a ti on was vi ewed by many re s e a rch ers as an unde s i ra ble ad ju n ct to an
a lytic theor y. Cu rren t ly, s i mu l a ti ons play su ch an impor tant role in scien t ific stu d i e s
that many analytic re sults are not bel i eved unless they are te s ted by com p uter simu l a
ti on . In part , this ref l ects the understanding that analytic inve s ti ga ti ons of ten requ i re
a pprox i m a ti ons that are not nece s s a ry in com p uter simu l a ti on s . Wh en a series of a p
prox i m a ti ons has been made as part of an analytic stu dy, a com p uter simu l a ti on of t h e
ori ginal probl em can direct ly test the approx i m a ti on s . If the approx i m a ti ons are va l i
d a ted , the analytic re sults of ten gen er a l i ze the simu l a ti on re su l t s . In many other cases,
s i mu l a ti ons can be used to inve s ti ga te sys tems wh ere analytic re sults are unknown .
1 . 7 . 1 Molecula r dyna mics a nd det erminist ic simula t ions
The simulation of systems composed of microscopic Newtonian part icles that expe
r ience forces due to interpar ticle int eractions and external fields is called molecular
dynamics. The techniques of molecular dynamics simulations, which integr ate
1 . 7
186 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 186
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 186
Newton’s laws for individual particles,have been developed to opt imize the efficiency
of computer simulation and to take advantage of parallel computer architect ures.
Typically, these methods implement a discrete iter ative map (Sect ion 1.1) for the par
ticle positions. The most common ( Verlet) form is:
r( t) · 2r( t − ∆t) − r(t − 2∆t) + ∆t
2
a(t − ∆t) (1.7.1)
where a(t ) · F(t)/ m is the force on the part icle calculated from models for interpar t i
cle and external for ces. As in Section 1.1, time would be measured in units of the time
inter val ∆t for convenience and efficiency of implementation. Eq. (1.7.1) is alge
br aically equivalent to the iterative map in Question 1.1.4, which is written as an up
date of both posit ion and velocity:
r(t) · r(t − ∆t) + ∆t v( t − ∆t / 2)
v( t + ∆t / 2) · v(t − ∆t / 2) + ∆ta(t)
(1.7.2)
As indicat ed, the velocity is int er preted to be at half integral times, though this does
not affect the result of the iterat ive map.
For most such simulations of physical systems,the accuracy is limited by the use
of models for interatomic interactions. Modern effor ts attempt to improve upon this
appr oach by calculating forces from quantum mechanics. However, such simulations
are ver y limited in the number of par ticles and the duration of a simulation.A useful
measure of the extent of a simulation is the pr oduct Nt
max
of the amount of physical
time t
max
, and the numb er of particles that are simulated N. Even without quantum
mechanical forces,molecular dynamics simulations are still far from being able to de
scribe systems on a space and time scale comparable to human senses. However, there
are many questions that can be addressed regarding microscopic properties of mole
cules and mat erials.
The development of appropr iate simplified macroscopic descriptions of physical
systems is an essential aspect o f our und erstanding o f these syst ems. These mo dels
may be based directly upon macroscopic phenomenology obtained from exper iment.
We may also make use of the microscopic infor mation obtained from various sources,
including both theor y and experiment, to infor m our choice of macroscopic models.
It is more difficult, but important as a str ategy for the description of both simple and
complex systems, to develop systematic methods that enable macroscopic models t o
be obtained directly from microscopic models. The development of such methods is
still in its infancy, and it is intimately related to the issues of emergent simplicity and
complexit y discussed in Chapter 8.
Abstract mathematical models that describe the deterministic dynamics for var
ious systems, whether represented in the form of differential equations or deter min
istic cellular automata (CA, Sect ion 1.5), enable computer simulation and study
through int egr ation o f the differential e quations or through simulation of the CA.
The effects of exter nal influences, not incor porated in the parameters of the model,
may be modeled using stochastic variables (Sect ion 1.2). Such models, whether of flu
ids or of galaxies, describe the macroscopic behavior of physical systems by assuming
that the microscopic (e.g., molecular) motion is irrelevant to the macroscopic
Co m p u t e r s i m u l a t i o n s 187
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 187
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 187
phenomena being described. The microscopic behavior is summarized by parameters
such as d ensity, elasticity or viscosity. Such model simulations enable us to describe
macroscopic phenomena on a large r ange of spatial and temporal scales.
1 . 7 . 2 Mont e Ca rlo simula t ions
In our investigations of various systems, we are often interested in average quantities
r ather than a complete description of the dynamics. This was part icularly apparent in
Section 1.3, when equilibr ium thermodynamic proper ties of syst ems were discussed.
The ergodic theor em (Sect ion 1.3.5) suggested that we can use an ensemble average
instead of the spacetime average of an experiment. The ensemble average enables us
to t reat problems analyt ically, when we cannot int egrate the dynamics explicitly. For
example, we studied equilibrium properties of the Ising model in Section 1.6 without
reference to its dynamics. We were able to o btain estimates of its free energy, energy
and magnet ization by aver aging various quantities using ensemble probabilit ies.
However, we also found that there were quite severe limits to our anal ytic capa
bilities even for the simplest Ising model. It was necessary to use the mean field ap
proximation to obtain results analyt ically. The essential difficulty that we face in per
forming ensemble averages for complex systems,and even for the simple Ising model,
is that the averages have to be performed over the many possible states of the system.
For as few as one hundred spins,the number of possible states of the system—2
100
—
is so large that we cannot aver age over all of the possible states. This suggests that we
consider ap proximate numer ical t echniques for stud ying the ensemble averages. In
order to perform the averages without summing over all the states, we must find some
way to select a representative sample of the possible states.
Mon t e Ca rlo simu l a ti ons were devel oped to en a ble nu m erical avera ges to be per
for m ed ef f i c i en t ly. Th ey play a cen t ral role in the use of com p uters in scien ce . Mon te
Ca rlo can be thought of as a gen eral way of e s ti m a ting avera ges by sel ecting a limited
sample of s t a tes of the sys tem over wh i ch the avera ges are perform ed . In order to opti
m i ze conver gen ce of the avera ge , we take adva n t a ge of i n form a ti on that is known abo ut
the sys tem to sel ect the limited sample. As we wi ll see , u n der some circ u m s t a n ce s ,t h e
s equ en ce of s t a tes sel ected in a Mon te Ca rlo simu l a ti on may itsel f be used as a model of
the dynamics of a sys tem . Th en ,i f we are careful abo ut de s i gning the Mon te Ca rl o, we
can sep a ra te the time scales of a sys tem by tre a ting the fast degrees of f reedom using an
en s em ble avera ge and sti ll treat ex p l i c i t ly the dynamic degrees of f reedom .
To introduce the concept of Monte Carlo simulation, we consider finding the av
erage of a function f (s), where the system variable s has the probability P( s). For sim
plicity, we take s to be a single real variable in the range [−1,+1]. The average can be
approximated by a sum over equally spaced values s
i
:
(1.7.3)
This formula works well if the functions f (s) and P(s) are reasonably smooth and uni
form in magnitude. However, when they are not smooth,this sum can be a ver y inef
< f (s) > · f (s)P(s)ds
−1
1
∫
≈ f (s
i
)P(s
i
)
s
i
∑
s ·
1
M
f (n / M) P(n / M)
n·−M
M
∑
188 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 188
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 188
ficient way to perform the integr al. Consider this integral when P(s) is a Gaussian,and
f ( s) is a constant:
(1.7.4)
A plot of the integr and in Fig. 1.7.1 shows that for <<1 we are per forming the inte
gral by summing many values that are essentially zer o. These values cont ribute noth
ing to the result and require as much computational effor t as the comparatively few
points that do cont r ibute to the int egral near s · 0, where the function is large. The
few points near s · 0 will not give a very accurate estimate of the integral. Thus,most
of the computational work is being wast ed and the integr al is not accurat ely evalu
ated. If we want to improve the accuracy of the sum, we have to increase the value of
M. This means we will be summing many more points that are almost zero.
To avoid this problem, we would like to focus our attention on the region in
Eq. (1.7.4) where the integr and is large. This can be done by changing how we select
the points where we per form the average. Instead of picking the points at equal inter
vals along the line, we pick them with a p robability given by P(s). This is the same as
saying that we have an ensemble representing the system with the state variable s.
Then we perfor m the ensemble aver age:
(1.7.5)
< f (s) > · f (s)P(s)ds
∫
·
1
N
f (s)
s :P( s )
N
∑
< f (s) > ∝ e
−s
2
/ 2
2
ds
−1
1
∫
≈
1
M
e
−(n/ M )
2
/2
2
n·− M
M
∑
Co m p u t e r s i m u l a t i o n s 189
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 189
Title: Dynamics Complex Systems
Shor t / Normal / Long
1 −σ 0 1 σ
Fi gure 1 . 7 . 1 Plot of t h e Ga ussia n dist ribut ion illust ra t in g t h a t a n in t e gra l t h a t is pe rforme d
by un iform sa mplin g will use a lot of poin t s t o re pre se n t re gion s wh e re t h e Ga ussia n is va n 
ish in gly sma ll. Th e proble m ge t s worse a s be come s sma lle r compa re d t o t h e re gion ove r
wh ich t h e in t e gra l must be pe rforme d. I t is much worse in t ypica l mult idime n sion a l a ve ra ge s
wh e re t h e Bolt zma n n proba bilit y is use d. Mon t e Ca rlo simula t ion s ma ke such in t e gra ls com
put a t iona lly fe a sible by sa mplin g t h e int e gra n d in re gions of h igh proba bilit y.
01adBARYAM_29412 3/10/02 10:17 AM Page 189
The latt er expression r epresents the sum over N values of s, where these values have
the probability distr ibution P( s). We have implicitly assumed that the function f ( s) is
relatively smooth compared to P(s). In Eq. (1.7.5) we have replaced the int egral with
a sum over an ensemble. The problem we now fa ce is to o btain the memb ers of the
ensemble with probability P(s). To do this we will inver t the ergodic theorem of
Section 1.3.5.
Since Section 1.3 we have described an ensemble as representing a syst em, if the
dynamics of the system satisfied the ergodic theor em. We now turn this around and
say that the ensemble sum in Eq.(1.7.5) can be represented by any dynamics that sat
isfies the ergodic theorem, and which has as its equilibrium probability P(s). To do
this we int roduce a time variable t that, for our cur rent pur poses, just indicates the
order of ter ms in the sum we are performing. The value of s appearing in the t th term
would be s(t). We then r ewrite the ergodic theorem by considering the time aver age
as an approximat ion to the ensemble average (rather than the opposite):
(1.7.6)
The p roblem r emains to sequentially generate the states s(t ), or, in other words, to
specify the dynamics of the system. If we know the probability P(s) ,an d s is a few bi
nary or real variables,this may be done directly with the assistance of a random num
ber generator (Question 1.7.1). However, often the system coordinate s represents a
large number of variables.A more serious problem is that for models of physical sys
tems, we generally don’t know the probabilit y distr ibution explicitly.
Ther modynamic systems are described by the Boltzmann probabilit y
(Section 1.3):
(1.7.7)
wher e {x,p} are the microscopic coordinates of the system,and E({x,p}) is the micro
scopic energy. An example of a quantity we might want to calculate would be the av
erage energy:
(1.7.8)
In many cases,as discussed in Section 1.4, the quantity that we would like to find the
average of depends only on the position of par ticles and not on their momenta. We
then wr ite more generally
(1.7.9)
P(s) ·
1
Z
s
e
−F(s ) / kT
Z
s
· e
−F( s ) /kT
s
∑
U ·
1
Z
E({x , p})e
−E({x ,p })/ kT
{x, p}
∑
P({x, p}) ·
1
Z
e
−E({x ,p })/ kT
Z · e
−E({x,p }) /kT
{x, p}
∑
< f (s) > ·
1
T
f (s(t ))
t ·1
T
∑
190 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 190
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 190
where we use the system state variable s to represent the r elevant coordinates of the
system. We make no assumpt ion about the dimensionality of the coordinate s which
may, for example, be the coordinates {x} of all of the par ticles. F(s) is the fr ee energy
of the set of states associated with the coordinate s. A precise definition, which indi
cates both the variable s and its value s ′, is given in Eq. (1.4.27):
(1.7.10)
We note that Eq. (1.7.9) is often written using the notatio n E(s) (the energy of s) in
stead of F( s) (the free energy of s) ,t hough F(s) is more cor rect. An average we might
calculate, of a quantit y Q(s), would be:
(1.7.11)
where Q(s) is assumed t o depend only on the var iable s and not directly on {x,p}.
The problem with the evaluation of either Eq. (1.7.8) or Eq. (1.7.11) is that the
Boltzmann probability does not explicitly give us the probability of a particular state.
In order to find the actual probabilit y, we need to find the partition function Z. To cal
culate Z we need to perform a sum over all states of the syst em, which is computa
tionally impossible. Indeed,if we were able to calculate Z, then,as discussed in Section
1.3, we would know the free energy and all the other thermodynamic proper ties of the
system. So a prescription that relies upon knowing the act ual value of the probabilit y
doesn’t help us. However, it turns out that we don’t need to know the act ual proba
bility in order to constr uct a dynamics for the syst em, only the r elat ive probabilities
of part icular states. The relative probability of two states, P( s) / P(s′),is directly given
by the Boltzmann probability in ter ms of their relat ive energy:
P(s) / P(s′) · e
−(F(s) −F( s′)) / kT
(1.7.12)
This is the key to Monte Carlo simulations. It is also a natural result, since a syst em
that is evolving in time does not know global propert ies that relate to all of its possi
ble states. It only knows propert ies that are related to the energy it has,and how this
energy changes with its configuration. In classical mechanics, the change of energy
with configur ation would be the force exper ienced by a part icle.
Our task is to describe a dynamics that generates a sequence of states of a system
s(t) with the proper probability dist ribution, P( s). The classical (Newtonian) ap
proach to dynamics implies that a deter ministic dynamics exists which is responsible
for gener ating the sequence of states o f a p hysical system. In o rder to generate the
equilibrium ensemble, however, there must be contact with a thermal reservoir.
Energy tr ansfer between the system and the reser voir introduces an external int erac
tion that disrupts the system’s determinist ic dynamics.
We will make our task simpler by allowing ourselves to consider a stochastic
Markov chain (Section 1.2) as the dynamics of the syst em. The Markov chain is de
scribed by the probabilit y P
s
(s′s′′) of the system in a stat e s · s′′ making a t r ansition
U ·
1
Z
Q(s)e
−F (s ) / kT
s
∑
F
s
(
′
s ) · −kT ln(
s, ′ s
e
−E ({x,p}) /kT
{x ,p }
∑
)
Co m p u t e r s i m u l a t i o n s 191
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 191
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 191
to the state s · s′. A par t icular sequence s( t) is generat ed by starting from one config
uration and choosing its successors using the t r ansition probabilities.
The gener al formulation of a Markov chain includes the classical Newtonian dy
namics and can also incorpor ate the effects of a thermal reser voir. However, it is gen
erally convenient and useful to use a Monte Carlo simulation to evaluate averages that
do not depend on the momenta,as in Eq.(1.7.11). There are some drawbacks to this
appr oach. It limits the proper ties of the system whose averages can be evaluated.
Systems where int er actions between par ticles d epend on their momenta cannot be
easily included. Moreover, averages of quantities that depend on both the momentum
and the position of par ticles cannot be performed. However, if the energy separates
into potential and kinet ic energies as follows:
(1.7.13)
then aver ages over all quantities that just depend on momenta (such as the kinetic en
ergy) can be evaluated directly without need for numerical computation. These aver
ages are the same as those of an ideal gas. Monte Carlo simulations can then be used
to per for m the average over quantities that d epend only upon position {x}, or more
generally, on positionrelated variables s. Thus,in the r emainder of this section we fo
cus on describing Markov chains for systems described only by positionrelated vari
ables s.
As described in Section 1.2 we can think about the Markov dynamics as a dy
namics of the probability rather than the dynamics of a syst em. Then the dynamics
are specified by
(1.7.14)
In order for the stochastic dynamics to represent the ensemble, we must have the time
average o ver the probability distr ibution P
s
(s′,t) equal to the ensemble probability.
This is tr ue for a long enough time average if the pr obability converges to the ensem
ble probability distribution, which is a steadystate distribution of the Markov chain:
(1.7.15)
Ther modynamics and stochastic Markov chains meet when we const ruct the Markov
chain so that the Boltzmann probability, Eq. (1.7.9), is the limit ing distribut ion.
We now make use of the PerronFrobenius theorem (see Sect ion 1.7.4 below),
which says that a Mar kov chain governed by a set o f t ransition probabilities P
s
( s′s′′)
converges to a unique limiting probability distribut ion as long as it is irreducible and
acyclic. Irreducible means that there exist possible paths between ea ch state and all
other possible states of the system. This does not mean that all states of the system are
connected by nonzero t ransition probabilities. There can be tr ansition probabilities
that are zero. However, it must be impossible to separate the states into two sets for
which there are no t ransitions from one set to the other. Acyclic means that the sys
tem is not ballistic—the states are not organized by the t r ansition mat rix into a ring
P
s
(
′
s ) · P
s
(
′
s ;∞) · P
s
(
′
s 
′ ′
s )P
s
(
′ ′
s ;∞)
′′
s
∑
P
s
(
′
s ;t ) · P
s
(
′
s 
′ ′
s )P
s
(
′ ′
s ;t −1)
′′
s
∑
E({x, p}) ·V ({x}) + p
i
2
/ 2m
i
∑
192 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 192
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 192
with a deterministic flow around it. There may be currents, but they must not be de
ter ministic. It is sufficient for there to be a single state which has a nonzero pr obabil
ity of making a transition to itself for this condition to be satisfied,thus it is often as
sumed and unstated.
We can now summarize the problem of identifying the desired Markov chain. We
must const ruct a matrix P
s
(s′s′′) that satisfies three propert ies.First,it must be an al
lowable t ransition mat rix. This means that it must be nonnegative, P
s
(s′′s′)≥0, and
satisfy the nor malization condition (Eq (1.2.4)):
(1.7.16)
Second,it must have the desired probability distribution, Eq.(1.7.9),as a fixed point.
Third, it must not be reducible—it is possible to const ruct a path between any two
states of the system.
These conditions are sufficient to guarantee that a long enough Markov chain
will be a good approximation to the desired ensemble. There is no guarantee that the
convergence will be rapid. As we have seen in Sect ion 1.4,in the case of the glass tran
sition,the ergodic theorem may be violated on all practical time scales for systems that
are following a particular dynamics. This applies to realistic or artificial dynamics. In
gener al such violations o f the ergodic theorem, or even just slow convergence of av
erages, are due to energy barr iers or entr opy “bottlenecks” that p revent the system
from reaching all possible configurations of the syst em in any reasonable time. Such
obstacles must be determined for each system that is studied, and are somet imes but
not always apparent. It should be under stood that different dynamics will satisfy the
conditions of the ergodic theorem over ver y different time scales. The equivalence of
results of an average perfor med using two distinct dynamics is only guaranteed ifthey
are both simulat ed for long enough so that each sat isfies the ergodic theorem.
Our discussion here also gives some additional insights into the conditions un
der which the ergodic theorem applies to the actual dynamics of physical systems. We
note that any proof of the applicability of the ergodic theorem to a real syst em r e
quires considering the actual dynamics rather than a model stochastic process. When
the ergodic theorem does not apply to the actual dynamics, then the use o f a Monte
Carlo simulation for performing an average must be considered car efully. It will not
give the same results if it satisfies the ergodic theorem while the real system does not.
We are still faced with the task of selecting values for the transition probabilities
P
s
(s′s′′) that satisfy the three requirements given above. We can simplify our search
for t ransition probabilities P
s
(s′s′′) for use in Monte Carlo simulations by imposing
the additional const raint of microscopic rever sibility, also known as detailed balance:
P
s
(s′′s′)P
s
(s′;∞) · P
s
(s′s′′) P
s
( s′′;∞) (1.7.17)
This equation implies that the t r ansition currents between two states of the system are
equal and ther efore cancel in the steady state, Eq.(1.7.15). It cor responds to tr ue equi
librium,as would be present in a physical system. Detailed balance implies the steady
state condition, but is not requir ed by it.Steady state can also include currents that do
P
s
(
′ ′
s 
′
s )
′ ′
s
∑
·1
Co m p u t e r s i m u l a t i o n s 193
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 193
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 193
not change in time. We can prove that Eq.(1.7.17) implies Eq. (1.7.15) by summing
over s′:
(1.7.18)
We do not yet have an explicit prescript ion for P
s
(s′s′′). There is still a tremen
dous flexibility in determining the transition probabilities.One prescription that en
ables direct implementat ion, called Met ropolis Monte Carlo, is:
(1.7.19)
These expressions specify the t ransition probabilit y P
s
(s′s′′) in t erms of a symmet ric
stochastic mat r ix (s′s′′). (s′s′′) is independent of the limiting equilibrium distrib
ution. The constraint associated with the limiting distribut ion has been incor porated
explicitly into Eq. (1.7.19). It satisfies detailed balance by direct substitution in
Eq. (1.7.17), since for P
s
(s′)≥ P
s
( s′′) (similar ly for the opposite) we have
(1.7.20)
The sym m etr y of the matr ix (s′ s′ ′) is essen tial to the proof of det a i l ed balance . O n e
must of ten be careful in the de s i gn of s pecific algor ithms to en su re this proper ty. It is
also important to note that the limiting prob a bi l i ty appe a rs in Eq . (1.7.19) on ly in the
form of a ra t io P
s
(s′ ) / P
s
(s′′) wh i ch can be given direct ly by the Boltzmann dist ri buti on .
To under stand Metropolis Monte Carlo, it is helpful to describe a few examples.
We first describe the movement o f the syst em in terms of the under lying st ochastic
process sp ecified by (s′s′′), which is ind ependent of the limiting dist ribution. This
means that the limiting dist ribution of the und erlying p rocess is uniform o ver the
whole space of possible states.
A standard way to choose the mat rix ( s′s′′) is to set it to be constant for a few
states s′ that are near s′′. For example, the simplest rand om walk is such a case, since
it allows a probability of 1/ 2 for the system to move to the right and to the left. If s is
a continuous variable, we could choose a distance r
0
and allow the walker to take a step
anywhere within the distance r
0
with equal probability. Both the discrete and contin
uous random walk have ddimensional analogs or, for a syst em of int eracting par t i
cles, Ndimensional analogs. When there is more than one dimension, we can choose
to move in all dimensions simultaneously. Alternatively, we can choose to move in
only one of the dimensions in each step. For an Ising mo del (Sect ion 1.6), we could
allow equal probability for any one of the spins to flip.
Once we have specified the underlying stochastic process, we generate the se
quence of Monte Carlo steps by applying it. However, we must modify the probabili
ties according to Eq. (1.7.19). This takes the form of choosing a st ep, but sometimes
rejecting it rather than taking it. When a st ep is r ejected,the system does not change
P
s
( ′ ′ s  ′ s ) P
s
( ′ s ) · ( ′ ′ s  ′ s ) P
s
( ′ s ) · ( ′ s  ′ ′ s )P
s
( ′ s )
· (
′
s 
′ ′
s )P
s
(
′
s ) / P
s
(
′ ′
s ) ( )P
s
(
′ ′
s ) · P
s
(
′
s 
′ ′
s )P
s
(
′ ′
s )
P
s
(
′
s 
′ ′
s ) · (
′
s 
′ ′
s ) P
s
(
′
s ) / P
s
(
′ ′
s ) ≥1
′ ′
s ≠
′
s
P
s
(
′
s 
′ ′
s ) · (
′
s 
′ ′
s ) P
s
(
′
s ) / P
s
(
′ ′
s ) P
s
(
′
s ) / P
s
(
′ ′
s ) <1
′ ′
s ≠
′
s
P
s
(
′ ′
s 
′ ′
s ) ·1 −
′
s ≠
′′
s
∑
P
s
(
′
s 
′ ′
s )
′
s
∑
P
s
(
′ ′
s 
′
s )P
s
(
′
s ;∞) ·
′
s
∑
P
s
(
′
s 
′ ′
s )P
s
(
′ ′
s ; ∞) ·P
s
(
′ ′
s ;∞)
194 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 194
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 194
its state. This gives rise to the third equation in Eq.(1.7.19) where the system does not
move. Specifically, we can implement the Monte Carlo process according to the fol
lowing prescr iption:
1. Pick one of the possible moves allowed by the underlying process. The selection
is random from all of the possible moves. This guarantees that we are selecting it
with the underlying probabilit y (s′s′′).
2. Calculate the ratio of probabilities b etween the location we are going t o, com
pared to the locat ion we are coming from
P
s
(s′′) / P
s
(s′) · e
−( E(s′)−E(s′)) / kT
(1.7.21)
If this ratio of probabilities is greater than one, which means the energy is lower where
we are going, the st ep is a ccepted. This gives the probability for the p rocess to occur
as ( s′s′′), which agrees with the first line of Eq.(1.7.19). However, if this ratio is less
than one, we accept it with a probability given by the ratio. For example,if the ratio is
0.6, we accept the move 60% of the time. If the move is rejected,the system stays in its
or iginal location. Thus,if the energy where we are t rying to go is higher, we do not ac
cept it all the time, only some o f the time. The likelihood that we a ccept it d ecreases
the higher the energy is.
The Metropolis Monte Carlo p rescription makes logical sense. It t ends to move
the system to regions of lower energy. This must be the case in order for the final dis
t ribut ion to satisfy the Boltzmann probability. However, it also allows the syst em t o
climb up in energy so that it can reach, with a lower probabilit y, states of higher en
ergy. The ability to climb in energy also enables the system to get over barriers such as
the one in the twostate system in Section 1.4.
For the Ising model , we can see that the Mon te Ca rlo dynamics t hat uses all singl e
spin flips as it s underlying stoch a s tic process is not the same as the Glauber dy n a m i c s
( Secti on 1.6.7), but is similar. Both begin by sel ecting a par ticular spin. Af ter sel ecti on
of the spin, the Mon te Ca rlo wi ll set the spin to be the oppo s i te wit h a prob a bi l i ty:
min(1,e
−( E(1)−E(−1)) / kT
) (1.7.22)
This means that if the energy is lower for the spin to flip, it is flipped. If it is higher, it
may still flip with the indicated p robability. This is different fr om the Glauber pr e
scription, which sets the selected spin to UP or DOWN according to its equilibrium
probability (Eq. (1.6.61)–Eq. (1.6.63)). The difference between the two schemes can
be shown by plotting the p robability of a selected spin being UP as a funct ion of the
energy difference between UP and DOWN, E
+
= E(1) – E(–1) (Fig. 1.7.2). The Glauber
dynamics prescription is independent of the starting value of the spin. The Met ropolis
Monte Carlo prescript ion is not. The latter causes more changes, since the spin is
more likely to flip. Unlike the Monte Carlo prescription, the Glauber dynamics ex
plicitly requires knowledge of the probabilities themselves. For a single spin flip in an
Ising system this is fine, because there are only two possible states and the p robabili
ties depend only on E
+
. However, this is difficult to gener alize when a system has many
more possible states.
Co m p u t e r s i m u l a t i o n s 195
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 195
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 195
There is a way to generalize further the use of Monte Carlo by r ecognizing that
we do not even have to use the cor rect equilibrium probability distribution when gen
er ating the time series. The generalized expression for an arbitrary probability distri
bution P ′( s) is:
(1.7.23)
The su b s c ri pt P( s) indicates that t he aver a ge assumes t hat s has the prob a bi l i t y dis
tr i but i on P( s) r a t h er t han P ′(s) . This equ a ti on gen er a l i zes Eq . ( 1 . 7 . 5 ) . The prob
l em wit h t his ex pr e s s i on is that it requ i res t hat we know ex p l i c i t ly t he prob a bi l i ti e s
P( s) and P ′( s) . This can be rem ed i ed . We illu s tr a te for a specific case, wh ere we use
t he Boltzman n distr i buti on at one tem per a tu re t o eva lu a t e the aver a ge at anot her
tem pera tu re :
(1.7.24)
The r atio of par tition functions can be directly evaluated as an aver age:
< f (s) >
P (s )
·
1
N
f (s)P(s)
′ P (s)
s:
′
P ( s)
N
∑
·
′ Z
Z

.
`
,
1
N
f (s)e
−E(s)(1 /kT −1 /k ′ T )
s :
′
P (s )
N
∑
< f (s) >
P (s )
·
f (s) P(s)
′ P (s)
′
P (s)ds
∫
·
1
N
f (s)P(s)
′ P (s)
s :
′
P ( s)
N
∑
196 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 196
Title: Dynamics Complex Systems
Shor t / Normal / Long
10 5 5 10
0.2
0.4
0.6
0.8
1
0
E
+
/ kT
P
s
(1;t )
s(t–1)=–1
Metropolis Monte Carlo
Glauber Dynamics
s(t–1)=1
Fi gure 1 . 7 . 2 I llust ra t ion of t h e diffe re n ce be t we e n Me t ropolis Mon t e Ca rlo a n d Gla ube r dy
na mics for t h e upda t e of a spin in a n I sin g mode l. Th e plot s sh ow t h e proba bilit y P
s
( 1;t ) of
a spin be ing UP a t t ime t . Th e Gla ube r dyn a mics proba bilit y doe s n ot de pe n d on t h e st a rt in g
va lue of t h e spin . Th e re a re t wo curve s for t h e Mon t e Ca rlo proba bilit y, for s ( t − 1) · 1 a n d
s( t − 1) · −1.
01adBARYAM_29412 3/10/02 10:17 AM Page 196
(1.7.25)
Thus we have the expression:
(1.7.26)
This means that we can obtain the average at various t emperatures using only a sin 
gle Monte Carlo simulation. However, the whole point of using the ensemble aver
age is to ensure that the average converges rapid ly. This may not happen if the en
semble temperature T ′ is much different from the temperature T. On the other
hand, there are circumstances where the function f(s) may have an energy depen
dence that makes it better to p erform the a verage using an ensemble that is not the
equilibr ium ensemble.
The approach of Monte Carlo simulations to the study of statistical aver ages en
sures that we do not have to be concer ned that the dynamics we are using for the sys
tem is a real dynamics. The result is the same for a broad class of ar tificial dynamics.
The generality p rovides a great flexibilit y; however, this is also a limitation. We can
not use the Monte Carlo dynamics to study dynamics. We can only use it to perform
statistical averages. Must we be resigned to this limitation? The answer, at least in part,
is no. The reason is r ooted in the central limit theorem. For example,the implemen
tations of Metropolis Monte Carlo and the Glauber dynamics are quite different. We
know that in the limit of long enough times, the dist ribut ion of configurations gen
erated by both is the same. We expect that since each of them flips only one spin,if we
are interested in changes in many spins,the two should give comparable results in the
sense of the central limit theorem. This means that aside from an overall scale factor,
the time evolution of the distribution of probabilities for long times is the same. Since
we already know that the limiting distr ibution is the same in both cases, we are as
sert ing that the approach to this limiting distribution, which is the long time dynam
ics, is the same.
The claim that for a large number of steps all dynamics is the same is not true
about all possible Monte Carlo dynamics. If we allow all of the spins in an Ising model
to change their values in one step of the underlying dynamics ( s′s′′), then this st ep
would be equivalent to many st eps in a dynamics that allows only one spin to flip at a
time. In order for two different dynamics to give the same results,there are two types
of constr aints that are necessary. First, both must have similar kinds of allowed steps.
Specifically, we define steps to the naturally proximate configurations as local moves.
As long as the Monte Carlo allows only local mo ves, the long time dynamics should
be the same. Such dynamics cor respond to a local diffusion in the space o f possible
< f (s) >
P (s )
· f (s)e
−E( s)(1 /kT −1 / k ′ T )
s :
′
P (s )
N
∑
e
−E(s )(1/ kT −1 /k ′ T )
s :
′
P (s )
N
∑
Z
′
Z

.
`
,
·
e
−E(s ) / kT
s
∑
e
−E(s ) / k ′ T
s
∑
·
e
−E( s )(1 /kT −1 / k ′ T )
e
−E( s) / k ′ T
s
∑
e
−E (s ) /k ′ T
s
∑
· e
−E(s )(1/ kT −1 /k ′ T )
′
P ( s )
·
1
N
e
−E (s )(1/ kT −1/k ′ T )
s : ′ P (s)
N
∑
Co m p u t e r s i m u l a t i o n s 197
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 197
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 197
configurations of the syst em. More gener ally, two different dynamics should be the
same if configuration changes that require many steps in one also require many steps
in the other. The second type of constraint is related to symmetries of the problem. A
lack of bias in the random walk was necessary to guarantee that the Gaussian distr ib
ut ion resulted from a generalized random walk in Section 1.2. For systems with more
than one dimension, we must also ensure that there is no relative bias b etween mo
tion in different direct ions.
We can think about Monte Carlo dynamics as diffusive dynamics of a system that
interacts frequently with a reser voir. There are proper ties of more realistic dynamics
that are not reproduced by such configuration Monte Carlo simulations. Correlations
between st eps are not incor porated b ecause of the assumpt ions underlying Mar kov
chains. This rules out ballistic motion,and exact or approximate momentum conser
vation. Momentum conser vation can be included if both position and momentum
are included as system coordinates. The method called Brownian dynamics incorpo
rates both ballistic and diffusive dynamics in the same simulation. However, if cor re
lations in the dynamics o f a system have a shor ter range than the motion we are in
terested in, momentum conservation may not matter to results that are of interest,
and convent ional Monte Carlo simulations can be used directly.
In summar y, Monte Carlo simulations are designed to reproduce an ensemble
rather than the dynamics of a particular system. As such,they are ideally suited to in
vestigating the equilibr ium p roper ties of ther modynamic syst ems. However, Monte
Carlo dynamics with local moves oft en mimic the dynamics of real syst ems. Thus,
Monte Carlo simulations may be used to investigate the dynamics of systems when
they are appropriately designed. This propert y will be used in Chapter 5 to simulat e
the dynamics of long polymers.
Ther e is a flip side to the design of Monte Carlo dynamics to simulate actual dy
namics. If our objective is the t raditional objective of a Monte Carlo simulation, of
obtaining an ensemble aver age, then the ability to simulate dynamics may not be an
advantage. In some syst ems, the real dynamics is slow and we would prefer to speed
up the process. This can often be done by knowingly introducing nonlocal moves that
displace the state of the system by large distances in the space of conformations. Such
nonlocal Monte Carlo dynamics have been designed for various syst ems. In particu
lar, both local and nonlocal Monte Carlo dynamics f or the p roblem of polymer dy
namics will be described in Chapter 5.
Q
ue s t i on 1 . 7 . 1 In order to perform Monte Carlo simulations, we must
be able to choose st eps at random and accept or reject steps with a cer
tain pr obabilit y. These oper ations require the a vailability of random num
bers. We might think of the sour ce o f these random numbers as a thermal
reser voir. Computers are specifically deigned to be completely deter ministic.
This means that inherently there is no randomness in their operation. To ob
tain random number s in a computer simulation requires a deterministic al
gorithm that gener ates a sequence of numbers that look random but are not
r andom. Such sequences are called pseudorandom numbers. Random
198 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 198
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 198
numbers should not be cor related to each other. However, using pseudo
random numbers, if we start a program over again we must get exactly the
same sequence of number s. The difficulties associated with the generation of
r andom numbers are centr al to performing Monte Carlo computer simula
tions. If we assume that we have random numb ers, and they are not really
uncor related, then our results may ver y well be incorrect. Never theless,
pseudorandom numbers often give results that are consistent with those ex
pected from random number s.
There are a variet y of techniques to generate pseudorandom numbers.
Many of these pseudorandom number gener ators are designed to provide,
with equal “probabilit y,” an integer between 0 and the maximal integer pos
sible. The maximum integer used by a particular rout ine on a par ticular ma
chine should be checked before using it in a simulation. Some use a standard
shor t integer which is represented by 16 bits (2 bytes).One bit represents the
unused sign of the integer. This leaves 15 bits for the magnitude of the num
ber. The pseudorandom number thus ranges up to 2
15
− 1 · 32767. An ex
ample of a rout ine that provides pseudorandom integer s is the subrout ine
r a n d ( ) in the ANSI C libr ar y, which is executed using a line such as:
k · r a n d ( ); (1.7.27)
The following three questions discuss how to use such a pseudorandom
number generator. Assume that it provides a standard shor t integer.
1. Explain how to use a pseudorandom number generator to choose a
move in a Metropolis Monte Carlo simulation, Eq. (1.7.19).
2. Explain how to use a pseudorandom number gener ator to accept or r e
ject a move in a Metropolis Monte Car lo simulation, Eq. (1.7.19).
3. Explain how to use a pseudorandom number generator to provide val
ues of x with a probability P(x) for x in the inter val [0,1]. Hint: Use two
pseudor andom numbers ever y st ep.
Solut i on 1 . 7 . 1
1. Given the necessity of choosing one out of M possible moves, we crea te
a on e  to  one mapping bet ween the M m oves and the integers {0, . . . ,
M − 1} If M is smaller than 2
15
we can use the value o f k · r a n d ( ) to
determine which move is taken next. If k is larger than M − 1, we don’t
make any move. If M is much smaller than 2
15
then we can use only
some of the bits of k. This avoids making many unused calls to r a n d ( ).
Fewer bits can be obtained using a modulo operation. For example, if
M · 10 we might use k modulo 16. We could also ignore values ab ove
32759,and use k modulo 10. This also causes each move to occur with
equal frequency. However, a standard word of caution about using only
a few bits is that we shouldn’t use the lowest order bits (i.e., the units,
twos and fours bits), because they t end to be more cor related than the
Co m p u t e r s i m u l a t i o n s 199
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 199
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 199
higher order bits. Thus it may be best first to divide k by a small num
ber, like 8 (or equivalently to shift the bits to the right),if it is desired to
use fewer bits. If M is larger than 2
15
it is necessary to use more than one
call to r a n d ( ) (or a random number generator that provides a 4byte
integer) so that all possible moves are accounted for.
2. Given the necessity of determining whether to accept a move with the
probability P, we compare 2
15
P with a number given by k · r a n d ( ).
If the former is bigger we accept the move, and if it is smaller we r eject
the move.
3. One way to do this is to gen era te two ra n dom nu m bers r
1
and r
2
.
Dividing both by 32767 (or 2
1 5
) , we use the first ra n dom nu m ber to be
the loc a ti on in t he interval x · r
1
/ 3 2 7 6 7 . However, we use this loc a ti on
on ly if the second ra n dom nu m ber r
2
/32767 is small er than P( x) . If t h e
ra n dom nu m ber is not used , we gen era te two more and proceed . Th i s
means that we wi ll use the po s i ti on x wit h a prob a bi l i t y P(x) as de s i red .
Because it is nece s s a r y to gen era te many r a n dom nu m ber s that are re
j ected , this met h od for gen er a ting nu m bers for use in performing the in
tegr al Eq . (1.7.3) is on ly useful if eva lu a ti ons of the funct i on f(x) are
mu ch more co s t ly than ra n dom nu m ber gen era ti on .
Q
ue s t i on 1 . 7 . 2 To compare the errors that arise fr om conventional nu
mer ical integration and Monte Carlo sampling, we return to Eq.(1.7.4)
and Eq. (1.7.5) in this and the following question. We choose two int egrals
that can be evaluated analyt ically and for which the er rors can also be eval
uated analyt ically.
Evaluate two examples of the integral
∫
P(x)f (x) dx over the inter val
x ∈[1,1]. For the first example (1) take f(x) · 1, and for the second (2)
f(x) · x. In both cases assume the probability dist ribution is an exponential
(1.7.28)
wher e the normalization constant A is given by the expression in square
br ackets.
Calculate the two integrals exactly (analytically). Then evaluate approx
imations to the integr als using sums over N equally spaced points,
Eq.(1.7.4). These sums can also be evaluated analytically. To improve the re
sult of the sum, you can use Simpson’s rule. This modifies Eq.(1.7.4) only by
subtr act ing 1/2 of the value of the integrand at the first and last points. The
errors in e valuation of the same int egral by Monte Carlo simulation are to
be calculated in Question 1.7.3.
Solut i on 1 . 7 . 2
1. The value of the int egral of P(x) is unity as required by normalization.
If we use a sum over equally spaced points we would have:
P( x) · Ae
− x
·
e −e
−
]
]
]
e
− x
200 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 200
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 200
(1.7.29)
wher e we used the temporar y definit ion a · e
− /M
to obtain
(1.7.30)
Expanding the answer in powers of / M gives:
(1.7.31)
The second term can be eliminated by noting that the sum could be
evaluated using Simpson’s rule by subt ract ing 1/2 of the contribution of
the end points. Then the third t erm gives an error of
2
/ 2M
2
. This is the
error in the numer ical approximation to the average of f (x) · 1.
2. For f(x) · x the exact integral is:
(1.7.32)
while the sum is:
With some assistance from Mathematica,the expansion to second order
in / M is:
· −A
(e −e
−
)
2
+ A
(e +e
−
)
+
A
M
(e −e
−
)
2
+
A
M
2
11
12
(e −e
−
) +…
· −1/ +coth( ) +
2M
+
11
12
M
2
+…
A dx xe
− x
≈
−1
1
∫
A
M
2
ne
− (n/ M )
n·−M
M
∑
·
A
M
2
na
n
n ·−M
M
∑
·
A
M
2
a
d
da
a
n
n·−M
M
∑
·
A
M
2
a
d
da
(a
M+1
−a
−M
)
a −1
·
A
M
2
((M +1)a
M +1
+Ma
−M
)
a −1
+
A
M
2
a(a
M +1
−a
−M
)
(a −1)
2
·
A
M
(e e
/ M
+e
−
)
e
/M
−1
+
A
M
2
(e e
/M
)
e
/ M
−1
+
A
M
2
e
/ M
(e e
/ M
−e
−
)
(e
/ M
−1)
2
A dx xe
− x
−1
1
∫
· −A
d
d
dxe
− x
−1
1
∫
· −A
d
d
e −e
−
]
]
]
]
· −coth( ) +(1/ )
A dxe
− x
≈
−1
1
∫
A
(e −e
−
)
+A
(e +e
−
)
2M
+ A
(e −e
−
)
2M
2
+…
·1+
2M
tanh( ) +
2
2M
2
+…
A dxe
− x
≈
−1
1
∫
A
M
(a
M+1
−a
−M
)
a −1
·
A
M
(e e
/ M
−e
−
)
e
/ M
−1
A dxe
− x
≈
−1
1
∫
A
M
e
− (n/ M )
n ·−M
M
∑
·
A
M
a
n
n·− M
M
∑
Co m p u t e r s i m u l a t i o n s 201
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 201
Title: Dynamics Complex Systems
Shor t / Normal / Long
(1.7.34)
(1.7.33)
01adBARYAM_29412 3/10/02 10:17 AM Page 201
The first two ter ms are the cor rect result. The third term can be seen to
be eliminated using Simpson’s rule. The fourth term is the error.
Q
ue st ion 1. 7 . 3 E s ti m a te the errors in perfor ming the same integrals as in
Q u e s ti on 1.7.2 using a Mon te Ca rlo en s em ble sampling with N ter ms as
in Eq .( 1 . 7 . 5 ) . It is not nece s s a ry to eva lu a te the integrals to eva lu a te the errors .
Solut i on 1 . 7 . 3
1. The errors in perfor ming the integral for f(x) · 1 are zero, since the
Monte Car lo sampling would be given by the expression:
(1.7.35)
One way to think about this result is that Monte Carlo takes advantage
of the no rmalization of the p robability, which the technique o f sum
ming the int egrand over equally spaced points cannot do. This knowl
edge makes this integral tr ivial, but it is also of use in performing other
integrals.
2. To evaluate the error for the int egral over f (x) · x we use an argument
based on the sampling er ror of different regions of the integral. We
break up the domain [−1,1] into q regions of size ∆x · 2/q. Each region
is assumed to have a significant number of samples. The number of
these samples is approximately given by:
NP(x) ∆x (1.7.36)
If this were the ex act nu m ber of samples as q i n c re a s ed ,t h en the integr a l
would be ex act . However, s i n ce we are picking the points at ra n dom ,
t h ere wi ll be a devi a ti on in the nu m ber of t hese from this ideal va lu e . Th e
t ypical devi a ti on , according to the discussion in Secti on 1.2 of r a n dom
w a l k s , is the squ a re root of this nu m ber. Thus the er ror in the su m
(1.7.37)
from a par t icular inter val ∆x is
( NP(x)∆x)
1/2
f(x) (1.7.38)
Since this er ror could have either a positive or negative sign, we must
take the square root of the sum of the squares of the er ror in each region
to give us the total er ror :
(1.7.39)
P( x) f (x)
∫
−
1
N
f ( x)
s :P (s )
N
∑
≈
1
N
NP( x) x f ( x)
2
∑
≈
1
N
P(x ) f ( x)
2
∫
f ( x)
s :P (s )
N
∑
<1 >
P( s )
·
1
N
1
s :P (s )
N
∑
·1
202 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 202
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 202
For f(x) · x the integr al in the square r oot is:
(1.7.40)
The approach of Mon te Ca rlo is useful wh en the ex pon en tial is ra p i dly de
c ayi n g. In t his case, > > 1 , and we keep on ly the third term and have an error
that is just of m a gn i tu de 1/√N. Com p a ring with the sum over equ a lly spaced
point s from Questi on 1.7.2, we see that the error in Mon te Ca rlo is indepen
dent of for large , while it grows for the sum over equ a lly spaced poi n t s .
This is the cr ucial adva n t a ge of the Mon te Ca rlo met h od . However, for a fixed
va lue of we also see that the er ror is more slowly dec reasing with N than the
sum over equ a lly spaced poi n t s . So wh en a large nu m ber of samples is po s s i
bl e , the sum over equ a lly spaced points is more ra p i dly conver gen t .
Q
ue s t ion 1 . 7 . 4 How would the discrete nat u re of the integer ra n dom
nu m bers de s c ri bed in Questi on 1.7.1 affect the en s em ble sampling?
An s wer qu a l i t a t ively. Is there a limit to the acc u racy of the integral in this case?
Solut i on 1 . 7 . 4 The int eger random numbers introduce two additional
sources of error, one due to the sampling interval along the x axis and the
other due to the imperfect approximation of P(x). In the limit of a large
number of samples, each of the possible values along the x axis would be
sampled equally. Thus, the ensemble sum would r educe to a sum of the in
tegr and over e qually spa ced p oints. The numb er o f points is given by the
largest integer used (e.g., 2
15
). This limits the accuracy accordingly.
1 . 7 . 3 PerronFrobenius t heorem
The Per ronFrobenius theorem is tied to our under standing of the ergodic theorem
and the use of Monte Carlo simulations for the representation of ensemble averages.
The theorem only applies to a system with a finite space of possible states. It says that
a transition matrix that is irreducible must ultimately lead to a stable limiting proba
bility dist ribution. This dist r ibution is unique, and thus depends only on the t ransi
tion matr ix and not on the initial conditions. The Per ronFrobenius theorem assumes
an irreducible matrix,so that star ting from any state,there is some path by which it is
possible to reach ever y other state of the system. If this is not the case,then the theo
rem can be applied to each subset of states whose tr ansition matr ix is ir reducible.
In a mor e gener al form than we will discuss,the PerronFrobenius theorem deals
with the effect of matr ix multiplication when all of the elements of a matrix are pos
itive. We will consider it only for the case of a t ransition mat rix in a Markov chain,
which also satisfies the nor malization condition, Eq. (1.7.16). In this case, the proof
of the PerronFrobenius theorem follows from the statement that there cannot be any
eigenvalues of the t ransition mat rix that are larger than one. Other wise there would
be a vector that would increase ever ywhere upon mat r ix multiplication. This is not
Ae
− x
f (x)
2
dx
∫
· Ae
− x
x
2
dx
∫
· A
d
2
d
2
(e −e
−
)
·
2
2
−
2 coth( )
+1
Co m p u t e r s i m u l a t i o n s 203
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 203
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 203
possible, because probability is conser ved. Thus if the probability increases in one
place it must decrease someplace else, and tend toward the limiting dist r ibution.
A difficulty in the proof of the theorem arises from dealing with the case in which
there are deterministic cur r ents through the syst em: e.g., ballistic motion in a circu 
lar path. An example for a twostate system would be
P(11) · 0 P(1 −1) · 1
P(−11) · 1 P(−1−1) · 0
(1.7.41)
In this case, a syst em in the state s · +1, goes int o s · −1, and a system in the state
s · −1 goes into s · +1. The limiting b ehavior of this Mar kov chain is o f two proba
bilities that alt ernate in position without ever settling down into a limiting dist r ibu
tion. An example with thr ee states would be
P(11) · 0 P(12) · 1 P(13) · 1
P(21) · .5 P(22) · 0 P(23) · 0 (1.7.42)
P(31) · .5 P(32) · 0 P(33) · 0
Half of the syst ems with s · 1 make t r ansitions to s · 2 and half to s · 3. All systems
with s · 2 and s · 3 make t r ansitions to s · 1. In this case there is also a cyclical be
havior that does not disappear over time. These examples are special cases, and the
proof shows that they are sp ecial. It is sufficient, for example, for there to be a single
state where there is some possibility of staying in the same state. Once this is t rue,
these examples of cyclic currents do not apply and the syst em will settle down into a
limiting distribution.
We will prove the Per ronFrobenius theorem in a few st eps enumerated below.
The pr oof is p rovided for completeness and reference, and can be skipped withou t
significant loss for the purposes of this book. The proof relies upon proper ties of the
eigenvectors and eigenvalues of the transition mat r ix. The eigenvectors need not al
ways be positive, real or satisfy the normalization condition that is usually applied to
probability distributions, P(s). Thus we use v(s) to indicate complex vectors that have
a value at ever y possible state of the system.
Given an ir reducible real nonnegative matrix (P(s′s) ≥ 0) satisfying
(1.7.43)
we have:
1. Applying P(s′ s) cannot increase the value of all elements o f a nonnegative vec
tor, v(s′) ≥ 0:
(1.7.44)
To avoid infinities, we can assume that the minimization only includes s′ such that
v(s′) ≠ 0.
′ s
min
1
v( ′ s )
P(
′
s  s)v(s)
s
∑

.
`
,
≤1
P(
′
s  s)
′
s
∑
·1
204 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 204
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 204
Proof : Assume that Eq. (1.7.44) is not t rue. In this case
(1.7.45)
for all v(s ′) ≠ 0, which implies
(1.7.46)
Using Eq. (1.7.43), the left is the same as the right and the inequalit y is impossible.
2. The magnitude of eigenvalues of P(s′s) is not greater than one.
Proof : Let v(s) be an eigenvector of P(s′s) with eigenvalue :
(1.7.47)
Then:
(1.7.48)
This inequality follows because each term in the sum on the left has b een made pos
itive. If all terms star ted with the same phase, then equality holds. Otherwise, in
equalit y holds. Compar ing Eq. (1.7.48) with Eq. (1.7.44), we see that   ≤ 1.
If   · 1,then equality must hold in Eq.(1.7.48), and this implies that v(s), the
vector whose elements are the magnitudes of v(s),is an eigenvector with eigenvalue 1.
Steps 3–5 show that there is one such vector which is st rictly posit ive (gr eater than
zero) ever ywhere.
3. P(s′s) has an eigenvector with eigenvalue · 1. We use the notation v
1
(s) for this
vector.
Proof : The exist ence of such an eigenvector follows fr om the existence of an eigen
vector of the tr anspose matrix with eigenvalue · 1. Eq.(1.7.43) implies that the vec
tor v(s) · 1 (one ever ywhere) is an eigenvector of the transpose matr ix with eigenvalue
· 1. Thus v
1
(s) exists, and by step 2 we can take it to be real and nonnegative, v
1
(s)
≥ 0. We can, however, assume more, as the following shows.
4. An eigenvector of P(s′s) with eigenvalue 1 must be strictly posit ive, v
1
(s) > 0.
Proof : Define a new Mar kov chain given by the t ransition matr ix
Q( s′s) · (P( s′s) +
s,s′
) / 2 (1.7.49)
Applying Q(s′s) N − 1 times to any vector v
1
(s) ≥ 0 must yield a vector that is str ictly
positive. This follows because P(s′s) is ir reducible. Star ting with unit probability at
any one value of s, after N − 1 steps we will move some probability ever ywhere. Also,
by the const ruction of Q( s′s), any s which has a nonzero probability at one time will
continue to have a nonzero probability at all later times. By linear superposition,this
applies to any initial probability distribution. It also applies to any unnormalized vec
tor v
1
(s) ≥ 0. Moreover, if v
1
(s) is an eigenvector of P(s′s) with eigenvalue one,then it
P(
′
s  s) v(s)
s
∑
≥ v(
′
s )
P(
′
s  s)v(s)
s
∑
· v(
′
s )
′
s
∑
P(
′
s  s)v(s)
s
∑
> v(
′
s )
′
s
∑
P(
′
s  s)v(s)
s
∑
>v(
′
s )
Co m p u t e r s i m u l a t i o n s 205
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 205
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 205
is also an eigenvector o f Q(s′s) with the same eigenvalue. Since applying Q( s′s) t o
v
1
(s) changes nothing, applying it N − 1 times also changes nothing. We have just
proven that v
1
(s) must be str ictly positive.
5 . Th ere is on ly one linearly indepen dent ei genvector of P(s′ s) with ei genva lue · 1 .
Proof : Assume there are two such eigenvectors: v
1
(s) and v
2
(s). Then we can make a
linear combination c
1
v
1
( s) + c
2
v
2
( s),so that at least one of the elements is zero and oth
ers are posit ive. This linear combination is also an eigenvector of P(s′s) with eigen
value · 1, which violates step 4. Thus ther e is exactly one eigenvector of P(s′s) with
eigenvalue · 1, v
1
(s):
(1.7.50)
6. Either P(s′s) has only one eigenvalue with   · 1 (in which case · 1), or it can
be wr itten as a cyclical flow.
Proof : Steps 2 and 5 imply that all eigenvectors of P(s′s) with eigenvalues satisfying
  · 1 can be written as:
v
i
( s) · D
i
( s)v
1
(s) · e
i
i
(s)
v
1
( s) (1.7.51)
As indicat ed, D
i
(s) is a vector with elements of magnitude one, D
i
( s) · 1. We can
write
(1.7.52)
There cannot be any terms in the sum on the left of Eq.(1.7.52) that add t erms of dif
ferent phase. If there were, then we would have a smaller magnitude than adding the
absolute values, which would not agree with Eq.(1.7.50). Thus we can assign all of the
elements of D
i
(s) into groups that have the same phase. P(s′s) cannot allow tr ansi
tions to occur from any two of these groups into the same gr oup. Since P(s′s) is irre
ducible, the only remaining possibility is that the different groups are connected in a
ring with the first mapped onto the se cond, and the second mapped onto the third,
and so on until we r eturn to the first group. In par ticular, if there are any transitions
between a site and itself this would violate the requirements and we could have no
complex eigenvalues.
7. A Markov chain gover ned by an irreducible tr ansition matr ix, which has only one
eigenvector, v
1
(s) with   · 1,has a limiting dist ribution over long enough times
which is proportional to this eigenvector. Using P
t
(s′s) to represent the effect of
applying P(s′s) t times, we must prove that:
(1.7.53)
for v(s) ≥ 0. The coefficient c depends on the nor malization of v(s) and v
1
(s). If both
are normalized so that the total probability is one, then conser vation of probability
implies that c · 1.
lim
t →∞
v(s;t ) · lim
t →∞
P
t
(
′
s  s)v(s)
s
∑
· cv
1
(
′
s )
P(
′
s  s)D
i
(s)v
1
(s)
s
∑
·
i
D
i
(
′
s )v
1
(
′
s )
P(
′
s  s)v
1
(s)
s
∑
· v
1
(
′
s )
206 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 206
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 206
Proof : We wr ite the matr ix P(s′s) in the Jordan normal form using a similarity trans
formation. In matr ix notation:
P · S
−1
JS (1.7.54)
J consists of a block diagonal matr ix.Each of the block matr ices along the diagonal is
of the form
(1.7.55)
where is an eigenvalue of P. In this block the only nonzero elements ar e s on the
diagonal, and 1s just above the diagonal.
Since P
t
· S
−1
J
t
S, we consider J
t
, which consists of diagonal blocks N
t
. We prove
that N
t
→0 as t → ∞for < 1. This can be shown by evaluating explicitly the ma
t rix elements. The qth element above the diagonal of N
t
is:
(1.7.56)
which vanishes as t →∞.
Since 1 is an eigenvalue with only one eigenvector, there must be one 1 × 1 block
along the diagonal of J for the eigenvalue 1. Then J
t
as t →∞has only one nonzero el
ement which is a 1 on the diagonal. Eq.(1.7.53) follows, because applying the matr ix
P
t
always results in the unique column of S
−1
that corresponds to the nonzero diago
nal element of J
t
. By our assumptions,this column must be proportional to v
1
(s). This
completes our proof and discussion of the PerronFrobenius theorem.
1 . 7 . 4 Minimiza t ion
At low temperatures, a ther modynamic system in equilibrium will be found in its
minimum energy configuration. For this and other reasons,it is often useful to iden
tify the minimum energy configuration of a syst em without describing the full en
semble. There are also many other problems that can be for mulated as minimization
or opt imization problems.
Minimization problems are often described in a ddimensional space of contin
uous variables. When there is only a single valley in the parameter space of the prob
lem,t here are a variet y of techniques that can be used to obtain this minimum. They
may be classified into direct search and gr adientbased techniques. In this section we
focus on the singlevalley problem. In Sect ion 1.7.5 we will discuss what happens
when there is more than one valley.
Di r ect search tech n i ques invo lve eva lu a t ing t he en er gy at va r ious loc a ti on s
and closing in on t he minimu m en er gy. In one dimen s i on , s e a rch tech n i ques can
be ver y ef fect ive . The key to a sear ch is br acket ing the min imum en er gy. Th en
t −q
t
q

.
`
,
N ·
1 0 0
0 O 0
0 0 O 1
0 0 0

.
`
,
Co m p u t e r s i m u l a t i o n s 207
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 207
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 207
e ach en er gy eva lu a ti on is used to geom et r i c a lly shr ink t he po s s i ble domain of t h e
m i n i mu m .
We start in one dimension by looking at the energy at two positions s
1
and s
2
that
are near each other. If the left of the two positions s
1
is higher in energy E(s
1
) > E(s
2
),
then the minimum must be to its right. This follows from our assumption that there
is only a single valley—the energy rises monotonically away from the minimum and
therefore cannot be lower than E( s
2
) ,anywhere to the left of s
1
. Evaluating the energy
at a third location s
3
to the right of s
2
further rest ricts the possible locations of the
minimum. If E(s
3
) is also great er than the middle energy location E(s
3
) > E(s
2
), then
the minimum must lie between s
1
and s
3
. Thus, we have successfully b racketed the
minimum. Other wise, we have that E(s
3
) < E( s
2
), and the minimum must lie to the
r ight of s
2
. In this case we look at the energy at a location s
4
to the right of s
3
. This
process is continued until the energy minimum is br acketed. To avoid taking many
steps to the right,the size of the steps to the right can be taken to be an increasing geo
metr ic ser ies, or may be based on an ext rapolation of the function using the values
that are available.
O n ce t he en er gy minimum is br acketed , t he segm ent is bi s ected again and
a gain to loc a te t he en er gy minimu m . This is an it er a t ive proce s s . We de s c ri be a
simple vers i on of t his process t hat can be easily implem en ted . An iter a ti on begi n s
wit h t hree loc a ti ons s
1
< s
2
< s
3
. The va lues of t he en er gy at t hese loc a ti ons sat i s f y
E( s
1
) , E( s
3
) > E( s
2
) . Thus the minimu m is bet ween s
1
a n d s
3
. We ch oose a n ew lo
c a ti on s
4
, wh i ch in even st eps is s
4
· ( s
1
+ s
2
) / 2 and in odd steps is s
4
· ( s
2
+ s
3
) / 2 .
Th en we el i m i n a te ei t h er s
1
or s
3
. The one t hat is el i m i n a ted is t he one next to s
2
i f
E( s
2
) > E( s
4
) , or t he one next to s
4
i f E( s
2
) < E( s
4
) . The remaining t hree loc a ti on s
a r e rel a bl ed to be s
1
, s
2
, s
3
for t he next step. Itera ti ons stop wh en t he distance be
t ween s
1
and s
3
is small er than an er ror to l er a n ce wh i ch is set in adva n ce . More so
ph i s ti c a ted ver s i ons of t his algor it hm use improved met h ods for sel ecting s
4
t h a t
accel er a te the conver gen ce .
In higherdimension spaces,direct search can be used. However, mapping a mul
tidimensional energy surface is much more difficult. Moreover, the exact lo gic that
enables an energy minimum to be br acketed within a par ticular d omain in one di
mension is not possible in higherdimension spaces. Thus, techniques that make use
of a gradient of the function are t ypically used even if the gradient must be numer i
cally evaluated. The most common gr adientbased minimization techniques include
steepest descent, second order and conjugate gradient.
Steepest descent techniques involve taking steps in the direction of the most rapid
descent direction as determined by the gr adient of the energy. This is the same as us
ing a firstorder expansion of the energy to deter mine the direction of motion toward
lower energy. Illustrating first in one dimension, we start from a position s
1
and wr ite
the expansion as:
(1.7.57)
E(s) ·E(s
1
) +(s −s
1
)
dE(s)
ds
s
1
+O((s −s
1
)
2
)
208 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 208
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 208
We now take a step in the direction of the minimum by setting:
(1.7.58)
From the ex p a n s i on we see that for small en o u gh c, E(s
2
) must be small er than E( s
1
) . Th e
probl em is to caref u lly sel ect c so that we do not go too far. If we go too far we may re ach
beyond the en er gy minimum and increase the en er gy. We also do not want to make su ch
a small step that many steps wi ll be needed to re ach the minimu m . We can think of t h e
s equ en ce of con f i g u ra ti ons we pick as a time sequ en ce , and the process we use to pick
the next loc a ti on as an itera tive map. Th en the minimum en er gy con f i g u ra ti on is a fixed
point of the itera tive map given by Eq .( 1 . 7 . 5 8 ) . From a point near to the minimum we
can have all of the beh avi ors de s c ri bed in Secti on 1.1—stable (conver ging) and unsta
ble (diver gi n g ) , both of these with or wi t h o ut altern a ti on from side to side of the min
i mu m .O f p a rticular rel eva n ce is the discussion in Questi on 1.1.12 that su ggests how c
m ay be ch o s en to stabi l i ze the itera tive map and obtain rapid conver gen ce .
When s is a multidimensional variable, Eq. (1.7.57) and Eq. (1.7.58) both con
tinue to apply as long as the derivative is replaced by the gr adient:
E(s) · E(s
1
) + (s − s
1
)
.
∇
s
E(s)

s
1
+ O(( s − s
1
)
2
) (1.7.59)
s
2
· s
1
− c∇
s
E(s)

s
1
(1.7.60)
Si n ce the directi on oppo s i te to the gr ad i ent is the directi on in wh i ch the en er gy dec re a s e s
most ra p i dly, this is known as a steepest de s cent tech n i qu e . For the mu l ti d i m en s i onal case
it is more difficult to ch oose a con s i s tent va lue of c , s i n ce the beh avi or of the functi on may
not be the same in different directi on s . The va lue of c m ay be ch o s en “on the fly ” by mak
ing su re that the new en er gy is small er than the old. If the cur rent va lue of c gives a va lu e
E(s
2
) wh i ch is larger than E(s
1
) then c is redu ced . We can improve upon this by loo k i n g
a l ong the directi on of the grad i ent and con s i der ing the en er gy to be a functi on of c:
E(s
1
− c∇
s
E(s)

s
1
) (1.7.61)
Then c can be chosen by finding the act ual minimum in this direct ion using the search
technique that works well in one dimension.
Grad i ent tech n i ques work well wh en different directi ons in the en er gy have the
same beh avi or in the vi c i n i ty of the minimum en er gy. This means that the second de
riva tive in different direct i ons is approx i m a tely the same. If the second der iva tives are
ver y different in different directi on s ,t h en the grad i ent tech n i que tends to bo u n ce back
and for th perpendicular to the directi on in wh i ch the second der iva tive is ver y small ,
wi t h o ut making mu ch progress tow a rd t he minimum (Fig. 1 . 7 . 3 ) . Im provem ents of
the grad i ent tech n i que fall into two cl a s s e s . One class of tech n i ques makes direct use of
the second deriva tive s , the other does not. If we expand the en er gy to second order at
the pre s ent best guess for the minimum en er gy loc a ti on s
1
we have
(1.7.62)
E(s) ·E(s
1
) +(s −s
1
) ⋅∇
s
E(s)
s
1
+(s −s
1
) ⋅
s
∇
s
r
∇
s
E(s)
s
1
⋅(s −s
1
) +O((s −s
1
)
3
)
s
2
·s
1
−c
dE(s)
ds
s
1
Co m p u t e r s i m u l a t i o n s 209
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 209
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 209
Setting the gr adient of this expr ession to zero gives the next approximation for the
minimum energy locat ion s
2
as:
(1.7.63)
This, in effect, gives a better descrip tion of the value of c for Eq. 1.7.60, which turns
out to be a mat rix inversely related to the secondorder der ivatives. Steps are large in
directions in which the second derivat ive is small. If the second derivatives are not eas
ily available, approximate second derivatives are used that may be improved upon as
the minimization is b eing p erformed. Because of the need to evaluate the mat rix of
secondorder derivatives and inver t the matrix,this approach is not often convenient.
In addition, the use o f second derivatives assumes that the expansion is valid all the
way to the minimum energy. For many minimization problems, this is not valid
enough to be a useful approximation. Fort unately, there is a second approach called
the conjugate gradient t echnique that often wor ks as well and sometimes better.
Conjugate gradient techniques make use of the gradient but are designed to avoid
the difficulties associated with long narrow wells where the steepest descent t ech
niques result in oscillations. This is done by start ing from a steepest descent in the first
step of the minimization. In the second st ep, the displacement is taken to be along a
direction that does not include the direction taken in the first step. Explicitly, let v
i
be
the direction taken in the ith step, then the first two directions would be:
(1.7.64)
This ensures that v
2
is orthogonal t o v
1
. Subsequent directions are made or thogonal
to some number of pr evious steps. The use of orthogonal directions avoids much of
the problem of bouncing back and for th in the energy well.
Monte Carlo simulation can also be used to find minimum energy configurations
if the simulations are done at zer o temperature. A zero temperature Monte Carlo
means that the steps taken always reduce the energy of the system. This approach
works not only for continuous variables, but also for the discrete variables like in the
Ising model. For the Ising model,the zero temper ature Monte Carlo described above
v
1
· −∇
s
E(s)
s
1
v
2
· −∇
s
E(s)
s
2
+v
1
( v
1
⋅∇
s
E( s)
s
2
)
v
1
⋅v
1
s
2
·s
1
−
1
2
s
∇
s
r
∇
s
E(s)
s
1

.
`
,
−1
⋅∇
s
E(s)
s
1
210 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 210
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 7 . 3 I llust ra t ion of t h e difficult ie s in fin din g a min imum e n e rgy by st e e pe st de sce nt
wh e n t h e se con d de riva t ive is ve ry diffe re n t in diffe re n t dire ct ion s. Th e st e ps t e n d t o oscil 
la t e a n d do not ma ke progre ss t owa rd t he min imum a lon g t he fla t dire ct ion .
01adBARYAM_29412 3/10/02 10:17 AM Page 210
and the zero temperature Glauber dynamics are the same. Ever y selected spin is placed
in its low energy orientat ion—aligned with the local effective field.
None of these tech n i ques are su i ted to finding the minimum en er gy con f i g u ra ti on
i ft h ere are mu l tiple en er gy minima, and we do not know if we are loc a ted near the cor
rect minimum en er gy loc a ti on . One way to ad d ress this probl em is to start from va ri
ous init ial con f i g u ra ti ons and to look for the local minimum nearby. By doing this
m a ny times it might be po s s i ble to iden tify the gl obal minimum en er gy. This work s
wh en t here are on ly a few different en er gy minima. Th ere are no tech n i ques that guar
a n tee finding t he gl obal minimum en er gy for an ar bi tr a r y en er gy functi on E(s) .
However, by using Mon te Ca r lo simu l a ti ons that are not at T ·0 , a sys tem a tic approach
c a ll ed simu l a ted annealing has been devel oped to tr y to iden tify the gl obal minimu m .
1 . 7 . 5 Simula t ed a nnea ling
Simulated annealing was int roduced relatively recently as an approach to finding the
global minimum when the energy or other opt imization function contains many lo
cal minima. The ap proach is based on the physical process o f heating a system and
cooling it down slowly. The minimum energy for many simple mat erials is a cr ystal.
If a material is heated to a liquid or vapor phase and cooled rapidly, the material does
not cr ystallize. It solidifies as a glass or amor phous solid. On the other hand, if it is
cooled slowly, crystals may form. If the material is for med out of several different
kinds of atoms, the cooling may also result in phase separation into particular com
pounds or atomic solids. The separated compounds are lower in energy than a rapidly
cooled mixture.
Simulated annealing works in much the same way. A Monte Carlo simulation is
started at a high temperature. Then the temperature is lowered according to a cooling
schedule until the temperature is so low that no additional movements are likely. If
the procedure is effect ive,the final energy should be the lowest energy of the simula
tion. We could also keep t r ack of the energy during the simulation and take the low
est value, and the configuration at which the lowest value was reached.
In general, simulated annealing improves up on methods that find only a local
minimum energy, such as steepest descent, discussed in the previous sect ion. For
some problems, the improvement is substantial. Even if the minimum energy that is
found is not the absol ute minimum in energy o f the syst em, it may be close. For ex
ample, in problems where there are many configurations that have roughly the same
low energy, simulated annealing may find one of the lowenergy configurat ions.
However, simulated annealing does not work well for all problems,and for some
problems it fails completely. It is also true that annealing of physical mat erials does
not always result in the lowest energy conformation. Many materials, even when
cooled slowly, result in polycrystalline mat erials, disordered solids and mixtures.
When it is important for technological reasons to reach the lowest energy state, spe
cial techniques are often used. For example, the best cr ystal we know how to make is
silicon. In order to for m a good silicon cr ystal, it is grown using car eful nonuniform
cooling. A single cr ystal can be gr adually pulled fr om a liquid that solidifies only on
the surfaces of the existing cr ystal. Another technique for forming cr ystals is growth
Co m p u t e r s i m u l a t i o n s 211
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 211
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 211
from the vapor phase, where atoms are deposited on a previously formed crystal that
serves as a template for the continuing growth. The difficulties inherent in obtaining
materials in their lowest energy state are also apparent in simulat ions.
In Secti on 1.4 we con s i dered the cooling of a two  s t a te sys tem as a model of a gl a s s
tra n s i ti on . We can think abo ut this simu l a ti on to give us clues abo ut why bot h phys i
cal and simu l a ted annealing som etimes fail to find low en er gy states of the sys tem . We
s aw that using a constant cooling ra te leaves some sys tems stu ck in the high er en er gy
well . Wh en there are many su ch high en er gy wells then the sys tem wi ll not be su cce s s
ful in finding a low en er gy state . The probl em becomes more difficult if t he hei ght of
the en er gy barr i er bet ween the two wells is mu ch larger t han the en er gy differen ce be
t ween the upper and lower well s . In this case, at high er tem pera tu res the sys tem doe s
not care wh i ch well it is in. At low tem pera tu res wh en it would like to be in the lower
en er gy well , it cannot overcome the barri er. How well the annealing wor ks in finding a
l ow en er gy state depends on wh et h er we care abo ut the en er gy scale ch a r acter i s tic of
the bar ri er, or ch a racteri s t ic of the en er gy differen ce bet ween the two minima.
There is another char acter istic of the energy that can help or hurt the effect ive
ness of simulated annealing. Consider a system where there are many local minimum
energy states (Fig. 1.7.4). We can think about the effect of high temperatures as plac
ing the syst em in one of the many wells of the energy minima. These wells are called
basins of attraction.A system in a particular basin of attraction will go into the min
imum energy configuration of the basin if we suddenly cool to zero temperature. We
212 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 212
Title: Dynamics Complex Systems
Shor t / Normal / Long
E(s)
Fi gure 1 . 7 . 4 Sch e ma t ic plot of a syst e m e n e rgy E( s) a s a fun ct ion of a syst e m coordin a t e s.
I n simula t e d a n n e a lin g, t h e loca t ion of a min imum e n e rgy is sough t by st a rt in g from a h igh
t e mpe ra t ure Mon t e Ca rlo a n d coolin g t h e syst e m t o a low t e mpe ra t ure. At t h e h igh t e mpe ra
t ure t h e syst e m h a s a h igh kin e t ic e n e rgy a n d e xplore s a ll of t h e possible con figura t ions. As
t h e t e mpe ra t ure is coole d it de sce n ds in t o on e of t h e we lls, ca lle d ba sin s of a t t ra ct ion , a nd
ca n n ot e sca pe . Fin a lly, wh e n t h e t e mpe ra t ure is ve ry low it lose s a ll kin e t ic e n e rgy a n d sit s
in t h e bot t om of t h e we ll. Min ima wit h la rge r ba sin s of a t t ra ct ion a re more like ly t o ca pt ure
t h e syst e m. Simula t e d a n n e a lin g works be st wh e n t h e lowe st  e n e rgy min ima h a ve t h e la rge st
ba sin s of a t t ra ct ion .
01adBARYAM_29412 3/10/02 10:17 AM Page 212
also can see that the gradual cooling in simulated annealing will result in low energy
states if the size of the basin of att ract ion increases with the depth o f the well. This
means that at high temperatures the system is more likely to be in the basin of attrac
tion of a lower energy minimum. Thus,simulated annealing wor ks best when energy
varies in the space in such a way that deep energy minima also have large basins of at
tr action. This is somet imes but not always true both in physical systems and in math
ematical opt imization problems.
Another way to improve the perfor mance of simulated annealing is to intr oduce
nonlocal Monte Carlo steps. If we understand the character istics of the energy, we can
design steps that take us through energy bar r iers. The problem with this approach is
that if we d on’t know the energy surface well enough, then mo ving around in the
space by ar bitrary nonlocal steps will result in attempts to move to locations where
the energy is high. These st eps will be rejected by the Monte Carlo and the nonlocal
moves will not help. An example where nonlocal Monte Carlo moves can help is t reat
ments of lowenergy atomic configurations in solids. Nonlocal steps can allow atoms
to mo ve through each other, switching their relat ive positions, instead o f tr ying t o
move gr adually around each other.
Finally, for the success of simulated annealing, it is often necessary to design care
fully the cooling schedule.Generally, the slower the cooling the more likely the simu
lation will end up in a low energy state. However, given a finite amount of computer
and human time,it is impossible to allow an arbit rarily slow cooling. Often there are
par ticular temper atures where the cooling rate is crucial. This happens at phase tran
sitions, such as at the liquidtosolid phase boundary. If we know of such a tr ansition,
then we can cool rapidly down to the transition, cool ver y slowly in its vicinity and
then speed up thereafter. The most difficult p roblems are those where there are bar
riers of var ying heights leading to a need to cool slowly at all temperatures.
For some problems the cooling rate should be slowed as the temperature b e
comes lower. One way to achieve this is to use a logarithmic cooling schedule. For ex
ample, we set the temper ature T( t) at time step t of the Monte Car lo, to be:
T(t) · T
0
/ ln(t / t
0
+ 1) (1.7.65)
where t
0
and T
0
are parameter s that must be chosen for the par ticular p roblem. In
Question 1.7.5 we show that for the twostate system,if kT
0
> (E
B
− E
1
),then the sys
tem will always relax into its ground state.
Q
ue s t i on 1 . 7 . 5 : Show that by using a logarithmic cooling schedule, Eq.
(1.7.65), where kT
0
> ( E
B
− E
1
),the twostate system of Sect ion 1.4 al
ways relaxes into the ground state. To simplify the problem, consider an in
cremental time ∆t during which the temperature is fixed.Show that the sys
tem will still relax to the equilibr ium probability over this incremental time,
even at low temperatures.
Solut i on 1 . 7 . 5 : We write the solution of the time evolution during the in 
cremental time ∆t from Eq. (1.4.45) as:
P(1;t + ∆t) − P(1; ∞) · (P(1;t) − P(1;∞))e
−t /τ(t)
(1.7.66)
Co m p u t e r s i m u l a t i o n s 213
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 213
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 213
where P(1;∞) is the equilibr ium value of the probability for the temper ature
T(t). (t)is the relaxation time for the temperatureT( t). In order for r elax
at ion to occur we must have that e
−t/τ(t)
<<1, equivalently:
t / (t) >> 1 (1.7.67)
We calculate (t) from Eq. (1.4.44):
(1.7.68)
where we have substit uted Eq. (1.7.65) and defined · (E
B
− E
1
) / kT
0
. We
make the reasonable assumption that we start our annealing at a high tem
perature where r elaxation is not a problem. Then by the time we get to the
low temperatures that are of interest, t >> t
0
, so:
1/ (t) > 2ν(t / t
0
)
−
(1.7.69)
and
(1.7.70)
For < 1 the righthand side increases with time and thus the relaxation im
proves with time according to Eq. (1.7.67). If relaxation occurs at higher
temperatures, it will continue to occur at all lower temperatures despite the
increasing relaxation time.
I nforma t i on
Ultimately, our ability to quantify complexity (How complex is it?) r equires a quan
tification of information (How much information does it take to describe it?). In this
section, we discuss information. We will also need computation theor y described in
Section 1.9 to discuss complexity in Chapter 8.A quantitative theor y of information
was developed by Shannon to describe the problem of communication. Specifically,
how much information can be communicated through a t r ansmission channel (e.g.,
a telephone line) with a specified alphabet of letters and a rate at which letter s can be
tr ansmitted. The simplest example is a binary alphabet consisting of two char acters
(digits) with a fixed rate of binary digits (bits) per second. However, the theor y is gen
er al enough to describe quite ar bit r ary alphabet s,let t ers of variable duration such as
are involved in Mor se code, or even continuous sound with a specified bandwidth.
We will not consider many of the additional applications,our object ive is to establish
the basic concepts.
1 . 8 . 1 The a mount of informa t ion in a messa ge
We start by consider ing the infor mation content of a string of digits s · (s
1
s
2
...s
N
) .One
might naively expect that infor mation is contained in the state of each digit. However,
when we r eceive a digit, we not only r eceive infor mation about what the digit is, but
1 . 8
t / (t) > t
0
t
1−
1/ (t ) · (e
−(E
B
−E
1
) / kT (t )
+e
−(E
B
−E
−1
) / kT (t )
)
> e
−( E
B
−E
1
) /kT (t )
· ( t /t
0
+1)
−
214 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 214
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 214
also what the digit is not. Let us assume that a digit in the string of digits we receive is
the number 1. How much information does this provide? We can contrast two differ
ent scenar ios—binar y and hexadecimal digits:
1. Ther e were two possibilities for the number, either 0 or 1.
2. Ther e were sixteen possibilities for the number {0, 1, 2, 3, 4, 5, 6, 7,8, 9, A, B, C,
D, E, F}.
In which of these did the “1” communicate more information? Since the first case pro
vides us with the infor mation that it is “not 0,” while the second provides us with the
infor mation that it is “not 0,” “not 2,” “not 3,” etc., the second p rovides more infor
mation. Thus there is more information in a digit that can have sixt een states than a
digit that can have only two states.We can quantify this difference if we consider a bi
nary representation of hexadecimal digits {0000,0001,0010,0011,…,1111}. It takes
four binary digits to represent one hexadecimal digit. The hexadecimal number 1 is
represented as 0001 in binary for m and uses four binary digits. Thus a hexadecimal 1
contains four t imes as much information as a binar y 1.
We note that the amount of information does not depend on the par ticular value
that is taken by the digit. For hexadecimal digits, consider the case of a digit that has
the value 5. Is there any difference in the amount of information given by the 5 than
if it were 1? No, either number contains the same amount of information.
This illust rates that information is act ually contained in the distinct ion between
the state of a digit compared to the other possible states the digit may have. In order
to quantify the concept of infor mation, we must specify the number of possible states.
Counting states is precisely what we did when we defined the ent ropy of a syst em in
Section 1.3. We will see that it makes sense to define the information content o f a
st ring in the same way as the entropy—the logarithm of the number of possible states
of the string:
I(s) · log
2
( ) (1.8.1)
By conven ti on , the inform a ti on is def i n ed using the loga rithm base two. Thu s , t h e
i n form a ti on con t a i n ed in a single bi n a ry digit wh i ch has two po s s i ble states is log
2
(2) ·1 .
More gen era lly, the nu m ber of po s s i ble states in a string of N bi t s , with each bit taking
one of t wo va lues (0 or 1) is 2
N
. Thus the inform a ti on in a string of N bits is (in wh a t
fo ll ows the functi on log( ) wi ll be assu m ed to be base two ) :
I(s) · log(2
N
) · N (1.8.2)
Eq.(1.8.2) says that each bit provides one unit of information. This is consistent with
the intuition that the amount of information grows linear ly with the length of the
string. The logarithm is essential, because the number of possible states grows expo
nentially with the length of the string, while the information grows linearly.
It is impor tant to recognize that the definition of infor mation we have given as
sumes that each of the possible realizations of the str ing has equal a p riori pr obabil
it y. We use the phrase a pr ior i to emphasize that this refers to the probability pr ior to
receipt of the st ring—once the str ing has arrived there is only one possibilit y.
I n f o r m a t i o n 215
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 215
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 215
To think about the role o f probability we must discuss further the nature of the
message that is being communicated. We const ruct a scenario involving a sender and
a receiver of a message. In order to make sure that the recipient of the message could
not have known the message in advance (so there is information to communicate), we
assume that the sender of the infor mation is sending the result of a random o ccur
rence, like the flipping of a coin or the throwing of a die. To enable some additional
flexibility, we assume that the random occur rence is the drawing of a ball from a bag.
This enables us to const ruct messages that have different probabilities. To be specific,
we assume there are ten balls in the bag numbered from 0 to 9. All of them are red ex
cept the ball mar ked 0, which is green. The person communicating the message only
reports ifthe ball dr awn from the bag is red (using the digit 1) or green (using the digit
0). The recipient of the message is assumed to know about the setup. If the recipient
receives the number 0,he then knows exactly which ball was selected,and all that were
not selected. However, if he r eceives a 1, this p rovides less information, because he
only knows that one of nine was selected,not which one. We notice that the digit 1 is
nine times as likely to occur as the digit 0. This suggests that a higher probability digit
contains less information than a lower probability digit.
We generalize the definition of the infor mation content of a string of digits to al
low for the possibility that different st r ings have different p robabilities. We assume
that the string is one of an ensemble of possible messages, and we define the infor
mation as:
I(s) · −log(P(s)) (1.8.3)
wher e P(s) is the probability of the occurrence of the message s in the ensemble. Note
that in the case of equal a priori probabilit y P(s) · 1/ , Eq. (1.8.3) reduces to
Eq. (1.8.1). The use o f probabilities in the definition of infor mation makes sense in
one of two cases:(1) The recipient knows the probabilities that represent the conven
tions o f the t ransmission, or (2) A large number o f independent messages are sent,
and we are consider ing the information communicated by one of them. Then we can
approximate the probability of a message by its proport ion of appearance among the
messages sent. We will discuss these points in gr eater detail later.
Q
ue s t i on 1 . 8 . 1 Calculate the information, according to Eq.(1.8.3),that
is provided by a single digit in the example given in the t ext of drawing
red and green balls from a bag.
Solut i on 1 . 8 . 1 For the case of a 0,the information is the same as that o f a
decimal digit:
I(0) · −log(1/10) ≈ 3.32 (1.8.4)
For the case of a 1 the infor mation is
I(0) · −log(9/10) ≈ 0.152 (1.8.5)
We can spec i a l i ze t he def i n i ti on of i n form a ti on in Eq . ( 1.8.3) t o a message
s · (s
1
s
2
. . .s
N
) com po s ed of i n d ivi dual ch a r act er s ( bi t s , h ex adecimal ch a ract ers ,
ASCII ch a r act ers , dec i m a l s , et c.) t hat are com p l etely indepen den t of e ach other
216 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 216
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 216
(for example, each corresponding to the result of a separate coin toss). This means
that the total probability o f the message is the product o f the p robability of each
character, P(s) · ∏
i
P (s
i
). Then the information cont ent of the message is given by:
(1.8.6)
If all of the characters have equal probability and there are k possible characters in the
alphabet, then P( s
i
) · 1/k, and the informat ion content is:
I(s) · N log(k) (1.8.7)
For the case of binary digits,this reduces to Eq.(1.8.2). For other cases like the hexa
decimal case, k · 16,this continues to make sense:the information I · 4N corresponds
to the requirement of representing each hexadecimal digit with four bits. Note that
the previous assumption of equal a priori pr obability for the whole str ing is stronger
than the independence of the digits and implies it.
Q
ue st ion 1 . 8. 2 App ly the def i n i ti on of i n form a ti on con tent in Eq .( 1 . 8 . 3 )
to each of the fo ll owing cases. As sume messages consist of a total of N bi t s
su bj ect to the fo ll owing con s tr aints (aside for the con s traints assume equ a l
prob a bi l i ti e s ) :
1. Ever y even bit is 1.
2. Ever y (odd, even) pair of bits is either 11 or 00.
3. Ever y eighth bit is a parity bit (the sum modulo 2 of the previous seven
bits).
Solut i on 1 . 8 . 2 : In each case, we first give an intuit ive argument, and then
we show that Eq. (1.8.3) or Eq. (1.8.6) give the same result.
1. The only information that is transfer red is the state of the odd bits. This
means that only half of the bits contain inf ormation. The total infor
mation is N / 2. To apply Eq.(1.8.6), we see that the even bits, which al
ways have the value 1, have a probability P(1) · 1 which cont ributes no
infor mation. Note that we never have to consider the case P(0) · 0 for
these bits, which is good, because by the for mula it would give infinite
infor mation. The odd bits with equal probabilities, P(1) · P(0) · 1/2,
give an infor mation of one for either value received.
2. Ever y pair o f bits contains only two possibilities, giving us the equiva
lent of one bit of information rather than two. This means that total in
for mation is N / 2. To apply Eq.(1.8.6), we have to consider ever y (odd,
even) pair of bits as a single char acter. These character s can never have
the value 01 or 10, and they have the value 11 or 00 with probabilit y
P(11) · P(00) · 1/ 2, which gives the expected result. We will see lat er
that there is another way to think about this example by using condi
tional probabilit ies.
3. The number of independent pieces of information is 7N / 8. To see this
from Eq. (1.8.6), we group each set o f eight bits t ogether and consider
I (s) · −
i
∑
log(P(s
i
))
I n f o r m a t i o n 217
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 217
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 217
them as a single char acter (a byte). Ther e are only 2
7
different possibil
ities for each byte, and each one has equal probability according to our
constraints and assumptions. This gives the desired result.
Note : Such representations are used to check for noise in transmission.
If there is noise,the redundancy of the eighth bit provides additional in
formation. The noisedependent amount of additional infor mation can
also be quantified; however, we will not discuss it here.
Q
ue s t i o n 1 . 8 . 3 Con s i der a t r a n s m i s s i on of E n glish ch a r act er s using
an ASCII repr e s en t a ti on . ASCII ch a racter s are t he conven ti on a l
m et h od for com p ut er r epre s en t a ti on of E n glish t ext including small and
capit al let t ers , nu m er als and pu nct u a ti on . Discuss (do not eva lu a te for
t his qu e s ti on) how you wou ld deter mine t he inform a ti on con tent of a
m e s s a ge . We wi ll eva lu a te t he infor m a ti on con tent of E n glish in a later
qu e s ti on .
Solut i on 1 . 8 . 3 In ASCII, characters are represented using eight bits. Some
of the possible combinations of bits are not used at all. Some are used ver y
infrequently. One way to deter mine the infor mation content of a message is
to assume a model where each of the characters is independent. To calculate
the information content using this assumpt ion, we must find the probabil
ity of occurrence of each char acter in a sample text. Using these p robabili
ties,the for mula Eq.(1.8.6) could be applied. However, this assumes that the
likelihood of occur rence of a character is independent of the pr eceding char
acters, which is not cor rect.
Q
ue s t i on 1 . 8 . 4 : Assume that you know in advance that the number of
ones in a long binary message is M. The total number of bits is N. What
is the information content o f the message? Is it similar to the information
content of a message of N independent binary char acters where the proba
bilit y that any character is one is P(1) · M / N ?
Solut i on 1 . 8 . 4 : We count the number of possible messages with M ones and
take the logarithm t o obtain the infor mation as
(1.8.8)
We can show that this is almost the same as the infor mation of a message of
the same length with a particular probability of ones, P(1) · M / N, by use of
the first two t erms of Ster ling’s approximation Eq. (1.2.36). Assuming 1 <<
M << N (A correction to this would grow logarithmically with N and can be
found using the additional terms in (Eq. (1.2.36)):
I ∼N(log(N) − 1) − M(log(M) − 1) − ( N − M)(log(N − M) − 1)
· −N[ P(1)log(P(1)) + (1 − P(1))log(1 − P(1))]
(1.8.9)
I ·log(
N
M

.
`
,
) · log(
N !
M!(N −M)!
)
218 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 218
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 218
This is the information from a string of independent character s where
P(1) · M / N. For such a str ing, the number of ones is approximately NP(1)
and the number of zeros N(1 − P(1)) (see also Question 1.8.7).
1 . 8 . 2 Cha ra ct erizing sources of informa t ion
The information content of a particular message is defined in ter ms of the probabil
ity that it,out of all possible messages, will be received. This means that we are char
acterizing not just a message but the source of the message.A direct characterization
of the source is not the infor mation of a part icular message, but the average infor ma
tion over the ensemble of possible messages. For a set of possible messages with a
given probabilit y distr ibut ion P( s) this is:
(1.8.10)
If the messages are composed out of char acters s · (s
1
s
2
...s
N
),and each character is de
ter mined independently with a probabilit y P(s
i
), then we can wr ite the information
content as:
(1.8.11)
We can move the factor in par enthesis inside the inner sum and interchange the o r
der of the summations.
(1.8.12)
The latter expression results from recognizing that the sum o ver all possible states is
a sum over all possible values of each of the letters. The sum and product can be
interchanged:
(1.8.13)
giving the result:
(1.8.14)
This shows that the average infor mation content of the whole message is the average
information content of each character summed over the whole character str ing. If the
char acter s have the same probabilit y, this is just the average infor mation content of an
individual character times the number of char acter s. If all letter s of the alphabet have
the same probability, this reduces to Eq. (1.8.7).
<I > · − P(s
′ i
) log P(s
′ i
) ( )
s
′ i
∑
′ i
∑
{s
i
}
i ≠i '
∑
P(s
i
)
i ≠ ′ i
∏
· P(s
i
)
s
i
∑
i ≠ ′ i
∏
·1
<I > · −
′ i
∑
P(s
i
)
i
∏

.
`
,
s
∑
log P(s
′ i
) ( ) · −
′ i
∑
{s
i
}
i ≠i '
∑
P(s
i
)
i≠ ′ i
∏

.
`
,
P(s
′ i
) log P(s
′ i
) ( )
s
′ i
∑
<I > · − P(s
i
)
i
∏

.
`
,
s
∑
log P(s
′ i
)
′ i
∏

.
`
,
· − P(s
i
)
i
∏

.
`
,
s
∑
log P(s
′ i
) ( )
′ i
∑
<I > · − P
s
∑
(s) log(P(s))
I n f o r m a t i o n 219
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 219
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 219
The average information content of a binary variable is given by:
< I > · −(P(1)log(P(1)) + P(0)log(P(0))) (1.8.15)
Aside from the use o f a logarithm base two, this is the same as the ent ropy of a spin
(Section 1.6) with two possible states s · ± 1 (see Question 1.8.5). The maximum in
formation content occurs when the probabilities are equal, and the infor mation goes
to zero when one of the two becomes one,and the other zero (see Fig. 1.8.1). The in
formation r eflects the uncertainty in, or the lack o f advance kno wledge about, the
value received.
Q
u e s t i o n 1 . 8 . 5 S h ow t hat the ex pre s s i on for t he en t ropy S given in
Eq . (1.6.16) of a set of n on i n t er act ing bi n a r y spins is the same as the
infor mation content defined in Eq.(1.8.15) aside from a nor malization con
stant k ln(2). Consider the binary notation s
i
· 0 to be the same as s
i
· −1 for
the spins.
220 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 220
Title: Dynamics Complex Systems
Shor t / Normal / Long
0.2 0.4 0.6 0.8 1
0.5
1.0
1.5
0
–log(P)
P
–Plog(P)
–Plog(P)–(1–P)log(1–P)
2.0
0.0
Fi gure 1 . 8 . 1 Plot s of fun ct ion s re la t e d t o t h e in forma t ion con t e n t of a me ssa ge wit h proba 
bilit y P. −log( P) is t h e in forma t ion con t e n t of a sin gle me ssa ge of proba bilit y P. −Plog( P) is
t h e con t ribut ion of t h is me ssa ge t o t h e a ve ra ge in forma t ion give n by t h e source . Wh ile t he
in forma t ion con t e n t of a me ssa ge dive rge s a s P goe s t o ze ro, it a ppe a rs le ss fre que n t ly so it s
con t ribut ion t o t h e a ve ra ge in forma t ion goe s t o ze ro. I f t h e re a re on ly t wo possible me ssa ge s,
or t wo possible ( bin a ry) ch a ra ct e rs wit h proba bilit y P a n d 1 − P t h e n t h e a ve ra ge in forma t ion
give n by t h e source pe r me ssa ge or pe r ch a ra ct e r is give n by −Plog( P) − ( 1 − P) log( 1 − P) .
01adBARYAM_29412 3/10/02 10:17 AM Page 220
Solut i on 1 . 8 . 5 The local magnetization m
i
is the aver age value of a par tic
ular spin var iable:
m
i
· P
s
i
(1) − P
s
i
(−1) (1.8.16)
Using P
s
i
(1) + P
s
i
(−1) · 1 we have:
P
s
i
(1) · (1 + m
i
) / 2
P
s
i
(−1) · (1 − m
i
) / 2
(1.8.17)
Insert ing these expressions into Eq. (1.8.15) and summing over a set of bi
nar y var iables leads to the expression:
(1.8.18)
The result is more general than this derivation suggests and will be discussed
fur ther in Chapter 8.
Q
ue s t i on 1 . 8 . 6 For a given set o f possible messages, prove that the en
semble where all messages have equal p robability p rovides the highest
average information.
Solut i on 1 . 8 . 6 Since the sum over all probabilities is a fixed number (1), we
consider what happens when we tr ansfer some probability from one message
to another. We start with the infor mation given by
(1.8.19)
and after shift ing a probabilit y of from one to the other we have:
(1.8.20)
We need to expand the change in information to first nonzero order in . We
simplify the task by using the expression:
<I ′> − <I > · f (P(s ′) + ) − f ( P(s′)) + f (P(s″) − ) − f (P(s″)) (1.8.21)
where
f (x) · −xlog(x) (1.8.22)
Taking a derivative, we have
(1.8.23)
This gives the result:
<I ′> − <I > · −(log(P(s′)) − log(P(s″))) (1.8.24)
d
dx
f ( x) · −(log(x) +1)
<
′
I > · −( P(
′
s ) − ) ln( P(
′
s ) − ) −( P(
′ ′
s ) + ) ln( P(
′ ′
s ) + ) − P(s) ln(P(s))
s ≠ ′ s , ′′ s
∑
<I > · −P(
′
s ) ln(P(
′
s )) −P(
′ ′
s ) ln( P(
′ ′
s )) − P(s) ln( P(s))
s ≠ ′ s , ′ ′ s
∑
I · N −
1
2
(1+m
i
) log 1+m
i ( )
+(1 −m
i
) log 1 −m
i ( ) ( )
i
∑
]
]
]
]
·S/ k ln(2)
I n f o r m a t i o n 221
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 221
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 221
Since log( x) is a monot onic increasing funct ion, we see that the average in
formation increases ((<I ′> − <I >) > 0) when probabilit y > 0 is transferred
from a higherprobability char acter to a lo werprobability char acter ( P (s″)
> P (s ′) ⇒−(log(P (s ′)) − log(P (s″)) > 0). Thus,any change of the pr obabil
ity toward a more unifor m probability distr ibution increases the average in
format ion.
Q
ue s t i on 1 . 8 . 7 A source produces strings of characters of length N. Each
character that appears in the string is independently selected from an al
phabet of char acter s with probabilities P(s
i
). Wr ite an expression for the
probability P(s) of a t ypical st ring o f char acters. Show that this expression
implies that the st ring gives N times the average information content of an
individual character. Does this mean that ever y str ing must give this amount
of informat ion?
Solut i on 1 . 8 . 7 For a long string, each char acter will appear NP(s
i
) times.
The probability of such a str ing is:
(1.8.25)
The information content is:
(1.8.26)
which is N times the average information of a single character. This is the in
for mation of a typical str ing. A particular string might have infor mation sig
nificantly different fr om this. However, as the number of char acters in the
string increases, by the cent r al limit theorem (Section 1.2), the fr action of
times a par ticular character appears (i.e., the distance t raveled in a rand om
walk divided by the total number of steps) becomes more nar rowly dist rib
uted around the expected p robability P(s
i
). This means the proport ion o f
messages whose infor mation content differ s from the typical value decreases
with increasing message length.
1 . 8 . 3 Correla t ions bet ween cha ra ct ers
Thus far we have considered char acters that are independent of each other. We can
also consider characters whose values are cor related. We describe the case of two cor
related character s. Because there are two characters,the notation must be more com
plete. As discussed in Section 1.2, we use the notation P
s
1
, s
2
(s′
1
, s′
2
) to denote the prob
ability that in the same str ing the char acter s
1
takes the value s′
1
and the variable s
2
takes
the value s′
2
. The aver age information contained in the two char acters is given by:
(1.8.27)
<I
s
1
,s
2
> · − P
s
1
,s
2
(
′
s
1
,
′
s
2
)
′ s
1
, ′ s
2
∑
log(P
s
1
, s
2
(
′
s
1
,
′
s
2
))
I (s) · −log(P(s)) · −N P(s
i
) log(
s
i
∑
P(s
i
))
P(s) · P(s
i
)
NP( s
i
)
s
i
∏
222 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 222
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 222
Note that the notation I(s
1
,s
2
) is often used for this expression. We use <I
s
1
,s
2
> because
it is not a funct ion of the values of the characters—it is the aver age information car
ried by the char acter s labeled by s
1
and s
2
. We can compare the information content
of the two characters with the infor mation content of each char acter separately:
(1.8.28)
(1.8.29)
It is possible to show (see Question 1.8.8) the inequalities:
(1.8.30)
The right inequality means that we receive more information fr om both characters
than from either one separately. The left inequality means that information we receive
from both char acters t ogether cannot exceed the sum of the infor mation fr om each
separately. It can be less if the characters are dependent on each other. In this case, re
ceiving one char acter reduces the infor mation given by the second.
The r elationship between the information fr om a char acter s
1
and the informa
tion fr om the same char acter after we know another char acter s
2
can be investigated
by defining a cont ingent or conditional pr obabilit y:
(1.8.31)
This is the probability that s
1
takes the value s′
1
assuming that s
2
takes the value s′
2
. We
used this notation in Section 1.2 to describe the transitions from one value to the next
in a chain of events (rand om walk). Here we are using it more generally. We could
recover the previous meaning by writing the t r ansition probability as P
s
(s′
1
 s′
2
) ·
P
s(t ) , s( t − 1)
( s′
1
 s′
2
). In this section we will be concerned with the more general d efini
tion, Eq. (1.8.31).
We can find the inform a ti on con tent of t he ch a racter s
1
wh en s
2
t a kes the va lue s′
2
(1.8.32)
<I
s
1
>
s
2
· ′ s
2
· − P
s
1
,s
2
( ′ s
1
 ′ s
2
)
′ s
1
∑
log P
s
1
,s
2
( ′ s
1
 ′ s
2
) ( )
·
−
′ s
1
∑
P
s
1
,s
2
( ′ s
1
, ′ s
2
) log(P
s
1
,s
2
( ′ s
1
, ′ s
2
)) −log( P
s
1
,s
2
( ′ ′ ′ s
1
, ′ s
2
)
′ ′ ′ s
1
∑
)

.
`
,
P
s
1
, s
2
(
′ ′
s
1
,
′
s
2
)
′′ s
1
∑
P
s
1
,s
2
(
′
s
1

′
s
2
) ·
P
s
1
, s
2
( ′ s
1
, ′ s
2
)
P
s
1
,s
2
(
′ ′
s
1
,
′
s
2
)
′ ′ s
1
∑
<I
s
2
> + <I
s
1
> ≥ <I
s
1
,s
2
> ≥ <I
s
2
>,<I
s
1
>
<I
s
2
> · − P
s
1
,s
2
(
′
s
1
,
′
s
2
)
′ s
1
, ′ s
2
∑
log( P
s
1
,s
2
(
′ ′
s
1
,
′
s
2
)
′ ′ s
1
∑
)
<I
s
1
> · − P
s
1
,s
2
(
′
s
1
,
′
s
2
)
′ s
1
, ′ s
2
∑
log( P
s
1
,s
2
(
′
s
1
,
′ ′
s
2
)
′′ s
2
∑
)
P
s
2
(
′
s
2
) · P
s
1
,s
2
(
′
s
1
,
′
s
2
)
′
s
1
∑
P
s
1
(
′
s
1
) · P
s
1
,s
2
(
′
s
1
,
′
s
2
)
′
s
2
∑
I n f o r m a t i o n 223
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 223
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 223
This can be aver aged over possible values of s
2
, giving us the average infor mation con
tent of the character s
1
when the character s
2
is known.
(1.8.33)
The average we have taken should be carefully under stood. The unconventional dou
ble average notation is used to indicate that the two averages are of a different nature.
One way to think about it is as treating the infor mation content of a dynamic variable
s
1
when s
2
is a quenched (frozen) random variable. We can rewr ite this in ter ms of the
information content of the two characters,and the information content of the char
acter s
2
by itself as follows:
(1.8.34)
Thus we have:
(1.8.35)
This is the intuit ive result that the information content given by both character s is the
same as the information content gained by sequentially o btaining the information
from the character s.Once the first character is known,the second character provides
only the information given by the conditional probabilities. There is no reason to re
str ict the use of Eq.(1.8.27) – Eq.(1.8.35) to the case where s
1
is a single char acter and
s
2
is a single character. It applies equally well if s
1
is one set of char acters,and s
2
is an
other set of characters.
Q
ue s t i on 1 . 8 . 8 Prove the inequalities in Eq. (1.8.30).
Hints for the left inequality:
1. It is helpful to use Eq. (1.8.35).
2. Use convexity ( f( 〈x〉) > 〈 f(x)〉) of the funct ion f (x) · −xlog(x).
Solut i on 1 . 8 . 8 The right inequality in Eq. (1.8.30) follo ws from the in
equalit y:
(1.8.36)
P
s
1
(
′ ′
s
1
) · P
s
1
,s
2
(
′ ′
s
1
,
′
s
2
)
′′
s
1
∑
>P
s
1
, s
2
(
′
s
1
,
′
s
2
)
<I
s
1
,s
2
> · <I
s
1
> + <<I
s
2
 s
1
>> · <I
s
2
> + <<I
s
1
s
2
>>
<< I
s
1
 s
2
>> · −
′ s
1
, ′ s
2
∑
P
s
1
,s
2
(
′
s
1
,
′
s
2
) log(P
s
1
, s
2
(
′
s
1
,
′
s
2
)) −log( P
s
1
,s
2
(
′ ′
s
1
,
′
s
2
)
′ ′ s
1
∑
)

.
`
,
· <I
s
1
,s
2
> −<I
s
2
>
<< I
s
1
 s
2
>> ≡ << I
s
1
>
s
2
·
′
s
2
>
· − P
s
2
(
′
s
2
)
′ s
2
∑
P
s
1
,s
2
(
′
s
1

′
s
2
)
′ s
1
∑
log P
s
1
,s
2
(
′
s
1

′
s
2
) ( )
· − P
s
1
,s
2
( ′ ′ ′ s
1
, ′ s
2
)
′′′
s
1
∑
′
s
1
∑
P
s
1
,s
2
( ′ s
1
, ′ s
2
)
P
s
1
,s
2
( ′ ′ s
1
, ′ s
2
)
′′
s
1
∑ ′
s
2
∑
log P
s
1
,s
2
( ′ s
1
 ′ s
2
)
( )
· −
′ s
1
, ′ s
2
∑
P
s
1
, s
2
(
′
s
1
,
′
s
2
) log P
s
1
, s
2
(
′
s
1

′
s
2
) ( )
224 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 224
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 224
The logarithm is a monotonic increasing function, so we can take the
logar ithm:
(1.8.37)
Changing sign and averaging leads to the desired result:
(1.8.38)
The left inequality in Eq. (1.8.30) may be proven fr om Eq. (1.8.35) and the
intuitive inequalit y
(1.8.39)
To prove this inequality we make use of the convexity of the function f( x) ·
−xlog(x). Convexity of a function means that its value always lies above line
segments (secants) that begin and end at points along its graph.
Algebraically:
f((ax + by) / (a + b)) > (af( x) + bf(y)) / (a + b) (1.8.40)
More gener ally, taking a set of values of x and aver aging over them gives:
f (〈x〉) > 〈 f ( x)〉 (1.8.41)
Convexity of f(x) follows fr om the obser vation that
(1.8.42)
for all x > 0, which is where the funct ion f (x) is defined.
We then note the relationship:
(1.8.43)
wher e, to simplify the following equations, we use a subscript to indicate the
average with respect to s
2
. The desired result follows from applying convex
it y as follows:
(1.8.44)
<I
s
1
> · − P
s
1
( ′ s
1
)
′ s
1
∑
log(P
s
1
( ′ s
1
)) · f ( P
s
1
( ′ s
1
)
′ s
1
∑
) · f (< P
s
1
, s
2
( ′ s
1
 ′ s
2
) >
s
2
′ s
1
∑
)
> < f ( P
s
1
,s
2
(
′
s
1

′
s
2
)
′ s
1
∑
) >
s
2
· − P
s
2
( ′ s
2
)
′ s
2
∑
P
s
1
,s
2
( ′ s
1
 ′ s
2
)
′ s
1
∑
log P
s
1
, s
2
( ′ s
1
 ′ s
2
) ( ) · <<I
s
1
s
2
>>
P
s
1
(
′
s
1
) · P
s
2
(
′
s
2
)
′ s
2
∑
P
s
1
,s
2
(
′
s
1

′
s
2
) · < P
s
1
,s
2
(
′
s
1

′
s
2
) >
s
2
d
2
f
dx
2
· −
1
x ln(2)
<0
<I
s
1
> ( ) > << I
s
1
 s
2
>> ( )
<I
s
2
> · − P
s
1
,s
2
( ′ s
1
, ′ s
2
)
′ s
1
, ′ s
2
∑
log( P
s
1
,s
2
( ′ ′ s
1
, ′ s
2
)
′ ′ s
1
∑
)
< − P
s
1
,s
2
( ′ s
1
, ′ s
2
)
′ s
1
, ′ s
2
∑
log(P
s
1
, s
2
( ′ s
1
, ′ s
2
)) · <I
s
1
,s
2
>
log( P
s
1
,s
2
(
′ ′
s
1
,
′
s
2
)
′ ′
s
1
∑
) > log(P
s
1
, s
2
(
′
s
1
,
′
s
2
))
I n f o r m a t i o n 225
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 225
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 225
the final equality following from the definition in Eq. (1.8.33). We can now
make use of Eq. (1.8.35) to obtain the desired result.
1 . 8 . 4 Ergodic sources
We consider a source that provides arbitrarily long messages, or simply continues to
give characters at a particular rate. Even though the messages are infinitely long, they
are still consider ed elements of an ensemble. It is then convenient to measure the av
erage information per character. The characterization of such an information source
is simplified if each (long) message contains within it a complete sampling of the pos
sibilities. This means that if we wait long enough, the entire ensemble of possible
char acter sequences will be r epresented in any single message. This is the same kind
of propert y as an ergodic system discussed in Sect ion 1.3. By analogy, such sources are
known as ergodic sources. For an ergodic source,not only the character s appear with
their ensemble probabilities, but also the pairs of characters, the triples of characters,
and so on.
For ergodic sources,the information from an ensemble average over all possible
messages is the same as the information for a par ticular long st ring. To wr ite this
down we need a notation that allows variable length messages. We write s
N
· (s
1
s
2
...s
N
),
where N is the length of the st r ing. The aver age information content p er char acter
may be written as:
(1.8.45)
The rightmost equality is valid for an ergodic source. An example of an ergodic source
is a source that provides independent characters—i.e., selects each character from an
ensemble. For this case, Eq.(1.8.45) was shown in Question 1.8.7. Mor e generally, for
a source to be ergodic, long enough st rings must break up into independent sub
str ings, or substrings that are more and more independent as their length increases.
Assuming that N is large enough, we can use the limit in Eq. (1.8.45) and wr ite:
(1.8.46)
Thus, for large enough N, there are a set of str ings that are equally likely to be gener
ated by the source. The number of these strings is
(1.8.47)
Since any st ring of characters is possible,in p rinciple,this statement must be formally
under stood as saying that the total probability of all other str ings becomes arbitrarily
small.
If the string of characters is a Markov chain (Sect ion 1.2),so that the probabilit y
of each character depends only on the previous character, then there are gener al con
ditions that can ensure that the source is ergodic. Similar to the discussion of Monte
Carlo simulations in Section 1.7, for the source to be ergodic,the transition pr obabil
2
N <i
s
>
P(s
N
) ≈2
−N <i
s
>
<i
s
>· lim
N →∞
<I
s
N
>
N
· − lim
N→∞
1
N
P(s
N
)
s
N
∑
log(P(s
N
)) · − lim
N →∞
1
N
log(P(s
N
))
226 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 226
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 226
ities between characters must be irreducible and acyclic. Irreducibility guarantees that
all characters are accessible from any start ing character. The acyclic proper ty guaran
tees that start ing fr om one subst r ing, all other subst rings are accessible. Thus, if we
can reach any par ticular substring, it will appear with the same fr equency in all long
str ings.
We can gener alize the usual Markov chain by allowing the probability of a char
acter to depend on sever al (n) previous characters. A Markov chain may be con
st ructed to represent such a chain by d efining new char acters, where each new char
acter is for med out of a subst ring of n characters. Then each new char acter depends
only on the previous one. The essential behavior of a Markov chain that is important
here is that correlations measur ed along the chain o f characters disappear exponen
tially. Thus,the statistical behavior of the chain in one place is independent of what it
was in the sufficiently far past. The number of char acters over which the correlations
disappear is the cor relation length. By allowing sufficiently many cor relation lengths
along the string—segments that are statistically independent—the aver age proper ties
of one st ring will be the same as any other such string.
Q
ue s t i on 1 . 8 . 9 Consider ergodic sour ces that are Markov chains with
two characters s
i
· t1 with transit ion probabilities:
a. P(11) · .999, P(−11) · .001, P(−1−1) · 0.5, P(1−1) · 0.5
b. P(11) · .999, P( −11) · .001, P(−1−1) · 0.999, P(1−1) · 0.001
c. P(11) · .999, P(−11) · .001, P(−1−1) · 0.001, P(1−1) · 0.999
d. P(11) · .001, P(−11) · .999, P( −1−1) · 0.5, P(1−1) · 0.5
e. P(11) · .001, P(−11) · .999, P(−1−1) · 0.999, P(1−1) · 0.001
f. P(11) · .001, P(−11) · .999, P(−1−1) · 0.001, P(1−1) · 0.999
Describe the appearance of the st rings generated by each source, and
(roughly) its cor relat ion length.
Solut i on 1 . 8 . 9 ( a) has long regions of 1s of typical length 1000. In between
there are shor t str ings of –1s of average length 2 · 1 + 1/ 2 + 1/ 4 + ...( t here
is a probability of 1/ 2 that a second character will be –1 and a probability of
1/ 4 that both the second and third will be –1, etc.). (b) has long r egions of
1s and long regions of –1s, both of t ypical length 1000. (c) is like ( a) except
the regions of –1s are of length 1. (d) has no extended regions of 1 or –1 but
has slightly longer regions of –1s. (e) inverts ( c). (f ) has regions of alter nat
ing 1 and –1 of length 1000 before switching to the other possibility (odd and
even indices are switched). We see that the characteristic cor relation length
is of or der 1000 in (a ),(b),(c),(e) and (f ) and of order 2 in (d ).
We have considered in d etail the problem of deter mining the information con
tent of a message, or the aver age information gener ated by a sour ce, when the char 
acter istics of the source are well defined. The source was character ized by the ensem
ble of possible messages and their probabilities. However, we do not usually have a
I n f o r m a t i o n 227
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 227
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 227
welldefined characterization of a source of messages,so a more pract ical question is
to determine the information content from the message itself. The definitions that we
have provided do not guide us in determining the infor mation of an ar bitr ary mes
sage. We must have a model for the source. The model must be constructed out of the
information we have—the string of char acter s it produces.One possibility is to model
the sour ce as ergodic. An ergodic sour ce can be mo deled in two wa ys, as a sour ce of
independent substr ings or as a generalized Markov chain where character s depend on
a certain number of previous characters. In each case we construct not one, but an in
finite sequence of models. The mo dels are designed so that if the sour ce is ergodic
then the infor mation estimates given by the models converge to give the cor rect in 
formation content.
There is a natural sequence of independent subst ring models indexed by the
number of characters in the subst r ings n. The first model is that of a source produc
ing independent char acters with a p robability specified by their fr equency of occur
rence in the message. The second model would be a source producing pairs of cor re
lated char acters so that ever y pair of characters is describ ed by the probability given
by their occurrence (we allow character pairs to overlap in the message). The thir d
model would be that of a source producing triples of correlated characters,and so on.
We use each of these models to estimate the information. The nth model estimate of
the information per character given by the source is:
(1.8.48)
where we indicate using the subscript 1,n that this is an estimate obtained using the
first t ype o f model (independent subst ring model) using subst r ings of length n. We
also make use of an approximate probability for the substr ing defined as
(1.8.49)
where N(s
n
) is the number of times s
n
appears in the str ing of length N. The infor ma
tion of the sour ce might then be est imated as the limit n → ∞ of Eq. (1.8.48):
(1.8.50)
For an ergodic source, we can see that this converges to the information of the mes
sage. The n limit converges monotonically from above. This is because the additional
information in s
n+1
given by s
n+1
is less than the information added by each previous
char acter (see Eq. 1.8.59 below). Thus, the estimate of infor mation per character
based on s
n
is higher than the estimate based on s
n+1
. Therefore, for each value of n the
estimate <i
s
>
1,n
is an upper bound on the informat ion given by the source.
How large does N have to be? Since we must have a reasonable sample of the oc
curr ence o f subst rings in order to estimate their probability, we can only estimate
probabilities o f substr ings that are much shorter than the length of the st ring. The
number of possible substr ings grows exponentially with n as k
n
, where k is the num
<i
s
> · lim
n →∞
lim
N →∞
1
n
˜
P
N
(s
n
)
s
n
∑
log(
˜
P
N
(s
n
))
˜
P
N
(s
n
) · N(s
n
) /(N −n +1)
<i
s
>
1,n
· lim
N →∞
1
n
˜
P
N
(s
n
)
s
n
∑
log(
˜
P
N
(s
n
))
228 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 228
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 228
ber of possible characters. If substr ings occur with roughly similar probabilities,
then to estimate the probability o f a subst r ing of length n would r equire at least a
st ring of length k
n
char acters. Thus,taking the large N limit should be underst ood to
correspond t o N gr eater than k
n
. This is a very severe requirement. This means that
to study a model of English char acter st r ings of length n · 10 (ignor ing up per and
lower case, numbers and punctuation) would r equire 26
10
~ 10
14
char acters. This is
roughly the number of char acters in all of the books in the Libr ary of Congress (see
Quest ion 1.8.15).
The generalized Mar kov chain model assumes a part icular char acter is depen
dent only on n previous char acter s. Since the first n characters do not p rovide a sig
nificant amount of information for a ver y long chain (N >> n), we can obtain the av
erage information per character fr om the incremental information given by a
character. Thus, for the nth generalized Markov chain model we have the estimate:
(1.8.51)
wher e we define the appr oximate conditional probability using:
(1.8.52)
Taking the limit n → ∞ we have an estimate of the information of the source per
character:
(1.8.53)
This also converges from above as a funct ion of n for large enough N. For a given n, a
Markov chain model takes into account mo re cor relations than the previous ind e
pendent substring model and thus gives a bett er estimate of the information
(Quest ion 1.8.10).
Q
ue s t i on 1 . 8 . 1 0 Prove that the Mar kov chain model gives a better est i
mate of the information for ergodic sources than the independent sub
st ring model for a par ticular n. Assume the limit N →∞ so that the estimated
probabilities become actual and we can substitute P
˜
N
→P in Eq.(1.8.48) and
Eq. (1.8.51).
Solut i on 1 . 8 . 1 0 The infor mation in a substr ing of length n is given by the
sum of the information provided incrementally by each character, where the
previous char acters are known. We der ive this statement algebr aically
(Eq. (1.8.59)) and use it to pr ove the desired result. Taking the N limit in
Eq. (1.8.48), we d efine the nth ap proximation using the independent sub
st ring model as:
(1.8.54)
<i
s
>
1,n
·
1
n
P(s
n
)
s
n
∑
log(P( s
n
))
<i
s
> · lim
n →∞
lim
N →∞
˜
P
N
(s
n−1
)
s
n −1
∑
˜
P (s
n
 s
n−1
)
s
n
∑
log(
˜
P (s
n
 s
n−1
))
˜
P
N
(s
n
 s
n−1
) · N(s
n−1
s
n
) / N(s
n−1
)
<i
s
>
2,n
· < <I
s
n
 s
n −1
>> · lim
N→ ∞
˜
P
N
(s
n−1
)
s
n −1
∑
˜
P (s
n
 s
n−1
)
s
n
∑
log(
˜
P (s
n
 s
n−1
))
I n f o r m a t i o n 229
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 229
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 229
and for the nth generalized Markov chain mo del we take the same limit in
Eq. (1.8.51):
(1.8.55)
To rel a te these ex pre s s i ons to each ot her, fo ll ow the deriva ti on of Eq .( 1 . 8 . 3 4 ) ,
or use it with the su b s ti tut i ons s
1
→s
n −1
and s
2
→s
n
, to obt a i n
(1.8.56)
Using the identities
(1.8.57)
this can be rewritten as:
< i
s
>
2, n
· n< i
s
>
1, n
−(n − 1)< i
s
>
1, n−1
(1.8.58)
This result can be summed over n from 1 to n (the n · 1 case is
<i
s
>
2,1
· <i
s
>
1,1
) to obtain:
(1.8.59)
since < i
s
>
2,n
is monotonic decreasing and < i
s
>
1,n
is seen from this expres
sion to be an average over < i
s
>
2,n
with lower values of n, we must have that
< i
s
>
2,n
≤ < i
s
>
1,n
(1.8.60)
as desired.
Q
ue s t i on 1 . 8 . 1 1 We have shown that the two models—the independent
subst ring models and the gener alized Mar kov chain model—are upper
bounds to the infor mation in a str ing. How good is the upper bound? Think
up an example that shows that it can be terr ible for both, but better for the
Markov chain.
Solut i on 1 . 8 . 1 1 Consider the example of a long str ing for med out of a re
peating subst ring, for example (000000010000000100000001…). The aver
age infor mation content per character of this st ring is zero. This is because
once the repeat str ucture has become established,there is no more infor ma
tion. Any model that gives a nonzer o estimate of the information content per
′
n ·1
n
∑
<i
s
>
2 , ′ n
·n <i
s
>
1,n
P(s
n−1
s
n
) · P(s
n
)
P(s
n−1
) · P(s
n−1
s
n
)
s
n
∑
<i
s
>
2,n
· −
s
n −1
, s
n
∑
P(s
n−1
s
n
) log(P(s
n−1
s
n
)) −log( P(s
n−1
s
n
)
s
n
∑
)

.
`
,
<i
s
>
2,n
· P(s
n−1
)
s
n −1
∑
P(s
n
 s
n−1
)
s
n
∑
log(P(s
n
 s
n−1
))
230 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 230
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 230
char acter will make a great er ror in its estimate of the information content
of the string, which is N times as much as the infor mation per char acter.
For the independent substring model,the estimate is never zero. For the
Markov chain model it is nonzero until n reaches the repeat distance. A
Markov model with n the same size or larger than the repeat length will give
the corr ect answer of zero infor mation per char acter. This means that even
for the Markov chain model, the information estimate does not work ver y
well for n less than the repeat distance.
Q
ue s t i on 1 . 8 . 1 2 Write a computer progr am to estimate the information
in English and find the estimate. For simple,easytocompute estimates,
use singlecharacter probabilities,twocharacter probabilities,and a Mar kov
chain model for individual characters. These correspond to the above defin
itions of < i
s
>
2,1
· < i
s
>
1,1
, < i
s
>
1,2
, and < i
s
>
2,2
respectively.
Solut i on 1 . 8 . 1 2 A program that evaluates the inf ormation content using
singlecharacter probabilities applied to the text (excluding equations) of
Section 1.8 of this book gives an estimate of information content of 4.4
bits/char acter. Twochar acter p robabilities gives 3.8 bits/char acter, and the
onecharacter Mar kov chain model gives 3.3 bits/char acter. A chap ter of a
book by Mark Twain gives similar results. These estimates are decreasing in
magnitude, consistent with the discussion in the text. They are also still quite
high as estimat es of the information in English per char acter.
The best estimates are based up on human guessing of the next charac
ter in a wr itten text. Such experiments with human subjects give estimates of
the lower and upper bounds of infor mation content per character of English
text. These are 0.6 and 1.2 bits/character. This range is significantly below the
estimates we obtained using simple models. Remarkably, these estimates
suggest that it is enough to give only one in four to one in eight characters of
English in or der for text to be decipherable.
Q
ue s t i on 1 . 8 . 1 3 Const ruct an example illust rating how correlations can
arise between character s over longer than,say, ten character s. These cor
relations would not be represented by any reasonable characterbased
Markov chain model. Is there an example of this type relevant to the English
language?
Solut i on 1 . 8 . 1 3 Example 1: If we have infor mation that is read from a ma
tr ix r ow by row, where the mat rix ent ries have correlations b etween r ows,
then there will be correlations that are longer than the length o f the mat r ix
rows.
Example 2: We can think about successive English sent ences as rows of
a mat rix. We would exp ect to find cor relations between rows (i.e., between
words found in adjacent sentences) rather than just between letters.
I n f o r m a t i o n 231
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 231
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 231
Q
ue s t i on 1 . 8 . 1 4 Estimate the amount of infor mation in a t ypical book
(order of magnitude is sufficient). Use the best estimate of information
content per char acter of English text of about 1 bit per char acter.
Solut i on 1 . 8 . 1 4 A rough estimate can be made using as follows: A 200
page novel with 60 char acters per line and 30 lines per page has 4 × 10
5
characters. Textbooks can have sever al times this many character s.A dictio
nar y, which is significantly longer than a typical book, might have 2 × 10
7
characters. Thus we might use an order of magnitude value o f 10
6
bits p er
book.
Q
ue s t i on 1 . 8 . 1 5 Obtain an estimate of the numb er of characters (and
thus the number of bits of information) in the Libr ary of Congress.
Assume an aver age of 10
6
characters per book.
Solut i on 1 . 8 . 1 5 Accor ding to information provided by the Libr ary of
Congress,t here are presently (in 1996) 16 million books classified according
to the Libr ary o f Congress classification system, 13 million other books at
the Library o f Congress, and ap proximately 80 million other items such as
newspaper s, maps and films. Thus with 10
7
–10
8
book equivalents, we est i
mate the number of characters as 10
13
–10
14
.
Inherent in the notion of quantifying information content is the understanding
that the same information can be communicated in different ways, as long as the
amount of information that can be transmitted is sufficient. Thus we can use binary,
decimal, hexadecimal or typed letters to communicate both numbers and letters.
Information can be communicated using any set of (two or more) char acters. The
presumption is that there is a way of translating from one to another. Translation op
erations are called codes; the act of t ranslation is encoding or decoding. Among pos
sible codes are those that are inver tible.Encoding a message cannot add infor mation,
it might,however, lose infor mation (Question 1.8.16). Inver tible codes must preser ve
the amount of information.
Once we have deter mined the information content, we can compare different
ways of writing the same information. Assume that one source generates a message of
length N characters with information I. Then a different source may transmit the
same information using fewer character s. Even if characters are generat ed at the same
r ate,the information may be more rapidly tr ansmitted by one source than another. In
par ticular, regardless of the value of N, by definition of information content, we could
have communicated the same infor mation using a binary st r ing of length I. It is,how
ever, impossible to use fewer than I bits because the maximum information a binary
message can contain is equal to its length. This amount o f information occurs for a
source with equal a priori probabilit y.
Encoding the information in a shor ter form is equivalent to data compression.
Thus a completely compressed binary data string would have an amount of informa
tion given by its length. The sour ce o f such a message would be character ized as a
source of messages with equal a pr iori probability—a random source. We see that ran
232 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 232
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 232
domness and infor mation are related. Without a tr anslation (decoding) function it
would be impossible to distinguish the completely compressed information from ran
dom numbers. Moreover, a r andom str ing could not be compressed.
Q
ue s t i on 1 . 8 . 1 6 Prove that an encoding operation that takes a message
as input and conver ts it into another welldefined message (i.e., for a
par ticular input message, the same output message is always given) cannot
add infor mation but may reduce it. Describe the necessary conditions for it
to keep the same amount of infor mation.
Solut i on 1 . 8 . 1 6 Our d efinition o f information r elies up on the specifica
tion of the ensemble of possible messages. Consider this ensemble and as
sume that each message appears in the ensemble a number of times in pro
port ion to its probability, like the bag with red and green balls. The effect of
a coding operation is to label each ball with the new message (code) that will
be delivered after the coding operation. The amount of infor mation depends
not on the nature of the lab el, but rather on the number o f balls with the
same label. The requirement that a par t icular message is encoded in a well
defined way means that two balls that start with the same message cannot be
labeled with different codes. However, it is possible for balls with different
original messages to be labeled the same. The average infor mation is not
changed if and only if all distinct messages are labeled with distinct codes. If
any distinct messages become identified by the same lab el, the infor mation
is reduced.
We can prove this conclusion algebr aically using the result of
Question 1.8.8, which showed that tr ansferring probability from a less likely
to a more likely case reduced the infor mation content. Here we are,in effect,
transfer ring all of the probability from the less likely to the more likely case.
The change in information upon labeling two distinct messages with the
same code is given by (f (x) · −xlog(x), as in Question 1.8.8):
∆I · f (P(s
1
) + P(s
2
)) − (f(P(s
1
)) + f (P( s
2
)))
· (f (P( s
1
) + P(s
2
)) + f (0)) − ( f (P(s
1
)) + f (P(s
2
))) < 0
(1.8.61)
where the inequality follows because f (x) is convex in the range 0 < x < 1.
1 . 8 . 5 Huma n communica t ion
The theor y of information, like other theories, relies upon idealized constr ucts that
are useful in establishing the essential concepts, but do not capture all features of real
systems. In par ticular, the definition and discussion of information r elies upon
sources that transmit the result of random occurrences, which, by definition, cannot
be known by the recipient. The sources are also completely described by specifying the
nature of the random process. This model for the nature of the source and the recip
ient d oes not adequately cap ture the attributes o f communication b etween human
beings. The theor y of information can be applied directly to address questions about
I n f o r m a t i o n 233
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 233
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 233
information channels and the characterization of communication in general. It can
also be used to develop an under standing of the complexity of systems. In this section,
however, we will consider some additional issues that should be kept in mind when
applying the theor y to the communication between human beings. These issues will
ar ise again in Chapter 8.
The definition of information content relies heavily on the concepts of probabil
it y, ensembles, and p rocesses that gener ate ar bitr arily many characters. These con
cepts are fraught with practical and philosophical difficulties—when there is only one
t ransmitted message,how can we say there were many that were possible? A book may
be considered as a single communication.A book has finite length and, for a particu
lar author and a particular reader, is a unique communication. In order to under stand
both the strengths and the limitations of applying the theor y of information,it is nec
essary to r ecognize that the information content of a message depends on the infor
mation that the r ecipient of the message already has. In par ticular, information that
the recipient has about the source. In the discussion above,a clear distinction has been
made. The only information that char acter izes the sour ce is in the ensemble proba
bilities P(s). The infor mation transmitted by a single message is distinct from the en
semble probabilities and is quantified b y I( s). It is assumed that the character ization
of the source is completely known to the r ecipient. The content of the message is com
pletely unknown (and unknowable in advance) to the recipient.
A slightly more difficult example to consider is that of a recipient who does not
know the character ization of the source. However, such a character ization in terms of
an ensemble P(s) does exist. Under these circumstances, the amount of information
t ransfer red by a message would be more than the amount of information given b y
I(s). However, the maximum amount of information that could be tr ansferred would
be the sum of the infor mation in the message,and the infor mation necessary to char
acterize the sour ce by sp ecifying the p robabilities P(s). This upper bound on the in
formation that can be t ransferred is only useful if the amount of infor mation neces
sar y to characterize the source is small compared t o the informat ion in the message.
The difficulty with discussing human communication is that the amount of in
for mation necessary to ful ly char acter ize the sour ce (one human b eing) is generally
much larger than the information tr ansmitted by a par ticular message. Similarly, the
amount of information possessed by the recipient (another human being) is much
larger than the infor mation contained in a par ticular message. Thus it is reasonable
to assume that the recipient does not ha ve a full char acterization of the source. It is
also reasonable to assume that the model that the recipient has about the source is
more sophisticated than a typical Markov chain model, even though it is a simplified
model of a human being. The infor mation contained in a message is, in a sense, the
additional infor mation not contained in the o riginal model possessed by the recipi
ent. This is consistent with the above discussion, but it also recognizes that specifying
the probabilities of the ensemble may require a significant amount of infor mation. It
may also be convenient to summarize this information by a different type of model
than a Markov chain model.
234 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 234
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 234
Once the specific model and information that the recipient has about the source
enter s into an evaluation of the infor mation tr ansfer, there is a cer tain and quite rea
sonable degree of relativity in the amount of information transfer red. An extreme ex
ample would be if the r ecipient has already r eceived a long message and knows the
same message is being repeated,then no new information is being transmitted.A per
son who has memorized the Gett ysburg Address will receive ver y little new informa
tion upon hearing or reading it again. The pr ior knowledge is part of the model pos
sessed by the recipient about the source.
Can we incorpor ate this in our definition of infor mation? In ever y case where we
have measur ed the information o f a message, we have made use of a mo del o f the
source of the infor mation. The under lying assumpt ion is that this model is possessed
by the recipient. It should now be recognized that there is a certain amount of infor
mation necessary to describe this model. As long as the amount of infor mation in the
model is small compared to the amount of infor mation in the message, we can say
that we have an absolute estimate of the information content of the message. As soon
as the infor mation content of the model approaches that o f the message itself, then
the amount of infor mation transfer red is sensitive to exactly what infor mation is
known. It might be possible to develop a theor y of information that incor porates the
infor mation in the model,and thus to arrive at a more absolute measure of informa
tion. Alternat ively, it might be necessary to develop a theory that considers the recip
ient and source more completely, since in act ual communication between human be
ings, both are nonergodic systems possessed of a large amount of information. There
is significant overlap of the information possessed by the recipient and the source.
Moreover, this common infor mation is essential to the communicat ion itself.
One effort to arr ive at a universal definition of information content of a message
has been made by formally quantifying the information contained in models. The re
sulting infor mation measure, Kolmogorov complexity, is based on computation the
ory discussed in the next section. While there is some success with this approach,t wo
difficulties remain. In order for a universal definition of infor mation to be agreed
upon ,models must still have an infor mation content which is less than the message—
knowledge possessed must be smaller than that received. Also, to calculate the infor
mation contained in a particular message is essentially impossible, since it requires
computational effor t that grows exponentially with the length of the message. In any
pract ical case,the amount of information contained in a message must be estimat ed
using a limited set of models of the source. The utilization of a limited set of models
means that any estimate of the informat ion in a message is an upper bound.
Comput a t i on
The theory of com p ut a ti on de s c ri bes the opera ti ons t hat we perfor m on nu m bers ,
i n cluding ad d i ti on , su btr act i on , mu l ti p l i c a ti on and divi s i on . More gen era lly, a com
p ut a ti on is a sequ en ce of opera ti ons each of wh i ch has a def i n i te / u n i qu e / well  def i n ed
re su l t . The fundamental stu dy of su ch opera ti ons is the theor y of l ogi c . Logical
1 . 9
C o m p u t a t i o n 235
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 235
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 235
oper ations do not necessarily act upon numb ers, but rather upon abstract objects
called statements. Statements can be combined together using operators such as
AND and OR, and acted upon by the negation operation NOT. The theory of logic
and the theor y of computation are at root the same. All computations that have
been conceived of can be constr ucted out of logical o perations. We will discuss this
equivalence in some detail.
We also discuss a further equivalence, gener ally less well ap preciated, between
computation and deter ministic time evolution. The theor y of computation str ives to
describe the class of all possible discrete deterministic or causal systems.
Computations are essentially causal relationships. Computation theor y is designed to
capture all such possible relationships. It is thus essential to our under standing not
just of the behavior of computers, or of human logic, but also to the understanding
of causal r elationships in all physical syst ems. A count er point to this association of
computation and causality is the recognition that cer tain classes of deterministic dy
namical systems are capable of the proper t y known as universal computation.
One of the central findings of the theor y of computation is that many apparently
different formulations of computation turn out to be equivalent. The sense in which
they are equivalent is that each one can simulate the other. In the ear ly years of com
putation theor y, ther e was an effort to describe sets of operations that would be more
powerful than others. When all of them were shown to be equivalent it became gen
erally accepted (the ChurchTuring hypothesis) that there is a welldefined set of pos
sible computations realized by any of several conceptual for mulations. This has be
come known as the theory of univer sal computation.
1 . 9 . 1 Proposit iona l logic
Logic is the study of reasoning, inference and deduction. Propositional logic describes
the manipulation of statements that are either t rue or false. It assumes that ther e ex
ists a set of statements that are either t r ue or false at a par ticular time, but not both.
Logic then provides the possibility o f using an assumed set o f relationships between
the statements to determine the t ruth or falsehood of other statements.
For example,the statements Q
1
·“I am standing” and Q
2
·“I am sitting” may be
related by the assump tion: Q
1
is t r ue implies that Q
2
is not t rue. Using this assump
tion,it is understood that a stat ement “Q
1
AND Q
2
” must be false. The falsehood de
pends only on the relationship between the two sentences and not on the particular
meaning of the sent ences. This suggests that an abstract constr uct ion that describes
mechanisms of inference can be developed. This abst ract constr uct ion is proposi
tional logic.
Propositional logic is formed out of statements (propositions) that may be t r ue
( T) or false (F), and operations. The operations are describ ed by their actions up on
statements. Since the only concern of logic is the tr uth or falsehood of statements, we
can describe the oper ations through tables of tr uth values (tr uth tables) as follows.
NOT (^ ) is an oper ator that a cts on a single statement (a unary operator) to form a
new statement. If Q is a stat ement then ^ Q (read “not Q”) is the symbolic r epresen
236 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 236
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 236
tation of “It is not t r ue that Q.” The t ruth of ^ Q is directly (causal ly) r elated to the
t ruth of Q by the relationship in the table:
Q ^ Q
T F (1.9.1)
F T
The value of the t ruth or falseho od of Q is sho wn in the left column and the corre
sponding value of the tr uth or falsehood of ^ Q is given in the right column.
Similarly, we can write the tr uth tables for the oper ations AND (&) and OR ( ):
Q
1
Q
2
Q
1
&Q
2
T T T
T F F (1.9.2)
F T F
F F F
Q
1
Q
2
Q
1
Q
2
T T T
T F T (1.9.3)
F T T
F F F
As the tables show, Q
1
&Q
2
is only true if both Q
1
is tr ue and Q
2
is t rue. Q
1
 Q
2
is only
false if both Q
1
is false and Q
2
is false.
Propositional logic includes logical theorems as statements. For example, the
statement Q
1
is true if and only if Q
2
is tr ue can also be written as a binary oper ation
Q
1
≡ Q
2
with the truth table:
Q
1
Q
2
Q
1
≡ Q
2
T T T
T F F (1.9.4)
F T F
F F T
Another binary oper ation is the statement Q
1
implies Q
2
, Q
1
⇒ Q
2
. When this
statement is translated into propositional logic,t here is a difficulty that is usually by
passed by the following convent ion:
Q
1
Q
2
Q
1
⇒Q
2
T T T
T F F (1.9.5)
F T T
F F T
The difficulty is that the last two lines suggest that when the antecedent Q
1
is false,the
implication is tr ue, whether or not the consequent Q
2
is true. For example, the
C o m p u t a t i o n 237
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 237
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 237
statement “If I had wings then I could fly”is as true a statement as “If I had wings then
I couldn’t fly,” or the statement “If I had wings then potatoes would be flat.” The prob
lem originates in the necessity of assuming that the result is t rue or false in a unique
way based upon the t ruth values of Q
1
and Q
2
. Other information is not admissible,
and a third choice of “nonsense” or “incomplete infor mation provided”is not allowed
within p ropositional logic. Another way to think about this problem is to say that
ther e are many operators that can be formed with definite ou tcomes. Regardless of
how we relate these operators to our own logical processes, we can study the syst em
of oper ator s that can be formed in this way. This is a model, but not a complete one,
for human logic.Or, if we choose to define logic as described by this system,then hu
man thought (as r eflected by the meaning of the word “implies”) is not fully charac
ter ized by logic (as reflected by the meaning of the operation “⇒”).
In addition to unary and binary operations that can act upon statements to for m
other statements,it is necessary to have parentheses that differentiate the order of op
erations to be performed. For example a stat ement ((Q
1
≡ Q
2
)&(^ Q
3
) Q
1
) is a series
of operations on primitive statements that starts from the innermost parenthesis and
progresses outward.As in this example,there may be more than one inner most paren
thesis. To be definite, we could insist that the order of per forming these operations is
from left to r ight. However, this order does not affect any result.
Within the context o f propositional lo gic, it is possible to describe a systematic
mechanism for proving statements that are composed of pr imitive statements. There
are several conclusions that can be arrived at regarding a particular statement.A tau
tology is a stat ement that is always tr ue regardless of the truth or falsehood of its com
ponent stat ements. Tautologies are also called theorems. A contradict ion is a state
ment that is always false. Examples are given in Question 1.9.1.
Q
ue s t i on 1 . 9 . 1 Evaluate the t ruth table of:
a. (Q
1
⇒ Q
2
) ((^ Q
2
)&Q
1
)
b. (^ (Q
1
⇒ Q
2
)) ≡((^ Q
1
) Q
2
)
Ident ify which is a tautology and which is a contr adiction.
Solut i on 1 . 9 . 1 Build up the t ruth table piece by piece:
a. Tautology:
Q
1
Q
2
Q
1
⇒Q
2
(^ Q
2
)&Q
1
(Q
1
⇒Q
2
)((^ Q
2
)&Q
1
)
T T T F T
T F F T T
F T T F T
F F T F T
(1.9.6)
238 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 238
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 238
b. Contr adict ion:
Q
1
Q
2
^ (Q
1
⇒Q
2
) (^ Q
1
)Q
2
(^ (Q
1
⇒Q
2
)) ≡ ((^ Q
1
)Q
2
)
T T F T F
T F T F F
F T F T F
F F F T F
(1.9.7)
Q
ue s t i on 1 . 9 . 2 : Constr uct a theorem (tautology) from a cont radiction.
Solut i on 1 . 9 . 2 : By negation.
1 . 9 . 2 Boolea n a lgebra
Propositional logic is a particular example of a more gener al symbolic system known
as a Boolean algebra. Set theor y, with the oper ators complement,union and intersec
tion, is another example of a Boolean algebra. The for mulation of a Boolean algebr a
is convenient b ecause within this more general framework a number o f important
theorems can be proven. They then hold for pr opositional logic,set theor y and other
Boolean algebr as.
A Boolean algebra is a set of elements B·{Q
1
,Q
2
, …}, a unary operator (^ ), and
two binary operators, for which we adopt the notation (+,•),that satisfy the follo wing
proper ties for all Q
1
, Q
2
, Q
3
in B:
1. Closure: ^ Q
1
, Q
1
+Q
2
, and Q
1
•Q
2
are in B
2. Commutative law: Q
1
+Q
2
·Q
2
+Q
1
, and Q
1
•Q
2
·Q
2
•Q
1
3. Dist ribut ive law: Q
1
•(Q
2
+Q
3
)·(Q
1
•Q
2
) +(Q
1
•Q
3
) and
Q
1
+(Q
2
•Q
3
) ·(Q
1
+Q
2
)•(Q
1
+Q
3
)
4. Existence of ident ity elements, 0 and 1: Q
1
+0·Q
1
, and Q
1
•1·Q
1
5. Complementar ity law: Q
1
+(^ Q
1
)·1 and Q
1
•(^ Q
1
)·0
The stat ements of proper ties 2 through 5 consist of equalities. These equalities indi
cate that the element of the set that results from operations on the left is the same as
the element resulting from operations on the right. Note particularly the second part
of the distr ibutive law and the complementarity law that would not be valid if we in
ter preted + as addition and • as multiplication.
Assumpt ions 1 to 5 allow the proof of additional proper ties as follows:
6. Associative proper ty: Q
1
+(Q
2
+Q
3
)·(Q
1
+Q
2
)+Q
3
and Q
1
•(Q
2
•Q
3
)·(Q
1
•Q
2
)•Q
3
7. Idempotent propert y: Q
1
+Q
1
·Q
1
and Q
1
•Q
1
·Q
1
8. Identity elements are nulls: Q
1
+1·1 and Q
1
•0·0
9. Involution propert y: ^ (^ Q
1
)·Q
1
C o m p u t a t i o n 239
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 239
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 239
10. Absorption propert y: Q
1
+(Q
1
•Q
2
)·Q
1
and Q
1
•(Q
1
+Q
2
)·Q
1
11. DeMorgan’s Laws: ^ (Q
1
+Q
2
)·(^ Q
1
)•(^ Q
2
) and ^ (Q
1
•Q
2
)·(^ Q
1
)+(^ Q
2
)
To identify propositional lo gic as a Boolean algebra we use the set B·{T,F} and
map the operations of propositional logic to Boolean operations as follows:(^ to ^ ),
( to +) and (& to •). The identity elements are mapped:(1 to T) and (0 to F). The proof
of the Boolean proper ties for propositional logic is given as Question 1.9.3.
Q
ue s t i on 1 . 9 . 3 : Prove that the identification of propositional logic as a
Boolean algebr a is correct.
Solut i on 1 . 9 . 3 : (1) is tr ivial; (2) is the invariance of the tr uth tables of
Q
1
&Q
2
, Q
1
 Q
2
to interchange of values of Q
1
and Q
2
; (3) requires compar i
son o f the t ruth tables of Q
1
 (Q
2
&Q
3
) and (Q
1
 Q
2
)&(Q
1
 Q
3
) (see below).
Comparison of the tr uth tables of Q
1
&(Q
2
 Q
3
) and (Q
1
&Q
2
) (Q
1
&Q
3
) is
done similarly.
Q
1
Q
2
Q
3
Q
2
&Q
3
Q
1
(Q
2
&Q
3
) Q
1
Q
2
Q
1
Q
3
(Q
1
Q
2
)&(Q
1
Q
3
)
T T T T T T T T
T T F F T T T T
T F T F T T T T
T F F F T T T T
F T T T T T T T
F T F F F T F F
F F T F F F T F
F F F F F F F F
(1.9.8)
(4) requires verifying Q
1
&T·T, and Q
1
 F·F (see the t ruth tables for & and 
above);(5) requires const ructing a truth table for Q ^ Q and verifying that it
is always T (see below). Similarly, the t ruth table for Q&^ Q shows that it is
always F.
Q ^ Q Q^ Q
T F T (1.9.9)
F T T
1 . 9 . 3 Complet eness
Our obj ective is to show that an arbi tra r y truth tabl e , an ar bi tr a r y logical statem en t ,
can be con s tru cted out of on ly a few logical opera ti on s . Truth tables are also equ iva
l ent to nu m erical functi on s — s pec i f i c a lly, f u n cti ons of bi n a ry va ri a bles that have bi
n a ry re sults (bi n a ry funct i ons of bi n a r y va ri a bl e s ) . This can be seen using the
Boolean repre s en t a ti on of T and F as {1,0} that is more familiar as a bi n a ry notati on
for nu m erical functi on s . For ex a m p l e , we can wri te the A N D and O R oper a ti on s
( f u n cti ons) also as:
240 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 240
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 240
Q
1
Q
2
Q
1
•Q
2
Q
1
Q
2
1 1 1 1
1 0 0 1 (1.9.10)
0 1 0 1
0 0 0 0
Similarly for all truth tables,a logical operation is a binary function of a set of binar y
variables. Thus,the ability to form an ar bit rary t ruth table from a few logical opera
tors is the same as the ability to for m an arbitr ary binary function of binary variables
from these same logical oper ators.
To prove this abilit y, we use the pr opert ies of the Boolean algebra to syst emati
cally discuss t ruth tables. We first const ruct an alternative Boolean expression for
Q
1
+Q
2
by a procedure that can be gener alized to arbitr ary truth tables. The procedure
is to look at each line in the t ruth table that contains an outcome of 1 and write an ex
pression that provides unity for that line onl y. Then we combine the lines to achieve
the desired table. Q
1
•Q
2
is only unity for the first line,as can be seen from its column.
Similarly, Q
1
•(^ Q
2
) is unity for the second line and (^ Q
1
)•Q
2
is unity for the third
line. Using the propert ies of + we can then combine the terms together in the form:
Q
1
•Q
2
+Q
1
•(^ Q
2
)+(^ Q
1
)•Q
2
(1.9.11)
Using associat ive and ident ity propert ies, this gives the same result as Q
1
+Q
2
.
We have replaced a simple expression with a much more complicated expression
in Eq.(1.9.11). The motivation for doing this is that the same procedure can be used
to represent any t ruth table. The gener al for m we have constructed is called the dis
junct ive nor mal form. We can constr uct a disjunc tive nor mal r epresentation for an
ar bitr ary binary function of binary variables. For example, given a specific binar y
funct ion of binar y var iables, f (Q
1
,Q
2
,Q
3
), we constr uct its truth table, e.g.,
Q
1
Q
2
Q
3
f (Q
1
,Q
2
,Q
3
)
1 1 1 1
1 0 1 0
0 1 1 1
0 0 1 0 (1.9.12)
1 1 0 0
1 0 0 1
0 1 0 0
0 0 0 0
The disjunctive normal form is given by:
f (Q
1
,Q
2
,Q
3
)·Q
1
•Q
2
•Q
3
+(^ Q
1
)•Q
2
•Q
3
+Q
1
•(^ Q
2
)•(^ Q
3
) (1.9.13)
as can be verified by inspect ion. An analogous constr uction can represent any binary
funct ion.
We have demonstr ated that an arbitr ary truth table can be constr ucted out of the
three operations (^ ,+, •). We say that these form a complete set o f oper ations. Since
C o m p u t a t i o n 241
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 241
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 241
there are 2
n
lines in a t ruth table formed out of n binary variables, there are 2
2
n
pos
sible funct ions of these n binary variables. Each is sp ecified by a par ticular choice of
the 2
n
possible outcomes. We have achieved a dramatic simplification by recognizing
that all of them can be written in terms of only three operators. We also know that at
most (1/2)n2
n
(^ ) operations, ( n − 1) 2
n
(•) oper ations and 2
n
− 1 (+) oper ations are
necessar y. This is the number of operations needed to represent the identity function
1 in disjunct ive nor mal form.
It is possible to further simplify the set of operations required. We can eliminate
either the + or the • oper ations and still have a complete set. To prove this we need only
display an expression for either of them in ter ms of the remaining operations:
Q
1
•Q
2
·^ ((^ Q
1
)+(^ Q
2
))
(1.9.14)
Q
1
+Q
2
·^ ((^ Q
1
)•(^ Q
2
))
Q
ue s t i on 1 . 9 . 4 : Ver ify Eq. (1.9.14).
Solut i on 1 . 9 . 4 : They may be verified using DeMorgan’s Laws and the invo
lution propert y, or by constr uct ion of the tr uth tables, e.g.:
Q
1
Q
2
^ Q
1
^ Q
2
Q
1
•Q
2
(^ Q
1
) (^ Q
2
)
1 1 0 0 1 0
1 0 0 1 0 1
0 1 1 0 0 1
0 0 1 1 0 1
(1.9.15)
It is possible to go one step further and id entify binary o perations that can r ep
resent all possible functions of binary variables. Two possibilities are the NAND (
ˆ
&)
and NOR (
ˆ
 ) operations defined by:
Q
1
ˆ
& Q
2
·^ (Q
1
&Q
2
) → ^ (Q
1
•Q
2
)
(1.9.16)
Q
1
ˆ
 Q
2
·^ (Q
1
 Q
2
) → ^ (Q
1
+Q
2
)
Both the logical and Boolean for ms are wr itten above. The truth tables of these oper
ator s are:
Q
1
Q
2
^ (Q
1
•Q
2
) ^ (Q
1
Q
2
)
1 1 0 0
1 0 1 0 (1.9.17)
0 1 1 0
0 0 1 1
We can prove that each is complete by itself (capable of representing all binary func
tions of binary variables) by showing that they are capable of representing one of the
earlier complete sets. We prove the case for the NAND oper ation and leave the NOR op
erat ion to Question 1.9.5.
242 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 242
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 242
^ Q
1
·^ (Q
1
•Q
1
)·Q
1
ˆ
& Q
1
(1.9.18)
(Q
1
•Q
2
)·^ (^ (Q
1
•Q
2
))·^ (Q
1
ˆ
& Q
2
)·(Q
1
ˆ
& Q
2
)
ˆ
& (Q
1
ˆ
& Q
2
)
Q
ue s t i on 1 . 9 . 5 : Ver ify completeness of the NOR operation.
Solut i on 1 . 9 . 5 : We can use the same for mulas as in the proof of the com
pleteness of NAND by r eplacing • with + and
ˆ
& with
ˆ
 ever ywhere.
1 . 9 . 4 Turing ma chines
We have found that logical operat ors can represent any binary funct ion of binary vari
ables. This means that all welldefined mathematical oper ations on integers can be
represented in this way. One of the implications is that we might make machines out
of physical elements, each of which is capable of perfor ming a Boolean o peration.
Such a machine would calculate a mathematical function and spare us a tedious task.
We can graphically display the operations of a machine performing a ser ies of
Boolean operations as shown in Fig. 1.9.1. This is a simplified symbolic form similar
to for ms used in the design of computer logic circuits.
By looking carefully at Fig. 1.9.1 we see that there are several additional kinds of
act ions that are necessary in addition to the elementary Boolean operation. These ac
tions are indicated by the lines that might be thought of as wires. One action is to
tr ansfer infor mation from the location where it is input into the syst em, to the place
where it is used. The second is to duplicate the information. Duplication is repre
sented in the figure by a branching of the lines. The br anching enables the same
C o m p u t a t i o n 243
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 243
Title: Dynamics Complex Systems
Shor t / Normal / Long
Q
1
Q
2
Q
1
Q
2
Q
1
 Q
2
^
Q
1
& Q
2
Fi gure 1 . 9 . 1 Gra ph ica l re pre se n t a t ion of Boole a n ope ra t ions. Th e t op figure sh ows a gra ph
ica l e le me n t re pre se n t in g t h e NOR ope ra t ion Q
1
^
 Q
2
· ^ ( Q
1
 Q
2
) . I n t h e bot t om figure we com
bin e se ve ra l ope ra t ion s t oge t h e r wit h lin e s ( wire s) in dica t in g in put , out put , da t a duplica t ion
a n d t ra n sfe r t o form t he AND ope ra t ion , ( Q
1
^
 Q
1
)
^
 ( Q
2
^
 Q
2
) · ( ^ Q
1
)
^
 ( ^ Q
2
) · Q
1
&Q
2
. Th is e qua t ion
ma y be use d t o prove comple t e ne ss of t he NOR ope ra t ion .
01adBARYAM_29412 3/10/02 10:17 AM Page 243
information to be used in more than one place. Additional implicit actions in volve
timing, because the r epresentation makes an assump tion that time causes the infor
mation to be moved and acted upon in a sequence from left to right. It is also neces
sar y to have mechanisms for input and output.
The kind of mathematical ma chine we just described is limited to performing
one prespecified function of its inputs. The process of making machines is time con
suming. To physically rear r ange components to make a new function would be in
convenient. Thus it is useful to ask whether we might design a machine such that part
of its input could include a specification o f the mathematical operation to be per
for med. Both information describing the mathematical funct ion,and the numbers on
which it is to be per for med, would be encoded in the input which could be described
as a string of binar y character s.
This discussion suggests that we should systematically consider the propert ies/
qualities of machines able to perform computations. The theor y of computation is a
selfconsistent discussion of abstract machines that perform a sequence of prespeci
fied welldefined operations. It extends the concept of universality that was discussed
for logical oper ations. While the theory of logic deter mined that all Boolean functions
could be represented using elementary logic o perations, the theor y of computation
endeavor s to establish what is possible to compute using a sequence of more general
elementary operations. For this discussion many of the practical matters of computer
design are not essential. The key question is to establish a relationship between ma
chines that might be const ructed and mathematical functions that may be computed.
Par t of the problem is to define what a computation is.
There are sever al alter native models of computation that have been shown to be
equivalent in a for mal sense since each one of them can simulate any other. Turing in
troduced a class of machines that represent a part icular model of computation.
Rather than maintaining information in wires, Turing machines (Fig. 1.9.2) use a
storage device that can be read and written t o. The st or age is r epresented as an infi
nite onedimensional tape marked into squares. On the tape can be written char ac
ters, one to a square. The total number of possible char acters, the alphabet, is finite.
These characters are often taken to be digits plus a set of mar ker s (delimiters). In ad
dition to the characters,the tape squares can also be blank. All of the tape is blank ex
cept for a finite number of nonblank places. Oper ations on the tape are per for med by
a r oving readwrite head that has a sp ecified (finite) number of inter nal st orage ele
ments and a simple kind of program encoded in it. We can treat the program as a table
similar to the tables discussed in the context of logic. The table o peration acts upon
the value of the tape at the current location of the head,and the value of storage ele
ments within the read head. The result of an operation is not just a single binary value.
Instead it corresponds to a change in the state of the tape at the curr ent location
(write),a change in the inter nal memor y of the head,and a shift of the location of the
head by one square either to the left or to the r ight.
We can also think about a Turing machine (TM) as a dynamic system. The inter
nal table does not change in time. The inter nal state s(t),the current location l(t) ,t he
curr ent character a(t) and the tape c(t) are all functions of time. The table consists of
244 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 244
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 244
a set of instructions or rules of the form { ,s′,a′,s,a} corresponding to a deterministic
t ransition matrix. s and a are the current internal state and current tape character re
spect ively. s′ and a′ are the new inter nal state and character. is the move to be made,
either right or left (R or L).
Using either conceptual model, the TM starts from an initial state and location
and a specified tape. In each time inter val the TM head performs the following oper
ations:
1. Read the cur rent tape char acter
2. Find the instruction that corresponds to the existing combinat ion of (s,a )
3. Change the internal memor y to the cor responding s′
4. Wr ite the tape with the corresponding char acter a ′
5. Move the head to the left or right as specified by
When the TM head reaches a sp ecial inter nal state known as the halt state, then the
outcome of the computation may be read from the tape. For simplicit y, in what fol 
lows we will indicate enter ing the halt state by a move · H which is to halt.
The best way to understand the oper ation of a TM is to const ruct part icular
tables that perform par ticular actions (Question 1.9.6). In addition to logical
C o m p u t a t i o n 245
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 245
Title: Dynamics Complex Systems
Shor t / Normal / Long
R s
2
1 , s
1
1
L s
1
1 , s
1
0
R s
1
1 , s
2
0
H s
2
1 , s
2
1
0 1 1 1 0 0 1 0 0 1 1 1 0 0 0 0 1 1 0 0 0
s
Fi gure 1 . 9 . 2 Turin g’s mode l of comput a t ion — t h e Turin g ma ch in e ( TM) — con sist s of a t a pe
divide d in t o squa re s wit h ch a ra ct e rs of a fin it e a lph a be t writ t e n on it . A rovin g “h e a d” in di
ca t e d by t h e t ria n gle h a s a fin it e n umbe r of in t e rn a l st a t e s a n d a ct s by re a din g a n d writ in g
t h e t a pe a ccordin g t o a pre spe cifie d t a ble of rule s. Ea ch rule con sist s of a comma n d t o re a d
t h e t a pe , writ e t h e t a pe , ch a n ge t h e in t e rn a l st a t e of t h e TM h e a d a n d move e it h e r t o t h e le ft
or righ t . A simplifie d t a ble is sh own con sist in g of se ve ra l rule s of t h e form { , s′, a ′, s, a }
wh e re a a n d a ′ a re possible t a pe ch a ra ct e rs, s a n d s ′ a re possible st a t e s of t h e h e a d a nd is
a move me n t of t h e h e a d righ t ( R) , le ft ( L) or h a lt ( H) . Ea ch upda t e t h e TM st a rt s by fin ding
t h e rule { , s′, a ′, s, a } in t h e t a ble such t h a t a is t h e ch a ra ct e r on t h e t a pe a t t h e curre n t lo
ca t ion of t h e h e a d, a n d s is it s curre n t st a t e . Th e t a pe is writ t e n wit h t h e corre spon din g a ′
a n d t h e st a t e of t h e TM h e a d is ch a nge d t o s ′. Th e n t h e TM h e a d move s a ccordin g t o t h e cor
re spon ding righ t or le ft . Th e illust ra t ion simplifie s t h e ch a ra ct e rs t o bin a ry digit s 0 a n d 1
a n d t he st a t e s of t h e TM he a d t o s
1
a n d s
2
.
01adBARYAM_29412 3/10/02 10:17 AM Page 245
oper ations, the possible act ions include moving and copying char acters.
Constr ucting par t icular act ions using a TM is tedious, in large part because the
movements of the head are limited to a single displacement right or left. Actual
computers use direct addressing that enables access to a par ticular st or age location
in its memor y using a number (address) specifying its location. TMs do not gener
ally use this because the tape is arbit rarily long, so that an address is an arbitrarily
large number, requiring an ar bitrarily large st orage in the inter nal state of the head.
Infinite storage in the head is not par t of the computational model.
Q
ue s t i on 1 . 9 . 6 The following TM table is designed to move a st r ing of
binary characters (0 and 1) that are located to the left of a special marker
M to blank squares on the tape to the right of the M and then to stop on the
M. Blank squares are indicated by B. The internal states of the head are indi
cated by s
1
, s
2
. . . These are not italicized, since they are values rather than
variables. The movements of the head right and left are indicated by R and
L. As mentioned above, we indicate enter ing the halt state by a movement H.
Each line has the for m { , s′, a′, s, a}.
Read over the progr am and convince yourself that it does what it is sup
posed to. Describe how it works. The TM must start fr om state s
1
and must
be located at the leftmost nonblank character. The line numbering is only for
convenience in descr ibing the TM, and has no role in its operat ion.
1. R s
2
B s
1
0
2. R s
3
B s
1
1
3. R s
2
0 s
2
0
4. R s
2
1 s
2
1
5. R s
2
M s
2
M
6. R s
3
0 s
3
0
7. R s
3
1 s
3
1
8. R s
3
M s
3
M (1.9.19)
9. L s
4
0 s
2
B
10. L s
4
1 s
3
B
11. L s
4
0 s
4
0
12. L s
4
1 s
4
1
13. L s
4
M s
4
M
14. R s
1
B s
4
B
15. H s
1
M s
1
M
Solut i on 1 . 9 . 6 This TM works by (lines 1 or 2) reading a nonblank char
acter (0 or 1) into the int er nal state of the head; 0 is repr esented by s
2
and 1
is r epresented by s
3
. The char acter that is read is set to a blank B. Then the
TM moves to the right, ignor ing all of the tape characters 0, 1 or M (lines 3
through 8) until it reaches a blank B. It writes the stored character (lines 9 or
10), changing its state to s
4
. This state specifies moving to the left,ignoring
all characters 0,1 or M (lines 11 through 13) until it reaches a blank B. Then
246 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 246
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 246
(line 14) it moves one step right and resets its state to s
1
. This starts the pro
cedure from the beginning. If it encounters the mar ker M in the state s
1
in
stead of a character to be copied, then it halts (line 15).
Since each character can also be represented by a set of other characters (i.e.,2 in
binary is 10), we can allow the TM head to read and wr ite not one but a finite pre
specified number of char acters without making a fundamental change. The following
TM, which acts upon pairs of char acters and moves on the tape by two char acters at
a time, is the same as the one given in Question 1.9.6.
1. 01 01 00 00 01
2. 01 11 00 00 11
3. 01 01 01 01 01
4. 01 01 11 01 11
5. 01 01 10 01 10
6. 01 11 01 11 01
7. 01 11 11 11 11
8. 01 11 10 11 10 (1.9.20)
9. 10 10 01 01 00
10. 10 10 11 11 00
11. 10 10 01 10 01
12. 10 10 11 10 11
13. 10 10 10 10 10
14. 01 00 00 10 00
15. 00 00 10 00 10
The part icular choice of the mapping from characters and internal states onto the
binary r epresentation is not unique. This choice is character ized by using the left and
right bits to represent different aspects. In columns 3 or 5, which represent the tap e
characters, the right bit r epresents the t ype of element (marker or digit), and the left
represents which element or mar ker it is: 00 represents the blank B, 10 represents M,
01 represents the digit 0,and 11 represents the digit 1. In columns 2 or 4, which rep
resent the state of the head,the states s
1
and s
4
are represented by 00 and 10, s
2
and s
3
are represented by 01 and 11 resp ectively. In column 1, moving right is 01, left is 10,
and halt is 00.
The ar chitecture of a TM is ver y general and allows for a large variety of act ions
using complex tables. However, all TMs can be simulated by tr ansferring all of the re
sponsibility for the table and data to the tape.A TM that can simulate all TMs is called
a universal Turing machine (UTM). As with other TMs,the responsibility of ar rang
ing the infor mation lies with the “progr ammer.” The UTM works by representing the
t able,current state,and curr ent letter on the UTM tape. We will describe the essential
concepts in building a UTM but will not explicitly build one.
The UTM acts on its own set of char acters with its o wn set of inter nal states. In
order to use it to simulate an ar bitrary TM, we have to represent the TM on the tape
of the UTM in the character s that the UTM can operate on. On the UTM tape, we
C o m p u t a t i o n 247
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 247
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 247
must be able to repr esent four t ypes of entities: a TM char acter, the state o f the TM
head, the mo vement to be taken by the TM head, and mar kers that indicate to the
UTM what is where on the tape. The mar kers are special to the UTM and must be
carefully distinguished fr om the other thr ee. For lat er reference, we will build a par
ticular t ype of UTM where the tape can be completely r epresented in binary.
The UTM tape has three parts,the part that represents the table of the TM,a work
area,and the part that represents the tape of the TM (Fig. 1.9.3). To represent the tape
and table of a particular but arbit rary TM, we start with a binary representation of its
alphabet and of its internal states
a
1
→00000, a
2
→00001, a
3
→00010, …
(1.9.21)
s
1
→000, s
2
→001, …
where we keep the left zeros, as needed for the number of bits in the longest binary
number. We then make a doubled binary representation like that used in the previous
example, where each bit becomes two bits with the low order bit a 1. The doubled bi
nary notation will enable us to distinguish between UTM markers and all other enti
ties on the tape. Thus we have:
a
1
→01 01 01 01 01, a
2
→01 01 01 01 11, a
3
→01 01 01 11 01, …
(1.9.22)
s
1
→01 01 01, s
2
→01 01 11, …
These labels of characters and states are in a sense arbitrar y, since the transition table
is what gives them meaning.
We also encode the movement commands. The movement commands are not ar
bit rar y, since the UTM must know how to interpret them. We have allowed the TM to
displace more than one character, so we must encode a set of movements such as R
1
,
L
1
, R
2
, L
2
, and H. These cor respond respectively to mo ving one char acter right, one
character left, two char acters right, two characters left, and entering the halt state.
Because the UTM must understand the move that is to be made, we must agree once
248 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 248
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 9 . 3 Th e un ive rsa l Turin g ma ch in e ( UTM) is a spe cia l TM t h a t ca n simula t e t h e com
put a t ion pe rforme d by a n y ot h e r TM. Th e UTM doe s t h is by e xe cut in g t h e rule s of t h e TM t h a t
a re e n code d on t h e t a pe of t h e UTM. Th e re a re t h re e pa rt s t o t h e UTM t a pe , t h e pa rt wh e re
t h e TM t a ble is e n code d ( on t h e le ft ) , t h e pa rt wh e re t h e t a pe of t h e TM is e n code d ( on t he
righ t ) a n d a workspa ce ( in t h e middle ) wh e re in forma t ion re pre se n t in g t h e curre n t st a t e o f
t h e TM h e a d, t h e curre n t ch a ra ct e r of t h e TM t a pe , a n d t h e move me n t comma n d, a re e n code d.
Se e t he t ext for a de script ion of t h e ope ra t ion of t he UTM ba se d on it s own rule t a ble.
M
4
TM table TM Tape Workspace
Universal Turing Machine
Current character Internal state Move
M
2
M
1
M
5
01adBARYAM_29412 3/10/02 10:17 AM Page 248
and for all on a coding of these movements. We use the lowest order bit as a dir ection
bit (1 Right, 0 Left) and the rest of the bits as the number of displacements in binar y
R
1
→011, R
2
→101, …,
L
1
→010, L
2
→100, …, (1.9.23)
H →000 or 001
The doubled binary representation is as before:each bit becomes two bits with the low
order bit a 1,
R
1
→01 11 11 , R
2
→11 01 11 , …,
L
1
→01 11 01 , L
2
→11 01 01 , …, (1.9.24)
H →01 01 01 or 01 01 11
Care is necessary in the UTM design b ecause we do not know in advance how many
types of TM moves are possible. We also don’t know how many characters or inter nal
states the TM has. This means that we don’t know the length of their binary repre
sentations.
We need a number of markers that indicate to the UTM the beginning and end
of encoded characters, states and movements described above. We also need markers
to distinguish different regions of the tape. A sufficient set of markers are:
M
1
—the beginning of a TM char acter,
M
2
—the beginning of a TM inter nal state,
M
3
—the beginning of a TM table ent r y, which is also the b eginning of a move
ment command,
M
4
—a separator between the TM table and the workspace,
M
5
—a separator between the workspace and the TM tape,
M
6
—the beginning of the cur rent TM char acter (the locat ion of the TM head),
M
7
—the identified TM table entr y to be used in the curr ent step, and
B—the blank, which we include among the marker s.
Depending on the design o f the UTM, these markers need not all be distinct. In any
case, we encode them also in binar y
B →000, M
1
→ 001, M
2
→ 010, … (1.9.25)
and then doubled binar y form where the second character is now zero:
B →00 00 00, M
1
→ 00 00 10, M
2
→ 00 10 00, … (1.9.26)
We are now in a position to encode both the tape and table of the TM on the tape
of the UTM. The representation of the table consists of a sequence of representations
of the lines of the table, L
1
L
2
..., where each line is r epresented by the doubled binar y
representation of
M
3
M
2
s′ M
1
a′ M
2
s M
1
a (1.9.27)
C o m p u t a t i o n 249
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 249
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 249
The marker s are definite but the characters and states and movements correspond to
those in a particular line in the table. The UTM representation of the tape of the TM,
a
1
a
2
. . ., is a doubled binary representation of
M
1
a
1
M
1
a
2
M
1
a
3
. . . (1.9.28)
The wor kspace starts with the character M
4
and ends with the character M
5
. There is
room enough for the representation of the current TM machine state,the cur rent tape
char acter and the movement command to be executed. At a particular time in execu
tion it appears as:
M
4
M
2
s M
1
a M
5
(1.9.29)
We describe in gener al terms the operation of the UTM using this representation
of a TM. Before execution we must indicate the starting location of the TM head and
its initial state. This is done by changing the corresponding mar ker M
1
to M
6
(at the
UTM tape location to the left of the character cor responding to the initial location of
the TM), and the init ial state of the TM is encoded in the workspace after M
2
.
The UTM starts from the leftmost nonblank character of its tape. It moves to the
r ight until it encount ers M
6
. It then copies the character aft er M
6
into the wor k area
after M
1
. It compares the values of ( s,a) in the work area with all of the possible (s,a)
pairs in the transition table pairs until it finds the same pair. It marks this table entr y
with M
7
. The corresponding s′ from the table is copied into the work area aft er M
2
.
The corresponding a′ is copied to the tape after M
6
. The corresponding movement
command is copied to the work area after M
4
. If the movement command is H the
TM halts. Other wise, the marker M
6
is moved according to the value of . It is moved
one step at a time (i.e.,the marker M
6
is swit ched with the adjacent M
1
) while decre
menting the value of the digits of (except the rightmost bit) and in the direction
specified by the rightmost bit. When the movement command is decremented to zero,
the TM begins the cycle again by copying the char acter after M
6
into the work area.
Ther e is one d etail we have overlooked: the TM can wr ite to the left o f its non
blank characters. This would cause problems for the UTM we have designed,since to
the left of the TM tape representation is the workspace and TM table. There are vari
ous ways to overcome this difficulty. One is to represent the TM tape by folding it
upon itself and interleaving the character s.St ar ting from an arbitrary location on the
TM tape we write all char acter s on the UTM tape to the right of M
5
, so that odd char
acters are the TM tape to the right, and even ones are the TM tape to the left.
Movements of the M
6
mar ker are doubled, and it is reflected (bounces) when it en
counters M
5
.
A TM is a dynamic system. We can r eformulate Turing’s model of computation
in the form of a cellular automaton (Section 1.5) in a way that will shed some light on
the dynamics that are being discussed. The most dir ect way to do this is to make an
automaton with two adjacent tapes. The only information in the second str ip is a sin
gle nonblank character at the location of the head that represents its internal state.
The TM update is entirely contained within the update rule of the automaton. This
update rule may be constr ucted so that it acts at every point in the space, but is en
abled by the nonblank character in the adjacent square on the second tape. When the
250 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 250
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 250
dynamics reaches a steady state (it is enough that two successive states of the au
tomaton are the same),the computation is completed. If desired we could reduce this
CA to one tape by placing each pair of squares in the two tapes adjacent to each other,
interleaving the two tapes. While a TM can be represented as a CA,any CA with only
a finite number of active cells can be updated by a Turing machine program (it is com
putable). There are many other CA that can be programmed by their initial state to
perform computations. These can be much simpler than using the TM model as a
start ing point. One example is Conway’s Game of Life, discussed in Sect ion 1.5.Like
a UTM, this CA is a universal computer—any computation can be performed by
start ing from some initial state and looking at the final steady state for the result.
When we consider the r elationship of computation theor y to dynamic syst ems,
there are some intentional rest rictions in the theor y that should be recognized. The
conventional theor y of computation describes a single computational unit oper ating
on a character string for med from a finite alphabet of characters. Thus, computation
theor y does not describe a continuum in space,an infinite array of processors, or real
numbers. Computer operations only mimic approximately the formal d efinition of
real numbers. Since an ar bitrary real number requires infinitely many digits to spec
ify, computations upon them in finite time are impossible. The reject ion by compu
tation theor y of oper ations upon real numbers is not a trivial one. It is rooted in fun
damental results of computation theor y regarding limits to what is inherently
possible in any computation.
This model of computation as dynamics can be summarized by saying that a
computation is the steadystate result of a deterministic CA with a finite alphabet (fi
nite number of characters at each site) and finite domain update rule.One of the char
acter s (the blank or vacuum) must be such that it is unchanged when the system is
filled with these characters. The space is infinite but the conditions are such that all
space except for a finite region must be filled with the blank character.
1 . 9 . 5 Comput a bilit y a nd t he ha lt ing problem
The constr uction of a UTM guarantees that if we know how to per for m a part icular
oper ation on numbers, we can progr am a UTM to perfor m this computation.
However, if someone gives you such a program––can you determine what it will com
pute? This seemingly simple question turns out to be at the core of a centr al problem
of logic theory. It turns out that it is not only difficult to determine what it will com
pute,it is,in a for mal sense that will be described below, impossible to figure out if it
will compute anything at all. The requirement that it will compute something is that
eventually it will halt. By halting, it declares its computation completed and the an
swer given. Instead of halting, it might loop forever or it might continue to write on
ever larger regions o f tape. To say that we can deter mine whether it will compute
something is equivalent to saying that it will eventually halt. This is called the halting
problem. How could we d etermine if it would halt? We have seen above how to r ep
resent an arbitrary TM on the tape of a par ticular TM. Consistent with computation
theor y, the halting problem is to const ruct a special TM, T
H
, whose input is a de
scription of a TM and whose output is a single bit that specifies whether or not the
C o m p u t a t i o n 251
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 251
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 251
TM will halt. In order for this to make sense,the program T
H
must itself halt. We can
prove by cont radiction that this is not possible in general, and therefore we say that
the halting problem is not computable. The p roof is based on constr ucting a para
doxical logical statement of the form “This statement is false.”
A proof starts by assuming we have a TM called T
H
that accepts as input a tape
representing a TM Y and its tape y. The output, which can be represented in func
tional form as T
H
(Y, y), is always welldefined and is either 1 or 0 representing the
statement that the TM Yhalts on y or doesn’t halt on y respectively. We now construct
a logical contr adiction by const ructing an additional TM based o n T
H
. First we con
sider T
H
(Y, Y), which asks whether Y halts when acting on a tape representing itself.
We design a new TM T
H 1
that takes only Yas input, copies it and then acts in the same
way as T
H
. So we have
T
H1
(Y) · T
H
(Y, Y) (1.9.30)
We now define a TM T
H2
that is based on T
H1
but whenever T
H1
gives the answer
0 it gives the answer 1,and whenever T
H1
gives the answer 1 it enters a loop and com
putes forever. A moment’s meditation shows that this is possible if we have T
H1
.
Applying T
H 2
to itself then gives us the contradict ion, since T
H2
( T
H2
) gives 1 if
T
H1
(T
H2
) · T
H
(T
H2
,T
H2
) · 0 (1.9.31)
By definition of T
H
this means that T
H 2
(T
H 2
) does not halt, which is a cont radiction.
Alter natively, T
H 2
(T
H 2
) computes forever if
T
H1
(T
H2
) · T
H
( T
H2
,T
H2
) · 1
by definition of T
H
this means that T
H2
(T
H 2
) halts, which is a contradict ion.
The noncomputability of the halting problem is similar to Gödel’s theorem and
other results denying the completeness of logic, in the sense that we can ask a ques
tion about a logical construction that cannot be answered by it.Gödel’s theorem may
be paraphrased as: In any axiomatic for mulation of number theor y (i.e.,integers) ,it
is possible to wr ite a statement that cannot be proven T or F. There has been a lot of
discussion about the philosophical significance of these theorems.A basic conclusion
that may be reached is that they describe something about the relationship of the fi
nite and infinite. Turing machines can be represented,as we have seen, by a finite set
of characters. This means that we can enumer ate them, and they correspond oneto
one to the integers. Like the int eger s, there are (countably) infinit ely many of them.
Gödel’s theorem is part of our understanding of how an infinite set of numbers must
be described. It t ells us that we cannot describe their pr opert ies using a finite set of
statements. This is ap pealing fr om the p oint of view of information theor y since an
ar bitrary int eger contains an ar bit r arily large amount o f infor mation. The noncom
putability of the halting problem tells us more specifically that we can ask a question
about a system that is described by a finite amount of information whose answer (in
the sense of computation) is not contained within it. We have thus made a vague con
nection between computation and information theor y. We take this connection one
step further in the following section.
252 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 252
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 252
1 . 9 . 6 Comput a t ion a nd informa t ion in brief
One o f our o bjectives will be to relate computation and information. We therefore
ask, Can a calculation produce information? Let us think about the results of a TM
calculation which is a st ring of character s—the nonblank characters on the output
tape. How much information is necessary to describe it? We could describe it directly,
or use a Markov model as in Section 1.8. However, we could also give the input of the
TM and the TM description, and this would be enough infor mation to enable us to
obtain the output by computation. This descript ion might contain more or fewer
char acter s than the direct description of the out put. We now return to the problem of
defining the information content of a st ring of character s. Utilizing the full power of
computation, we can define this as the length of the shortest possible input tape for a
UTM that gives the desir ed char acter str ing as its ou tput. This is called the algor ith
mic (or Kolmogorov) complexity of a character string. We have to be careful with the
definition, since there are many different possible UTM. We will discuss this in
greater detail in Chapter 8. However, this discussion does imply that a calculation
cannot produce information. The information pr esent at the b eginning is sufficient
to o btain the result of the computation. It should be under stood, however, that the
information that seems to us to be present in a result may be larger than the original
infor mation unless we are able to reconst ruct the start ing point and the TM used for
the computation.
1 . 9 . 7 Logic, comput a t ion a nd huma n t hought
Both logic and computation theor y are designed to capt ure aspects of human
thought. A fundamental question is whether they capture enough of this process—
are human beings equivalent to glor ified Turing machines? We will ask this question
in sever al ways throughout the text and arrive at various conclusions,some of which
suppor t this identification and some that oppose it.One way to understand the ques
tion is as one of progressive approximation. Logic was o riginally designed to model
human thought. Computation theor y, which generalizes logic, includes additional
features not represented in logic. Computers as we have defined them are instruments
of computation. They are given input (infor mation) sp ecifying both program and
data and provide welldefined output an indefinite time later. One of the features that
is missing from this kind of machine is the continuous inputout put interact ion with
the world characteristic of a sensor ymotor system. An appropr iate gener alization of
the Turing machine would be a robot. As it is conceived and somet imes realized,a ro
bot has both sensor y and motor capabilities and an embedded computer. Thus it has
more o f the features character istic o f a human being. Is this sufficient, or have we
missed additional feat ures?
Logic and computation are oft en cont r asted with the concept of creativity. One
of the centr al questions about computers is whether they are able to simulate creativ
it y. In Chapter 3 we will produce a model of creativity that appears to be possible t o
simulate on a computer. Hidden in this model, however, is a need to use random
numbers. This might seem to be a minor problem, since we oft en use computers t o
C o m p u t a t i o n 253
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 253
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 253
gener ate rand om numbers. However, computers do not actually generate rand om
ness,they generate pseudorandom number s. If we recall that randomness is the same
as information, by the discussion in the previous section,a computer cannot gener
ate true randomness.A Turing machine cannot generate a result that has more infor
mation than it is given in its initial data. Thus creativity appears to be tied at least in
part to rand omness, which has oft en been suggested, and this may be a problem for
conventional computer s. Conceptually, this problem can be readily resolved by
adding to the description of the Turing machine an infinite random tape in addition
to the infinite blank tape. This new syst em appears quite similar to the or iginal TM
specification.A reasonable question would ask whether it is really inherently differ
ent. The main difference that we can ascertain at this time is that the new system
would be capable of gener ating results with arbit rarily large information content,
while the or iginal TM could not. This is not an unreasonable distinction to make be
tween a creative and a logical system. There are still key problems with understanding
the pract ical implications of this distinct ion.
The subtle ty o f this discussion increases when we consider that one br anch o f
theoret ical computer science is based on the commonly believed assumption that
there exist functions that are inherently difficult to inver t—they can only be inverted
in a time that grows exponentially with the length of the nonblank part of the tape.
For all p ract ical purposes, they cannot be inverted, because the estimat ed lifetime of
the universe is insufficient to invert such funct ions. While their existence is not
proven, it has b een p roven that if they do exist, then such a function can be used to
gener ate a string of character s that, while not random, cannot be distinguished from
a random st r ing in less than exponential time. This would suggest that there can b e
no pr actical difference between a TM with a random tape,and one without. Thus,t he
possibility of the exist ence of noninver tible functions is intimat ely tied to questions
about the r elationship between TM, r andomness and human thought.
1 . 9 . 8 Using comput a t ion a nd informa t ion t o describe t he
rea l world
In this sect ion we r eview the fundamental relevance of the theories o f computation
and infor mation in the real wor ld. This relevance ultimately arises fr om the proper
ties of obser vations and measurements.
In our ob s er va ti ons of the worl d , we find that qu a n ti ties we measu re va r y. In deed ,
wi t h o ut va ri a ti on there would be no su ch thing as an ob s erva ti on . Th ere are va ri a ti on s
over time as well as over space . Our intell ectual ef for t is ded i c a ted to cl a s s i f ying or un
derstanding this va ri a ti on . To con c reti ze the discussion , we con s i der ob s erva ti ons of a
va ri a ble s wh i ch could be as a functi on of time s(t) or of s p ace s(x) . Even though x or
t m ay appear con ti nu o u s , our ob s er va ti ons may of ten be de s c ri bed as a finite discrete
s et {s
i
} . One of the cen t ral (met a ) ob s erva ti ons abo ut the va ri a ti on in va lue of {s
i
} is that
s om etimes the va lue of the va ri a ble s
i
can be inferred from , is correl a ted wi t h , or is not
i n depen dent from its va lue or va lues at some other time or po s i ti on s
j
.
These concepts have to do with the relatedness of s
i
to s
j
. Why is this impor tant?
The reason is that we would like to know the value of s
i
without having to obser ve it.
254 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 254
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 254
We can under stand this as a problem in prediction—to anticipate events that will oc
cur. We would also like to know what is located at unobser ved positions in space;e.g.,
around the corner. And even if we have observed something, we do not want to have
to remember all obser vations we make. We could argue more fundamentally that
knowledge/infor mation is impor tant only if predict ion is possible. There would be no
reason to remember past observations if they were uncor related with anything in the
fut ure. If correlations enable predict ion,then it is helpful to store information about
the past. We want to st ore as little as possible in order to make the p rediction. Why?
Because storage is limited, or because accessing the right information requires a
search that takes time. If a search takes more time than we have till the event we want
to predict, then the information is not useful. As a corollary (fr om a simplified u tili
tarian point of view), we would like to retain only infor mation that gives us the best,
most rapid prediction, under the most circumstances, for the least storage.
Inference is the process of logic or computation. To be able to infer the state of a
variable s
i
means that we have a definite formula f(s
j
) that will give us the value o f s
i
with complete cer tainty from a knowledge of s
j
. The theor y of computation describes
what functions f are possible. If the index i corresponds to a lat er time than j we say
that we can predict its value. In addition to the value of s
j
we need to know the func
tion f in order to predict the value of s
i
. This r elationship need not be fr om a single
value s
j
to a single value s
i
. We might need to know a collection of values {s
j
} in order
to obtain the value of s
i
from f ({s
j
}).
As part of our exper ience of the world, we have learned that obser vations at a par
ticular time are more closely related to observations at a previous time than obser va
tions at different near by locations. This has been summarized by the pr inciple of
causality. Causality is the ability to deter mine what hap pens at one time from what
happened at a previous time. This is mo re explicitly stat ed as micr ocausality—what
happens at a particular time and place is related to what happened at a previous time
in its immediate vicinity. Causality is the principle behind the notion of determinism,
which suggests that what o ccurs is d etermined by pr ior conditions. One of the ways
that we express the relationship between system obser vations over time is by conser
vation laws. Conser vation laws are the simplest form of a causal relationship.
Cor relation is a looser relationship than infer ence. The stat ement that values s
i
and s
j
are correlated implies that even if we cannot tell exactly what the value s
i
is from
a knowledge of s
j
, we can describe it at least part ially. This partial knowledge may also
be inherently statistical in the context of an ensemble of values as discussed below.
Correlation often describes a condition where the values s
i
and s
j
are similar. If they
are opposite, we might say they are anticorrelated. However, we sometimes use the
term “cor related”more generally. In this case, to say that s
i
and s
j
are cor related would
mean that we can construct a function f (s
j
) which is close to the value of s
i
but not ex
actly the same. The degree of cor relation would t ell us how close we expect them t o
be. While cor relations in time appear to be more central than cor relations in spa ce,
systems with interactions have correlations in both space and time.
Concepts of relatedness are inherently o f an ensemble nature. This means that
they do not refer to a par ticular value s
i
or a pair of values (s
i
, s
j
) but rather to a
C o m p u t a t i o n 255
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 255
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 255
collection of such values or pairs. The ensemble nature of relationships is often more
explicit for correlations, but it also applies to inference. This ensemble nature is hid
den by func tional t er minology that describes a relationship between par t icular val
ues. For example, when we say that the temperature at 1:00 P.M. is correlated with the
temper ature at 12:00 P.M., we are describing a relationship between two temperature
values. Implicitly, we are describing the collection of all pairs of temper atures on dif
fer ent days or at different locations. The set of such pairs are analogs. The concept of
inference also generally makes sense only in r eference to an ensemble. Let us assume
for the moment that we are discussing only a single value s
i
. The statement of infer
ence would imply that we can obtain s
i
as the value f(s
j
). For a single value,the easiest
way (requiring the smallest amount of information) to specify f (s
j
) would be to spec
ify s
i
. We do not gain by using inference for this single case. However, we can gain if
we know that, for example,the velocity of an object will remain the same if there are
no forces upon it. This describes the velocity v(t) in t er ms of v(t ′) of any one object
out of an ensemble of objects. We can also gain from inference if the function f (s
j
)
gives a string of more than one s
i
.
The notion of independence is the opposite of inference or correlation. Two val
ues s
i
and s
j
are independent if there is no way that we can infer the value of one from
the other, and if they are not cor related. Randomness is similar to independence. The
word “independent” is used when there is no cor relation between two obser vations.
The word “r andom” is stronger, since it means that there is no cor relation between an
obser ved value and anything else. A random pr ocess,like a sequence of coin tosses,is
a sequence where each value is independent of the others. We have seen in Section 1.8
that rand omness is intimately r elated with information. Random p rocesses are un
predict able,t herefore it makes no sense for us to t r y to accumulate information that
will help predict it. In this sense, a rand om p rocess is simple to describe. However,
once a random process has occurred,other events may depend upon it. For example,
someone who wins a lotter y will be significantly affected by an event presumed to be
r andom. Thus we may want to remember the results of the random process aft er it
occurs. In this case we must remember each value. We might ask, Once the random
process has o ccurred, can we summarize it in some way? The answer is that we can 
not. Indeed, this proper ty has been used to define r andomness.
We can abstr act the problem of prediction and descript ion of observations to the
problem of data compression. Assume there are a set of obser vations {s
i
} for which we
would like to obtain the shortest possible description from which we can reconstr uct
the complete set of observations. If we can infer one value from another, then the set
might be compressed by eliminating the infer able values. However, we must make
sure that the added information necessary to describe how the inference is to be done
is less than the information in the eliminated values. Correlations also enable com
pression. For example,let us assume that the values are biased ON with a probability
P(1) · .999 and OFF with a probability P (−1) · 0.001. This means that one in a thou
sand values is OFF and the others are ON. In this case we can remember which ones
are OFF rather than keeping a list of all of the values. We would say they are ON except
for number s 3, 2000,2403,5428, etc. This is one way of coding the information. This
256 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 256
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 256
method of encoding has a pr oblem in that the numbers representing the locations of
the OFF values may become large. They will be cor related because the first few digits
of successive locations will be the same (…,431236,432112,434329,…). We can fur
ther reduce the list if we are willing to do some more processing, by giving the inter
vals between successive OFF values rather than the absolute numbers of their location.
Ultimately, when we have reached the limits o f our ability to infer one obser va
tion from another, the rest is information that we need. For example, differential
equations are based on the presumpt ion that boundary conditions (initial conditions
in time,and boundary conditions in space) are sufficient to predict the behavior of a
system. The values of the initial conditions and the boundary conditions are the in 
formation we need. This simple model of a syst em, where information is clear ly and
simply separat ed from the problem of computation, is not always applicable.
Let us assume that we have made extensive obser vations and have separated from
these obser vations a minimal set that then can be used to infer all the rest.A minimal
set o f information would have the proper ty that no one piece of information in it
could be obtained from other pieces of information. Thus,as far as the set itself is con
cerned, the information appears to be random. Of course we would not be satisfied
with any random set; it would have to be this one in part icular, because we want to
use this informat ion to tell us about all of the actual observations.
One of the difficulties with random numbers is that it is inherently difficult to
prove that numbers are random. We may simply not have thought of the right func
tion f that can predict the value of the next number in a sequence from the previous
numbers. We could argue that this is one of the reasons that gambling is so attr active
to p eople because of the use o f “lucky number s” that are expected by the individual
to have a betterthanrandom chance of success. Indeed,it is the success of science to
have shown that apparently uncorr elated events may be related. For example, the
falling of a ball and the motion of the planets. At the same time, science pr ovides a
framework in which noncausal correlations, otherwise called superstitions, are
rejected.
We have argued that the purpose of knowledge is to succinctly summarize infor
mation that can be used for pr ediction. Thus,in its most abstract form, the problem
of deduct ion or p rediction is a problem in data compression. It can thus be argued
that science is an exercise in data compression. This is the essence of the principle of
Occam’s razor and the importance of simplicity and universality in science. The more
univer sal and the more general a law is,and the simpler it is,then the more data com
pression has been achieved. Often this is considered to relate to how valuable is the
contr ibution o f the law to science. Of course, even if the equations are general and
simple,if we cannot solve them then they are not par ticular ly useful from a pr actical
point of view. The concept of simplicity has always been poor ly defined. While science
seeks to discover cor relations and simplifications in observations of the universe
around us,ultimat ely the minimum descript ion of a system (i.e.,the univer se) is given
by the number of independent pieces of infor mation required to describe it.
Our understanding of infor mation and computation enters also into a discussion
of our models of systems discussed in previous sect ions. In many of these models, we
C o m p u t a t i o n 257
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 257
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 257
assumed the existence of random variables, or random processes. This randomness
represents either unknown or complex phenomena. It is important to recognize that
this represents an assumption about the nature of cor relations between different as
pects of the problem that we are modeling. It assumes that the random pr ocess is in
dependent of (uncor related with) the aspects of the system we are explicitly studying.
When we model the random p rocess on a computer by a pseudorandom numb er
generator, we are assuming that the computations in the pseudorandom number
generator are also uncorrelated with the system we are stud ying. These assumptions
may or may not be valid, and tests of them are not gener ally easy to perfor m.
Fra ct a ls, Sca li ng a nd Re norma li za t i on
The physics of Newton and the related concepts of calculus, which have dominated
scientific thinking for three hundred years,are based upon the understanding that at
smaller and smaller scales—both in space and in time—physical systems become sim
ple,smooth and without detail.A more careful articulation of these ideas would note
that the fine scale st ructure of planets, mater ials and atoms is not without detail.
However, for many problems, such detail becomes irrelevant at the larger scale. Since
the d etails are irrelevant, formulating theor ies in a way that assumes that the detail
does not exist yields the same results as a more exact description.
In the t reatment of complex syst ems, including various physical and biolo gical
systems,t here has been a r ecognition that the concept of progressive smoothness on
finer scales is not always a useful mathematical star t ing point. This recognition is an
impor tant fundamental change in perspective whose consequences are still being
explored.
We have already discussed in Sect ion 1.1 the subject of chaos in iterative maps. In
chaotic maps, the smoothness of dynamic behavior is violat ed. It is violat ed because
fine scale details matt er. In this section we describe fr actals,mathematical models of
the spatial st ruct ure of systems that have increasing detail on finer scales.Geomet ric
fractals have a selfsimilar structure, so that the structure on the coarsest scale is re
peat ed on finer length scales. A more gener al framework in which we can articulate
questions about systems with behavior on all scales is that of scaling the or y int ro
duced in Section 1.10.3.One of the most powerful analytic tools for studying systems
that have scaling properties is the renor malization group. We apply it to the Ising
model in Section 1.10.4, and then return full cycle by appl ying the renor malization
group to chaos in Section 1.10.5.A computational technique,the multigrid method,
that enables the description o f problems on multiple scales is discussed in Section
1.10.6. Finally, we discuss briefly the relevance of these concepts to the study of com
plex systems in Sect ion 1.10.7.
1 . 1 0 . 1 Fra ct a ls
Traditional geomet ry is the study of the propert ies of spaces or objects that have in
tegr al dimensions. This can be gener alized to allow effect ive fractional dimensions of
objects, called fr actals, that are embedded in an integral dimension space. In r ecent
1 . 1 0
258 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 258
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 258
years the recognition that fr actals can play an impor tant role in modeling natural phe
nomena has fueled a whole area of research investigating the occur rence and proper
ties of fr actal objects in physical and biological systems.
Fractals are of ten def i n ed as geom et ric obj ects whose spat ial st ru ctu re is sel f 
s i m i l a r. This means that by magn i f ying one par t of the obj ect , we find t he same stru c
tu re as of the or i ginal obj ect . The obj ect is ch a r acter i s ti c a lly for m ed out of a co ll ec
t i on of el em en t s : poi n t s , line segm en t s , planar secti ons or vo lume el em en t s . Th e s e
el em ents exist in a space of t he same or high er dimen s i on to t he el em ents them s elve s .
For ex a m p l e , line segm ents are on e  d i m en s i onal obj ect s that can be found on a line,
p l a n e , vo lume or high er dimen s i onal space . We might begin to de s c ri be a fr act al by
t he obj ects of wh i ch it is for m ed . However, geom etr ic fr actals are of ten de s c ri bed by
a procedu re ( algorithm) that cre a tes t hem in an ex p l i c i t ly selfsimilar manner.
One of the simplest examples of a fr actal object is the Cantor set (Fig. 1.10.1).
This set is formed by a procedure that starts from a single line segment. We r emove
the middle third from the segment. There are then two line segments left. We then re
move the middle third from both of these segments, leaving four line segments.
Continuing iter atively, at the kth iter ation there are 2
k
segments. The Cantor set,
which is the limiting set of points obtained from this process,has no line segments in
it. It is selfsimilar by direct construction,since the left and right third of the original
line segment can be expanded by a factor of three to appear as the or iginal set.
An analog of the Cantor set in two dimensions is the Sierpinski gasket
(Fig. 1.10.2). It is const ructed from an equilateral tr iangle by removing an internal tri
angle which is half of the size of the or iginal triangle. This procedure is then iterated
for all of the smaller t r iangles that result. We can see that ther e are no areas that are
left in this shape. It is selfsimilar, since each of the thr ee corner t r iangles can be e x
panded by a factor of two to appear as the or iginal set.
For selfsimilar objects, we can obtain the effective fr actal dimension directly by
consider ing their composition fr om parts. We do this by analogy with conventional
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 259
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 259
Title: Dynamics Complex Systems
Shor t / Normal / Long
(0)
(1)
(2)
(3)
(4)
Fi gure 1 . 1 0 . 1 I llust ra t ion of t h e con st ruct ion of t h e Ca n t or se t , on e of t h e be st  kn own fra c
t a ls. Th e Ca n t or se t is forme d by it e ra t ive ly re movin g t h e mi ddle t h ird from a lin e se gme n t ,
t h e n t h e middle t h ird from t h e t wo re ma in in g lin e se gme n t s, a n d so on . Four it e ra t ion s of t he
proce dure a re sh own st a rt ing from t he comple t e line se gme n t a t t he t op.
01adBARYAM_29412 3/10/02 10:17 AM Page 259
geometr ic objects which are also selfsimilar. For example,a line segment,a square, or
a cube can be formed fr om smaller objects of the same t ype. In general, for a ddi
mensional cube, we can for m the cube out of smaller cubes. If the size of the smaller
cubes is reduced from that of the large cube by a factor of , where is inversely pro
port ional to their diameter, ∝ 1/ R, then the number of smaller cubes necessary t o
form the original is N ·
d
. Thus we could obtain the dimension as:
d · ln(N) / ln( ) (1.10.1)
For selfsimilar fractals we can do the same, where N is the number of parts that make
up the whole.Ea ch of the parts is assumed to have the same shape, but reduced in size
by a factor of from the original object.
We can gen er a l i ze the def i n i ti on of f r act al dimen s i on so that we can use it to
ch a r acter i ze geom etr ic obj ect s that are not stri ct ly sel f  s i m i l a r. Th ere is mor e than
one way to gen era l i ze the def i n i ti on . We wi ll adopt an int u i tive def i n i ti on of f r act a l
d i m en s i on wh i ch is cl o s ely rel a ted to Eq . ( 1 . 1 0 . 1 ) . If the obj ect is em bed ded in d d i
m en s i on s , we cover t he obj ect with d d i m en s i onal disks. This is illu s t ra ted in Fig.
1.10.3 for a line segm ent and a rect a n gle in a two  d i m en s i onal space . If we cover t he
obj ect with two  d i m en s i onal disks of a fixed r ad iu s , R, usin g t he minimal nu m ber of
disks po s s i bl e , the nu m ber of these disks ch a n ges with the r ad ius of the disks ac
cor ding to t he power law:
N(R) ∝ R
−d
(1.10.2)
where d is defined as the fractal dimension. We note that the use of disks is only illus
t rative. We could use squares and the result can be proven to be equivalent.
260 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 260
Title: Dynamics Complex Systems
Shor t / Normal / Long
F i g u re 1 .10 .2 T he Sie r p i n ski gaske t is fo r me d in a similar ma n ne r t o t he Can t or set . St art ing
f rom an equilat e ra l t ria n g l e, a similar t ria n gle one ha lf t he size is re moved from t h e middle le av
i ng t hree t ria ngles a t t he corne r s. Th e pro c e du re is t hen it erat ively applied t o t he re ma i n i ng t ri
a n g l e s. The fig u re shows t he set t h at re sult s aft e r four it era t io ns of t he pro c e du re.
01adBARYAM_29412 3/10/02 10:17 AM Page 260
We can use either Eq. (1.10.1) or Eq. (1.10.2) to calculate the dimension of the
Cantor set and the Sier pinski gasket. We illustr ate the use of Eq. (1.10.2). For the
Cantor set, by constr uct ion, 2
k
disks (or line segments) of radius 1/3
k
will cover the
set. Thus we can write:
N(R / 3
k
) · 2
k
N(R) (1.10.3)
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 261
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 261
Title: Dynamics Complex Systems
Shor t / Normal / Long
(b)
(a)
(c)
F i g u re 1.1 0.3 I n orde r t o de f i ne t h e dime n s ion of a fract al obje ct , we con s ider t he problem of
c o v e r i n g a set wit h a minimal number of disks of ra dius R. ( a ) shows a line segme nt wit h t hre e
d i f f e re nt cove rings supe rimpose d. ( b) a nd ( c) show a re c t a ngle wit h t wo differe nt coverings re
s p e c t i v e l y. As t h e size of t he disks de c rea ses t he number of disks nece ssa ry t o cover t he shape
g rows a s R
−d
. This beha vior becomes e xa ct on ly in t he limit R → 0. The fract a l dime n s ion de
f i ned in t his way is some t i mes ca lled t he box  c o u n t i ng dime n s ion, because d d i me n s io na l boxe s
a re oft en used ra t he r t ha n disks.
01adBARYAM_29412 3/10/02 10:17 AM Page 261
Using Eq. (1.10.2) this is:
(R / 3
k
)
−d
· 2
k
R
−d
(1.10.4)
or :
3
d
· 2 (1.10.5)
which is:
d · ln(2) / ln(3) ≅ 0.631 (1.10.6)
We would ar rive at the same result more directly from Eq. (1.10.1).
For the Sier pinski gasket, we similar ly r ecognize that the set can be covered by
three disks of radius 1/ 2, nine disks of radius 1/ 4,and more generally 3
k
disks of ra
dius 1/2
k
. This gives a dimension of:
d · ln(3) / ln(2) ≅ 1.585 (1.10.7)
For these fractals there is a deter ministic algor ithm that is used to generate them.
We can also consider a kind of stochastic fr actal generated in a similar way, however,
at each level the algor ithm involves choices made from a probability distribution. The
simplest modification of the sets is to assume that at each level a choice is made with
equal probability from several possibilities. For example,in the Cantor set, rather than
removing the middle third from each of the line segments, we could choose at ran
dom which of the three thirds to remove. Similar ly for the Sier pinski gasket, we could
choose which of the four triangles to remove at each stage. These would be stochastic
fractals,since they are not described by a deterministic selfsimilarity but by a statis
tical selfsimilarit y. Never theless, they would have the same fr actal dimension as the
deterministic fr actals.
Q
ue s t i on 1 . 1 0 . 1 How does the dimension of a fractal,as defined by Eq.
(1.10.2), depend on the dimension of the space in which it is embedded?
Sol u t i o n 1 . 1 0 . 1 The dimen s i on of a fract al is indepen dent of t he di
m en s i on of the space in wh i ch it is em bed ded . For ex a m p l e , we migh t
s t a r t with a d d i m en s i onal space and increase the dimen s i on of t he space
to d + 1 dimen s i on s . To show t hat Eq . ( 1.10.2) is not ch a n ged , we for m a
cover ing of t he fr act al by d + 1 dimen s i onal sph er es whose in ter s ecti on
with t he d d i m en s i onal space is t he same as t he cover ing we used for t he
a n a lysis in d d i m en s i on s .
Q
ue s t i on 1 . 1 0 . 2 Pr ove that the fractal dimension does not change if we
use squares or circles for covering an object.
Solut i on 1 . 1 0 . 2 Assume that we have minimal cover ings of a shape using
N
1
( R) · c
1
R
−d
1
squares, and minimal coverings by N
2
( R) · c
2
R
−d
2
circles,
with d
1
≠ d
2
. The squares are characterized using R as the length of their side,
while the circles are character ized using R as their radius. If d
1
is less than d
2
,
then for smaller and smaller R the number of disks becomes ar bitr arily
smaller than the number of squares. However, we can cover the same shape
262 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 262
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 262
using squares that circumscribe the disks. The numb er o f these squares is
N ′
1
(R) · c
1
(R / 2)
−d
1
. This is impossible, because for small enough R, N ′
1
( R)
will be smaller than N
1
(R), which violates the assumpt ion that the latter is a
minimal cover ing. Similar ly, if d is greater than d ′, we use disks circum
scr ibed around the squares to arr ive at a contradict ion.
Q
ue s t i on 1 . 1 0 . 3 Calculate the fr actal dimension of the Koch curve given
in Fig. 1.10.4.
Solut i on 1 . 1 0 . 3 The Koch cur ve is composed out of four Koch cur ves r e
duced in size from the original by a factor of 3. Thus, the fractal dimension
is d · ln(4) / ln(3) ≈ 1.2619.
Q
ue s t i on 1 . 1 0 . 4 Show that the length of the Koch curve is infinite.
Solu t io n 1 . 1 0 . 4 The Koch curve can be con s tru cted by taking out the mid
dle third of a line segm ent and inserting two segm ent s equ iva l ent to the on e
that was rem oved . Th ey are inserted so as to make an equ i l a teral tr i a n gle wi t h
the rem oved segm en t . Thu s , at every itera ti on of the con s tr u cti on procedu re ,
the lengt h of the peri m eter is mu l ti p l i ed by 4/ 3 , wh i ch means that it diver ge s
to infinity. It can be proven more gen era lly that any fract al of d i m en s i on 2 >
d > 1 must have an infinite length and zero are a ,s i n ce these measu res of s i ze
a re for on e  d i m en s i onal and two  d i m en s i onal obj ects re s pect ively.
Eq. (1.10.2) neglects the jumps in N(R) that arise as we vary the radius R. Since
N(R) can only have int egral values,as we lower R and add additional disks there are
discrete jumps in its value. It is conventional to define the fractal dimension by taking
the limit of Eq.(1.10.2) as R →0, where this problem disappears. This approach,how
ever, is linked philosophically to the assumpt ion that systems simplify in the limit of
small length scales. The assumption here is not that the system becomes smooth and
featureless, but rather that the fractal propert ies will continue to all finer scales and
remain ideal. In a physical system,the fractal dimension cannot be taken in this limit.
Thus, we should allow the definition to be applied over a limited d omain of length
scales as is appr opriate for the problem. As long as the domain of length scales is large,
we can use this definition. We then solve the problem of discrete jumps by treating the
leading behavior of the funct ion N(R) over this domain.
The problem of t reating distinct dimensions at different length scales is only one
of the difficulties that we face in discussing fractal systems. Another problem is inho
mogeneity. In the following section we discuss objects that are inherently inhomoge
neous but for which an alternate natural definition o f dimension can be devised t o
describe their structure on all scales.
1 . 1 0 . 2 Trees
Itera t ive procedu res like t hose used to make fract als can also be used to make geo
m et ric obj ect s call ed tree s . An example of a geom et ric tree , wh i ch be a rs va g u e
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 263
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 263
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 263
re s em bl a n ce to physical t ree s , is shown in Fig. 1 . 1 0 . 5 . The t ree is for m ed by st ar ti n g
with a single obj ect ( a line segm en t ) , scaling it by a factor of 1/ 2 , du p l i c a t ing it two
t imes and attaching t he par ts to t he ori ginal obj ect at its bo u n d a r y. This process is
t h en iter a ted for each of the re su l ting part s . The iter a ti ons cre a te str u ctu re on finer
and finer scales.
264 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 264
Title: Dynamics Complex Systems
Shor t / Normal / Long
(0)
(1)
(2)
(3)
(4)
Fi gure 1 . 1 0 . 4 I llust ra t ion of t h e st a rt in g lin e se gme n t a n d four succe ssive st a ge s in t h e for
ma t ion of t he Koch curve. For furt he r discussion se e Que st ion s 1. 10. 3 a nd 1. 10. 4.
01adBARYAM_29412 3/10/02 10:17 AM Page 264
We can gener alize the definition of a tree to be a set formed by iter atively adding
to an o bject copies of itself. At it eration t, the add ed objects are r educed in size by a
factor
t
and duplicated N
t
times, the duplicated versions being rotated and then
shifted by vectors whose lengths converge to zero as a function of t. A tree is different
from a fractal because the smaller versions of the original object, are not contained
within the original object.
The fractal dimen s i on of t rees is not as st ra i gh tfor w a rd as it is for sel f  s i m i l a r
f ract a l s . The ef fective fractal dimen s i on can be calculated ; h owever, it gives re su l t s
t hat are not intu i t ively rel a ted to t he tree stru ct u re . We can see why this is a probl em
in Fig. 1 . 1 0 . 6 . The dimen s i on of t he regi on of the t ree wh i ch is above the size R is that
of the em bed ded en ti t y (line segm en t s ) , while the fr actal dimen s i on of the regi on
wh i ch is less than t he size R is determ i n ed by t he spat ial st ru ctu re of the tree . Bec a u s e
of the ch a n ging va lue of R in the scaling rel a ti on , an interm ed i a te va lue for the frac
t al dimen s i on would t yp i c a lly be found by a direct calculat i on (Quest i on 1.10.5).
It is reasonable to avoid this p roblem by classifying trees in a different cat egor y
than fr actals. We can d efine the t ree dimension by considering the selfsimilarity of
the tree str ucture using the same for mula as Eq. (1.10.1), but now applying the defi
nition to the number N and scaling of the displaced parts o f the generating st r uc
ture, rather than the embedded parts as in the fractal. In Section 1.10.7 we will en
counter a t reelike structure; however, it will be more useful to describe it rather than
to give a dimension that might char acter ize it.
Q
ue s t i on 1 . 1 0 . 5 A simple ver sion of a tree can be const ructed as a set of
points {1/ k } where k takes all positive integer values. The tree dimension
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 265
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 265
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 1 0 . 5 A ge ome t ric t re e
forme d by a n it e ra t ive a lgorit h m
simila r t o t h ose use d in forming
fra ct a ls. Th is t re e ca n be forme d
st a rt in g from a sin gle lin e se g
me n t . Two copie s of it a re t h e n re 
duce d by a fa ct or of 2, rot a t e d by
45˚ le ft a n d righ t a n d a t t a ch e d a t
on e e nd. Th e proce dure is re pe a t e d
for e a ch of t h e re sult in g lin e se g 
me n t s. Un like a fra ct a l, a t re e is
not sole ly compose d out of pa rt s
t h a t a re se lf simila r. I t is forme d
out of se lf simila r pa rt s, a lon g
wit h t h e origin a l sh a pe — it s
t run k.
01adBARYAM_29412 3/10/02 10:17 AM Page 265
of this set is zero because it can be for med from a point which is duplicated
and then displaced by progressively smaller vectors. Calculate the fractal di
mension of this set.
Solut i on 1 . 1 0 . 5 We constr uct a covering of scale R from line segments of
this length. The covering that we constr uct will be formed out of two parts.
One part is constr ucted fr om segments placed side by side. This part starts
from zero and covers infinitely many points of the set. The other part is con
structed fr om segments that are placed on individual p oints. The crossing
point between the two sets can be calculated as the value of k where the dif
ference between successive points is R. For k below this value,it is not possi
ble to include mo re than one point in one line segment. For k above this
value, there are two or more points per line segment. The critical value of k
is found by sett ing:
(1.10.8)
or k
c
· R
−1/2
. This means that the number of segments needed to cover indi
vidual p oints is given by this value. Also, the number of segments that ar e
placed side by side must be enough to go up to this point, which has the value
1/k
c
. This number of segments is given by
(1.10.9)
1 k
c
R
· R
−1/2
≈k
c
1
k
c
−
1
k
c
+1
·
1
k
c
(k
c
+1)
≈
1
k
c
2
· R
266 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 266
Title: Dynamics Complex Systems
Shor t / Normal / Long
F i g u re 1 . 1 0 . 6 I l l u s t ra t io n
of t h e cove rin g of a ge ome t 
ric t re e by disks. Th e cove r 
in g sh ows t h a t t h e la rge r
sca le st ruct ure s of t h e t re e
( t h e t run k a n d first bra n ch e s
in t h is ca se ) h a ve a n e ffe c
t ive dime n sion give n by t he
dime n sion of t h e ir com
p o n e n t s. The sma l le r sca le
s t r uc t u re s h a ve a di me n 
s ion t h a t is de t e r m i n e d by
t h e a lgo ri t h m use d t o ma ke
t he t re e. Th i s in h o mo ge ne 
it y impl ie s t h a t t h e fra c t a l
d i me n s ion i s n ot a lwa ys t he
n a t u ra l wa y t o de sc r ib e t he
t re e.
01adBARYAM_29412 3/10/02 10:17 AM Page 266
Thus we must cover the line segment up to the point R
1/2
with R
−1/2
line seg
ments, and use an additional R
−1/2
line segments to cover the rest of the
points. This gives a total number of line segments in a covering of 2R
−1/2
. The
fractal dimension is thus d · 1/2.
We could have used fewer line segments in the cover ing by covering
pairs of points and tr iples of points rather than cover ing the whole line seg
ment below 1/k
c
. However, each partial covering of the set that is concerned
with pairs, t riples and so on consists of a number of segments that grows as
R
−1/2
. Thus our conclusion remains unchanged by this cor rect ion.
Trees il lustr ate only one example of how syst em p roperties may exist on many
scales, but are not readily described as fr actals in the conventional sense. In order to
generalize our concepts to enable the discussion of such proper ties, we will introduce
the concept of scaling.
1 . 1 0 . 3 Sca ling
Geometr ic fractals suggest that systems may have a selfsimilar str uct ure on all length
scales. This is in contr ast with the mor e t ypical approach of science, where there is a
specific scale at which a phenomenon appears. We can think about the problem of de
scribing the b ehavior of a system on multiple length scales in an abst ract manner. A
phenomenon (e.g., a measurable quantity) may be described by some function of
scale, f (x). Here x represents the char acter istic scale rather than the position. When
ther e is a welldefined length scale at which a par ticular effect occurs, for longer length
scales the function would typically decay exponentially:
f( x) ∼ e
−x / λ
(1.10.10)
This functional dependence implies that the character istic scale at which this prop
er ty disappears is given by .
In order for a system propert y to be relevant over a large range of length scales,it
must vary more gr adually than exponentially. In such cases, typically, the leading be
havior is a power law:
f (x) ∼ x (1.10.11)
A function that follows such powerlaw behavior can also be characterized by the scal
ing r ule:
f (ax) · a f (x) (1.10.12)
This means that if we ch a racteri ze the sys tem on one scale, t h en on a scale that is larger
by the factor a it has a similar appe a ra n ce , but scaled by the factor a . is call ed the scal
ing ex pon en t . In con trast to the beh avi or of an ex pon en ti a l , for a power law there is no
p a rticular length at wh i ch the property disappe a rs . Thu s , it may ex tend over a wi de
ra n ge of l ength scales. Wh en the scaling ex pon ent is not an integer, the functi on f (x) is
n on a n a lyti c . Non  a n a lyti c i t y is of ten indicative of a property that cannot be tre a ted by
a s suming that it becomes smooth on small or large scales. However, f racti onal scaling
ex pon ents are not nece s s a r y in order for power l aw scaling to be app l i c a bl e .
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 267
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 267
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 267
Even when a system propert y follows powerlaw scaling, the same behavior can
not continue over ar bitrarily many length scales. The disappearance of a cer tain
power law may occur because of the appearance of a new behavior on a longer scale.
This change is characterized by a crossover in the scaling proper ties o f f (x). An ex
ample of crossover occurs when we have a quantit y whose scaling behavior is
(1.10.13)
If A
1
> A
2
and
1
<
2
then the first t erm will dominate at smaller length scales, and
the second at larger length scales. Alter natively, the powerlaw behavior may eventu
ally succumb to exponential decay at some length scale.
There are three related approaches to applying the concept of scaling in model or
physical systems. The first appr oach is to consider the scale x to be the physical size of
the system, or the amount of matter it contains. The quantity f (x) is then a propert y
of the system measured as the size o f the system changes. The second approach is t o
keep the system the same, but vary the scale o f our obser vation. We assume that our
ability to obser ve the system has a limit ed d egree of discernment of fine d etails—a
finest scale of obser vation. Finer d etails are to be averaged o ver or disregarded. By
moving t oward or away fr om the syst em, we change the physical scale at which our
obser vation can no longer discern details. x then represents the smallest scale at which
we can observe variation in the syst em st ructure. Finally, in the third approach we
consider the relationship between a proper t y measured at one location in the system
and the same property measured at another location separated by the distance x. The
funct ion f (x) is a cor relation of the system measurements as a function of the distance
between regions that are being considered.
Examples of quantities that follow scaling relations as a function of system size
are the extensive proper ties of thermodynamic syst ems (Sect ion 1.3) such as the en
ergy, entropy, free energy, volume, number of particles and magnetizat ion:
U (ax) · a
d
U (x) (1.10.14)
These properties measure quantities of the whole system as a function of system size.
All have the same scaling exponent—the dimension of space. Int rinsic ther mody
namic quantities are independent of system size and therefore also follow a scaling be
havior wher e the scaling exponent is zero.
An o t h er example of scaling can be found in the ra n dom walk (Secti on 1.2). We
can gen era l i ze the discussion in Secti on 1.2 to all ow a walk in d d i m en s i ons by ch oo s
ing steps wh i ch are ±1 in each dimen s i on indepen den t ly. A ra n dom walk of N s teps in
t h ree dimen s i ons can be thought of as a simple model of a molecule form ed as a ch a i n
of m o l ecular units—a po lym er. If we measu re the avera ge distance bet ween the en d s
of the chain as a functi on of the nu m ber of s teps R(N) , we have the scaling rel a ti on :
R(aN) · a
1/2
R(N ) (1.10.15)
This scaling of distance traveled in a rand om walk with the numb er of steps taken is
independent of dimension. We will consider random walks and other models of poly
mers in Chapter 5.
f (x) ~ A
1
x
1
+ A
2
x
2
268 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 268
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 268
Often our int erest is in knowing how different parts of the syst em aff ect each
other. Direct interact ions do not always reflect the degree of influence. In complex sys
tems, in which many elements are interacting with each other, there are indirect
means of inter act ing that transfer influence between one part of a system and another.
The simplest example is the Ising model, where even shortrange interactions can lead
to longerrange correlations in the magnet ization. The correlation funct ion int ro
duced in Section 1.6.5 measures the correlations between differ ent locations. These
correlations show the degree to which the int eractions couple the behavior of differ
ent parts of the system. Correlations of behavior occur in both space and time. As we
mentioned in Section 1.3.4, near a secondorder phase transition, there are correla
tions between different places and times on ever y length and time scale, because they
follow a powerlaw behavior. This example will be discussed in greater detail in the
following section.
Our discussion of scaling also finds application in the theory of computation
(Section 1.9) and the practical aspects of simulation (Sect ion 1.7). In addition to the
question of computability discussed in Section 1.9, we can also ask how hard it is to
compute something. Such questions are gener ally formulated by describing a class of
problems that can be ordered by a parameter N that describes the size of the pr oblem.
The object ive of the theory of computational complexity is to determine how the
number of oper ations necessary to solve a problem grows with N. A scaling analysis
can also be used to compare different algorithms that may solve the same problem.
We are often primarily concerned with the scaling behavior (exponential, power law
and the value o f the scaling exponent) rather than the coefficients of the scaling be
havior, because in the comparison o f the difficulty of solving different p roblems or
different methodologies this is often, though not always, the most important issue.
1 . 1 0 . 4 Renorma liza t ion group
G e n e ra l me t h od The ren or m a l i z a ti on group is a formalism for st u dying t he scal
ing proper ties of a sys tem . It st ar ts by assuming a set of equ a ti ons t hat de s c ri be the
beh avi or of a sys tem . We t hen ch a n ge the length scale at wh i ch we are de s c ri bing the
s ys tem . In ef fect , we assume t hat we have a fin ite abi l i ty to see det a i l s . By movi n g
aw ay fr om a sys tem , we lose some of t he det a i l . At the new scale we assume that the
same set of equ a ti ons can be app l i ed , but po s s i bly with different coef f i c i en t s . Th e
obj ect ive is t o rel a te the set of equ a ti ons at one scale to the set of equ a ti ons at the
o t h er scale. O n ce this is ach i eved , the scale depen dent pr opert ies of the sys tem can
be infer red .
Applications of the renormalization group method have been largely to the study
of equilibrium systems,par ticularly near secondorder phase tr ansitions where mean
field approaches break down (Section 1.6). The premise of the renormalization group
is that exactly at a secondorder phase tr ansition,the equations describing the system
are independent of scale. In r ecent years, dynamic r enormalization theor y has been
developed to describe systems that evolve in time. In this section we will describe the
more conventional renor malization group for thermodynamic systems.
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 269
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 269
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 269
We il lustr ate the concepts of renormalization using the Ising model. The Ising
model,discussed in Sect ion 1.6, describes the int eract ions of spins on a lattice. It is a
first model of any system that exhibits simple cooper ative behavior, such as a magnet.
In order to apprec i a te the con cept of ren orm a l i z a ti on , it is useful to recogn i ze that
the Ising model is not a true micro s copic theory of the beh avi or of a magn et . It migh t
s eem that there is a well  def i n ed way to iden tify an indivi dual spin with a single el ectron
at the atomic level . However, this is far from app a rent wh en equ a ti ons that de s c ri be qu a n
tum mechanics at the atomic level are con s i dered . Si n ce the rel a ti onship bet ween the mi
c ro s copic sys tem and the spin model is not manife s t , it is clear that our de s c ri pti on of t h e
m a gn et using the Ising model relies upon the mac ro s copic proper ties of the model ra t h er
than its micro s copic natu re . S t a ti s tical mechanics does not gen era lly attem pt to derive
m ac ro s copic properties direct ly from micro s copic re a l i ty. In s te ad , it attem pts to de s c ri be
the mac ro s copic ph en om ena from simple model s . We might not give up hope of i den ti
f ying a specific micro s copic rel a ti onship bet ween a particular material and the Is i n g
m odel ,h owever, the use of the model does not rely upon this iden ti f i c a ti on .
Essential to this approach is that many of the d etails o f the at omic r egime ar e
somehow ir relevant at longer length scales. We will return lat er to discuss the rele
vance or ir relevance of microscopic details. However, our first question is: What is a
single spin variable? A spin variable represents the effect ive magnetic behavior of a re
gion of the mat erial. There is no par ticular reason that we should imagine an indi
vidual spin variable as representing a small or a large region of the mater ial.
Sometimes it might be possible to consider the whole magnet as a single spin in an
external field. Identifying the spin with a r egion of the material of a par ticular size is
an assignment of the length scale at which the model is being applied.
What is the difference between an Ising model describing the system at one
length scale and the Ising model describing it on another? The essential p oint is that
the interactions between spins will be different depending on the length scale at which
we choose to model the system. The renor malization group takes this discussion one
step further by explicitly relating the models at different scales.
In Fig. 1.10.7 we illustrate an Ising mo del in two dimensions. There is a second
Ising model that is used to describe this same system but on a length scale that is twice
as big. The first Ising model is described by the energy function (Hamiltonian):
(1.10.16)
For conven i en ce , in what fo ll ows we have inclu ded a constant en er gy term −c N ·−c Σ1 .
This term does not affect the beh avi or of the sys tem ,h owever, its va ri a ti on from scale
to scale should be inclu ded . The second Ising model is de s c ri bed by the Ha m i l ton i a n
(1.10.17)
where both the variables and the coefficients have primes. While the first mo del has
N spins, the second model has N ′ spins. Our object ive is to r elate these two models.
The gener al process is called r enormalization. When we go from the fine scale to the
coarse scale by eliminating spins, the process is called decimation.
′
E [{
′
s
i
}] · −
′
c 1
i
∑
–
′
h
′
s
i
i
∑
−
′
J
′
s
i
′
s
j
<ij>
∑
E[{s
i
}] · −c 1
i
∑
– h s
i
i
∑
−J s
i
s
j
<ij>
∑
270 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 270
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 270
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 271
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 271
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 1 0 . 7 Sch e ma t ic illust ra t ion of t wo I sin g mode ls in t wo dime n sions. Th e spin s a r e
in dica t e d by a rrows t h a t ca n be UP or DOWN. Th e se I sin g mode ls illust ra t e t h e mode lin g of a
syst e m wit h diffe re n t le ve ls of de t a il. I n t h e uppe r mode l t h e re a re on e  fourt h a s ma n y spins
a s in t h e lowe r mode l. I n a re n orma liza t ion group t re a t me n t t h e pa ra me t e rs of t h e lowe r
mode l a re re la t e d t o t h e pa ra me t e rs of t h e uppe r mode l so t h a t t h e sa me syst e m ca n be de
scribe d by bot h . Ea ch of t h e spin s in t h e uppe r mode l, in e ffe ct , re pre se n t s four spin s in t h e
lowe r mode l. Th e in t e ra ct ion s be t we e n a dja ce n t spin s in t h e uppe r mode l re pre se n t t h e n e t
e ffe ct of t h e int e ra ct ions be t we e n groups of four spins in t h e lowe r mode l.
01adBARYAM_29412 3/10/02 10:17 AM Page 271
There are a variety of methods used for relating models at different scales. Each
of them provides a distinct conceptual and pract ical approach. While in pr inciple they
should provide the same answer, they are typically approximated at some stage of the
calculation and therefor e the answers need not be the same. All the approaches we de
scribe r ely upon the par tition function to enable direct connection fr om the micr o
scopic statistical treatment to the macroscopic ther modynamic quantities. For a par
ticular system, the par tition function can be wr itten so that it has the same value,
independent of which representation is used:
(1.10.18)
It is conven ti onal and conven i ent wh en perfor ming r en orm a l i z a ti on t r a n s for m a
t i ons to set · 1 / k T · 1 . Si n ce mu l tiplies each of the para m eters of t he en er gy
f u n ct i on , it is a r edundant para m eter. It can be rei n s er ted at t he end of t he calcu
l a ti on s .
The different ap proaches to r enor malization are useful for various models that
can be studied. We will describe three of them in the following paragr aphs because of
the importance of the different conceptual t reatments. The three approaches are (1)
summing over values of a subset of the spins, (2) aver aging over a local combination
of the spins, and (3) summing over the short wavelength d egrees of freedom in a
Fourier space representat ion.
1. Summing over values of a subset of the spins. In the first ap proach we consider
the spins on the larger scale to be a subset of the spins on the finer scale. To find
the energy of inter action between the spins on the larger scale we need to elimi
nate (decimate) some of the spins and replace them by new interactions between
the spins that are left. Specifically, we identify the larger scale spins as corr e
sponding to a subset {s
i
}
A
of the smaller scale spins. The rest of the spins {s
i
}
B
must be eliminat ed from the fine scale model to obtain the coarse scale model.
We can implement this directly by using the par tition function:
(1.10.19)
In this equation we have identified the spins on the larger scale as a subset of the
finer scale spins and have summed over the finer scale spins to obtain the effec
tive energy for the larger scale spins.
2. Aver aging over a local combination of the spins. We need not identify a particu
lar spin of the finer scale with a part icular spin of the coarser scale. We can choose
to identify some function of the finer scale spins with the coarse scale spin. For
example, we can identify the major ity rule of a certain number of fine scale spins
with the coarse scale spins:
(1.10.20)
e
−E[{ ′ s
i
}]
·
i ∈A
∏ ′ s
i
,sign( s
i ∑
)
{s
i
}
∑
e
−E[{s
i
}]
e
− ′ E [{ ′ s
i
}]
·
{s
i
}
B
∑
e
−E[{ ′ s
i
}
A
,{s
i
}
B
]
· e
−E[{s
i
}]
{s
i
}
∑ ′ s
i
,s
i
i∈A
∏
Z ·
{s
i
}
∑
e
− E[{s
i
}]
·
{
′
s
i
}
∑
e
− ′ E [{ ′ s
i
}]
272 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 272
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 272
This is easier to think abo ut wh en an odd nu m ber of spins are being ren orm a l i zed
to become a single spin. No te that t his is qu i te similar to t he con cept of defining a
co ll ective coord i n a te that we used in Secti on 1.4 in discussing the two  s t a te sys tem .
The differen ce here is that we are defining a co ll ective coord i n a te out of on ly a few
or i ginal coord i n a te s , so that the redu cti on in the nu m ber of degrees of f reedom is
com p a ra tively small . No te also that by conven ti on we con ti nue to use the term
“en er gy,” ra t h er than “f ree en er gy,” for the co ll ective coord i n a te s .
3. Summing over the shor t wavelength degrees of freedom in a Fourier space r ep
resentation. Rather than performing the elimination of spins dir ectly, we ma y
recognize that our procedure is having the effect of removing the fine scale vari
ation in the problem. It is natural then to consider a Fourier space representation
where we can remove the rapid changes in the spin values by eliminating the
higher Fourier components. To do this we need to represent the energy function
in terms of the Fourier tr ansform of the spin variables:
(1.10.21)
Writing the Hamiltonian in terms of the Fourier tr ansformed variables, we then
sum over the values of the high frequency terms:
(1.10.22)
The remaining coordinates s
k
have k > k
0
.
All o f the approaches described above t ypically require some ap proximation in
order to p erform the anal ysis. In gener al there is a conser vation of effor t in that the
same difficulties tend to arise in each approach, but with different manifestation. Par t
of the reason for the difficulties is that the Hamiltonian we use for the Ising model is
not really complete. This means that there can be other parameter s that should be in
cluded to describe the behavior of the system. We will see this by direct application in
the following examples.
I s i n g mode l i n one d i me ns i o n We illu s t r a te the basic con cept s by app lying the
ren orm a l i z a ti on group to a on e  d i m en s i onal Ising model wh ere the procedu re can
be done ex act ly. It is conven i ent t o use t he fir st approach (nu m ber 1 above) of
i den ti f ying a su b s et of t he fine scale spins wit h t he larger scale model . We st ar t wi t h
the case wh ere t here is an inter act i on bet ween nei gh boring spins, but no magn eti c
f i el d :
(1.10.23)
We sum the par tition funct ion over the odd spins to obtain
(1.10.24)
Z ·
{s
i
}
even
∑
{s
i
}
odd
∑
e
c 1
i
∑
+J
i
∑
s
i
s
i +1
·
i even
∏
2cosh(J(s
i
+s
i+2
))
{s
i
}
even
∑
e
2c
E[{s
i
}] · −c 1
i
∑
− J s
i
s
j
<ij >
∑
e
−E[{s
k
}]
·
{s
k
}
k <k
0
∑
e
−E[{s
k
}]
s
k
· e
ikx
i
i
∑
s
i
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 273
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 273
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 273
We equate this to the energy for the even spins by themselves, but with primed
quant ities:
(1.10.25)
This gives:
(1.10.26)
or
c′ + J′s
i
s
i +2
· ln(2cosh(J( s
i
+ s
i +2
))) + 2c (1.10.27)
Insert ing the two distinct combinations of values of s
i
and s
i+2
(s
i
· s
i+2
and s
i
· −s
i+2
),
we have:
c′ + J′ · ln(2cosh(2J )) + 2c
(1.10.28)
c′ − J′ · ln(2cosh(0)) + 2c · ln(2) + 2c
Solving these equat ions gives the pr imed quantities for the larger scale model as:
J′ · (1/2)ln(cosh(2,J ))
(1.10.29)
c′ · 2c + (1/2)ln(4cosh(2J ))
This is the renormalization group relationship that we have been looking for. It relates
the values of the parameters in the two different energy funct ions at the different
scales.
While it may not be obvious by inspect ion, this iterative map always causes J t o
decrease. We can see this more easily if we tr ansform the relationship of J to J ′ to the
equivalent for m:
tanh(J′) · tanh(J)
2
(1.10.30)
This means that on longer and longer scales the effective inter action between neigh
boring spins becomes smaller and smaller. Eventually the system on long scales be
haves as a string of decoupled spins.
The analysis o f the onedimensional Ising model can be extended to include a
magnetic field. The decimat ion step becomes:
(1.10.31)
We equate this to the coarse scale par tition function:
(1.10.32)
which requires that:
Z ·
{s
i
}
odd
∑
e
′ c + ′ h
i
∑
′ s
i
+ ′ J
i
∑
′ s
i
′ s
i +1
·
i odd
∏
2 cosh(h +J(s
i
+s
i +2
))e
2c
{s
i
}
odd
∑
Z ·
{s
i
}
even
∑
{s
i
}
odd
∑
e
c 1
i
∑
+h
i
∑
s
i
+J
i
∑
s
i
s
i +1
·
i odd
∏
2cosh(h +J(s
i
+s
i +2
))
{s
i
}
even
∑
e
2c
e
′ c + ′ J s
i
s
i +2
·2cosh(J(s
i
+s
i +2
))e
2c
Z ·
{s
i
}
even
∑
e
′ c +
i
∑
′ J
i
∑
s
i
s
i + 2
·
i even
∏
2cosh(J(s
i
+s
i +2
))
{s
i
}
even
∑
e
2c
274 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 274
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 274
c′ + h ′ + J′ · h + ln(2cosh(h + 2J)) + 2c
c′ − J ′ · ln(2cosh(h)) + 2c (1.10.33)
c′ − h ′ + J′ · −h + ln(2cosh(h − 2J)) + 2c
We solve these equations t o obtain:
c′ · 2c + (1/4)ln(16cosh(h + 2J)cosh(h − 2J)cosh(h)
2
)
J′ · (1/4)ln(cosh(h + 2J)cosh(h − 2J)/cosh(h)
2
) (1.10.34)
h′ · h + (1/2)ln(cosh(h + 2J)/cosh(h − 2J))
which is the desired renor malization group tr ansformation. The renor malization
t ransfor mation is an iter ative map in the parameter space (c, h, J).
We can show what happens in this it erative map using a plot of changes in the
values of J and h at a particular value of these parameters. Such a diagram of flows in
the parameter space is illust rated in Fig. 1.10.8. We can see from the figure or from Eq.
(1.10.34) that there is a line of fixed points of the iter ative map at J · 0 with arbitrar y
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 275
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 275
Title: Dynamics Complex Systems
Shor t / Normal / Long
4
3
2
1
0
1
2
3
4
0 0.5 1 1.5 2
h
J
Fi gure 1 . 1 0 . 8 Th e re n orma liza t ion t ra n sforma t ion for t h e on e  dime n sion a l I sin g mode l is il
lust ra t e d a s a n it e ra t ive flow dia gra m in t h e t wo dime n sion a l ( h,J ) pa ra me t e r spa ce . Ea ch of
t h e a rrows re pre se n t s t h e e ffe ct of de cima t in g h a lf of t h e spin s. We se e t h a t a ft e r a fe w it e r
a t ion s t h e va lue of J be come s ve ry sma ll. Th is in dica t e s t h a t t h e spin s be come de couple d from
e a ch ot h e r on a la rge r sca le . Th e a bse n ce of a n y in t e ra ct ion on t h is sca le me a n s t h a t t h e re is
n o ph a se t ra nsit ion in t h e one  dime n siona l I sin g mode l.
01adBARYAM_29412 3/10/02 10:17 AM Page 275
value of h. This simply means that the spins are decoupled. For J · 0 on any scale,the
behavior of the spins is deter mined by the value of the external field.
The line of f i xed point s at J · 0 is a st able (att ract ing) set of f i xed poi n t s . Th e
f l ow lines of t he itera tive map take us to these fixed point s on the att ractor line. In
ad d i ti on , t h ere is an unst able fixed point at J · ∞. This would corre s pond to a
s tron gly co u p l ed line of s p i n s , but since this fixed point is unstable it does not de
s c ri be the large scale beh avi or of the model . For any finite va lue of J, ch a n ging t he
scale r a p i dly causes the va lue of J to become small . This means that t he large scale
beh avi or is alw ays that of a sys tem with J · 0 .
I s i ng mode l i n t wo di me ns i ons In the onedimensional case treat ed in the pre
vious sect ion, the r enormalization group works perfectly and is also, from the p oint
of view of studying phase transitions,uninteresting. We will now look at two dimen
sions, wher e the renor malization group must be approximated and where there is also
a phase transit ion.
We can simplify our task in two dimensions by eliminating half of the spins (Fig.
1.10.9) instead of three out of four spins as illust r ated previously in Fig. 1.10.7.
Eliminating half of the spins causes the square cell to be rotated by 45˚,but this should
not cause any problems. Labeling the spins as in Fig. 1.10.9 we write the decimation
step for a Hamiltonian with h · 0:
(1.10.35)
In the last expr ession we take into consideration that each bond of the for m s
1
s
2
ap
pears in two squares and each spin appears in four squares.
In order t o solve Eq . ( 1.10.35) for the va lues of c′and J ′ we must inser t all po s
s i ble va lues of the spins ( s
1
,s
2
,s
3
,s
4
) . However, this leads t o a ser ious pr obl em .
Th er e ar e four disti n ct equ a ti ons that ar ise fr om the different va lues of t he spins.
This is redu ced fr om 2
4
· 8 bec a u s e , by sym m etr y, i nver t ing all of the spins give s
t he same an swer. The probl em is that while t here are fou r equ a ti on s , t h er e ar e
on ly t wo unknowns t o solve for, c′ and J ′. The pr obl em can be illu s t r a ted by r ec
ognizing t hat t here are t wo dist i n ct ways to have t wo spins U P and t wo spins
DOW N. One way is to have t he spins that are t he same be ad jacent to each ot her,
and t he ot her way is to have t hem be oppo s i te each ot her ac ross a diagon a l . Th e
t wo ways give t he same re sult for the va lue of ( s
1
+ s
2
+ s
3
+ s
4
) bu t different re su l t s
for ( s
1
s
2
+ s
2
s
3
+ s
3
s
4
+ s
4
s
1
) .
Z ·
{s
i
}
A
∑
{s
i
}
B
∑
e
c 1 +J
i
∑
i
∑
s
0
(s
1
+s
2
+s
3
+s
4
)
·
{s
i
}
A
∑
i∈B
∏
2 cosh(J( s
1
+s
2
+s
3
+s
4
))e
c
·
{s
i
}
A
∑
i∈B
∏
e
c′+( J ′/2)( s
1
s
2
+s
2
s
3
+s
3
s
4
+s
4
s
1
)
276 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 276
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 276
In order to solve this problem, we must int roduce additional parameters which
correspond to other interactions in the Hamiltonian. To be explicit, we would make a
table of symmetr yrelat ed combinations of the four spins as follows:
(s
1
,s
2
,s
3
,s
4
) (1,1,1,1) (1,1,1,−1) (1,1,−1,−1) (1,−1,1,−1)
1 1 1 1 1
( s
1
+ s
2
+ s
3
+ s
4
) 4 2 0 0
(s
1
s
2
+ s
2
s
3
+ s
3
s
4
+ s
4
s
1
) 4 0 0 −4 (1.10.36)
(s
1
s
3
+ s
2
s
4
) 2 0 −2 2
s
1
s
2
s
3
s
4
1 −1 1 1
In order to make use of these to resolve the problems with Eq.(1.10.35), we must in
troduce new interactions in the Hamiltonian and new parameters that multiply them.
This leads to secondneighbor interactions (across a cell diagonal),and four spin in
teract ions around a square:
(1.10.37)
where the notation << ij >> indicates secondneighbor spins across a square diagonal,
and < ijkl > indicates spins around a square. This might seem to solve our p roblem.
However, we star ted out from a Hamiltonian with only two parameters,and now we
are switching to a Hamiltonian with four parameters. To be selfconsistent, we should
start from the same set o f parameters we end up with. When we start with the addi
tional parameter s K and L this will,however, lead to still further terms that should be
included.
E[{s
i
}] · −
′
c 1
i
∑
–
′
J s
i
s
j
<ij >
∑
−
′
K s
i
s
j
<<ij >>
∑
−
′
L s
i
s
j
<ijkl >
∑
s
k
s
l
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 277
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 277
Title: Dynamics Complex Systems
Shor t / Normal / Long
s
0
s
1
s
2
s
3
s
4
F i g u re 1 . 10 . 9 I n a re n o r ma l
i z a t ion t re a t me n t of t he t wo
d i me n s io n al I s i ng mo del it is
possible t o de c i ma t e one out
of t wo spin s as illust ra t e d in
t h is fig u re. Th e bla ck do t s
re p re s e n t spin s t h a t re ma in in
t he larger scal e mo de l, an d
t he whit e dot s re p re s e nt spins
t h at are de c i ma t e d. Th e n e a r
e s t  n e igh bor int e ra c t io ns in
t he la rge rsca le mo del a re
s hown by da s h e d line s. As dis
cussed in t h e t ext , t h e pro c e s s
of de c i ma t ion int ro duce s ne w
i n t e ra c t io n s be t we en spins
a c ross t he dia go n al, a nd fo u r
spin int e ra c t io ns be t we en
s p i ns aro u nd a squa re.
01adBARYAM_29412 3/10/02 10:17 AM Page 277
Re leva nt a nd i rre leva nt pa ra me t e rs In general,as we eliminate spins by renor mal
ization, we int roduce inter actions between spins that might not have been included
in the or iginal model. We will have inter actions between second or third neighbors or
between more than two spins at a time. In pr inciple, by using a complete set of par a
meters that describe the syst em we can perform the r enor malization tr ansformation
and obtain the flows in the parameter space. These flows t ell us about the scalede
pendent proper ties of the system.
We can char acterize the flows by focusing on the fixed points of the iterat ive map.
These fixed points may be stable or unstable. When a fixed point is unstable, renor
malization takes us away from the fixed point so that on a larger scale the proper ties
of the syst em are found to be different fr om the values at the unstable fixed point.
Significantly, it is the unstable fixed points that represent the secondorder phase
t ransitions. This is because deviating from the fixed point in one direction causes the
parameters to flow in one dir ect ion, while deviating from the fixed point in another
direction causes the parameters to flow in a differ ent direction. Thus,the macroscopic
properties of the system depend on the direction microscopic parameters deviate
from the fixed point—a succinct characterization of the nature of a phase tr ansition.
Using this characterization of fixed points, we can now distinguish between dif
ferent t ypes of parameter s in the model. This includes all o f the additional parame
ters that might be introduced in order to achieve a selfconsistent r enormalization
t ransfor mation. There are two major categor ies of parameters: relevant or irrelevant.
Star t ing near a particular fixed p oint, changes in a r elevant parameter grow und er
renormalization. Changes in an irrelevant parameter shrink. Because renor malization
indicates the values of system parameters on a larger scale,this tells us which micr o
scopic parameters are impor tant to the macroscopic scale. When obser ved on the
macroscopic scale, relevant parameters change at the phase t ransition, while ir rele
vant parameters do not.A r elevant parameter should be included in the Hamiltonian
because its value affects the macroscopic behavior. An irrelevant parameter may often
be included in the model in a more approximate way. Marginal parameters are the
borderline cases that neither grow nor shrink at the fixed point.
Even when we are not solely interested in the behavior of a system at a phase tr an
sition, but rather are concer ned with its macroscopic propert ies in general,the defin
ition of “relevant” and “irrelevant” continues to make sense. If we start from a partic
ular microscopic description of the system, we can ask which parameters are relevant
for the macroscopic behavior. The relevant parameters are the ones that can affect the
macroscopic behavior of the system. Thus, a change in a relevant microscopic par a
meter changes the macroscopic behavior. In terms of renormalization, changes in rel
evant par ameters do not disappear as a result of renormalizat ion.
We see that the use of a ny model , su ch as the Ising model , to model a physical sys
tem assumes that all of the para m eters that are essen tial in de s c ri bing t he sys tem have
been inclu ded . Wh en this is tr u e , t he re sults are universal in the sense t hat all micro
s copic Ha m i l tonians wi ll give r ise to the same beh avi or. Ad d i ti onal terms in the
Ha m i l tonian cannot affect the mac ro s copic beh avi or. We know that the micro s cop i c
beh avi or of the physical sys tem is not re a lly de s c ri bed by the Ising model or any ot her
simple model . Thu s , in cre a ting models we alw ays rely upon the con cept , i f not the
278 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 278
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 278
proce s s , of ren orm a l i z a ti on to make many of the micro s copic details disappe a r, en
a bling our simple models to de s c ri be the mac ro s copic beh avi or of the physical sys tem .
In the Ising model, in addition to longer range and multiple spin interactions,
ther e is another set of parameters that may be relevant. These parameters are related
to the use of binary variables to describe the magnetization of a region of the mater
ial. It makes sense that the process of renormalization should cause the model to have
additional spin values that are intermediate b etween ful ly magnet ized UP and ful ly
magnetized DOWN. In order to accommodate this, we might int roduce a continuum
of possible magnet izations.Once we do this,the amplitude of the magnetization has
a probability dist ribut ion that will be controlled by additional parameters in the
Hamiltonian. These parameters may also be relevant or irrelevant. When they are ir
relevant,the Ising mo del can be used without them. However, when they are relevant,
a more complete model should be used.
The parameters that are relevant gener ally depend on the dimensionality of
space. From our analysis of the behavior of the onedimensional Ising model,the pa
rameter J is irrelevant. It is clearly irrelevant because not only variations in J but J it
self disappears as the scale increases. However, in two dimensions this is not tr ue.
For our pur poses we will be satisfied by simplifying the renor malization t reat
ment of the twodimensional Ising model so that no additional parameters are intro
duced. This can be done by a fourth renormalization group technique which has some
conceptual as well as p r actical advantages over the other s. However, it d oes hide the
impor tance of determining the relevant par ameters.
Bond s hi f t i ng We simplify our analysis of the twodimensional Ising model by mak
ing use of the MigdalKadanoff transfor mation. This renor malization group tech
nique is based on the recognition that the cor relation between adjacent spins can en
able us t o, in effect, substitute the role of one spin f or another. As far as the coarser
scale model is concerned, the funct ion of the finer scale spins is to mediate the inter
action between the coarser scale spins. Because one spin is correlated to the behavior
of its neighbor, we can shift the resp onsibility for this int eract ion to a neighbor, and
use this shift to simplify elimination of the spins.
To apply these ideas to the twodimensional Ising model, we move some of the
interactions (b onds) b etween spins, as shown in Fig. 1.10.10. We note that the dis
tance over which the bonds act is preser ved. The net result of the bond shifting is that
we form shor t linear chains that can be renormalized just like a onedimensional
chain. The renormalization group t ransformation is thus done in two st eps.First we
shift the bonds, then we decimate. Once the bonds are mo ved, we write the renor
malization of the par tit ion function as:
(1.10.38)
Z ·
{s
i
}
A
∑
{s
i
}
B
∑
{s
i
}
C
∑
e
c 1
i
∑
+2J
i
∑
s
0
(s
1
+s
2
)
·
i ∈A
∏
2cosh(2J(s
1
+s
2
))e
4c
{s
i
}
∑
·
i ∈A
∏
e
′ c + ′ J (s
1
s
2
)
{s
i
}
∑
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 279
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 279
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 279
The spin labels s
0
, s
1
, s
2
are assigned along each doubled bond, as indicated in
Fig. 1.10.10. The thr ee t ypes of spin A, B and C cor respond to the white, black and
gr ay dots in the figure. The resulting equation is the same as the one we found when
performing the onedimensional renormalization group transfor mation with the ex
cept ion of factors of two. It gives the result:
280 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 280
Title: Dynamics Complex Systems
Shor t / Normal / Long
(a)
(b)
s
1
s
2
s
0
F i g u re 1 . 1 0 . 1 0 I l l u s 
t ra t ion of t h e Migda l
Ka da noff re n orma liza 
t ion t ra n sforma t ion t h a t
e n a ble s us t o bypa ss
t h e forma t ion of a ddi
t ion a l in t e ra ct ions. I n
t h is a pproa ch some of
t h e in t e ra ct ion s be 
t we e n spin s a re move d
t o ot h e r spin s. I f a ll t he
spin s a re a lign e d ( a t low
t e mpe ra t ure or h igh J ) ,
t h e n sh ift in g bon ds
doe sn ’t a ffe ct t h e spin
a lign me n t . At h igh t e m
pe ra t ure , wh e n t h e spins
a re un corre la t e d, t h e in 
t e ra ct ion s a re n ot im
port a n t a n ywa y. Ne a r
t h e ph a se t ra n sit ion ,
wh e n t h e spin s a re
h igh ly corre la t e d, sh ift 
in g bon ds st ill ma ke s
se n se . A pa t t e rn of bond
move me n t is illust ra t e d
in ( a ) t h a t give s rise t o
t h e pa t t e rn of double d
bon ds in ( b) . Not e t h a t
we a re illust ra t in g on ly
pa rt of a pe riodic la t t ice,
so t h a t bon ds a re move d
in t o a n d out of t h e re 
gion illust ra t e d. Using
t h e exa ct re n orma liza 
t ion of o n e  d i me n s io n a l
c h a i ns, t he gray spins
a nd t h e bl ack spin s can
be de c i ma t e d t o le a ve
on ly t h e whit e spins.
01adBARYAM_29412 3/10/02 10:17 AM Page 280
J′ · (1/2)ln(cosh(4J ))
(1.10.39)
c′ · 4c + (1/2)ln(4cosh(4J ))
The r en orm a l i z a ti on of J in the t wo  d i m en s i onal Ising model t u rns out to beh ave
qu a l i t a tively different from the on e  d i m en s i onal case. Its beh avi or is plot ted in
F i g. 1.10.11 using a flow diagra m . Th ere is an unstable fixed point of the itera tive
map at J ≈ . 3 0 5 . This non zero and non i n f i n i te fixed point indicates that we have a
phase t ra n s i ti on . Rei n s ert ing the t em pera tu re , we see that the phase t ra n s i ti on occ u r s
at J · .305 which is significantly larger than the mean field result zJ · 1 or J · .25
found in Section 1.6. The exact value for the phase t ransition for this lattice, J ≈ .441,
which can be obtained analyt ically by other techniques, is even larger.
It turns out that there is a tr ick that can give us the exact tr ansition point using a
similar renor malization t ransfor mation. This tr ick begins by recognizing that we
could have moved b onds in a larger square. For a square with b cells on a side, we
would end up with each bond on the perimeter being replaced by a bond of str ength
b. Using Eq.(1.10.30) we can infer that a chain of b bonds of st rength bJ gives rise to
an effect ive interact ion whose strength is
J ′(b) · tanh
−1
(tanh( bJ )
b
) (1.10.40)
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 281
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 281
Title: Dynamics Complex Systems
Shor t / Normal / Long
J
0 0.2 0.4 0.6 0.8 1
0.2 0.25 0.3 0.35 0.4
J
(a)
(b)
Fi gure 1 . 1 0 . 1 1 Th e t wo dime n sion a l I sin g mode l re n orma liza t ion group t ra n sforma t ion ob
t a in e d from t h e Mi gda l Ka da noff t ra n sforma t ion is illust ra t e d a s a flow dia gra m in t h e on e 
dime n sion a l pa ra me t e r spa ce ( J ) . Th e a rrows sh ow t h e e ffe ct of succe ssive it e ra t ion s st a rt 
in g from t h e bla ck dot s. Th e wh it e dot in dica t e s t h e posit ion of t h e un st a ble fixe d poin t , J
c
,
wh ich is t h e ph a se t ra n sit ion in t h is mode l. St a rt in g from va lue s of J sligh t ly be low J
c
, it e r
a t ion re sult s in t h e mode l on a la rge sca le be comin g de couple d wit h n o in t e ra ct ion s be t we e n
spin s ( J → 0) . Th is is t h e h igh  t e mpe ra t ure ph a se of t h e ma t e ria l. Howe ve r, st a rt in g from
va lue s of J sligh t ly a bove J
c
it e ra t ion re sult s in t h e mode l on t h e la rge sca le be coming
st ron gly couple d ( J → ∞) a n d spin s a re a lign e d. ( a ) sh ows on ly t h e ra n ge of va lue s from 0
t o 1, t h ough t h e va lue o f J ca n be a rbit ra rily la r ge. ( b) sh ows a n e n la r ge me n t of t h e re gion
a round t h e fixe d point .
01adBARYAM_29412 3/10/02 10:17 AM Page 281
The t r ick is to take the limit b →1, because in this limit we are left with the original
Ising model. Extending b to nonintegr al values by analytic continuation may seem a
little st r ange, but it does make a kind of sense. We want to look at the incremental
change in J as a result of renormalization, with b incrementally different from 1. This
can be most easily found by taking the hyper bolic tangent of both sides of Eq.
(1.10.40), and then taking the derivative with respect to b. The result is:
(1.10.41)
Setting this equal to zero to find the fixed point of the tr ansformation actually gives
the exact result for the phase t ransit ion.
The r enor malization group gives us more information than just the location of
the phase t r ansition. Fig. 1.10.11 shows changes that occur in the parameters as the
length scale is varied. We can use this picture to understand the behavior of the Ising
model in some detail. It shows what happens on longer length scales by the direction
of the ar rows. If the flow goes toward a par ticular point,then we can tell that on the
longest (thermodynamic) length scale the behavior will be char acterized by the be
havior of the model at that point. By knowing how close we are to the or iginal phase
tr ansition, we can also learn from the renor malization group what is the length scale
at which the behavior characteristic of the phase transition will disappear. This is the
length scale at which the iterat ive map leaves the r egion of the r epelling fixed point
and moves t o the attracting one.
We can also ch a r act er i ze t he rel a ti onship bet ween sys tems at different va lu e s
of t he para m eters : t em pera tu res or magn et ic fiel d s . Ren orm a l i z a ti on t akes us
f rom a sys tem at one va lue of J to another. Thu s , we can rel a te t he beh avi or of a
s ys t em at one tem pera tu re to anot her by per forming the ren or m a l i z a ti on for bo t h
s ys t ems and st opping bot h at a part icular va lue of J. At t his point we can direct ly
rel a te propert ies of the t wo sys tem s , su ch as t heir free en er gi e s . Di f ferent nu m bers
of ren orm a l i z a ti on st eps in the t wo cases mean t hat we are rel a ting t he t wo sys
t ems at different scales. Su ch de s c ri pt i ons of rel a ti onships of the propert ies of on e
s ys t em at one scale wit h anot her sys t em at a different scale are kn own as scaling
f u n ct i ons becau se they de s c ri be how the propert ies of the sys tem ch a n ge wi t h
s c a l e .
The r enor malization group was developed as an analytic t ool for stud ying the
scaling proper ties of systems with spatially arrayed interacting parts. We will study an
other use of renormalization in Section 1.10.5. Then in Section 1.10.6 we will int ro
duce a computational approach—the multigrid method.
Q
ue s t i on 1 . 1 0 . 6 In this section we displayed our iterative maps graphi
cally as flow diagrams, because in renormalization group t ransfor ma
tions we are often int erested in maps that involve more than one variable.
Make a diagram like Fig. 1.1.1 for the single variable J showing the it er ative
renor malization group transfor mation for the twodimensional Ising model
as given in Eq. (1.10.39).
d ′ J (b)
db
b·1
· J +sinh( J) cosh(J ) ln(tanh(J ))
282 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 282
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 282
Solut i on 1 . 1 0 . 6 See Fig. 1.10.12. The fixed point and the iterative behavior
are readily apparent.
1 . 1 0 . 5 Renorma liza t ion a nd cha os
Our final example of renormalization brings us back to Section 1.1, wher e we studied
the properties of iter ative maps and the bifurcation route to chaos. According to our
discussion, cycles o f length 2
k
, k · 0,1,2,..., appeared as the parameter a was varied
from 0 to a
c
· 3.56994567, at which point chaotic behavior appeared.Fig. 1.1.3 sum
marizes the bifurcation route to chaos.A schematic of the bifurcation part of this di
agr am is r eproduced in Fig. 1.10.13. A br ief review of Sect ion 1.1 may be useful for
the following discussion.
The p rocess of bifurcation appears to be a selfsimilar process in the sense that
the appearance of a 2cycle for f (s) is repeated in the appearance of a 2cycle for f
2
(s),
but over a smaller range of a. The idea of selfsimilarity seems manifest in
Fig. 1.10.13, where we would only ha ve to change the scale of magnification in the s
and a directions in order to map one bifurcation point onto the next one. While this
mapping might not wor k per fectly for smaller cycles, it becomes a b etter and b etter
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 283
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 283
Title: Dynamics Complex Systems
Shor t / Normal / Long
0.0
0.3
0.6
0.9
1.2
1.6
1.9
2.2
2.5
2.8
0 1 2 3 4 5 6 7 8
f(J)=0. 5 ln(cosh(4J))
Fi gure 1 . 1 0 . 1 2 Th e it e ra t ive ma p sh own a s a flow dia gra m in Fig. 1. 10. 11 is sh own h e re in
t h e sa me ma n n e r a s t h e it e ra t ive ma ps in Se ct ion 1. 1. On t h e le ft a re sh own t h e succe ssive
va lue s of J a s it e ra t ion proce e ds. Ea ch it e ra t ion sh ould be un de rst ood a s a loss of de t a il in
t h e mode l a n d h e n ce a n obse rva t ion of t h e syst e m on a la rge r sca le . Sin ce in ge n e ra l our ob
se rva t ion s of t h e syst e m a re ma croscopi c, we t ypica lly obse rve t h e limit in g be h a vior a s t he
it e ra t ion s go t o ∞. Th is is simila r t o con side rin g t h e limit in g be h a vior of a st a nda rd it e ra t ive
ma p. On t h e righ t is t h e gra ph ica l me t h od of de t e rmin in g t h e it e ra t ion s a s discusse d in
Se ct ion 1. 1. Th e fixe d poin t s a re visible a s in t e rse ct ion s of t h e it e ra t in g fun ct ion wit h t h e di
a gona l lin e.
01adBARYAM_29412 3/10/02 10:17 AM Page 283
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 284
Title: Dynamics Complex Systems
Shor t / Normal / Long
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.5 1 1.5 2 2.5 3 3.5 4
a
a
c
0
0
s
1
· (a −1) a
s
2
·
(a +1) t (a +1)(a − 3)
2a
a · 1 + 6
s
∆a
k
∆a
k+1
∆s
k
∆s
k+1
Fi gure 1 . 1 0 . 1 3 Sch e ma t ic re product ion of Fig. 1. 1. 4, wh ich sh ows t h e bifurca t ion rout e t o
ch a os. Succe ssive bra n ch in gs a re a pproxima t e ly se lf simila r. Th e bot t om figure sh ows t h e de 
fin it ion of t h e sca lin g fa ct ors t h a t re la t e t h e succe ssive bra n ch in gs. Th e h orizon t a l re sca ling
of t h e bra n ch e s, δ, is give n by t h e ra t io of ∆a
k
t o ∆a
k +1
. Th e ve rt ica l re sca lin g of t he
bra n ch e s, α, is give n by t h e ra t io of ∆s
k
t o ∆s
k +1
. Th e t op figure sh ows t h e va lue s from wh ich
we ca n obt a in a first a pproxima t ion t o t h e va lue s o f a nd δ, by t a kin g t h e ra t ios from t h e
ze rot h , first a n d se con d bifurca t ions. Th e ze rot h bifurca t ion poin t is a ct ua lly t h e point a · 1.
Th e first bifurca t ion poin t occurs a t a · 3. t h e se con d occurs a t a · 1 + √6. Th e va lue s o f s
a t t h e bifurca t ion poin t s we re obt a in e d in Se ct ion 1. 1, a n d formula s a re in dica t e d on t h e fig
ure. Wh e n t h e sca lin g be h a vior of t h e t re e is a n a lyze d usin g a re n orma liza t ion group t re a t 
me n t , we focus on t h e t re e bra n ch e s t h a t cross s · 1/ 2. Th e se a re in dica t e d by bold lin e s in
t h e t op figure.
01adBARYAM_29412 3/10/02 10:17 AM Page 284
appr oximation as the number of cycles increases. The bifurcation diagr am is thus a
t reelike object. This means that the sequence of bifurcation points forms a geometr i
cally converging sequence, and the width of the b ranches is also geomet rically con
verging. However, the distances in the s and a directions are scaled by different fac
tors. The factor s that gover n the t r ee rescaling at each level ar e and , as shown in
Fig. 1.10.13 (b):
(1.10.42)
By this convention,the magnitude of both and is gr eater than one. is defined to
be negative because the longer branch flips up to d own at ever y branching. The val
ues are to be obtained by taking the limit as k →∞ where these scale factors have well
defined limits.
We can find a first approximation to these scaling factors by using the values at
the first and second bifurcations that we calculated in Sect ion 1.1. These values, given
in Fig. 1.10.13, yield:
≈ (3 − 1)/(1 + √6 − 3) · 4.449 (1.10.43)
(1.10.44)
Numer ically, the asymptotic value of for large k is found to be 4.6692016. This dif
fers from our first estimate by only 5%. The nume rical value for is 2.50290787,
which differs from our first estimate by a larger margin of 30%.
We can determine these constants with greater accur acy by studying directly the
proper ties of the functions f, f
2
, . . . f
2
k
. . . that are involved in the formation of 2
k
cy
cles. In order to do this we modify our notation to explicitly include the dependence
of the funct ion on the parameter a. f (s,a), f
2
(s,a), etc. Note that iter ation of the func
tion f only applies to the fir st argument.
The tree is formed out of curves s
2
k ( a) that are obtained by solving the fixed point
equat ion:
(1.10.45)
We are interested in mapping a segment of this cur ve between the values of s where
(1.10.46)
and
(1.10.47)
df
2
k
(s,a )
ds
· −1
df
2
k
(s,a )
ds
·1
s
2
k (a) · f
2
k
(s
2
k (a), a )
≈
2s
1
a ·3
s
2
+
−s
2
−

.
`
,
a ·1+ 6
·
4
3
a
(a +1)(a −3)
a·1+ 6
· 3.252
· lim
k →∞
∆s
k
∆s
k+1
· lim
k →∞
∆a
k
∆a
k +1
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 285
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 285
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 285
onto the next funct ion, where k is r eplaced ever ywhere by k + 1. This mapping is a
kind of renormalization process similar to that we discussed in the p revious section.
In order to do this it makes sense to expand this function in a power series around an
inter mediate point, which is the point where these der ivatives are zero. This is known
as the super stable point of the iter ative map. The superstable point is ver y convenient
for study, because for any value of k there is a superstable point at s · 1/2. This follows
because f(s,a) has its maximum at s · 1/2, and so its der ivative is zero. By the chain
rule,the derivative of f
2
k
(s,a),is also zero. As illust rated in Fig. 1.10.13,the line at s ·
1/2 int ersects the bifurcation tree at every le vel o f the hie rarchy at an int ermediate
point between bifurcation points. These inter sect ion points must be superstable.
It is convenient to displace the or igin of s to be at s = 1/2,and the origin of a to
be at the convergence point of the bifurcations. We thus define a funct ion g which rep
resents the str ucture of the tree. It is approximately given by:
g(s,a) ≈ f( s + 1/2,a + a
c
) − 1/2 (1.10.48)
However, we would like to represent the idealized t ree rather than the real tree. The
idealized tree would satisfy the scaling relation exactly at all values of a. Thus g should
be the analog of the function f which would give us an ideal t ree. To find this function
we need to expand the region near a · a
c
by the appropriate scaling factors.
Specifically we define:
(1.10.49)
The easiest way to think about the function g (s,a) is that it is quite similar to the qua
dratic function f( s,a) but it has the for m necessary to cause the bifurcation tr ee to have
the ideal scaling behavior at ever y br anching. We note that g(s,a) depends on the be
havior of f (s,a) only ver y near to the point s · 1/2. This is appar ent in Eq. (1.10.49)
since the region near s · 1/2 is expanded by a factor of
k
.
We note that g(s,a) has its maximum at s · 0. This is a consequence of the shift
in origin that we chose to make in defining it.
Our objective is to find the form of g(s,a) and, with this form,the values of and
. The trick is to recognize that what we need to know can be obtained directly from
its scaling p roper ties. To write the scaling proper ties we look at the relationship be
tween successive iterations of the map and wr ite:
g(s,a ) · g
2
(s/ ,a / ) (1.10.50)
This follows either from our discussion and definition of the scaling parameters and
or directly from Eq. (1.10.49).
For convenience, we analyze Eq. (1.10.50) first in the limit a → 0. This corr e
sponds to looking at the funct ion g ( s,a) as a function of s at the limit of the bifurca
tion sequence. This funct ion (Fig. 1.10.14) still looks quite similar to our original
funct ion f(s), but its specific form is different. It sat isfies the relationship:
g(s,0) · g(s) · g
2
(s/ ) (1.10.51)
g(s, a) · lim
k→∞
k
f
2
k
(s /
k
+1/ 2,a /
k
+a
c
) −1/ 2
286 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 286
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 286
We approximate this funct ion by a quadratic with no linear t er m because g(s) has its
maximum at s · 0:
g(s) ≈ g
0
+ cs
2
(1.10.52)
Insert ing into Eq. (1.10.51) we obtain:
g
0
+ cs
2
≈ (g
0
+ c(g
0
+ c(s / )
2
)
2
) (1.10.53)
Equating separat ely the coefficients o f the first and second t erms in the expansion
gives the solution:
· 1 / (1 + cg
0
)
· 2cg
0
(1.10.54)
We see that c and g
0
only appear in the combination cg
0
, which means that there is one
parameter that is not deter mined by the scaling r elationship. However, this does not
prevent us from determining . Eq. (1.10.54) can be solved to obtain:
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 287
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 287
Title: Dynamics Complex Systems
Shor t / Normal / Long
0.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.3
0.4
0.5 0.3 0.1 0.1 0.3 0.5
f(s+1/2, a
c
)1/2
Fi gure 1 . 1 0 . 1 4 Th re e fun ct ion s a re plot t e d t h a t a re succe ssive a pproxima t ion s t o g( s) · g ( s,
0) . Th is fun ct ion is e sse n t ia lly t h e limit in g be h a vior of t h e qua dra t ic it e ra t ive ma p f ( s) a t t he
e n d of t h e bifurca t ion t re e a
c
. Th e fun ct ion s plot t e d a re t h e first t h re e k va lue s in se rt e d in
Eq. ( 1. 10. 49) : f( s + 1/ 2, a + a
c
) − 1/ 2, a f
2
( s/ + 1/ 2, a
c
) − 1/ 2 a n d a
2
f
4
( s/
2
+ 1/ 2, a
c
) −
1/ 2. Th e la t t e r t wo a re a lmost in dist in guish a ble , in dica t in g t h a t t h e se que n ce of fun ct ion s
con ve rge s ra pidly.
01adBARYAM_29412 3/10/02 10:17 AM Page 287
cg
0
· (−1 t √3)/2 · −1.3660254
· (−1 t √3) · −2.73205081
(1.10.55)
We have chosen the negative solut ions because the value of and the value of cg
0
must
be negative.
We ret urn to consider the dependence of g(s,a,) on a to obtain a new estimate for
. Using a firstor der linear dependence on a we have:
g(s,a,) ≈ g
0
+ cs
2
+ ba (1.10.56)
Insert ing into Eq. (1.10.50) we have:
g
0
+ cs
2
+ ba ≈ (g
0
+ c(g
0
+ c(s / )
2
+ ba / )
2
+ ba / ) (1.10.57)
Taking only the first order ter m from this equation in a we have:
· 2 cg
0
+ · 4.73205 (1.10.58)
Eq. (1.10.55) and Eq.(1.10.58) are a significant improvement over Eq.(1.10.44) and
Eq.(1.10.43). The new value of is less than 10% from the exact value. The new value
of is less than 1.5% from the exact value. To improve the accur acy of the results, we
need only add additional terms to the expansion of g(s,a) in s. The firstorder ter m in
a is always sufficient to obtain the corresponding value of .
It is impor tant, and actually cent r al to the argument in this section, that the ex
plicit form of f (s,a) never entered into our discussion. The only assumpt ion was that
the funct ional behavior near the maximum is quadratic. The rest of the argument fol
lows independent of the form of f (s,a) because we are looking at its properties aft er
many iterations. These proper ties depend only on the region right in the vicinity of
the maximum of the function. Thus only the firstorder ter m in the expansion of the
original function f (s,a) matters. This illust rates the notion of universality so essential
to the concept of renormalization—the behavior is cont rolled by ver y few parame
ters. All other parameters are ir relevant—changing their values in the o riginal it era
tive map is ir relevant to the behavior after many iter ations (many renormalizations)
of the iterative map. This is similar to the study of renormalization in models like the
Ising model, where most of the details of the behavior at small scales no longer mat
ter on the largest scales.
1 . 1 0 . 6 Mult igrid
The multigr id technique is designed for the solut ion of computational problems that
benefit from a description on multiple scales. Unlike renormalization, which is largely
an analyt ic tool,the multigr id method is designed specifically as a computational tool.
It wor ks well when a problem can be approximated using a description on a coarse
lattice, but becomes more and mo re accurate as the finerscale details on finerscale
lattices are included. The idea of the method is not just to solve an equation on finer
and finer levels of descript ion, but also to cor rect the coarserscale equations based on
the finerscale results. In this way the methodology creates an imp roved description
of the pr oblem on the coarserscale.
288 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 288
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 288
The multigrid approach relies upon iterative refinement of the solution.
Solut ions on coarser scales are used to approximate the solutions on finer scales. The
finerscale solutions are then it er atively r efined. However, by cor recting the coarser
scale equations,it is possible to perform most of the iter ative refinement of the fine
scale solut ion on the coarser scales. Thus the iter ative r efinement of the solution is
based both upon cor rection of the solut ion and correct ion of the equation. The idea
of correct ing the equation is similar in many ways to the r enormalization group ap
proach. However, in this case it is a particular solution, which may be spatially d e
pendent, r ather than an ensemble averaging process, which provides the correct ion.
We explain the multigr id approach using a conventional problem, which is the
solut ion of a differential equation. To solve the differential equation we will find an
appr oximate solut ion on a gr id of points.Our ultimate objective is to find a solution
on a fine enough gr id so that the solut ion is within a prespecified accuracy of the ex
act answer. However, we will start with a much coarser grid solution and progressively
refine it to obtain more accurate results. Typically the multigrid method is applied in
two or three dimensions, where it has greater advantages than in one dimension.
However, we will describe the concepts in one dimension and leave out many of the
subtleties.
For concreteness we will assume a differential equation which is:
(1.10.59)
where g(x) is specified. The domain of the equation is sp ecified, and boundary con
ditions are provided for f (x) and its derivative.On a gr id of equally spaced points we
might represent this equation as:
(1.10.60)
This can be written as a mat rix equat ion:
(1.10.61)
The mat rix equation can be solved for the values of f (i) by mat r ix inversion (using
matr ix diagonalization). However, diagonalization is ver y costly when the matrix is
large, i.e., when there are many points in the gr id.
A multigr id approach to solving this equation starts by defining a set of lattices
(gr ids), L
j
, j ∈ {0,. . .,q¦, where each successive lattice has twice as many points as the
previous one (Fig. 1.10.15). To explain the procedure it is simplest to assume that we
start with a good approximation for the solution on gr id L
j −1
and we are looking for
a solution on the gr id L
j
. The steps taken are then:
1. Inter polate to find f
j
0
(i ), an approximate value of the function on the finer
grid L
j
.
j
∑
A(i, j) f ( j ) · g( i)
1
d
2
( f ( i +1) + f ( i −1) −2 f (i)) · g( i)
d
2
f (x)
dx
2
·g( x)
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 289
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 289
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 289
2. Perform an it er ative improvement (relaxation) of the solution on the finer grid.
This iter ation involves calculating the error
(1.10.62)
where all indices refer to the grid L
j
. This error is used to improve the solut ion on
the finer grid, as in the minimization procedures discussed in Sect ion 1.7.5:
(1.10.63)
The scalar c is generally r eplaced by an approximate inverse of the mat rix A(i,j )
as discussed in Sect ion 1.7.5. This it er ation captures much of the cor rection of
the solution at the finescale level; however, there are resulting correct ions at
coarser levels that are not capt ured. Rather than continuing to it erat ively improve
the solution at this finescale level, we move the iter ation to the next coarser level.
3. Recalculate the value of the funct ion on the coarse gr id L
j −1
to obtain f
1
j –1
(i). This
might be just a restr iction from the finegrid p oints to the coarsegr id p oints.
However, it often involves some more sophisticated smoothing. Ideally, it should
be such that int er polation will in ver t this process to o btain the values that were
found on the finer grid. The correction for the difference between the interpo
lated and exact finescale results are retained.
4. Correct the va lue of g(i) on the coa rse gr id using the va lues of r
j
(i) re s tr i cted to
the coa rs er gr i d . We do t his so that the coa rs e  grid equ a ti on has an ex act soluti on
that is con s i s tent with the finegrid equ a ti on . From Eq . (1.10.62) this essen ti a lly
means adding r
j
(i) to g(i) . The re su l ting corrected va lues we call g
1
j– 1
(i) .
f
1
j
( i ) · f
0
j
( i) −cr
j
( i )
′ i
∑
A(i, ′ i ) f
0
j
( ′ i ) − g( i) · r
j
( i)
290 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 290
Title: Dynamics Complex Systems
Shor t / Normal / Long
L
0
L
1
L
2
L
3
Fi gure 1 . 1 0 . 1 5 I llust ra t ion of four grids for a on e  dime n sion a l a pplica t ion of t h e mult igrid
t e ch n ique t o a diffe re n t ia l e qua t ion by t he proce dure illust ra t e d in Fig. 1. 10. 16.
01adBARYAM_29412 3/10/02 10:17 AM Page 290
5. Relax the solut ion f
1
j –1
(i) on the coarse grid to obtain a new approximation to the
funct ion on the coarse gr id f
2
j –1
(i). This is done using the same procedure for re
laxation descr ibed in step 3; however g(i) is replaced by g
1
j –1
(i).
The p rocedure of going to coarser gr ids in steps 3 through 5 is repeated for al l
gr ids L
j −2
, L
j −3
,… till the coarsest grid, L
0
. The values of the funct ion g( i) are pro
gressively cor rected by the finerscale e r rors. Step 5 on the coarsest gr id is re
placed by exact solution using mat r ix diagonalization. The subsequent steps are
designed to br ing all of the iter ative refinements to the finestscale solution.
6. Interpolate the coarsegrid solution L
0
to the finergrid L
1
.
7. Add the cor rection that was previously sa ved when going fr om the fine to the
coarse grid.
Steps 6–7 are then repeated to take us to progressively finerscale grids all the way
back to L
j
.
This procedure is called a Vcycle since it appears as a V in a schematic that shows
the progressive movement between levels. A Vcycle starts from a relaxed solut ion on
gr id L
j −1
and results in a relaxed solution on the grid L
j
. A full multigrid procedure in
volves starting with an exact solution at the coarsest scale L
0
and then per for ming V
cycles f or p rogressively finer gr ids. Such a multigrid p rocedure is gr aphically il lus
t rated in Fig. 1.10.16.
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 291
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 291
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 1 . 1 0 . 1 6 Th e mult igrid a lgorit h m use d t o obt a in t h e solut ion t o a diffe re n t ia l e qua
t ion on t h e fin e st grid is de scribe d sch e ma t ica lly by t h is se que n ce of ope ra t ions. Th e ope ra
t ion se que n ce is t o be re a d from le ft t o righ t . Th e diffe re n t grids t h a t a re be in g use d a re in 
dica t e d by succe ssive h orizon t a l lin e s wit h t h e coa rse st grid on t h e bot t om a n d t h e fin e st
grid on t h e t op. Th e se que n ce of ope ra t ion s st a rt s by solvin g a diffe re n t ia l e qua t ion on t he
coa rse st grid by exa ct ma t rix dia gon a liza t ion ( sh a de d circle ) . Th e n it e ra t ive re fin e me n t of t he
e qua t ion s is pe rforme d on fin e r grids. Wh e n t h e fin e r grid solut ion s a re ca lcula t e d, t he
coa rse  grid e qua t ion s a re corre ct e d so t h a t t h e it e ra t ive re fin e me n t of t h e fin e  sca le solut ion
ca n be pe rforme d on t h e coa rse grids. Th is in volve s a V cycle a s in dica t e d in t h e figure by t he
boxe s. Th e h orizon t a l curve d a rrows in dica t e t h e re t e n t ion of t h e diffe re n ce be t we e n coa rse 
a nd fine  sca le solut ions so t h a t subse que nt re fin e me n t s ca n be pe rforme d.
01adBARYAM_29412 3/10/02 10:17 AM Page 291
There are several advantages of the multigrid metho dology for the solution of
differential equations o ver mo re t r aditional integration methods that use a single
grid r epresentation. With car eful implementation, the increasing cost of finerscale
gr ids grows slowly with the number of gr id points, scaling as N ln(N). The solution
of multiple problems of similar t ype can be even more efficient,since the cor rections
of the coarsescale equations can often be carried over to similar problems, acceler at
ing the it er ative refinement. This is in the spirit of developing universal coarsescale
representations as discussed earlier. Finally, it is natural in this method to obtain esti
mates of the remaining error due to limited grid densit y, which is impor tant to
achieving a controlled er ror in the solution.
1 . 1 0 . 7 Levels of descript ion, emergence of simplicit y
a nd complexit y
In our explorations of the world we have often discovered that the natural world may
be describ ed in ter ms o f underlying simple objects, concepts, and laws o f behavior
(mechanics) and interactions. When we look still closer we see that these simple ob
jects are composite objects whose inter nal str ucture may be complex and have a
wealth of possible behavior. Somehow, the wealth of behavior is not relevant at the
larger scale. Similarly, when we look at longer length scales than our senses nor mally
are attuned t o, we discover that the behavior at these length scales is not affected by
objects and events that appear impor tant to us.
Examples are found from the behavior of galaxies to elementary particles: galax
ies are composed of suns and int erst ellar gases, suns are formed of complex plasmas
and are orbited by planets, planets are for med from a diversity of materials and even
life, mater ials and living organisms are formed of atoms,at oms are composed of nu
clei and electrons, nuclei are composed of protons and neutrons (nucleons),and nu
cleons appear to be composed of quar ks.
Each of these represents what we may call a level of description of the wor ld. A
level is an inter nally consistent picture of the behavior of interact ing elements that are
simple. When taken together, many such elements may or may not have a simple be
havior, but the rules that give rise to their collect ive behavior are simple. We note that
the interplay between levels is not always just a selfcontained descript ion of one level
by the level immediately below. At times we have to look at more than one level in or
der to descr ibe the behavior we are interested in.
The existence of these levels of description has led science to develop the notion
of fundamental law and unified theor ies of matter and nature. Such theor ies are the
selfconsistent descriptions of the simple la ws governing the b ehavior and int erplay
of the entities on a par ticular level. The laws at a par t icular level then give rise to the
largerscale behavior.
The ex i s ten ce of s i m p l i c i t y in t he de s c ri pt i on of u n der lying fundament al laws
is not t he on ly way that simplicit y ar ises in scien ce . The ex i s t en ce of mu l t iple lev
els implies that simplicit y can also be an em er gent pr oper t y. This means that t he
co ll ect ive beh avi or of m a ny el em en t a r y par ts can beh ave simply on a mu ch larger
s c a l e .
292 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 292
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 292
The study of complex systems focuses on understanding the relationship be
tween simplicity and complexity. This requires both an understanding of the emer
gence of complex behavior from simple elements and laws, as well as the emergence
of simplicity from simple or complex elements that allow a simple largerscale d e
scription to exist.
Much of our discussion in this section was based upon the under standing that
macroscopic behavior of physical systems can be describ ed or d etermined by only a
few relevant parameter s. These parameters arise from the underlying microscopic de
scription. However, many of the aspects of the microscopic descript ion are irrelevant.
Different microscopic models can be used to describe the same macroscopic phe
nomenon. The approach of scaling and renormalization does not assume that all the
details of the microscopic descript ion b ecome ir relevant, however, it t ries to deter
mine selfconsistently which of the microscopic parameter s are relevant to the macro
scopic b ehavior in order to enable us to simplify our analysis and come to a better
under standing.
Whenever we are describing a simple macroscopic behavior, it is natural that the
number o f microscopic parameters r elevant to model this behavior must be small.
This follows dir ectly from the simplicity of the macroscopic behavior. On the other
hand,if we describe a complex macroscopic behavior, the number of microscopic pa
rameters that are relevant must be large.
Never t h el e s s , we know that t he ren or m a l i z a ti on group approach has some va
l i d i ty even for com p l ex sys tem s . At long lengt h scales, a ll of the details t hat occur on
the smallest length scale ar e not rel eva n t . The vi bra ti ons of an indivi dual atom are
not gen era lly rel evant to the beh avi or of a com p l ex bi o l ogical or ga n i s m . In deed ,
t h ere is a pat ter n of l evels of de s c ri pti on in t he st ru ctu r e of com p l ex sys tem s . For
bi o l ogical or ga n i s m s , com po s ed out of a tom s , t h ere are ad d i ti onal levels of de
s c ri pt i on t hat ar e int erm ed i a te bet ween atoms and t he or ga n i s m : m o l ec u l e s , cell s ,
t i s su e s , or gans and sys tem s . The ex i s ten ce of t hese levels implies t hat many of t h e
details of t he atomic beh avi or ar e not r el evant at the mac ro s copic level . This should
also be under s tood fr om t he pers pective of t he mu l ti  gr id approach . In this pictu re ,
wh en we are de s c ri bing t he beh avi or of a com p l ex sys tem , we have the po s s i bi l i t y of
de s c ri bing it at a ver y coa rse level or a fin er an d yet finer level . The nu m ber of l ev
els t hat are nece s s a r y depends on the level of pr ec i s i on or level of det ail we wish to
ach i eve in our de s c ri pti on . It is not alw ays nece s s a r y to de s c ri be the beh avi or in
ter ms of the finest scale. It is essen ti a l , h owever, to iden tify pr oper ly a model t hat
can capt u re t he essen tial under lying para m eters in order to discuss the beh avi or of
a ny sys tem .
Like biological organisms, manmade const ructs are also built from levels of
st ruct ure. This method of organization is used to simplify the design and enable us to
understand and work with our own creations. For example, we can consider the con
str uct ion of a factor y from machines and computer s,machines constructed from in
dividual moving parts, computers constr ucted fr om various components including
computer chips, chips constr ucted fr om semiconductor devices, semiconductor de
vices composed out of regions of semiconductor and metal. Both biology and
F ra c t a l s , s c a l i n g a n d r e n o r m a l i z a t i o n 293
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 293
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 293
engineering face problems of design for function or pur pose. They both make use of
interacting building blocks to engineer desired behavior and therefore const ruct the
complex out of the simple. The existence of these building blocks is related to the ex
istence of levels of descr iption for both natural and ar t ificial systems.
Our discussion thus brings us to recognize the importance of studying the prop
er ties of substr uct ure and its relationship to funct ion in complex syst ems. This rela
tionship will be considered in Chapter 2 in the context of our study of neural
networks.
294 I n t r o d u c t i o n a n d P r e l i m i n a r i e s
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 294
Title: Dynamics Complex Systems
Shor t / Normal / Long
01adBARYAM_29412 3/10/02 10:17 AM Page 294
295
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 295
Title: Dynamics Complex Systems
Shor t / Normal / Long
2
Ne ura l Ne t works I :
Subdi vi s i on a nd Hi e ra rchy
Conce pt ua l Out li ne
Motivated by the properties of biological neural networks, we introduce
simple mathematical models whose properties may be explored and related to as
pects of human information processing.
The attractor network embodies the properties of an associative content
addressable memory. Memories are imprinted and are accessed by presenting the
network with part of their content. Properties of the network can be studied using a
signaltonoise analysis and simulations. The capacity of the attractor network for
storage of memories is proportional to the number of neurons.
The feedforward network acts as an inputoutput system formed out of
several layers of neurons. Using prototypes that indicate the desired outputs for a set
of possible inputs, the feedforward network is trained by minimizing a cost function
which measures the output error. The resulting training algorithm is called back
propagation of error.
In order to study the overall function of the brain, an understanding of sub
structure and the interactions between parts of the brain is necessary. Feedforward
networks illustrate one way to build a network out of parts. A second model of inter
acting subnetworks is a subdivided attractor network. A subdivided attractor network
stores more than just the imprinted patterns—it stores composite patterns formed out
of parts of the imprinted patterns. If these are patterns that an organism might en
counter, then this is an advantage. Features of human visual processing, language
and motor control illustrate the relevance of composite patterns.
Analysis and simulations of subdivided attractor networks reveal that par
tial subdivision can balance a decline in the storage capacity of imprinted patterns
with the potential advantages of composite patterns. However, this balance only al
lows direct control over composite pattern stability when the number of subdivisions
is no more than approximately seven, suggesting a connection to the 7 t 2 rule of
shortterm memory.
2 . 5
2 . 4
2 . 3
2 . 2
2 . 1
02adBARYAM_29412 3/10/02 9:27 AM Page 295
The limitation in the number of subdivisions in an effective architecture
suggests that a hierarchy of functional subdivisions is best for complex pattern
recognition tasks, consistent with the observed hierarchical brain structure.
More general arguments suggest the necessity of substructure, and ap
plicability of the 7 ± 2 rule, in complex systems.
Ne ura l Ne t works : Bra i n a nd Mi nd
The functioning of the br ain as part of the nervous system is generally believed to ac
count for the complexity of human (or animal) interact ion with its environment. The
brain is considered responsible for sensor y processing, motor control,language, com
mon sense,logic, creat ivity, planning, selfawareness and most other aspects of what
might be called higher information pr ocessing. The elements believed responsible for
brain function are the nerve cells—neurons—and the interactions between them. The
interactions are mediated by a variet y of chemicals transfer red through synapses. The
brain is also affected by diver se substances (e.g., adrenaline) pr oduced by other parts
of the body and t ransported through the bloodstream. Neurons are cells that should
not be described in only one form, as they have diverse forms that vary b etween dif
ferent parts o f the brain and within particular brain sections (Fig. 2.1.1). Specifying
the complete b ehavior o f an individual neur on is a detailed and complex p roblem.
However, it is reasonable to assume that many of the gener al principles upon which
the ner vous syst em is designed may be described through a muchsimplified model
that takes into account only a few features of each neur on and the interact ions be
tween them. This is expected, in part, because of the large number, on the order of
10
11
, neurons in the brain.
A variety of mathematical models have b een describ ed that attempt to capture
par ticular features of the neurons and their inter actions. All such models are incom
plete. Some models are par ticularly well suit ed for theoretical investigations, other s
for patter nrecognition tasks. Much of the modern effor t in the modeling of the ner
vous syst em is of commer cial nature in seeking to implement patternrecognition
str ategies for art ificial intelligence tasks.Our ap proach will be to int roduce two of the
simpler models of neural networks, one of which has b een used for ext ensive theo
ret ical studies, the other for commercial applications. We will then take advantage of
the simple analysis of the for mer to develop an understanding of subdivision in
neural networks. Subdivision and subst ructure is a key theme that appears in man y
forms in the study of complex systems.
Ther e have b een many effor ts to demonstrate the connect ion b etween mathe
matical models of neural networks and the biological brain. These are imp ortant in
order to bridge the gap between the biological and mathematical mo dels. The addi
tional readings located at the end of this t ext may be consulted for d etailed discus
sions. We do not review these effor ts here; instead we motivate more loosely the art i
2 . 1
2 . 7
2 . 6
296 N e u r a l N e t wo r k s I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 296
Title: Dynamics Complex Systems
Shor t / Normal / Long
02adBARYAM_29412 3/10/02 9:27 AM Page 296
ficial models and rely upon investigations of the proper ties of these models to estab
lish the connection, or to suggest invest igations of the biological system.
To motivate the ar tificial models of neural networks, we show in Fig. 2.1.2 a
schematic of a biological neural network that consists of a few neurons.Each neuron
has a cell b ody with multiple project ions called dendrites, and a longer p rojection
called an axon which br anches into t er minal fibers. The t erminal fibers o f the axon
of one neur on gener ally end p roximate to the d endrites of a different cell body. The
cell walls o f a neur on suppor t the t ransmission of elect rochemical pulses that t ravel
along the axon from the cell body to the terminal fibers. A single elect rochemical
pulse is not usually considered to be the quantum of information. Instead it is the
“activity”—the rate of pulsing—that is considered to be the relevant parameter de
scribing the state of the neur on. Pulses that ar rive at the end of a t erminal fib er r e
lease various chemicals into the narrow intracellular region separating them from
the dendrites of the adjacent cell. This region, known as a synapse, provides the
medium of influence of one neur on on the next. The chemicals released across the
N e u r a l n e t wo r k s : Br a i n a n d m i n d 297
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 297
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 2 . 1 . 1 Se ve ra l
diffe re n t t ype s of
n e uron s a da pt e d from
illust ra t ion s obt a in e d
by va rious st a in in g
t e ch n ique s.
Fi gure 2 . 1 . 2 S c h e ma t ic
i l l u s t ra t ion of a bio l o g i
ca l ne u ral n et work sho w
i ng seve ral nerve cells
wit h bra n c h i ng axo ns.
T he a xo ns end at syna p s e s
c o n n e c t i ng t o t he de n
drit es of t he next ne u ro n
t hat lead t o it s cell body.
Th is sche ma t ic illust ra
t ion is furt he r simplifie d
t o obt ain t he a rt ific ia l
ne t work mo dels sh own in
F ig. 2. 1. 3.
02adBARYAM_29412 3/10/02 9:27 AM Page 297
gap may either stimulate (an excitator y synapse) or depress (an inhibit or y synapse)
the activity of the next neuron.
It is generally assumed, though not univer sally a ccepted, that the “state o f the
mind”at a par ticular time is described by the activities of all the neur ons—the pat 
tern o f neural a ctivit y. This a ct ivity pattern e volves in time, because the act ivity o f
each neuron is determined by the activity of neurons at an earlier time and the exci
tator y or inhibit or y synapses b etween them. The influence of the exter nal world on
the neur ons occurs through the a ctivity of sensor y neurons that are affected by sen
sor y receptors. Actions are effected by the influence of motorneuron activity on the
muscle cells. Synaptic connect ions are in part “hardwired” performing funct ions that
are prespecified by genetic programming. However, memor y and exper ience are also
believed to be encoded into the st rength (or even the exist ence) of the synapses b e
tween neurons. It has b een demonstr ated that synap tic st rengths are affected by the
state of neuronal excitation. This influence,called impr inting, is considered to be the
principle me chanism f or adap t ive learning. The most established and wellstudied
form of impr inting was originally proposed by Hebb in 1949. The plasticity of
synapses should not be underestimated, because the development of even basic func
tions of vision is known to be influenced by sensor y stimulation.
Hebbian imprinting suggests that when two neur ons are both firing at a partic
ular time, an excitator y synapse between them is strengthened and an inhibit or y
synapse is weakened. Conver sely, when one is firing and the other is not, the in
hibitory synapse is st rengthened and the excitator y synapse is weakened. Intuitively,
this results in the possibility of reconstructing the neural activity patt er n from a par t
of it, because the synapses have been modified so as to reinforce the pattern. Thus,t he
imprinted patt er n of neural act ivity becomes a memor y. This will be demonst rated
explicitly and explained more fully in the context of ar tificial networks that success
fully reproduce this process and help explain its function.
The two types of ar tificial neural networ ks we will consider are illustr ated in
Fig. 2.1.3. The first kind is called an att ractor network, and consists of mathematical
neurons identified as variables s
i
that represent the neuron activity. i is the neuron in
dex. Neurons are connected by synapses consisting of variables J
ij
that represent the
strength of the synapse between two neurons i and j. The synapses are taken to be
symmet ric,so that J
ij
· J
ji
. A positive value of J
ij
indicates an excitator y synapse.A neg
at ive value indicates an inhibitor y synapse. A more precise definition follows in
Section 2.2. The second kind of network,discussed in Section 2.3, is called a feedfor
ward network. It consists of a set of two or more layers of mathematical neurons con
sisting of variables s
l
i
that represent the neuron activit y. For convenience, l is added as
a layer ind ex. Synapses r epresented by variables J
l
ij
act only in one direct ion and se
quentially from one layer to the next.
Our knowledge of biological neural networks indicates that it would be more re
alistic to represent synapses as unidirectional,as the feedfor ward network does, but to
allow neurons to be connected in loops. Some of the effects of feedback in loops are
represented in the attractor network by the symmetric synapses.
298 N e u r a l N e t wo r k s I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 298
Title: Dynamics Complex Systems
Shor t / Normal / Long
02adBARYAM_29412 3/10/02 9:27 AM Page 298
A second distinct ion between the two types of networks is in their choice of rep
resentation o f the neural act ivity. The att ractor network t ypically uses binary vari
ables, while the feedfor ward network uses a real number in a limited range. These
choices are related to the nonlinear response of neurons. The activity of a neuron at a
particular time is thought to be a sigmoidal funct ion of the influence o f other neu
rons. This means that at moderate levels o f excitation, the a ctivity o f the neur on is
propor tional to the excitation. However, for high levels of excitation,the act ivity sat
urates. The question arises whether the br ain uses the linear regime or generally drives
the neurons to saturation. The most reasonable answer is that it depends on the func
tion of the neuron. This is quite analogous to the use of silicon transist or s, which are
used both for linear resp onse and swit ching tasks. The neur ons that are used in sig
nalprocessing funct ions in the early stages of the auditor y or visual systems are likely
to make use of the linear r egime. However, a linear oper ation is greatly limited in its
possible effects. For example,any number of linear oper ations are equivalent to a sin
gle linear operation. If only the linear regimes of neurons were ut ilized,the whole op
er ation of the network would be reducible to application of a linear o perator to the
input information—multiplication by a matrix. Thus, while for initial signal process
ing the linear regime should play an impor tant role,in other parts of the brain the sat
uration regime should be expected to be impor tant. The feedforward network uses a
model of nonlinear response that includes both linear and saturation regimes, while
the attr actor networ k typically represents only the saturation regime.Gener alizing the
attractor network to include a linear regime adds analyt ic difficulty, but does not sig
nificantly change the results. In contr ast, both the linear and nonlinear regimes ar e
necessar y for the feedforward network to be a meaningful model.
Each of the two ar tificial networ k models represents drastic simplifications over
more realistic network models. These simplifications enable intuitive mathematical
N e u r a l n e t wo r k s : Br a i n a n d m i n d 299
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 299
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 2 . 1 . 3 Sch e ma t ic illust ra t ion of t wo t ype s of a rt ificia l n e ura l n e t works t h a t a re use d
in mode lin g biologica l n e t works e it h e r for forma l st udie s or for a pplica t ion t o pa t t e rn re cog
n it ion . On t h e le ft is a sch e ma t ic of a n a t t ra ct or n e t work. Th e dot s re pre se n t t h e n e uron s a nd
t h e lin e s re pre se n t t h e syn a pse s t h a t me dia t e t h e in flue n ce be t we e n t h e m. Th e syn a pse s a re
symme t ric ca rryin g e qua l in flue n ce in bot h dire ct ions. On t h e righ t is a fe e dforwa rd n e t work
con sist in g of se ve ra l la ye rs ( h e re four) of n e uron s t h a t in flue n ce e a ch ot h e r in a un idire c
t ion a l fa sh ion . Th e in put a rrivin g from t h e le ft se t s t h e va lue s of t h e first la ye r of n e urons.
Th e se n e uron s in flue n ce t h e se con d la ye r of n e uron s t h rough t h e syn a pse s be t we e n la ye r one
a nd t wo. Aft e r se ve ra l st a ge s, t he out put is re a d from t he fin a l la ye r of ne uron s.
02adBARYAM_29412 3/10/02 9:27 AM Page 299
t reatments and capture behavior s that are likely to be an important part of more re
alistic models. The attractor network with symmetr ic synapses is the most convenient
for analyt ic t reatments because it can be described using the st ochastic field for mal
ism discussed in Section 1.6. The feedforward network is more easily used as an input
out put system and has found more use in applications.
At t ra ct or Ne t works
2 . 2 . 1 Defining a t t ra ct or net works
Attractor networ ks, also known as Hopfield networks, in their simplest form, have
three features:
a. Symmetric synapses:
J
ij
· J
ji
(2.2.1)
b. No selfaction by a neuron:
J
ii
· 0 (2.2.2)
c. Binar y var iables for the neuron activit y values:
s
i
· t1 (2.2.3)
There ar e N neurons, so the neuron indices i, j take values in the range {1,...,N}. By
Eq.(2.2.1) and Eq.(2.2.2),the synapses J
ij
for m a symmetric N × N matrix with all di
agonal elements equal to zero.
The bi n a ry repre s en t a ti on of n eu ron activi t y su ggest s that the act ivi t y has on ly
t wo va lues wh i ch are active or “f i ri n g,” s
i
· +1 ( O N) , and inact ive or “qu i e s cen t ,” s
i
·−1
(O F F) . The activi ty of a par ticular neu ron ,u p d a ted at time t, is given by:
(2.2.4)
where the values of all the other neurons at time t − 1 are polled through the synapses
to determine the i th neuron activity at time t. Specifically, this expression states that
a particular neuron fires or does not fire depending on the result of performing a sum
of all o f the messages it is receiving through synapses. This sum is formed fr om the
act ivity of ever y neur on multiplied by the st rength of the synapse between the two
neurons. Thus, for example,a firing neur on j, s
j
·+1, which has a positive (excitatory)
synapse to the neuron i, J
ij
> 0, will increase the likelihood of neuron i firing. If neu
ron j is not firing, s
j
· −1, then the likelihood of neuron i firing is reduced. On the
other hand,if the synapse is inhibitor y, J
ij
< 0, the opposite occurs — a firing neuron
j, s
j
· +1, will decrease the likelihood of neuron i firing, and a quiescent neuron j, s
j
·
−1, will increase the likelihood o f neuron i firing. When necessar y, it is understood
that sign(0) takes the value t1 with equal probability.
The act ivity of the whole network of neurons may be determined either syn
chronously (all neurons at once) or asynchronously (selecting one neuron at a time).
Asynchronous updating is probably more realistic in models of the brain. However,
s
i
(t ) · sign( J
ij
s
j
(t −1)
j
∑
)
2 . 2
300 N e u r a l N e t wo r k s I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 300
Title: Dynamics Complex Systems
Shor t / Normal / Long
02adBARYAM_29412 3/10/02 9:27 AM Page 300
for many pur poses the difference is not significant, and in such cases we can assume
a synchronous update.
2 . 2 . 2 Opera t ing a nd t ra ining a t t ra ct or net works
Conceptually, the operation of an att ractor network proceeds in the following st eps.
First a patt ern of neural activities, the “input”, is imposed on the network. Then the
network is evolved by updating several times the neurons according to the neuron up
date rule, Eq.( 2.2.4) . The evolut ion continues until either a steady state is reached or
a prespecified number of updates have been performed. Then the state of the network
is read as the “output.” The next pattern is then imposed on the network.
At the same time as the network is p erforming this p rocess, the synapses them
selves are modified by the state of the neurons according to a mathematical formula
tion of the Hebbian rule:
J
ij
(t ) · J
ij
(t − 1) + cs
i
( t − 1)s
j
(t − 1) i ≠ j (2.2.5)
wher e the rate of change o f the synapses is cont rolled by the parameter c. This is a
mathematical descript ion of Hebbian impr inting, because the synapse between two
neurons is changed in the direction of being excitator y if both neurons are either ON
or OFF, and the synapse is changed in the direction of being inhibitor y if one neuron
is ON and the other is OFF.
The update of a neuron is considered to be a much faster process than the
Hebbian changes in the synapses—the synapt ic dynamics. Thus we assume that c is
small compared to the magnitude o f the synapse values, so that each imprint causes
only an incremental change. Because the change in synapses occurs much more slowly
than the neuron update, for mo deling pur poses it is convenient to separate it com
pletely from the process of neuron update. We then describe the operation of the net
wor k in terms of a t raining per iod and an oper ating period.
The training of the network consists of imprinting a set of selected neuron firing
patterns {
i
} where i is the neuron ind ex i ∈ {1,...,N¦ , is the patt ern ind ex ∈
{1,...,p¦, and
i
is the value of a particular neuron s
i
in the th pattern. It is assumed
that there are a fixed number p of patterns that are to be tr ained. The synapses are then
set to:
(2.2.6)
The p refactor 1/ N is a choice of normalization o f the synapses that is often conve
nient, but it does not affect in an essential way any results descr ibed here.
2 . 2 . 3 Energy a na log
The formulation of the attr actor network can be recognized as a gener alization of the
Ising model discussed in Section 1.6. Neurons are analogous to spins, and the int er
action between two spins s
i
and s
j
is the synapse J
ij
.
We can thus identify the effective energy of the system as:
J
ij
·
1
N
i j
·1
p
∑
i ≠ j
0 i · j
¹
'
¹
¹
¹
At t r a c t o r n e t w o r k s 301
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 301
Title: Dynamics Complex Systems
Shor t / Normal / Long
02adBARYAM_29412 3/10/02 9:27 AM Page 301
(2.2.7)
The update of a par ticular neuron, Eq.( 2.2.4) , consists of “aligning” it with the effec
tive local field (known as the postsynaptic potent ial):
(2.2.8)
This is the same dynamics as the Glauber or Monte Carlo dynamics of an Ising model
at zero temper ature. At zero t emperature the syst em evolves to a local minimum en
ergy state. In this state each spin is aligned with the effective local field.
The analogy between a neural network and a mo del with a welldefined energy
enables us to consider the operation of the network in a natural way. The patt ern of
neural activities evolves in time to d ecrease the energy of the patt ern until it reaches
a local energy minimum, where each neuron act ivity is consistent with the influences
upon it as measured by the postsynaptic potential. Imprinting a pattern of neural ac
tivity lowers the energy of this patt ern and, to a lesser degree,the energy of patterns
that are similar. In lowering the energy of these patterns,imprinting creates a basin of
attr action. The basin of attract ion is the region