You are on page 1of 79

The Emerging Mind of the Machine

John Cameron Glasgow II


A philosophy of emerging systems based on general systems theory concepts is propounded and
applied to the problem of creating intelligence in a machine. It is proposed that entropy, a
principle that provides a framework for understanding the organization of physical systems
throughout the universe, serves also as a basis for understanding the creation of an intelligence.
That position is defended by the argument that the universe is a statistically described place in
which the entropic processes of information bearing physical systems (e.g. RNA, DNA, neural
networks, etc.) describe the same thing under interpretation as information or physical structures.
As a consequence the observations concerning hierarchical stratification and rule creation
described in general systems theory and so obvious in physical systems are assumed to apply to
information systems. The mind and its intelligence is identified as such an information system
that emerges into the human environment as a system of hierarchically structured mental objects
and associated rules. It is inferred that those rules/objects that emerge at a level in a mind are not
reducible to the functions and interactions of constituent parts at lower levels. All of this implies
that any implementation of human–like intelligence in a machine will have to include lower levels,
and that the mechanism of that implementation will have to be emergence rather than
construction. It is not concluded that this precludes the implementation of intelligence in a
machine. On the contrary, these ideas are developed in the context of an attempt to implement an
undeniably intelligent machine, that is, a machine that will be considered intelligent in the same
manner that one person normally and implicitly considers another person intelligent.


Emergence is a controversial philosophical viewpoint that maintains that the universe exhibits an
ongoing creativity. It claims that creativity is manifest in the orderly systems that ‘emerge’ at a
level from constituent parts at lower levels. More explicitly, such systems are seen to exhibit
behavior that is coherent and deterministic but not predictable and not reducible to the laws that
apply at the lower levels. That is, new levels exhibit new laws. Thus a person who accepts the
concept of emergence might claim that, while the laws of physics and chemistry can explain the
internal bonds and the physical attributes of any and all molecules, those laws cannot predict
which molecules will be produced or the subsequent behavior of a particular molecule or a group
of molecules, even though the nature of those activities can be codified. As a more concrete
example, while the structure and chemical proclivities of a molecule of deoxyribonucleic acid
(DNA) can be explained by the laws of chemistry and its behavior in the world as it encounters
other objects can be explained by laws of physics, the spectacular behavior exhibited over time
by groups of these molecules (as codified in synthetic theories of evolution) is not implicit in
those laws. As an example at a higher level, while demographers might determine that the world
human population will reach a particular level in fifty years, they cannot predict the nature of the
social, governmental, or economic institutions that will arise any more than they can predict that
some specific set of people will produce offspring who will in turn produce offspring who will
produce a great leader who will materially affect those institutions (although it is reasonable to
assert that such will be the case). The concept of a universe in a state of ongoing creative activity
lies somewhere between the two most popular world models; the idea that the universe is in
some manner controlled or directed from without and the idea that the universe consists of a set
of unchanging rules and objects to which all observed phenomena can be reduced.

The philosophical concept of reductionism maintains that the nature and activity of everything is
implicit in some basic finite set of rules. According to reductionism, given those rules and an
initial state of the universe all of the past and future states of the universe can be calculated (an
assertion made by Pierre Simon Laplace in the late seventeenth century at a time when it seemed
that the ‘clockwork mechanism’ of the universe would soon be completely revealed). This

philosophy is easily accepted by scientists, for whom it can serve as both the justification of
continued search for underlying laws, and the assurance that such laws exist. Philosophically,
reductionism has come under attack from many quarters. The implied existence of a regress of
explanation of phenomena in terms of their constituent phenomena requires (for any practical
application of the concept) the hypothesis of some particular set of low level, basic and
fundamental objects and rules whose validity is assumed. In a macroscopic sense the job of
science can be taken to be the reduction of phenomena at high levels to some hierarchical
combination of the basic set, thus revealing why things are as they are. But the selection of a
basic set of rules and objects has often proven problematical. In practice, an assumed truth often
turns out to be just a particular case in a larger scheme. In geometry, Euclids parallel lines axiom
is famous as an assumed ‘truth’ that is only true in flat or uncurved space. Newtons laws are
true only in the context of a restricted portion of an Einsteinian universe, albeit the portion of
which we happen to be most aware. At the quantum level of existence some of the laws of logic
that seem so evident at the human level of existence, break down. Even some of Einstein's
assumptions break down at the quantum level (e.g. the assumption that the speed of light is
fundamental limit). A reductionist would take this to mean that the basis set of fundamental
truths had not yet been discovered.

In the early part of this century the school of philosophical thought that exemplified the
reductionists attitude was logical positivism. The two famous proponents of logical positivism
were Alfred North Whitehead and Bertrand Russell. Their attempt to provide the philosophy
with a sound analytical basis (Principia Mathematica) was brought into question in 1931 by the
famous incompleteness theorem of Kurt Goedel. Goedel proved that in any formal system based
on a set of axioms and that is sufficiently complex to be able to do arithmetic, either there will be
assertions that can be made within the framework of the system that are true but that cannot be
proven true (i.e. the system is incomplete), or there will be contradictory assertions that can be
proven true (i.e. the system is inconsistent). That is, there are inherent weaknesses in any such
system that tend to invalidate the system as a whole. Still, scientists can work within a
reductionist framework if their work is restricted to the phenomena at some particular level or
some small closely associated group of levels. In the normal conduct of their research they do not
come up against an infinite regress of explanations. Problems of incompleteness or inconsistency
are rarely encountered. When they are encountered, the necessary assumptions concerning the
effects of other levels phenomena can usually be revised to remove the specific problem. Indeed,
the very process of revision of assumptions can be considered an integral part of scientific
process (see Kuhn 1962). Consequently scientists have not found it necessary to seriously alter
the methodological framework within which they have worked since the development and
adoption of that methodology in the eighteenth century. It is natural then, that scientifically
oriented researchers, when faced with the problem of creating intelligence in a machine feel that
such intelligence can be achieved by an appropriate basic set of rules and objects installed in the
proper hierarchically structured control system. Assuming, that reductionism holds across the
range of levels required for the implementation of a mind, and that humans have the perception,
tools, and time to determine the content of those levels together with the skill to emulate and
implement those contents in a machine, then certainly machines can be built to be intelligent.

Reductionism is unpopular with philosophers defending concepts of ethics and morality as

formulated in Judeo-Christian religions. Most of the doctrines of western religious institutions as
they apply to ethical and moral considerations turn upon the human intuition of free choice. If
the universe is predetermined as reductionism implies, then free choice is an illusion. In
particular a choice to commit or not commit an immoral act, no matter how real the choice seems
to the chooser, is a charade. Further, any supposed imposition of reward or punishment for such
behavior becomes a compounded charade. The inclusion of a Deity in a reductionist framework
results in a religious concept called Deism. It maintains that a Deity created and set the universe
in motion after which the universe did not require (and does not require) any further
intervention. Deism enjoyed a brief popularity among intellectuals about the time of the
American revolution, a time at which the consequences of the Newtonian revolution dominated
most of the thought of the times. But such a belief does not provide a sufficient base upon which
to build the requisite ethical and moral dogma of a western style religion. Persons worried about a
philosophical basis for ethical and moral behavior, or those who insist simply that the decisions
they make are meaningful in that the future course of events is not fixed and does respond to
those decisions, normally prefer some concept of a supervised, managed, or directed universe
(we’ll use the term spiritualism to refer to any such a concept of a directed universe).

Spiritualists will almost certainly disagree with the reductionist conclusion concerning machine
intelligence. Inherent in spiritualism is the belief in the existence of a non-corporeal motivating
and controlling force or forces that in some (inexplicable) manner interact with corporeal objects
causing behavior that does not yield to reductionist explanation. This ‘force’ or ‘forces’ must be
an eternal feature of the universe for if man is the result of a lengthy metamorphosis from simple
molecular structures to his present state (a theory that has achieved the status of accepted fact
for all but a tiny minority of people) then the motivating force must have been there all along,
directing, maintaining, and imbuing each and every object or process with its unique non-
reducible characteristics. To maintain otherwise requires either that evolution be rejected in favor
of an instantaneous creation replete with all of the evidence that argues against it, or that it be
accepted that ‘unenlightened’ matter in some way can give rise to, or create, inexplicable forces
that subsequently vitalize it. Thus an eternal and all pervasive life force or elan vital is
hypothesized to account for the obvious but not easily described difference between living and
non-living matter, and the human essence, mind, or soul, is considered an entity, separate from,
but associated with the body, yet not subject to the body’s limitations. The believer is free to
assert that ‘the forces’ are responsible for any perceived phenomenon or state of nature. If the
human mind is perceived as a unique, non-reducible phenomenon, then that is the intent of ‘the
forces’ of which we can have no knowledge. Certainly we cannot reproduce in a machine that of
which we can have no knowledge.

Phenomena for which no reductionist explanations exist serve as the only support for belief in
spiritualism (e.g. inexplicable human mental activity such as intuition, a flash of insight, or
creativity may be cited as evidence the non-reducible quality of the human mind). In fact the term
belief applies only to assertions about such phenomena. At the other end of the spectrum the
term fact applies only to those phenomena that are reducible (we use the term in its sense as an
objective measure of reality rather than its use in referring to data or in making assertions of
truth). The axioms to which facts may be reduced might be as informal as the (reproducible)
evidence of the senses or as formal as an abstract mathematical system. While adherents of facts
and beliefs both claim to have a corner on the market in truth, both are subject to challenge.
Spiritualists are stuck with the problem of trying to support their beliefs by citing inexplicable
phenomena that might at any moment be explained, while reductionists must fear (or they should
fear) that their systems might yield contradictory facts, or that facts will be discovered that are
not explainable in their system. It might appear that there are no alternatives to these two
opposed ways of conceptualizing the world. Such is not the case. The concept of emergence

provides another1 explanation of the way things are; one that allows for the existence of non-
reducible phenomena, and yet one that is not spiritualistic. We stress that concepts
associated with theories of ‘emergence’ should not be confused with spiritualism. Certainly
‘spiritual forces’ could be invoked to explain creative processes but that would simply be another
version of a directed universe. Many real, non-spiritual mechanisms can be proposed to explain
the process of ongoing creation (and some are detailed below). Nor are the concepts regarding
emergence, philosophically compatible with reductionism. Theories concerned with emerging
systems hypothesize a hierarchically structured universe in which the levels of the hierarchy
exhibit a semi-permeability with respect to the effects of the rules and objects at one level on
adjacent levels. The further removed two levels the less the phenomena in the one can be reduced
to the phenomena in the other. Since the levels are assumed to extend throughout the universe,
laws can be at once, both universal and restricted in scope. In effect the laws of the universe
never die, they just fade away (across levels). This concept has many implications for what it
means to exist and be intelligent at one level in such a universe. In the context of this paper it has
implications for the implementation of machine intelligence. To sum up, reductionism easily
admits of the possibility of building a machine intelligence while spiritualism argues against it.
Emergence lies somewhere between these two extremes. As might be expected then, it implies
that the creation of a machine intelligence lies somewhere between impossible and possible. First
we develop support for the concept of emerging systems by considering the concept of a
universe organized into hierarchical levels through the interaction of two seemingly opposed
facts of nature, chaos and order. We follow that by stating the ideas more clearly as a hypothesis.
Finally we make inferences from the hypothesis concerning the possibility of creating a machine
intelligence (which we deem possible) and the conditions necessary to accomplish that feat in the
context of an emergent universe.

1 But not necessarily the only other one.



The philosophical–religious climate of Europe in the eighteenth and nineteenth centuries was
favorable for investigations into the nature of biological organisms. This was due to two aspects
of the Judeo–Christian thought of that time: time was considered linear and events, measured
from creation to eternity with (for the Christians) memorable deterministic events occurring at
the advent and second coming of Christ. This was different from the cyclical nature of time linear
and cyclical of the Greek philosophers (and of most other cultures).

It was assumed that God had created the world in a methodical ordered manner. One perception
of the order of things was the "scale of being," the "ladder of perfection" or the "ladder of life."
Man was perceived to occupy the highest earthly rung of the ladder with all of the other various
life forms occupying lower positions depending upon their perfection or complexity. The ladder
did not stop at man but continued on with Angels, Saints and other heavenly creatures that
occupied successively higher rungs until finally God was reached. Man thus had a dual nature; a
spiritual nature that he shared with the Angels above him on the ladder and an animal nature that
he shared with the creatures below him on the ladder.

Scientific thought and religious thought often mixed during the renaissance and even into the
nineteenth century. Investigations into biological complexity were encouraged so that the order of
succession of creatures on the ladder of life could be more accurately ascertained. That, coupled
with the perception that time moved inexorably from the past into the future set the stage for the
discovery of evolutionary processes. All that was needed was the concept of vast periods of
geologic2 time, the idea that the ladder of life might have a dynamic character, and a non
supernatural mechanism by which movement on it could occur. These ideas, or various forms of
them, had already gained widespread acknowledgement when Charles Darwin wrote his Origin of
Species (published in 1859). Origin united the idea of the transmutation of species over geologic

2 The idea of geologic time was provided by James Hutton, the father of geology.

time with the mechanism of natural selection3 to yield the theory of biological evolution. The
theory required strong champions for it made formal and respectable the heretical ideas that had
only been the object of speculation. Even in 1866 (seven years after publication of Origin of
Species) the ladder was still being defended (Eiseley , 1958). From a scientific point of view these
were the last serious challenges to evolution. The evidence from biology, geology, and
paleontology were too strong to support that particular form of theologically generated order. At
about the same time (1866) an Austrian monk named Gregor Mendel published a report on his
research into genes, the units of heredity. In it he outlined the rules that govern the passing of
biological form and attribute from one generation to the next. The report was to lay
unappreciated for the next thirty years until around the turn of the century when the rules were
independently rediscovered. The Mendelian theory provided the Darwinian mechanism of natural
selection with an explanation for the necessary diversification and mutation upon which it relied.
The mendelian laws were incorporated into Darwinism which became known as Neo–Darwinism.
In 1953 J. D. Watson and F. H. C. Crick proposed the double helix molecular structure of the
deoxyribonucleic acid (DNA)) from which genes are constructed and which contain the actual
coded information needed to produce an organism. This and various loosely related theories such
as population genetics, speciation, systematics, paleontology and developmental genetics
complement neo–darwinism and combine with it to form what is termed the synthetic theory of
evolution. The synthetic theory of evolution represents the state of the theory today (Salthe

This new wave of scientific discovery about the earth and the particularly the biology of the
earth worked a new revolution in philosophical thought in which man had to be perceived as
playing a smaller role in the scheme of things if for no other reason than the true grandeur of the
scheme of things was becoming apparent. The first philosopher to embrace the new biological
theories was Herbert Spencer. He was a philosopher of the latter half of the nineteenth century
who seized upon the theory of evolution as presented by Darwin and generalized it to a
philosophical principle that applied to everything, as Will Durant (Durant 1961) demonstrates

3 The result of the uneven reproduction of the genotype–phenotype in a group of members of a species that
can mate with each other. The mechanism of selection is that of the interaction of the phenotype with its
environment. The concept was popularized as 'the survival of the fittest' by Herbert Spencer.

by partial enumeration, to:

"The growth of the planets out of the nebulae; the formation of oceans and mountains on the earth; the metabolism
of elements by plants, and of animal tissues by men; the development of the heart in the embryo, and the fusion of
the bones after birth; the unification of sensations and memories into knowledge and thought, and of knowledge
into science and philosophy; the development of families into clans and gentes and cities and states and alliances
and the 'federation of the world'."

the last, of course, being a prediction. In everything Spencer saw the differentiation of the
homogeneous into parts and the integration of those parts into wholes. The process from
simplicity to complexity or evolution is balanced by the opposite process from complexity to
simplicity or dissolution. He attempted to show that this followed from mechanical principles by
hypothesizing a basic instability of the homogeneous, the similar parts being driven by external
forces to separated areas in which the different environments produce different results.
Equilibration follows; the parts form alliances in a differentiated whole. But all things run down
and equilibration turns into dissolution. Inevitably, the fate of everything is dissolution. This
philosophy was rather gloomy but it was in accord with the second law of thermodynamics4 or
the law of entropy that states that all natural processes run down. Entropy was proposed by
Rudolf Clausius and William Thompson in the 1850's and was derived from the evidence of
experiments with mechanical systems, in particular heat engines (see the section on order and
entropy below). That scientific fact tended to support a philosophy that the universe and
everything in it was bound for inevitable dissolution. This was shocking and depressing to
philosophers and to the educated populace at the turn of the twentieth century. However, the
universal application of the Darwinian theory of evolution is not so simple as Spencer would
have it.

In response to the overwhelming wealth of detail, the biological sciences began to turn inward.
The detail allowed for tremendous diversification without leaving the confines of the life sciences.
The subject matter of biology was, and is, perceived as so different from the subject matter of

4 The first law of thermodynamics deals with the conservation of energy, stating that the sum of the flow of
heat and the rate of change of work done, are equal to the rate of change of energy in a system. There is a
zero'th law (that, as you may suspect, was proposed after the first and second laws, but that logically precedes
them). It states that two objects that are in thermal equilibrium with a third object, will be in thermal
equilibrium with each other.

other areas of science that it is considered unique and to some extent a closed discipline. But the
isolationist tendency is widespread in all of the sciences and, as has been shown on many
occasions, uniqueness is an illusion. In recent years there has been a move on the part of some
biologists to find a more general framework within which biology fits together with the other
disciplines of science. Such a move toward integration should, of course, be of interest to all of
the sciences. One such approach is espoused by Stanley N. Salthe5 .

Salthe notes (Salthe 1985) that there are at least seven levels of biological activity that must be
considered for the investigation of what are considered common biological processes. These
include, but are not limited to, the molecular, cellular, organismic, population, local ecosystem,
biotic regional, and the surface of the earth levels. The systems at these levels are usually studied
as autonomous systems largely because of the difficulty in attributing cause to other levels. There
is, inherent in biological phenomena, a hierarchical structure differentiated mainly by spacial and
temporal scaling. The hierarchy is not limited to biological systems, but extends upward to the
cosmos and downward to quantum particles. Nor is it limited to any particular enumeration such
as the seven levels mentioned above. Interactions between levels are largely constrained to those
that occur between adjacent levels. At a particular level (focal level) and for a particular entity the
surrounding environment (or next higher level) and the material substance of the entity (next
lower level) are seen as providing initial conditions and constraints for the activities or evolution
of the entity. Salthe proposes that a triadic system of three levels (the focal level and the
immediately adjacent levels) are the appropriate context in which to investigate and describe
systemic, in particular, biologic phenomena. So the theory of evolution that depends upon
natural selection applies at the population level and may or may not apply at other levels. Salthe
distinguishes several factors that provide differentiation into levels. These include 1) scale in both
size and time that prohibits any dynamic interaction between entities at the different levels, 2)
polarity, that causes the phenomena at higher and lower levels to be radically different in a
5 The specific approach presented by Salthe has its roots in general system theory. That theory's inception was
in Norbert Wiener's cybernetics. It was given its strong foundations and extended far beyond the original
content by such researchers as Ludwig Von Bertalanffy (1968), Ervin Laszlo (1972), Ilya Prigogine (1980),
Erich Jantsch (1980), David Layzer (1990) and many others (the cited works are included in the bibliography,
and are not intended to represent the principle work of the cited researcher). General systems science is now a
firmly established discipline. It provides the context in which the discussion concerning 'levels' on the
ensuing pages should be considered.

manner that prohibits recursion (that is, phenomena observed at a higher level are not seen to
recur at a lower level) and 3) complexity, that describes the level of multiple interrelatedness
between entities at a level.

Biologists tend to be divided into two camps, the vitalists and the mechanists who subscribe to
the spiritualist and reductionist approaches respectively. The mechanists see no reason why all
aspects of living systems cannot be described in physical terms. They believe it is possible to
reduce phenomenon observed at one level to the operation of laws on objects at a lower level.
This mechanist/reductionist approach has succeeded in explaining many biological phenomena to
the great benefit of mankind. Most biologists would fall into this category (partly because of the
success and partly because the alternative feels unscientific). Vitalists, on the other hand, feel that
the immense complexity of physical systems precludes an explanation for life in terms of
physical laws. The distinction between living and non–living things, while perfectly obvious is
virtually indefensible when formalized. Vitalists hypothesize an elan vital or life force to account
for the difference. The nature of that force has not been succinctly identified by vitalists.

In spite of its popularity, complete explanations are not forthcoming from mechanist approaches;
it is one thing to recognize parts and interactions of parts of systems and another to explain the
activities of the whole in terms of the parts. Two problems stand in the way of the success of the
mechanists, 1) the process itself is doomed to an infinite regress of explanation of finer and finer
detail or the description of processes in contexts of ever greater and greater scope and 2) even
after the successful explanation of the nature of an object in terms of lower level those
explanations provide no explanation for the activities of that object at its level of existence as a
whole. So the attempt of the mechanists to reduce the events at one level to those at another,
while useful, are not complete; it may be possible to explain the structure of a cell in terms of
certain molecular building blocks and the laws that govern those constituents, but doing so does
not explain the activities of that cell. W. M. Elsasser (Elsasser, 1970) argues that the existence of
unique, distinguishable individuals at a level actually constitutes a feature of reality overlooked
by the standard physics:

"The primary laws are the laws of physics which can be studied quantitatively only in terms of their validity in
homogeneous classes. There is then a 'secondary' type of order or regularity which arises only through the
(usually cumulative) effect of individualities in inhomogeneous systems and classes. Note that such order need
not violate the laws of physics"

Salthe and others who attempt to distinguish levels as more than mere handy abstractions to
locate or specify objects provides a means by which mechanists and vitalists have some common
ground. The processes and objects that occupy a level are not expected to be completely
explainable in terms of any other level, in fact any complete explanation is precluded by the
increasing impermeability that characterizes the correspondence between increasingly distant
levels. Only the most immediate levels have any appreciable impact on the activities of the other.
This saves the mechanists from the infinite regress of explanations but leaves the explanation of
elan vital for further investigation. As Elsasser points out, these need not be mystic,
spiritualistic, or non–scientific. One such possible explanation for the observed propensity of
systems to self–organize is given below in the section on entropy and order. It is an unexpected
explanation, advocating that organization results not because of any positive motivating force
but rather because of the inability of some open systems to pursue their entropic imperative at a
rate commensurate with the expanding horizons of the local state space. It will be argued that the
nature of the system at a level in an environment together with existence of chaotic tendencies,
dictates the form that the resulting organization takes. These ideas have applicability to virtually
any system at any level, not just biological systems at the levels identified above. We shall
comment and expand upon these ideas in later sections, but first we will review the development
of knowledge about the nature of the physical world beyond (and below) the lowest levels
mentioned above. We skip over the molecular level, whose rules and objects are detailed in
chemistry, the atomic level described by atomic physics, and the nuclear level described by
nuclear physics, to the quantum level whose rules and objects are expressed in particle physics
by quantum mechanics. We do this because the quantum level vividly illustrates the increasing
inability to relate the actions and objects at different levels far removed from the human level of

Quantum Theory

At the end of the nineteenth century physicists were perplexed by the fact that objects heated to

the glowing point gave off light of various colors (the spectra). Calculations based on the then
current model of the atom indicated that blue light should always be emitted from intensely hot
objects. To solve the puzzle, the German scientist, Max Planck proposed that light was emitted
from atoms in discrete quanta or packets of energy according to the formula, E = hf, where f was
the frequency of the observed light in hertz and h was a constant unit of action (of dimensions

energy/frequency). Planck calculated h = 6.63 x 10–27 ergs/hertz and the value became known as
Planck's constant. In 1905 Albert Einstein suggested that radiation might have a corpuscular

nature. Noting that mc2 = E = hf, or more generally mv2 = hf (where v stands for any velocity)
provides a relation between mass and frequency he suggested that waves should have particle
characteristics. In particular since the wavelength l, of any wave, is related to the frequency by f
= v/l, then l = h/mv, or setting the momentum mv = p, then p = h/l. That is the momentum of a
wave is Planck's constant divided by the wavelength. Years later (in 1923), Prince Louis
DeBroglie, a French scientist, noted that it should be true that the inverse relation also exists and
that l = h/p. That is, particles have the characteristics of waves. This hypothesis was quickly
verified by observing the interference patterns formed when an electron beam from an electron
gun was projected through a diffraction grating onto a photo–sensitive screen. An interference
pattern forms from which the wavelength of the electrons can be calculated and the individual
electron impact points can be observed. This, in itself, is not proof of the wave nature of
particles since waves of particles, (for instance sound waves or water waves) when projected
through a grating will produce the same effect. However, even when the electrons are propagated
at widely separated intervals (say one a day), the same pattern is observed. This can only occur
if the particles themselves and not merely their collective interactions have the characteristics of
waves. In other words, all of the objects of physics (hence all of the objects in the universe), have
a wave–particle dual nature. This idea presented something of a problem to the physicists of the
early part of this century. It can be seen that the wave nature of objects on the human scale of

existence can be ignored (the wavelength of such objects will generally be less than 10–27 meters
so, for instance, if you send a stream of billiard balls through a (appropriately large) grating the
interference pattern will be too small to measure, or at least, small enough to safely ignore).
Classical physics, for most practical purposes, remained valid, but a new physics was necessary

for investigating phenomena on a small scale. That physics became known as the particle6

The Austrian physicist, Erwin Schr_dinger applied the wave mechanics developed by James
Clerk Maxwell to develop an appropriate wave equation (y) for calculating the probability of the
occurrence of the attributes of quantum particles. For example, solving y for a specified position
of a particle (say the position of an electron near an atom) yields a value which, when squared,
gives the probability of finding the particle at that position. Then, to investigate the trajectory of
an electron in relation to its atom, the equation can be solved for a multitude of positions. The
locus of points of highest probability can be thought of as a most likely trajectory for the
electron. There are many attributes that a quantum particle might have, among which are mass,
position, charge, momentum, spin, and polarization. The probability of the occurrence of
particular values for all of these attributes can be calculated from Shr_dinger's wave function.
Some of these attributes (for example mass and charge), are considered static and may be thought
of as fixed and always belonging to the particle. They provide, in effect, immutable evidence for
the existence of the particle. But other attributes of a particle are complementary in that the
determination of one affects the ability to measure the other. These other attributes are termed
dynamic attributes and come in pairs. Foremost among the dynamic complementary attributes
are position, q, and momentum, p. In 1927 Werner Heisenberg, introduced his famous
uncertainty principle in which he asserted that the product of the uncertainty with which the
position is measured, Dq, and the uncertainty with which the momentum is measured, Dp must
always be greater than Planck's constant. That is, Dq Dp ≥ h. In other words, if you measure one
attribute to great precision (e.g. Dq is very small) then the complementary attribute must
necessarily have a very large uncertainty (e.g. Dp ≥ h/Dq). A similar relation is true of any pair of
6 All things at the quantum level (even forces) are represented by particles. Some of these particles are
familiar to most people (e.g. the electron and photon), others are less familiar (muons, gluons, neutrinos,
etc.). Basically, quantum particles consist of Fermions and Bosons. The most important Fermions are quarks
and leptons. Quarks are the components of protons and neutrons, while leptons include electrons. The
Fermions then, contain all the particles necessary to make atoms. The Bosons carry the forces between the
Fermions (and their constructs). For example, the electromagnetic force is intermediated by photons, and the
strong and weak nuclear forces are intermediated by mesons, and intermediate vector bosons respectively.
Quantum physicists have found it convenient to speak of all of the phenomena at the quantum level in terms
of particles. This helps avoid confusion that might arise because of the wave/particle duality of quantum
phenomena. It gives non–physicists a comfortable but false image of a quantum level populated by tiny
billiard balls.

dynamic attributes. In the early days of quantum mechanics it was believed that this fact could be
attributed to the disturbance of the attribute not being measured by the measuring process. This
view was necessitated by the desire to view quantum particles as objects that conform to the
normal human concept of objects with a real existence and tangible attributes (call such a concept
normal reality). Obviously, if Heisenberg's principle was true and the quantum world possessed
a normal reality, the determination of a particle's position must have altered the particle's
momentum in a direct manner. Unfortunately, as will be discussed below, the assumption of
normal reality at quantum levels leads to the necessity of assuming faster than light
communication among the parts of the measured particle and the measuring device. The idea that
quantum objects conform to a normal reality is now out of favor with physicists (though not
discarded). In any case, the principle was not intended as recognition of the clumsiness of
measuring devices. Heisenberg's uncertainty principle arose from considerations concerning
the wave nature of the Schr_dinger equations and result directly from them. Richard P. Feynman,
in one of his famous lectures on quantum electrodynamics (QED) tried to put the principle in its
historical perspective (Feynman, 1985):

“When the revolutionary ideas of quantum physics were first coming out, people still tried to understand them
in terms of old-fashioned ideas (such as, light goes in straight lines). But at a certain point the old-fashioned
ideas would begin to fail, so a warning was devised that said, in effect, ‘Your old-fashioned ideas are no
damned good when...’ If you get rid of the old-fashioned ideas and instead use the ideas I’m explaining in
these lectures [ ] there is no need for an uncertainty principle.”

The old-fashioned ideas were of course, the normal physics of normal reality, and as Feynman
readily admitted, the new ideas appeared non-sensical in that reality.

But the predictions made by quantum mechanics (which now consist of four different systems
equivalent to Schr_dinger equations)7 of the results of quantum particle experiments have proven
accurate for fifty years. They consistently and precisely predict the observations made by
physicists of the effects produced by the interactions of single particles (from perhaps cosmic
sources) and beams of particles produced in laboratories. The tools for the prediction of
7 Three other systems equivalent to the Schr_dinger equations are Heisenberg's matrix mechanics, Dirac's
transformation theory and Feynman's sum over histories system.

experimental results are perfected and used daily by physicists. So, under the test that a
successful theory can be measured by its ability to predict, quantum theory is very successful. In
spite of all of their accuracy and predictive ability the systems do not provide a picture of, or
feeling for the nature of quantum reality any more than the equation F = ma provides a picture of
normal reality. But the quantum mechanics does provide enough information to indicate that
quantum reality cannot be like normal reality. Unfortunately (or fortunately if you are a
philosopher) there are many different versions of quantum reality that are consistent with the
facts; all of which are difficult to imagine. For instance, the concept of attribute, so strong and
unequivocal in normal reality, is qualified at the quantum level as indicated above. Worse, if the
quantum mechanics equations are to be interpreted as a model of the quantum level of existence
then the inhabitants of quantum reality consist of waves of probability, not objects with
attributes (strange and quirky though they may be). As Werner Heisenberg said "Atoms are not
things." The best one can do is to conjure up some version of an Alice in Wonderland place in
which exist entities identifiable by their static attributes but in which all values of dynamic
attributes are possible but in which none actually exist. Occasionally the quantum peace is
disturbed by an act that assigns an attribute and forces a value assignment (e.g. a measurement or
observation occurs). Nick Herbert in his book Quantum Reality (Herbert, 1989) has identified
eight versions of quantum reality held by various physicists (the versions are not all mutually

1) The Copenhagen interpretation (and the reigning favorite) originated by Niels Bohr, Heisenberg, and Max Born
holds that there is no underlying reality. Quantum entities possess no dynamic attributes. Attributes arise as a joint
product of the probability wave and the normal reality measuring device.

2) The Copenhagen interpretation part II maintains that the world is created by an act of observation made in the
normal world. This has the effect of choosing an attribute and forcing the assignment of values.

3) The world is a connected whole arising from the history of phase entanglements of the quantum particles that
make up the world. That is, when any two possibility waves (representative of some quantum particle) interact, their
phases become entangled. Forever afterward, no matter their separation in space, whatever actuality may manifest
itself in the particles, the particles share a common part. Originally the phase entanglement was thought of as just a
mathematical fiction required by the form of the equations. Recent developments (Bell's theorem, see below) lend
credence to the actuality of the entanglements in the sense that those entanglements can explain, and in fact are
needed to explain, experimental data. Phase entanglement was originally proposed by Erwin Schr_dinger.

4) There is no normal reality hypothesis (normal reality is a function of the mind). John Von Neumann felt that the
problem with explaining the reality behind the experimental observation arose because the measurement instruments
were treated as normally real while the particles being measured were considered as waveforms in a quantum reality.
He undertook to treat the measurement devices as quantum waveforms too. The problem then became one of

determining when a quantum particle as represented by a wave of all possible attributes and values, collapsed (took
a quantum leap) to a particular set of attributes and values. He examined all of the possible steps along the path of
the process of measurement and determined that there was no distinguished point on that path and that the
waveform could collapse anywhere without violating the observed data. Since there was no compelling place to
assume the measurement took place, Von Neumann placed it in the one place along the path that remained
somewhat mysterious, the human consciousness.

5) There are an infinity of worlds hypothesis. In 1957, Hugh Everett, then a Princeton University graduate student
made the startling proposal that the wave function does not collapse to one possibility but that it collapses to all
possibilities. That is, upon a moment requiring the assignment of a value, every conceivable value is assigned, one
for each of a multitude of universes that split off at that point. We observe a wave function collapse only because we
are stuck in one branch of the split. Strange as it may seem this hypothesis explains the experimental observations.

6) Logic at the quantum level is different than in normal reality. In particular the distributive laws in logic do not
apply to quantum level entities, that is, A or (B and C) is not the same as (A or B) and (A or C). For example, if
club Swell admits people who are rich or it admits people who come from a good family and have good
connections, while club Upper Crust admits people who are rich or come from a good family and who are rich or
have good connections, then in normal reality we would find that after one thousand people apply for membership
to both clubs the same group of people will have been accepted at both clubs. In the quantum world, not only will
the memberships be different but in club Upper Crust there will be members who are not rich and do not come from
a good family or are not well connected.

7) Neo-realism (the quantum world is populated with normal objects). Albert Einstein said that he did not believe
that God played dice with the universe. This response was prompted by his distaste for the idea that there was no
quantum reality expressible in terms of the normal reality. He didn't want to except probability waves as in some
sense real. He and DeBroglie argued that atoms are indeed things and that the probabilistic nature of the quantum
mechanics is just the characteristic of ensembles of states of systems as commonly presented in statistical
mechanics. John Von Neumann proved a theorem that stated that objects that displayed reasonable characteristics
could not possibly explain the quantum facts as revealed by experimental data. This effectively destroyed the neo-
realist argument until it was rescued by David Bohm who developed a quantum reality populated by normal objects
consistent with the facts of quantum mechanics. The loophole that David Bohm found that allowed him to develop
his model was the assumption by Von Neumann of reasonable behavior by quantum entities. To Von Neumann,
reasonableness did not include faster than light phenomena. In order to explain the experimental data in a quantum
reality populated by normal objects, Bohm found it necessary to hypothesize a pilot wave associated with each
particle that was connected to distant objects and that was able to receive superluminal communications. The pilot
wave was then able to guide the particle to the correct disposition to explain the experimental data. Unfortunately
for the neo–reality argument faster than light communications puts physicists off. As discussed below, the
acceptance of a variation of that concept, at least for objects in the quantum world, may be required by recent

8) Werner Heisenberg champions a combination of 1 and 2 above. He sees the quantum world as populated by
potentia or "tendencies for being." Measurement is the promotion of potentia to real status. The universe is observer
created (which is not the same as Von Neumann's consciousness created universe).

Albert Einstein did not like the concept of a particle consisting of a wave of probabilities. In
1935, in collaboration with Boris Podowski and Nathan Rosen he proposed a thought experiment
that would show that, even if quantum mechanics could not be proven wrong, it was an
incomplete theory. The idea was to create a hypothetical situation in which it would have to be
concluded that there existed quantum attributes/features that were not predictable by quantum
mechanics. The thought experiment is known as the EPR experiment.

The EPR experiment consists of the emission in opposite directions from some source of two
momentum correlated quantum particles. Correlated particles (correlated in all attributes, not just
by the momentum attribute) may be produced, for example, by the simultaneous emission of two
particles from a given energy level of an atom. Call the paths that the first and second particles
take the right hand and left hand paths respectively. At some point along the right hand path
there is a measuring device that can measure the momentum of the first particle. On the left hand
path there is an identical measuring device at a point twice the distance from the source than the
first device. When the particles are simultaneously emitted along their respective paths, according
to quantum theory, they both consist of a wave of probability that will not be converted into an
actuality until they are measured. At the point of being measured their probability wave is
collapsed, or the momentum attribute is forced to take a value, or the consciousness of the
observer assigns a momentum value (e.g. a human is monitoring the measuring device), or the
universe splits, or some other such event takes place to fix the momentum. Now consider the
events of the experiment. The particles are emitted at the same time in opposite directions. Soon
the first particle (on the right hand path) is measured and its momentum is fixed to a particular
value. What is the status of the second particle at this point in time? According to quantum
mechanics it is still a wave of probability that won't become actualized until it encounters the
measuring device (still off in the distance). And yet it is known that the left hand measuring
device will produce the same momentum that the right hand device has already produced and
when the second particle finally gets to that measuring device it does show the expected
momentum. Two possibilities present themselves, either the results of the measurement of the
first particle is somehow communicated to the second particle in time for it to assign the correct
value to its momentum attribute, or the second particle already 'knows' in some sense, which
actual momentum to exhibit when it gets to the measuring device. The particles are moving at or
near the speed of light so the former possibility requires superluminal communication and must
be rejected. But then the quantum particle must contain knowledge that is not accounted for by
quantum theory. In fact it must contain a whole list of values to assign to attributes upon
measurement because it can't possibly know which attribute will be measured by the measuring
device it encounters. So, Einstein concludes, quantum theory is incomplete since it says nothing
about such a list.
Einstein’s rejection of superluminal communication is equivalent to an assumption of locality.
That is, any communication that takes place must happen through a chain of mediated
connections and may not exceed the speed of light. Electric, magnetic and gravitational fields may
provide such mediation but the communication must occur through distortions of those fields
that can only proceed at subluminal speeds. In 1964 John Stewart Bell, an Irish Physicist
attached to CERN, devised a means of using a real version of the EPR thought experiment to test
the assumption of locality. He substituted two beams of polarity correlated (but randomly
polarized) photons for the momentum correlated particles of the EPR experiment.

By 1972 John Clauser at University of California at Berkeley, using a mercury source for the
beams obtained results indicating that the quantum mechanics predictions were correct. In 1982
Alain Aspect of the University of Paris performed a more rigorous version of the experiment.
The validity of quantum theory was upheld and the assumption of locality in quantum events
shown to be wrong. That quantum reality must be non–local is known as Bell's theorem. It is a
startling theorem that says that unmediated and superluminal or instantaneous communication
occurs between quantum entities no matter their separation in space. It is so startling that many
physicists do not accept it. It does little to change the eight pictures of quantum reality given
above except to extend the scope of what might be considered a measurement situation from the
normal reality description of components to include virtually everything in the universe. The
concept of phase entanglement is promoted from mathematical device to active feature of
quantum reality.

Of course, we see none of this in our comfortable, normal reality. Objects are still constrained to
travel below the speed of light. Normal objects communicate at subluminal rates. Attributes are
infinitely measurable and their permanence is assured even though they may go for long periods
of time unobserved and uncontemplated. We can count on our logic for solving our problems, it
will always give the same correct answer, completely in keeping with the context of our normal
reality. But we are constructed from quantum stuff. How can we not share the attributes of that
level of existence? And we are one of the constituents of the Milky Way, (although we never
think of ourselves as such); what role do we play in the nature of its existence? Does such a
question even make sense? It might help to look more closely at the nature of levels of existence.


Biologists, physicists, astronomers, and other scientists, when describing their work, often
qualify their statements by specifying the scale at which their work occurs. Thus it's common to
hear the 'quantum level', the 'atomic level', the 'molecular level', the 'cellular level', the 'planetary
level' or the 'galactic level' mentioned in their work. And they might further qualify things by
mentioning the time frame in which the events of their discipline occur, e.g. picoseconds,
microseconds, seconds, days, years, millions of years, or billions of years.

The fact that such specifications must be made is not often noted; it's just taken as the nature of
things that, for example, objects and rules at the quantum level don't have much to do with the
objects and rules at the cellular level, and the objects and rules at the cellular level don't have
much to do with objects and rules at the galactic level and so on. A bit of reflection however,
reveals that it's really quite an extraordinary fact! All of these people are studying exactly the
same universe, and yet the objects and rules being studied are so different at the different levels,
that an ignorant observer would guess that they belonged to different universes. Further, when
the level at which humankind exists is placed on a scale representing the levels that are easily
identified, it is very nearly in the center (see figure 1).

Three reasons why that might be, come immediately to mind: 1) we're in the center by pure
chance, 2) the universe was created around us (and therefore, probably for us), or 3) it only looks
like we're in the center because we can 'see' about the same distance in both directions. Is our
perception of these levels, embedded in the scale of being, just an artifact of our failure to see the
underlying scheme (i.e. the universe can be explained by a few immutable object types together
with their rules of interaction, so that the different objects and rules that we perceive at different
levels are simply a complex combination of those primitives), or were these levels ‘set up’, in
some way for our support, or do these levels (and their objects and rules) have a real and, in
some sense, independent existence? The answer we give is part of the hypothesis that will be
made below: levels exist and contain rules not derived from surrounding levels of existence, and
the further removed any two levels the less the one can be interpreted or explained in terms of the

If accepted, this answers the question of why we seem to be at the center of the scale of being;
the further removed a level from the human level of existence, the less that level has in common
with the human level and the less humans are able to interpret what occurs at that level in familiar
terms. For sufficiently remote levels, no interpretation can be made. We cannot 'see' beyond what
our minds can comprehend. So, for example, we don't understand quantum reality because it is on
the very edge of that to which we as humans can relate. For the same reason, at the other end of
the scale, we can't conceive of all that is as the universe, and at the same time, the universe as all
that is, nor do our minds comfortably deal with infinity and eternity. If we were a quantum
entity, we

10 Meters
range of weak nuclear force
-15 radii of proton or neutron
atomic Nuclei range of strong nuclear force

-10 radii of electron shells of atoms

macro-molecules(DNA etc.)
-5 blood corpuscles
snowflakesand animal cells
0 man
small town
5 large city
radius of the Earth
radius of the Sun
radius of Solar System
distance to nearest star
20 radius of galaxy (Milky Way)
radius of largest structures (clusters of galaxies)
10 Meters

Figure 1: New scale of being

would undoubtedly express the same wonder at the impossible nature of the strange subluminal
world populated by unconnected, virtually uncommunicative, objects with frozen attributes, that
exist at the very edge of the ability to comprehend. But we shall pull back from such tempting
speculations. Our purpose is not to investigate the concept of levels in general. We will leave
many questions unanswered. We are mainly interested in the fact of the existence of levels and
the effect of that on what we call intelligence. We take up that problem in subsequent sections of
this paper. It would be good, however, to have a firmer grasp of how and why levels occur.

Why should the universe be hierarchically structured and what process or processes leads to the
semi-permeability of objects and rules across levels?


Mathematics was invented as an aid to reasoning about objects and processes in the world. Its
main feature is abstraction and its main implementation mechanism is deduction from a set of
axioms. In the beginning the axioms were restricted to self evident, non-reducible truths about the
world. Tools of proof were restricted to real world artifacts such as the compass and divider.
Eventually it was discovered that the restriction to self-evident truths was not fundamental to the
process of deduction, and finite tools were not sufficient for the proof of all theorems that could
otherwise be shown to be true. It was found that different axiomatic systems could give rise to
diverse and interesting mathematical structures, some of which could be applied to real world
problems. Throughout all of its history, the axiomatic method remained the unquestioned
approach of mathematics. The cracks in that foundation evidenced by competing yet seemingly
valid axiomatic systems spurred the attempt to find an encompassing set of axioms upon which
to base all of mathematics. That search came to an abrupt end with the publishing of Goedel's
Theorem which stated that no such system could exist. Mathematics had moved from an aid to
reckoning, to acceptance as a main revealer of truth about the world; it was considered the
handmaiden or even the queen of the sciences. But like a skyrocket that rises from a bottle to the
sky and then explodes into a thousand beautiful whirligigs, mathematics, while a beautiful sight to
behold, no longer had, nor does it have, a discernible direction. The importance of Goedel's
theorem should not be overlooked, it does not just apply to abstract mathematics. It applies to
any codified (formal) system and so denies the validity of highly mathematical world models and
loosely argued cosmologies alike.

When a system is determined to be incomplete or inconsistent for some specific reason, then
those inconsistencies or the incompleteness can often be corrected by enlarging the system to
include appropriate axioms. This is reminiscent of the way in which one scientific theory
supplants another. Unfortunately, any such measure simply creates another system to which
Goedel's theorem applies as strongly as the first. To avoid an infinite regress it becomes
necessary at some point to accept a world system and the fact that, in it, there will be
unexplainable (non-reducible) or contradictory phenomena. We will argue below that this is not
simply a mathematical conundrum but a feature inherent in a creative hierarchically structured
universe. In preparation for that argument we first discuss an obvious manifestation of the
inexplicable in the world system that most humans find comfortable, that is the phenomena
subject to description (not explanation) in the mathematics of chaos theory.

Geometry, algebra, calculus, logic, computability theory, probability theory, and the various
other areas and subdivisions of mathematics that have been developed over the years serve to
describe and aid in the development of the various hard and soft sciences. As the sciences have
come to treat more complex systems (e.g. global weather systems, the oceans, economic systems
and biological systems to name just a few) the traditional mathematical tools have proved, in
practice, to be inadequate. With the advent of computers a new tactic emerged. The systems
could be broken down into manageable portions. A prediction would be calculated for a small
step ahead in time for each piece, taking into account the effects of the nearby pieces. In this
manner the whole system could be stepped ahead as far as desired. A shortcoming of the
approach is the propagation of error inherent in the immense amount of calculation, in the
approximation represented by the granularity of the pieces, and in the necessary approximation
of nonlocal effects. To circumvent these problems, elaborate precautions consisting of
redundancy and calculations of great accuracy have been employed. To some extent such tactics
are successful, but they always break down at some point in the projection. The hope has been
that with faster more accurate computers the projections could be extended as far as desired.

In 1980 Mitchell J. Feigenbaum published a paper (Feigenbaum 1980) in which he presented

simple functionally described systems whose stepwise iterations (i.e. the output of one iteration
serves as the input to the next iteration in a manner similar to the mechanism of the models
mentioned above) produced erratic, non–predictable trajectories. Beyond a certain point in the
iteration, the points take on a verifiably random nature. That is the nature of the points

produced satisfy the condition for a truly random sequence of numbers8 . The resulting
unpredictability of such systems arises not as an artifact of the accumulation of computational
inaccuracies but rather as a fundamental sensitivity of the process to the initial conditions of the
system.9 Typically these systems display behavior that depends upon the value of parameters in
that system. For example the equation xi+1 = axi(1 – xi) where 0 < x0 < 1, has a single solution

for each a < 3, (that is the iteration settles down so that for above some value of i, xi remains the

same). At a = 3 there is an onset of doubling of solutions (i.e. beyond a certain i, xi alternates

between a finite set of values; the number of values in that set doubles at incremental values as a
increases). Above a = 3.5699... the system quits doubling and begins to produce random values.
If you denote by l n the value for which the function doubles for the nth time then (ln+1 –

l n)/(ln – l n–1) = 4.6692. The value 4.6692 and another value 2.5029 are universal constants

that have been shown to be associated with all such doubling systems. The points at which
doubling occurs are called points of bifurcation. They are of considerable interest because for real
systems that behave according to such an iterative procedure1 0, there is an implication that the
system might move arbitrarily to one of two or more solutions. The above example has one
dimension but the same activity has been observed for many dimensional systems and
apparently occurs for systems of any dimension. When the points generated by a two
dimensional system are plotted on a coordinate system (for various initial conditions) they form
constrained but unpredictable, and often beautiful trajectories. The space to which they are
constrained often exhibits coordinates to which some trajectories are attracted, sometimes
8 For example a random sequence on an interval is random if the probability that the process will generate a
value in any given subinterval is the same as the probability of it generating a number in any other subinterval
of the same length.

9 Consider the following simple system: x0 = pi – 3, xi+1 = 10*xi – trunc(10*xi), where trunc means to
drop the decimal part of the argument. The sequence of numbers generated by the indicated iteration are
numbers between 0 and 1 where the nth number is the decimal portion of pi with the first n digits deleted.
This is a random sequence of numbers arising from a deterministic equation. Changing the value of x0,
however slightly, inevitably leads to a different sequence of numbers.

10 For example the equation xi+1 = axi(1 – xi) is called the logistics function and can serve as a model for
the growth of populations.

providing a simple stable point to which the points gravitate or around which the points orbit. At
other times trajectories orbit around multiple coordinates in varying patterns.

More spectacularly, some trajectories gravitate toward what have been called strange attractors.
Strange attractors are objects with fractional dimensionality. Such objects have been termed
fractals by Benoit B. Mandelbrot. who developed a geometry based on the concept of fractional
dimensionality (Mandelbrot, 1977). Fractals of dimension 0 < d < 1, consists of sets of points
that might best be described as a set of points trying to be a line. An example of such a set of
points is the Cantor set which is constructed by taking a line segment, removing the middle one
third and repeating that operation recursively on all the line segments generated (stipulating that
the remaining sections retain specifiable endpoints). In the limit an infinite set of points remains
that exhibits a fractal dimensionality of log 2 / log 3 = .6309. Objects of varying dimensionality in
the range [0,1] may be constructed by varying the size of the chunk removed from the middle of
each line. An object with dimensionality 1 < d < 3 can be described as a line trying to become a
plane. It consists of a line, so contorted, that it can have infinite length yet still be contained to a
finite area of a plane. An example of such an object is the Koch snowflake as a fractal. The Koch
snowflake is generated by taking an equilateral triangle and deforming the middle one third of each
side outward into two sides of another equilateral triangle. After the first deformation a star of
David is formed. Each of the twelve sides of the star of David is deformed in the same manner,
and the sides of the resulting figure deformed, and so on. In the limit a snowflake shaped figure
with an infinitely long perimeter and a fractal dimensionality of log 4 / log 3 = 1.2618 results.
Beyond a few iterations the snowflake does not change much in appearance.

Many processes in the real world display the kind of abrupt change from stability to chaos that
Feigenbaum's systems display. Turbulence, weather, population growth and decline, material
phase changes (e.g. ice to water), dispersal of chemicals in solution and so forth. Many objects of
the real world display fractal like dimensionality. Coastlines consists of convolutions, in
convolutions, in convolution, and so on. Mountains consist of randomly shaped lumps, on
randomly shaped lumps, on randomly shaped lumps, and so on. Cartographers have long noted
that coastlines and mountains appear much the same no matter the scale. This happens because
at large scales detail is lost and context is gained while at small scale detail is gained but context is
lost. In other words, shorelines look convoluted and mountains bumpy. It might be expected,
given that the chaotic processes described above give rise to points, lines, orbits, and fractal
patterns that real world systems might be described in terms of their chaotic tendencies. And
indeed that is the case. The new area of research in mathematics and physics arising from these
discoveries has come to be known as chaos theory. Chaos theory holds both promise and peril
for model builders. Mathematical models whose iterations display the erratic behavior that
natural systems display, capture to some extent the nature of the bounds, constraints and
attractors to which the systems are subject. On the other hand, the fact that any change, however
slight (quite literally), in the initial conditions from which the system starts, results in totally
different trajectories implies that such models are of little use as predictors of a future state of the
system. Further, it would appear that it may be very difficult, perhaps even impossible, to
extract the equations of chaos (no matter that they may be very simple) from observation of the
patterns to which those equations give rise. Whatever the case, the nature of the activity of these
systems has profound implications for the sciences.

If Pierre–Simon Laplace turns out to have been correct in assuming that the nature of the
universe is deterministic (which the tenets of chaos theory do not preclude, but which they do
argue against1 1), he was wrong when he conjectured that he only needed enough calculating
ability and an initial state to determine all the past and future history of the universe. A major
error was in assuming that initial conditions could ever be known to the degree necessary to
provide for such calculations. As an example, imagine a frictionless game of billiards played on a
perfect billiard table. The player, who has absolutely perfect control over his shot and can

11 We reject it mainly because of the butterfly effect which states that the flapping of the wings of a butterfly
in Tokyo might affect the weather at a later date in New York City. This is because the weather systems at the
latitude of New York City and Tokyo travel West to East so that the weather in New York can be thought of
as arising from the initial conditions of the atmosphere over Tokyo. As another example, consider the old
saying, "for want of a nail the shoe was lost, for want of a shoe the horse was lost, for want of a horse the
battle was lost, and for want of the battle the kingdom was lost." Consider the nail in the horse's shoe. For
some coefficient of friction between the nail and the hoof, the shoe would have been retained. The exact value
of that coefficient might have depended on the state of a few molecules at the nail/hoof interface. Suppose
those molecules were behaving in a manner that had brought them to a point of bifurcation that would result
in either an increase or decrease in the coefficient of friction. In the one case the shoe would have been lost and
in the other it would have been retained.The loss of the nail would then have been a true random event that
would have propagated its way to the level of events in human civilization. If the battle in question had been
the battle of Hastings, the strong French influence in the language you are reading might well have been

calculate all of the resulting collisions, sends all of the balls careening around the table. If he
neglects to take into account the effects of one electron on the edge of the galaxy, his calculations
will begin to diverge from reality within one minute (Crutchfield et al, 1986). Knowing the
impossibility of prediction even in a theoretically deterministic universe, decouples the concepts
of determinism and predictability, and of describability and randomness. Further, chaotic systems
have built into them an arrow of time. The observer of such a system at some given point in time
can not simply extrapolate backward to the initial conditions nor can he (so far as we know)
divine the underlying equations by the character and relationship of the events it generates. The
laws that underlie those events are obscured and can be determined only by simulated trial and
error, the judicious use of analogy, description in terms of probability, or blind luck. There is a
strong sense that something of Goedel's theorem is revealing itself here. Just as in any
mathematical system of sufficient complexity there are truths that cannot be proved, there are
real deterministic systems for which observations can be made that cannot be predicted. Perhaps
the most important result of the discoveries in chaotic processes, is the growing realization
among scientists that the linear, closed system, mathematics that have served so well for
describing the universe during the last two hundred and fifty years, and which gave rise to the
metaphor of a clockwork universe, only describe exceptionally well behaved and relatively rare
(but very visible) phenomena of that universe. It is becoming apparent that it may not be
possible to derive a neat linear, closed system model for everything that exhibits a pattern. It is
likely that new non–linear, open system approaches that rely heavily on computers will be
required to proceed beyond the current state of knowledge, and that in the explanation of
phenomenon in terms of laws, those laws may have to be constrained to the levels of existence of
the systems they describe and not be reduced to more fundamental laws at more fundamental

It is not surprising that the investigation of chaotic activity waited on the development of fast
computers and high resolution graphics display devices. The amount of calculation required to get
a sense of chaos is staggering and the structures revealed can only be appreciated graphically with
the use of computers. But chaos theory has implications for computing as well. Computers have
been around for a long time. The first machine that can truly be said to have been a computer was
the Analytic Engine, designed in England in the mid nineteenth century by Charles Babbage. The
machine was never finished, but Ada Augusta Byron, daughter of Lord Byron, and who held the
title, Countess of Lovelace, wrote programs for it, making her the first computer programmer.
The Analytic Engine, though incomplete, was a showpiece that impressed and intimidated
people. The countess is said to have stated (McCorduck, 1979):

"The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to
order it to perform."

Since then, those sentiments have often been repeated to reassure novice programmers and others
whose lack of understanding causes them to see the computer as an intimidating, intelligent, and
perhaps malevolent entity. More experienced programmers, however, find that the machine is
truly an ignorant brute, incapable of doing anything on its own. And so the myth is propagated
across generations. In chaos theory the myth is refuted. In pursuing chaotic processes the
machine can truly create, though, in a mechanical manner that seems to lack purpose. We will
argue that the seeming lack of purpose is a human prejudice based on our concept of what
constitutes purpose, and that chaotic processes may very well be the mark of a creative universe.

The structure revealed in chaotic processes is statistically accessible. While the sequence and
position of the points generated within the constraints of the system are not predictable, if,
taking as an example a two dimensional space, the points are treated as forming a histogram on a
grid imposed upon the shape/form generated by the system then the system can be treated
probabilistically. In terms of levels, the points may be seen as occupying one level and the form
defined by the points as occupying the next higher level. The theory then has potential for the
probabilistic prediction of system activities. This appeal to statistical description is not
frivolous, but in fact is essential to any description of the universe as David Layzer indicates in
the statement of his proposed cosmological principle.

According to Layzer (Layzer 1990) (and as indicated above) the world/universe at its
fundamental smallest scale (quantum level) consists of permanent, and indistinguishable objects
that occupy discrete (but, in principle, unpredictable) states as opposed to the continuous,
theoretically predictable, distinguishable objects that populate the macro world. The

unpredictability permits truly random behavior as opposed to the deterministic behavior that is
assumed to exist in a (classical) continuous macro world system. In combination with the
discreteness this allows (in fact demands) a statistical description of the universe since the
microstates of finite systems (consisting of particular configurations of quantum level objects) are
finite in their description but statistical in their configuration. This has important ramifications in
an infinite universe populated by such systems. An example that illuminates the connection
between the discreteness of quantum level objects, the finiteness of systems composed of such
objects, and the possibility of an adequate and complete statistical description of the universe is
as follows:

Consider a sequence of ones and zeros each of which has been chosen by the flip of a coin. If the
sequence has a beginning or end then the sequence is unique and any other sequence (with a
beginning or end) can be compared to it on an element to element basis (i.e. just compare the
elements starting from the respective beginnings or ends). But if a sequence has no beginning or
end then the only criteria for comparison is to stipulate that any sub-sequence, no matter how
long, in the one sequence must have an identical counterpart in the second sequence (an
impossible task to actually carry out). But, since in any such infinite sequence every finite
subsequence must occur somewhere, then any two such infinite sequences must be identical. To
produce a sequence of the first kind (one with a beginning or end) requires an infinite amount of
information (the specific infinite sequence in question). But to produce an infinite sequence with
no beginning or end only requires that the probability of a one or a zero at any cell in the infinite
sequence be specified (in this example P(1) = P(0) = .5). A Laplacian, deterministic, universe
corresponds to a specification of the first kind, and contains an infinite amount of specific
information (the initial conditions). On the contrary, a statistically described universe contains
very little information, and even that is statistical in nature. Yet obviously it describes a universe
every bit as complex and varied as the Laplacian version.

That is, a complete statistical description of the universe in terms of probabilities and averages
can exist if the universe is

1. infinite
2. statistically uniform (average properties are the same everywhere)
3. finitely non-uniform (i.e. subsystems are finite)
4. discrete (in the microstates of finite subsystems)

All of this all gives rise to Layzer’s Cosmological Principle:

A complete description of [an] initial state [of the universe] contains no information at all that
would serve to define a preferred position in space or a preferred direction at any point.

[That is, a statistical description that applies equally everywhere in the universe is preferred to an
elaboration of the specific conditions that maintain at each place in the universe. Indeed, the
statistical description is required in an infinite discrete universe.]

The importance of a valid statistical description of the universe is very important for the next
section in which order and entropy are explored. Given that we could explain (if not predict) the
processes of nature using some combination of statistics and chaos theory, there still seems to be
a missing element that spiritualists would happily call animus or moving force. What drives such
systems in the first place? The traditional answer, given by thermodynamics, is that there is
potential for work in thermodynamics in a system (force through a distance) anytime there is a
heat source and a heat sink. Any system embodying such a source and sink is considered to be far
from thermodynamic equilibrium until the temperatures of the source and sink have been
equalized. A measure of the difficulty with which that heat can be converted to work is termed
entropy. One of the most famous laws of science states that in a closed system in which work or
heat interactions occur the entropy must increase. In the last century, An Austrian, Ludwig
Boltzmann, recognized that this behavior is intimately connected with the concept of the
possible states that a physical system might occupy. He was able to state the law simply as S =
k logeP in which S is the entropy, P is the probability of the thermodynamic state occurring, and

k is the ratio of ideal gas constant to Avogadro's number and is called Boltzmann's constant. The
equation is engraved on Boltzmann's tombstone in Vienna. The description of systems by the use
of statistics and probability theory as reflected in entropic processes is taken up next.

Order, entropy, and information (the emergence of levels)

The word entropy is commonly used to indicate the process by which ordered systems become
disordered1 2. It is also the name for the measure of disorder. The next few paragraphs deal with
the measure of entropy but the importance of entropy to this paper is as a process.

There are two distinct types of systems to which the mathematical equation that describes
entropy is commonly applied; the physical systems that are the object of study in physics, and
the communication systems discussed in information theory. Entropy, as it applies to physical
systems, is known as the second law of thermodynamics and is associated with time–irreversible
characteristics of large systems whose sheer number of components and inaccessible initial
conditions preclude deterministic analysis. Its form arises in the statistical consideration of the
possible configurations of the components of a system and is often described as a measure of the
randomness in motion and distribution of those components. The law states that, in any closed
system1 3 , the entropy, or disorder, or inability to convert heat or energy to work can only
increase or remain the same over time. Among the laws of physics this irreversibility with
respect to time is peculiar to the second law of thermodynamics (with the possible exception that
might result from investigations in chaos theory). This view of entropy is intimately connected
to the micro unpredictable nature of large real physical systems with many components and
consequently does not often appear in classical physics where the behavior of small deterministic
kinetic systems is investigated.

This form of the law is expressed in a simple form that first appeared in formulas attributable to
DeMoivre and that he used to investigate games of chance (Wicken, 1988). When applied to
thermodynamic systems (perhaps a container full of a gas or a liquid), it involves the
12 A system is a structure of components that function together. This distinguishes them from a structure
which may be an arrangement of non-dynamic items or an aggregate which may be an arrangement of
non–related items. 'Function together' may be interpreted liberally to mean constrained to interact. We may,
for instance, consider a gas in a container a system. The terms system, structure and aggregate depend to some
extent upon context. The phrase "aggregate of stones" is meant to imply that we are speaking of a collection
of stones whose size, hardness and other stony characteristics are not related. We could speak of the same
collection as a structure if we had reason to refer to the spacial distribution of those stones.

13 A closed system is one that cannot exchange energy, space or matter with its surroundings.

probabilities with which the states1 4 of the system manifest themselves. The entropy of a
system is given by the natural logarithm of the probability 'Pi' of each state 'i' , weighted by itself

and summed over all of the states, this yields the formula:

S = –k * SPi *logePi (i = 1 to W), (1)

in which 'S' is the entropy, 'W' is the number of possible or attainable states of the system, and
'k' is Boltzmann's constant1 5. When the probability of occurrence of any state is the same as the
probability of the occurrence of any other state, i.e. when all state probabilities are equal to 1/W,
the maximum disorder occurs, (1) collapses to

max(S) = k * loge W. (2)

Given this measure of the maximum entropy possible in a system and the actual entropy (1), and
allowing that the percentage disorder displayed by a system can be represented as the ratio of
actual entropy to the maximum entropy, then we can define

disorder ≡ S/max(S), (3)

so that it is natural to define

order ≡ 1 – disorder. (4)

14 For a physical system 'State' implies one specific configuration of components (probably but not
necessarily at the molecular level). A state is thought of as occupying a point in phase space. Phase space is
simply a multidimensional space whose dimensions represent the spacial coordinates and momenta of the
components of the system so that one point in phase space can represent one state of the system (Tolman
1979). A probability density function defined on this space can be used to describe the system.

15 k = 1.380 x 10–16 erg per degree centigrade. In the subsequent development in this section k can be
thought of as a simple scaler value. The association of k with Boltzmann's constant is required in the
investigations of entropy associated with an ideal gas. Note that the form of the entropy equation given here
is just the expected value (in a phase space) of entropy as defined by Boltzmann.

In the second commonly portrayed form of entropy; the one that is associated with
communication theory, the meanings change in several ways.

Some people consider it a mistake for Claude Shannon to have named his measure of the
uncertainty concerning the information content of a message, entropy. The obvious reason for
his choice of the word is that the formulas that define information entropy are identical to those
that define the entropy of physical systems, i.e. they are just (1) and (2) above except that 'H' is
usually used to denote entropy instead of 'S'. A second reason is that when information is
transmitted through a communications channel, that information content can only remain the
same or decrease in time, just as the order in a closed physical system can only remain the same
or decrease in time. But equivalence in formula is not equivalence in interpretation unless the
constants and variables in each formula refer to the same things. In the pure cases mentioned thus
far, they do not. The meaning of the various components as they apply to physical systems
were given above. In communication theory the variables, constants and probabilities have the
following meanings: W refers to the number of possible character strings of length L, where each
character in a string is selected from an alphabet of N characters; that is there exist W = NL
possible strings, the Pi refer to the probability of string 'i' occurring, and the 'k' is used for

scaling purposes. It has become traditional to use log2 instead of loge in the formula since the

alphabet used in information theory is that of electronic switching and consists of two characters,
i.e. the binary alphabet or '1' and '0'. The 'k' may be used to scale or normalize the equation. If,
for instance, we assign k = 1/log2W then the minimum entropy is scaled to 0 while the maximum

entropy is scaled1 6 to 1. The minimum entropy is associated with the certainty that the message
is one particular message out of the W possible messages while unitary entropy represents the
complete uncertainty as to which of the W possible messages is the correct one. An increase of
information entropy is then associated with the loss of certainty as to which string (and its
16 The minimum information entropy occurs when it is known with certainty which string of the W strings
comprises the message. In that case

S = (–1/log2W )*log21 = 0.

When every message is equally probable then

S = (–1/log2W)*S(1/W)*log2(1/W) = (–1/log2W)*(log21 –log2W) = 1.

associated information) is being sent or is intended. Such a loss of certainty is usually attributed
to noise in the environment in which the transmission takes place. It's a theorem of information
theory that the increase in entropy can be made arbitrarily small by increasing the redundancy in
the message.

As noted, the formulas (1) and (2) normally refer to different concepts when used to describe
features of a physical system as opposed to an information system. However, both kinds of
entropy may be legitimately used to describe aspects of either kind of system (since information
can be associated with the structure of any physical system and any information system, is
ultimately expressed in a physical structure that will dissipate energy in its activities). The
distinction between an information system and a physical system is sometimes blurry.

Take for example DNA replication. DNA is a physical system consisting of strings of four kinds
of molecular bases, combined in long sequences and forming the famous double helix. Other
catalytic molecules can interact with DNA and free bases in the immediate environment, forming
and breaking chemical bonds, in a process that results in the construction of another molecule
(RNA). RNA is a copy of a portion of the original DNA. The change of entropy in this system
may be described in two ways. One description depicts it as a physical system that at one
point in time exhibits a certain degree of organization (as represented by the DNA, catalyzing
agents and free molecular bases) and that at a later point in time displays more organization (as
represented by the DNA, catalyzing agents and RNA). This change translates into a decrease in
physical entropy. A second equally valid description depicts the DNA as a string of characters
selected from an alphabet of four characters whose replication in the RNA (creating redundancy)
makes more certain the transmission of the information that they contain, i.e. decreases the
information entropy of the system. The process, viewed in either way, has the same result: it
provides the zygote with the organization it needs to construct an organism.

The similarity between thermodynamic entropy and information theoretic entropy in this case
can lead to speculation that the connection between the two forms of entropy is more than
coincidental. This suspicion is strengthened by the manner in which Maxwell's demon may be
dispatched. In 1897 James Clerk Maxwell, the great physicist, hypothesized a demon whose
actions would violate the second law of thermodynamics. He stationed his demon at a door
between two otherwise closed chambers of gas, say chambers A and B. When the demon saw a
particle of the gas approaching the door from the chamber A side, he would open the door and let
the particle enter into chamber B. He would be careful not to let any particle from chamber B get
into chamber A. In this way the order in the system comprised of the two chambers full of gas
would be increased, the entropy would be decreased and the second law would be violated. While
Maxwell's original description can be attacked on many situational grounds (e.g. the energy
dissipated in opening the door or "seeing" a molecule coming could make up for the decrease in
entropy) all such objections can be countered by modifying the situation. They are unessential
to the underlying problem which is the possibility of an entropy decrease caused by an
intelligent reordering of a physical system. The problem can be stated as: "can the entropy in a
system be decreased by friction free intelligent manipulation of its components?" In 1929 Leo
Szilard (Szilard 1983) proposed a different kind of system that came to be known as Szilard's
Engine. The system appears to convert heat from its surroundings into work, an effect equivalent
to a reduction in entropy. It relies upon being able to determine and remember in which half of a
cylinder a molecule of gas can be found. By knowing this fact, the system positions a piston to
obtain work.In his careful analysis of the system, Szilard determined that in measuring and
remembering the location of the molecule, the system squanders the energy it gains. This was
before the advent of computers but Szilard's solution just misses the current solution as given by
Charles H. Bennett (Bennett 1987). In his solution Bennett maintains that it is the clearing of
memory, in anticipation of storing the information needed to extract the work, that spends the
entropy decrease manifest in the work. Clearing memory is an irreversible, entropic event, as
shown by Rolf Landauer (Bennett 1985). Storing information is not necessarily an entropic
event. The significance of this for our purposes lies in the meeting of physical entropy and
information theoretic entropy. If the remembering device is a computer the information stored in
it (and subsequently cleared) can be described as information in the information theoretic sense.
This fact coupled with Bennett's solution appears to establish an exchange rate between
information theoretic entropy and thermodynamic entropy.

That a physical system can contain information in the information theoretic sense is hardly
startling; a computer is a machine dedicated to that purpose as is a traffic light, television set,
clock, abacus and a strand of DNA. In these systems information may be contained in the
physical structure of the system and the physical structure of the system may derive, to some
extent, from the information it contains. Structure in the world takes the form of, or may be
described in terms of hierarchical organization. Thus books are comprised of chapters, chapters
are comprised of paragraphs, paragraphs of words and words of letters. Animals consist of
organs, organs of tissues, tissues of cells and so forth. Even processes submit to hierarchical
description: a meal consists of preparation, consuming the meal and cleaning up. Preparation
consists of selecting appropriate foods, cooking them and serving them. consuming the meal
consists of several actions, such as cutting, scooping and spearing various foodstuffs,
transporting them to your mouth, chewing, swallowing and so on. The description of entropy
must then deal with the hierarchical structure of real systems, in particular, the possible
groupings, at the various levels of the hierarchy, of the physical objects that manifest
information. Pure information theoretic formulas such as (1) and (2) treat only the lowest level
(or terminal nodes) as information. This is because the information elements are abstract and
considered interchangeable. But physical structures that carry information are distinct. Grouping
them requires that the combinatorics1 7 of such a selection process be considered. The formulas
(1) and (2) are insufficient for the purpose, however, it is not difficult to modify them so they
can work (Brooks, Cumming and LeBlond 1988).

The basic idea is to identify the distinct subsystems, calculate their entropy and sum. For
example, assume that a subsystem may consist of an arrangement of L components that may be
selected from a basic alphabet of N elements (for instance letters or molecules arranged in strings

If it is desired to calculate the information containing potential of 3 binary digits we simply use formula (1),
with k = 1, (with 23 = 8 possible, equally probable states) to calculate S = 8((1/8)log28) = 3. But a system of

three toggle switches may be grouped in a number of ways (creating hierarchical systems) that may contain
additional information. For example, the three switches may be grouped separately or as a combination of 2
switches and 1 switch, or as a group of three switches. When treated separately there are 3 ways to choose the first
switch, two ways to choose the second switch and and one way to choose the last switch for a total of 6
combinations. There are three combinations of 2 switches and 1 switch and there is 1 combination of 3 switches.
Altogether 10 different structures may be composed, each of which may be associated with information in addition
to that which may be associated with the terminal. As is shown further on, this can allow max(S) = 6.

or arrays). There are Ni possible distinct subsystems of size i. The entropy equations of an i
sized subsystem is from (1), (letting k be 1),

Si = – SPi,jlog2Pi,j (j = 1, Ni). (5),

where Pi,j is the probability of the jth distinct subsystem of size i occurring. The entropy of the

whole system is

S = SSi (i = 1, L), (6)

At maximum entropy each subsystem has an equal probability of occurring i.e. Pi,j = 1/Ni. In

that case (5) becomes

Si = i * log2N. (7)

The maximum entropy of the whole system (which may also be thought of as its information
carrying capacity ) is then

max(S )= S i * log2N (i = 1, L) = (L(L+1)/2)*log2N. (8)

Order and disorder may again be defined as in (3) and (4) above except that they apply to a
hierarchical structure. Note that the maximum entropy of a system that permits hierarchical
structure is greater than the maximum entropy of a system consisting of one level. This will be
significant in the development below.

That the entropy equations are important for the investigations of physicists and
communications researchers is obvious but entropy is becoming important in the investigation of
many systems that become more ordered over time, that is, that organize themselves. This
means that entropy becomes important in the investigations of biologists, astronomers,

economists, sociologists, researchers into artificial intelligence and other scientists who
investigate emerging self–ordering systems. In these systems the distinction between the two
forms of entropy is definitely blurred.

As a descriptor of an important attribute of large systems, entropy has been around for a long
time. It has not, however, entered into the investigations of systems that self organize, because it
has not been clear how a description of the tendency of systems to dissipate could be applied to
such systems. In fact, entropy poses a conundrum: how can a system become more ordered
when the second law implies that just the opposite tendency is normal? The answer requires a
careful look at the nature of systems.

Three kinds of evolving systems can be identified (Olmstead p245 1988). Closed systems are
systems that cannot exchange matter or energy with their surroundings in obeying the second law
of thermodynamics they evolve toward a disorganized equilibrium state (a state in which no
organization exists and no change in organization occurs). There are few if any examples of truly
closed systems in nature, a laboratory experiment that isolates some material from all outside
influences might approximate a closed system or the universe itself might be used as an example.
A second kind of evolving system is an out-putting system. Out–putting systems are those that
can expel matter and/or energy and that can evolve toward an ordered equilibrium state at the
expense of an increase in the entropy of the surrounding system (which includes the out–putting
system). A star provides a good example of this kind of system. Stars can achieve the status of
neutron stars in which organization is present but no further change occurs. The last kind of
evolving system is a processing system in which energy and matter is input to the system,
processed, degraded and retained or expelled, to lead to an ordered relatively steady state system
far from equilibrium. Continuance of the steady state depends upon maintenance of the flow of
energy and matter into and out of the system. The earth's surface/atmosphere and all biological
systems are good examples of processing systems. In both out and processing systems, entropy
can decrease in the system at the expense of a corresponding increase in the entropy of the
encompassing environment. A processing system is intimately connected to and dependent over
time on its environment. This empirically established fact is developed formally by Ilya
Prigogine (Prigogine 1980) especially as it applies to thermodynamic systems.
A fundamental question is: how did order in the universe come to be in the first place? Most
theories of the formation of the universe hypothesize an early stage of homogeneous
disorganization. For the second law to have held from the beginning, it would be required that
order appear and entropy increase concurrently. Since an increase in order implies a decrease in
entropy this would seem to be a contradiction. The explanation is that the two processes are not
mutually exclusive in an expanding universe. As pointed out by Layzer (Layzer p26 1987) the
order exhibited by a system is the difference between the maximum possible entropy in the
system and the actual entropy (Os = max(S) – S, ... if we normalize this by multiplying through

by 1/max(S) we obtain O = 1–S/max(S), this is equivalent to setting k = 1/logW and provides the
motivation for defining disorder as S/max(S)). The increase in the physical size of the universe
would mean that the states that the physical substance of the universe might occupy increase
rapidly. If that rate of increase is greater than would allow the substance of the universe to
occupy those states then the maximum entropy would increase faster than the actual entropy. In
general, the entropy and order of a system can both increase so long as the phase space
describing the possible states that the system can occupy grows at a rate sufficient to
accommodate the growth of both. Specifically as regards the creation of the universe, inflation, a
rapid expansion led to a cooling and the subsequent coalescing of matter and energy. A currently

popular, theoretical account of the big bang (up to 10–30 seconds) called inflation (Guth and
Steinhardt 1984) depends on an immense expansion occurring during that early fraction of a
second of the existence of the universe. The expansion provides the driving force to produce all of
the matter and energy of the universe (hence its structure and order), possibly out of nothingness.
As a more prosaic example, imagine a container of gas as a closed system. Its entropy approaches
the maximum as the gas molecules are nearly randomly distributed throughout its volume. Then
in a very short period of time the container expands to millions of times its original volume. The
gas immediately tries to fill the new volume, but it cannot do this as fast as the container is
expanding. In the new system with its larger phase space the small gas cloud represents an
orderly system, though one in which entropy is increasing rapidly. With the increase in container
size (hence maximum entropy) comes an increase in order. More formally, if we view entropy
and maximum entropy as functions of time

dOs/dt = d(max(S))/dt – dS/dt so that; (9)

dOs/dt > 0 if d(max(S))/dt >dS/dt. (10)

That is, order must increase if the rate of growth of the maximum entropy is greater than the rate
of growth of the actual entropy. If a system is being driven toward order by such a process, a
hierarchically organized system will permit greater order than a single level system. Now consider
a planet. It collects energy and matter, degrades it and expels energy and matter into space. The
phase space of the planet is growing as the totality of all energy and matter that have impinged
upon it. Some planets simply heat up and reradiate the energy others also display elaborate
atmospheric formations. To explain the particular means by which order is expressed on the earth
one has to investigate the particular composition and situation of the earth. Obviously the
composition of the earth permits a hierarchical structure and a rapid progression to greater order.

The observation that entropy and order can both increase in an expanding phase space has
implications for biological systems. Prigogine, while describing how processing systems can
exhibit a decrease in entropy, does not provide an explanation of why the amount and speed of
entropy decrease (or increase in order) should vary so much from system to system . In
particular why is it that the production of order is so much greater in biological systems than in
non–biological systems? That biological order might be created and maintained at the expense of
thermodynamic order was a view first espoused by Erwin Shr_dinger in the 1940s, but the
complexity of biological systems made it difficult to accept a principle that seemed to apply
more to simple chemical systems. Most biologists concentrated on natural selection as the main
mechanism driving evolutionary processes. Now, however, there are new theories, gaining some
acceptance, that depend upon the hierarchical description of entropy given above, to explain
evolution.(Brooks, Cumming, and LeBlond 1988). First, they note that Dollo's law and the
proposition that natural selection is the primary driving force of evolution are not compatible.
Dollo's law is the empirically supported hypothesis that biological evolution is characterized by
the monotonically increasing complexity of organisms. If natural selection were the only or even
the major driving force for evolution, the cyclical nature of climatic conditions over the eons

would cause a corresponding cyclical effect in the evolution of life. Such cycles are not observed,
instead, a continuous increase in the complexity of organisms is observed. Secondly they point
out that the genome (gene bearing structures) of organisms represent a phase space for the
potential development of the organism. An increase in the phase space, as represented by
extended gene structures can result in more order while entropy increases. They specify the
characteristics of a self–organizing system as follows (Brooks, Cumming, and LeBlond p216

1) All information (entropy) and order capacities (maximum entropies), both total and at each
level increase over time.
2) a) at low levels (small L) disorder is greater than order, and
b) at high levels (large L) order is greater than disorder.

The first requirement is a restatement, in a manner applicable to a hierarchical system, of the

above discussed possibility of a concurrent increase in order and entropy in a thermodynamic
system. The second requirement stems from the observed nature of self-organizing systems. If 2a
maintains but not 2b the system is disordered. If 2b maintains but not 2a then a limited ordered
system such as a snowflake results. If 2 maintains then a complex system results. Brooks et al
checked DNA against this hypothesis, testing the order/disorder capacity of the DNA of various
life-forms. Their results confirmed that DNA conforms to 2. When 1 and 2 maintain then an
increase in complexity is expected over time and Dollo's law is given theoretical underpinnings. It
is the self organizing, energy processing systems that exhibit the strongest tendency to order and

The equations (3), (4), (9), and (10) normally serve only to quantify the order in a structure
assumed to have been imposed upon a system. It is difficult to restrain from elevating the
equations to a teleological statement that equates the increased order that they permit with all the
growth and organization we observe around us. Accepting the equations as a description of such
a force provides a basis for explaining nature, in its hierarchical structure, at all levels, from the
emergence and evolution of cosmological structures to the emergence and evolution of life. There
is no logical justification for making such a leap of faith. In fact, in so doing one returns to a kind
of vitalist viewpoint once removed (it could easily be misconstrued as the advocation of a
mystical organizing or guiding force). We will resist the temptation to equate and be satisfied
with inferring a strong relation between the conditions that cause the equations to indicate
increased order and the organization we observe in nature. This is in accord with position of some
physicists. Paul Davies (Davies, 1989), while supporting the general concept of levels resistant
to reductionism, protests the identification of increased order with organization. He states that
the simple fact that order can be maintained or increased in open dynamic systems does not
explain the organization but only serves as a prerequisite for that organization. He argues that
rules that he terms software laws (because they often deal with information) emerge along with
and complementary to the emergence of levels. Software laws cannot be reduced to the standard
laws of physics and apply only to the systems at the levels of the observed phenomena. He
points to developments in chaos theory, fractal geometry, cellular automata theory, along with
the enhanced organizational tendencies of systems far from equilibrium, as possible generators of
such laws1 8. We are content to concede that such is the case and are satisfied with determining
that entropic processes of dissipating systems is coincidental with self–organizing processes in a
hierarchy of semi–permeable levels.

We will accept the following description of how the observed increasingly complex organization
of the universe occurs: Entropy is a fundamental fact of nature that serves to describe or govern
(depending on your point of view) the distribution of order and disorder in the universe.
Processing systems are those that accept energy and matter from outside the system, process it
(or degrade it) and expel it. The possible states that such a system can occupy is always
increasing (all of the energy and matter ever processed through the system must be considered as
belonging to the system although only the extant structure is implied by the word system). Order,
structure or organization is observed to emerge and grow in processing systems. In terms of the
phase space in which such systems may be described, the space is growing at a rate that greatly
exceeds the ability of the system to occupy it. When the equations describing the process are
normalized by a factor dependent on the size of the phase space the process can be interpreted as
an increase in order at the expense of disorder. When the system is closed disorder always
18 But we lean toward an explanation that relies on chaos theory. For example, since the objects at a level
can interact chaotically, and since the objects at the next higher level are built from the systems that result
from the interactions at that level, then the objects (and thus the rules) that emerge at the higher level are
unpredictable, non-reducible, hence new, hence a creation.

increases at the expense of order reiterating the second law of thermodynamics. Hierarchically
organized systems seem to be preferred, an observation that might be explained by the fact that
such systems can, in a sense, absorb more order with fewer parts than other organizations.
Hierarchical, self–organizing systems occur either as the result of an inherent, mathematically
revealed necessity (in which organization and order are considered interchangeable or at least,
intimately linked together) or because non reducible laws, complementary and non–contradictory,
to the laws of physics (or the laws at any level for that matter), emerge as a result of the
unpredictability of chaotic (or other) processes that apply to the occupants of a level. These
processes are assumed to be an ongoing and permanent source of novelty or creativity at all levels
throughout the universe. As an example, the biosphere of the earth is a processing system in
which order has increased over time, in which a hierarchical organization seems to be preferred,
and in which the process continues today.


Psychological Considerations

Psychology grew out of philosophical considerations of mind. The approach was at first, more
introspective than scientific owing to its origin in an introspectively oriented philosophy, the
readiness to hand of subjects, and the difficulty in devising experiments that would reveal the
inner workings of the human mind. The theories of Freud and Jung were based largely on case
histories of patients who were subjected to psychological examinations. Slowly, however,
scientific experiments were devised and applied to animals and people. Notable experiments on
dogs were performed by I. V. Pavlov that showed that animals could be conditioned to respond
automatically and involuntarily to various stimuli. Because of the difficulties in accessing the
unconscious, or lower levels of the mind, there arose the idea of treating the mind as a black box
in which the outputs engendered by given inputs would be studied. B. F. Skinner, in the 1930's
carried this line of research to great lengths in the behaviorist school of psychology. The theory
was good at explaining some human behavior but failed in many cases. Psychologists returned to
investigating the functions of the lower levels of the mind and made significant advances,

especially through direct neurobiologic experiments on the brain. In recent years psychologists
have turned to the concept of the human brain as a machine upon which the functions of the mind
execute as software. This has occurred more perhaps because of the advent of computers and the
interesting potential for experimentation than because of the availability of compelling theories.
But it is the computer scientist, in particular the computer scientist engaged in artificial
intelligence work who should take to heart the models of cognition that psychologists have
developed over the years. The theories of Jean Piaget are a case in point.

Structure of the Mind

Piaget is best known as a psychologist whose interest is in the early development of children. His
interest goes beyond the study of childhood development to the investigation of the origins and
nature of knowledge. This would normally be the subject matter of philosophers, but Piaget
sees the development of knowledge in a child as an extension of the basic organizing principles of
biology that in turn are an extension of principles of universal order. Piaget determines to bring
the subject under the scrutiny of scientific investigation by using the methods of psychology.
Because of this he is often referred to as a genetic epistomologist (McNally 1973).

The world view that Piaget subscribes to is termed "structuralism" (Piaget 1968). It has three
defining attributes; wholeness, transformation and self regulation. The first attribute, wholeness,
implies that the elements of the structure are in some distinguishable manner subordinated to the
whole structure. The second attribute, transformation, is seen as the hallmark of structure. As
determined by the Bourbaki program, there are three basic frameworks into which all
mathematical systems can be sorted; algebraic (especially groups), networks, and topological
structures. All structures defined in this way display certain attributes among which are closure
and reversibility, i.e. ability to return to a previous state. Mathematical reversibility takes
different forms in each of the basic structures. Algebraic structures show reversibility through
inversion, networks through reciprocity and topological structures through continuity and
separation. Piaget notes that, in the intellectual development of the child these forms may be
observed as the recognition of number, serial correspondences and neighborhoods and boundaries.
In this view the order inherent in mathematics derives directly from the nature of things and is
not superimposed by the mathematician onto the world. The last attribute, self regulation
(psychological), results from the particular transformation rules that define the structure, and is
inevitable because of the nature of a transformational system. Rhythm, regulation and operation
are the hallmarks of self regulation.

The dependence of this theory upon the reversibility of the processes associated with structures
causes Piaget to gloss over entropy and the probabilistic approaches of physics, since they are
intimately associated with irreversible processes. In fact, these methods and theories elevate
chaotic processes to a basic framework status, essentially a network model but contrary to
Piaget's fears, quite reversible in the sense deemed important by him. Nevertheless, Piaget
suggests that a hierarchical organization of form and content is required because of the
inconsistency and incompleteness theorems of Kurt Goedel. And he recognizes that structures
(what we have referred to previously, especially in the section on entropy) must be dynamic and
grow or be static and superfluous to cognition (Piaget 1970). So he arrives by a different route at
many of the same conclusions that result from embracing the ideas put forth in the entropy
section above. Piaget is a constructivist, he believes that structures (we would call them
systems) are not pre–determined but constructed by interaction with the environment. He is
specific about the mechanism by which growth and learning takes place and that he terms
adaptation. He specifies the functions of accommodation by which an existing structure is added
to (for example a child recognizing a football as a kind of ball) and assimilation by which new
structures are created (for example a child learning how to eat with a knife and fork). Piaget terms
these "knowledge" structures into which the child fits new information (or that are created from
scratch), "schemas." A student of computer science will recognize a similarity between schemas
and the Frames and Scripts of artificial intelligence although those structures, as advocated, have a
more static nature.

The development of mind (how parents create children)

If the human mind is mostly created by interaction with its environment exactly what is the
process? Kenneth Kaye gives a creditable account (Kaye, 1982) which we summarize in the
following section. Newborn babies are not intelligent but they are endowed with the prerequisites
for intelligence. By the end of their first year babies do, to a degree, exhibit intelligence. Parents
influence on child, in particular the mother, would deny that the baby is not intelligent from the
start. They and other peripheral adults interpret everything that the newborn baby does as an
intelligent act. And in fact, previous theories (summarized in Kaye (1982)) have maintained that
the baby is organized in certain respects from birth; that the baby reacts to or bonds with the
mother and that the mother/ baby pair immediately form an interacting system. However, Kaye
indicates that this is an exaggeration. The baby acts more or less at random as regards the mother
or indicating awareness of the purported system. The mother persists in interpreting the baby's
movements as intentional acts. She talks to it and encourages it to reply and in fact treats the
baby as though it were making replies. The mother tries to elicit a response from the baby by
making faces and noises, and by poking, prodding and jiggling the baby. When the baby does
begin to exhibit a reaction specifically to the mother (at several months of age) it is by
synchronizing its facial expressions with those of the mother (i.e. smiling) in a receive greeting,
initiate greeting, turn–taking manner. The mother has been acting all along as though the baby
were returning her greetings by interpreting as such anything from a burp to a wet diaper. In so
doing the mother sets up a (one–sided) framework in which the baby has a part to play even
though it is a passive part. When the baby finally does begin to become aware of the world it
finds itself already accepted as part of that framework and the mother/baby pair takes on the
aspects of a system.

The concept of a system entails the recognition of parts of the system that interact together in
ways that determine the action of the system as a whole. Social systems have parts that consist
of biological organisms and are always open systems. But not all relationships between
organisms comprise a social system. In order to be considered a social system the organisms that
comprise the parts of the system must also have a shared development and a shared purpose.

A mother and newborn baby are not yet a social system because all of the purpose is on the part
of the mother. Even so, she treats the baby as though it were cooperating and forms an apparent
social system. As Kaye says (p109)

"So instead of being born with rules, the infant is born with a few consistent patterns of behavior that will look
enough like the rules -necessarily, like just the universal features of interaction rules- of adult life so that adults
will treat him as a person. And, of course, infants are also born with learning mechanisms that enable them to
improve the fit between the way they are expected to behave and the way they do behave."

Some learning mechanisms include learning from the consequences of actions, learning by habitual
association and learning by imitation. Through these and other mechanisms the baby eventually
begins to behave, at 5 to 6 months of age (Kaye p152), as the mother has expected all along. The
mother and infant pair begin to become a social system.

In their social system adult humans follow rules in their interactions with one another. Among
the first rules that the baby learns is turn–taking. The earliest turn–taking is in greetings between
the mother and baby in which the mother fits all her turns into whatever space the baby leaves.
As time goes by the turn–taking becomes more shared. When the social system begins to
develop, the baby's imitative abilities are optimized under the turn-taking system. At about one
year of age the baby can begin to speak and the turn-taking becomes a valuable framework from
within which dialogue can occur and the child can be taught. At this point the child begins to learn
how to use symbols. Until now learning has occurred on a contingent basis, that is, as animals are
trained through incremental steps towards some goal. But with the acquisition of language the
baby can begin to learn in a more organized manner. (Kaye p115)

"The social matrix that provides for the human developmental processes is as important to cognition as it is to
language. Others have described the development of thought as an interiorization of symbols, but they have not
always included in their account the internalization of the social skills through which those symbols are
acquired and through which they work. Thought is not just internalized symbolic meaning, a construction of
propositions using an acquired code. It is an internalized discourse, a matter of anticipating what the response
of another might be to one's behavior and responding to those responses in advance. Thought is, in fact,
verbally or non-verbally, a dialogue with oneself."

So turn-taking (or the development of rules) is important since it sets the pattern by which the
individual will acquire knowledge and even think for the remainder of his life.

The use of symbols by the baby occurs at the same time as the acquisition of language. There are
two criteria that must be met before the use of a signal can be considered a symbol. The first is
that the signal is intended to signify something else and the second is that the signal must be a
convention and different from that which it represents. So words are symbols while a brake light

or frightened look, even when conveying a meaning, are not symbols. The important thing about
symbols is that their meaning can be shared by individuals in a social system. When this occurs it
is termed intersubjectivity.

The pattern of discourse that arises from the injection of language into the turn-taking between
mother and baby is characterized by what Kaye calls turnarounds. Turnarounds are largely
produced by the mother and are in the nature of both a reply to a verbalization on the part of the
child and a demand for a further response from the child. The child, for its part, rarely produces
turnarounds. It responds to the mother's question or comments further on whatever topic is
being discussed.The structure of the dialogue does not require that the child share in the
responsibility for maintaining the discourse and yet allows him to practice language and
demonstrate acquired knowledge in an environment in which immediate feedback is
provided.Through the dialogues intersubjectivity occurs. The child comes to understand the
meanings the mother has for the symbols they are using and comes to be able to anticipate the
effects that his own use of the symbols will have in the social system. As Kaye puts it: (p149)

"The conventional symbols inserted by adults into their systematic exchanges with infants during the first year
are directly responsible for the dawning of human intelligence."

Nothing discussed sheds any light on what the actual mechanism might be that causes the baby to
begin to understand the meanings behind the symbols. But once this occurs, the transformation
from relying on learning by contingencies to learning by the assimilation and the use of symbols
is rapid. It happens in the baby at the same time as the acquisition of language. The acquisition
of a formal language with its capacity to describe the world prepares the way for the baby to
become intelligent. Imitation is Kaye's answer to the question "what is the actual mechanism by
which babies learn to attach meanings to symbols?" (p159)

"At the heart of imitation is assimilation or the classification of new events as equivalent to known objects or
events. We cannot imagine, let alone find in the real world, an organism that does not assimilate. Assimilation
is involved in all categorical knowledge. I assimilate the object in my hand to the category "pen": even without
labeling the category with a symbol, I assimilate the object to sensorimotor schemas that know what to do with
pens. When infants perceive modeled action X as an occasion for performing their own version of X, the
fundamental ability involved is no different from the fundamental ability of any organism to recognize novel
events as instances of known categories, despite their novelties. The process is hardly unique to man but is,
as Piaget (1952) argued, the biological essence of intelligence."

Kaye distinguishes four levels of assimilation undertaken by the baby, all under the general
designation of imitation. The first level is involuntary assimilation as the baby perceives the
world. The second level is "accommodation" that occurs when the baby actually responds in
some manner to that which he is assimilating. The third level is "signification" during which the
baby begins to make signals indicating that he is beginning to recognize his impact on the world.
Kaye argues here that the parents' actions greatly help the infant at this point as they tend to
carry out the apparent intentions of the baby, providing a model for the baby to imitate. First
words occur at this level. The final level Kaye terms "designation." It begins at the point at which
the baby starts to attach meanings to symbols. The acquisition of language rapidly follows.

The emergence of mind (a speculative account)

A system, no matter the type, can be described in terms of a phase space upon which a
probability density function is imposed. For example a physical system can be described by
means of a state space in which the momenta and position of each component of the system are
represented by dimensions in that space. A point in the space is then a state of the system and a
probability density function defined on that space will generate the probability of occurrence for
each state. Entropy is a mathematical definition on such a probability distribution. So entropy
equations may be applied to any sufficiently described system. Of course, the meaning assigned
to the values derived from those equations will vary according to the system to which they are
applied. In the case of physical systems the entropy is a measure of the disorder in the system.
The maximum entropy less the actual entropy is interpreted as the maximum possible disorder
less the actual disorder observed and so is interpreted as the order in the system. In the case of
communication systems the entropy represents the uncertainty as to the message being received.
In some cases the physical system and the information system are the same. In that case the
entropy equations substantially coincide. Note that, in the communication system, the maximum
entropy less the actual entropy is the maximum possible uncertainty about the message less the
actual uncertainty about the message and so may be interpreted as the information received from
the message. Suppose then that we regard the human mind as a physical/information system. It
consists of the brain which is a collection of neurons, connective tissue, chemicals and electrical
potentials occupying various states that are more or less probable according to some probability
density function.The states of the mind, in the physicalists view of things are associated with the
things we think. That does not mean that every state represents a thought but that potentially a
state may be associated with a thought much as a word may be associated with a sequence of
letters. In this system maximum entropy represents the same thought or no thought associated
with every brain state, or we might say, maximum ignorance. The amount of uncertainty as to the
message also depends upon the size of the message, that is, we are more uncertain about a garbled
message that is 10 characters long than one that is one character long. This is because there can be
many more messages of length ten than of length one. The amount of ignorance, like the amount
of uncertainty in the message system, depends upon the number of states that the mind might
occupy. So we would consider the maximum potential ignorance of a mind with a billion different
states as being greater than a mind with a million different states. The actual entropy represents
the actual ignorance so that maximum entropy less actual entropy is maximum ignorance less
actual ignorance. In the same manner that we assign order to be the difference between maximum
disorder and actual disorder, and information to be the difference between maximum uncertainty
about a message and the actual uncertainty, we could assign intelligence as the difference between
maximum ignorance and actual ignorance. But we won't succumb to this temptation to define
intelligence in a semi–mentalist manner because it overlooks the interaction of the growing mind
with the environment and the effect of the material nature of the brain (however, if we wanted,
we could supplement the argument by noting that along with the growth of total intelligence, the
fact of the dual physical/information nature of the brain/mind, we could infer that the intelligence
would manifest itself in levels). In any case knowledge might be a more appropriate term than
intelligence to assign in this context. Intelligence is as much a dynamic multi–level process as a
state of mind.

As regards the growth of knowledge in the human mind a couple of possibilities present
themselves: 1) The brain is created fully connected and totally available for acquisition of
knowledge. It represents a potentially large probability space of states of knowledge (a large
potential maximum entropy) but is created with low actual entropy and correspondingly high
order (i.e. it is blank, or has a few states occupied representing the necessary starter set of
human capabilities). Then the brain, as it interacts through the mediation of the senses of the
body to acquire knowledge, increases its entropy at a rate proportional to the rate at which it is
acquiring knowledge. In terms of order, the brain would then actually become less orderly over
time even though its knowledge increases. A knowledge acquisition mechanism might need to be
hypothesized. Or 2) The brain is not fully connected when it begins to learn. The normal growth
of the brain or stimulation of growth of the brain through interactions with the environment by
the mediation of the senses causes a rapid expansion of the capacity to learn (e.g. through the
connection of synapses). This expansion of the maximum potential entropy of the mind at a rate
greater than the increase in actual entropy would require an increase of order. In this hypothesis
an association of knowledge with an increased order in the brain is possible and the order is
mandated. Since we equate the order in the world with the order in our minds, this latter
hypothesis is pleasing. Too, it would help explain why, the older one gets the more difficult it is
to learn (there are less and less new connections that can be made). There is some evidence that
the brain is plastic (or most plastic) for a period early in life (Aoki and Siekevitz, 1988) at least in
cat brains and as regards the visual system and in rat brains as regards the tactile senses
associated with the rats whiskers. There is also evidence that production of neurons occurs in
adult song birds. The new neurons may be associated with the birds ability to learn new songs
(Nottebohn, 1989). So far there is no evidence for the production of new neurons in adult
humans. Certainly learning slows down as a person matures. However, the fact that memories
can be forgotten or modified, unless that process is completely wasteful of brain tissue, suggests
that the same portion of brain tissue can be rewritten. The truth, as are most truths, is probably
somewhere between the two extremes of wastefulness and complete flexibility.

The hypothesis

Complexity comes into the universe through a process of system building within an existing level.
The interaction of systems in a coherent manner define the new level hypothesized. Each level
as it is generated gives rise to a probability space. This ties directly to the concept of order as
described by the entropy equations. The entropy at a particular level in the hierarchy is
calculated by summing over the probabilities that derive from defining the universe as all of the
states possible at that level. In this way the effects of other levels upon that level, are contained
indirectly in the measure or ignored. The environment of a system can provide both a source of
matter and energy with which the space of states may be expanded and can provide a sink for
waste products (i.e. dissipated and expelled portions of the system). In the hierarchy that the
universe as a whole represents the original source is the quanta and the final sink is the universe

1) Levels may be recognized by the degree to which the objects and processes at that level participate in or
determine the objects and processes of another level. The less the participation, the further removed the levels.
Everything, with the possible exception of the universe itself or quantum particles, is embedded in, or encompasses
many higher and lower levels.

2) The emergence of order is consistent with entropic processes that work on all systems at all levels at all times.
The process is coincidental with organization (preferably hierarchical) that manifests itself in systems and proceeds
so long as conditions merit. As a definition, the portion of a level with which a system interacts is the
environment of that system. A level then, may consist of many environments. By one above, systems mainly affect
and are affected by their environment, level, and to a monotonically decreasing extent, the levels at increasing

3)The elements of a system may be either real or informational or of a dual nature (although all information
systems are ultimately grounded in real parts and are unreal only in the sense that they focus on the nature and
effects of communication, relations, between systems).

4)The rules that characterize organization emerge along with a level. New levels may be generated whenever
processes of organization proceed to a point that the system meets the criteria for the recognition of a level.

For the arguments made in this paper the italicized portion of 3) above is the most important
propositon. However, the hypothesis as a whole presents striking cosmological variances from
the Newtonian/Einsteinian cosmologies, and it treats phenomena usually overlooked in those

1)Natural laws need not apply across all levels (although nothing proposed prohibits them from
doing so). There are some indications that could be interpreted as evidence for the
non–persistence of rules/laws across levels:

a) quantum level particle behavior simply does not conform to the rules that we take for granted at the
human level. The concept of attribute, so pervasive and concrete to us is qualified at the quantum
level. While we can imagine instantaneous communication, the idea that something can exist as,
something that can only can be described as a probability wave, is beyond human understanding. Even
the rules of logic are different at the quantum level. That level is so far removed from human existence
that it marks the boundary of phenomena that can be translated into terms comprehensible by the
human mind.

b)Euclidean geometry and Newtonian physics suffice to describe the behavior at the human level of
existence but are inadequate for the description of objects of immense size, moving at very high
speeds, through immense distances. Even within the realm of applicability of Euclidean space and
Newtonian physics derivative laws are seen to dissolve across levels. A mistake often made by the
makers of motion pictures who infer that a 50 foot spider will be 600 times more frightening than a
one inch spider, is to assume that such a creature could exist. The relationship of surface area to
volume precludes such monsters who would promptly collapse under their own weight1 9.

c) it is as impossible for the human mind to frame the concepts required for any explanation of levels
of existence at the opposite end of the scale of being from the quantum level. That is, any discussion
of that which frames the universe, involving as it must, concepts such as infinity and eternity, are as
foreign to the human ability to imagine (convert into human level terms) as waves of probability.

d)chaotic processes have been detected at most levels from the molecular to the galactic, however there
is an apparent absence of chaotic processes at the quantum level (Science February 17 1989).

e) application of the rules of physics as we know them predict that fully 90 to 99 percent of the matter
necessary to explain observed galactic behavior is missing. It is hypothesized that this dark matter
exists but that we lack the tools to see it (or the imagination to determine where it might be). It is
possible, however, that the matter doesn't exist and the rules of galactic interaction are fundamentally
different from what is projected from the laws of physics derived from observations made from the
human level.

f)laws of chemistry apply only at the atomic and molecular levels, the gas laws (i.e. laws of constant
temperature, pressure, and volume) apply only at levels at which contained and isolated gasses exist
(e.g. planetary surfaces etc.). In interstellar space, gas clouds tend to form stars, a fact not predicted by
chemistry, the ideal gas laws nor explained by current astrophysics.

g)laws addressing human development, civilization, technology, apply only at or near the human
level of existence (e.g. natural selection/mutation, supply and demand, combustion, temperature ).

2)The universe is becoming more complex through the evolution of levels (i.e. creation is an
ongoing process). One need only be an observer of human civilization to form this conclusion,
but the ongoing creation of stars, and the creation of the elemental forms of matter in the death of
stars are more dramatic examples.
19 If volume is considered proportionate to weight, a spider with a spherical body of radius .5 inch would
weigh rp(.5)3 = .39r, while a spider with a radius of 25 feet (i.e. 600 times larger in dimension) would weigh
rp(300)3 = 84823001r. If r converts to ounces then a spider 600 times larger than a .39 ounce version would
weigh about 2651 tons. Even though his legs would be the size of telephone poles it is questionable that they
would support such weight.

3)The further removed two levels the less effect they have on one another. The universal
constant c, relating space and time, forces a correlation of space and time scales and defines
absolute limits on interaction, at least in the levels in the immediate vicinity of the human level of
existence2 0 e.g.

a)galactic events have no effect on the course of human history; human history has no effect on the
course of galactic events

b)an automobile accident immediately and permanently alters our existence, a malfunction of an
internal organ is cause for immediate concern and may alter our existence, the activity of a bacterium
in the gut, may or may not, in time, be concern for alarm. The destruction or construction of a
particular amino acid molecule in our body is almost certainly no cause for alarm. The exchange of
electrons by two atoms in our body is of no consequence to us, etc.

Thus the general theory of relativity explains the relations that exist among the largest structures
of the universe (the level above which we lose the ability to translate into human terms
corresponding to events at the quantum level). Ecological principles are the mechanisms by
which the biosphere maintains itself and becomes more orderly. Natural selection coupled with
genetic mutation and genetic mixing are mechanisms by which the biological order of populations
is increased. Culture, economics, sociological principles and technology are mechanisms by which
the order in human society is increased. Adaptation, assimilation, accommodation and imitation
are mechanisms by which human mental systems grow, or are formed. Syntax and semantics are
the mechanisms of language. Chemistry describes the interaction of objects at the molecular level
and quantum mechanics describes the rules by which the smallest known particles of the universe

20 There is a natural, infinitely graduated, and absolute partitioning of events in space near the human level of
existence that occurs because of the limitations imposed by the universally constant speed that cannot be
exceeded (approximately 186,282 miles per second, a speed that only light ever reaches and as a consequence
the constant is called the speed of light). This makes it impossible for cause and effect relations to occur
between any two elements in space in time periods less than the time it takes for light to proceed from the one
point to the other. The result is a positive correlation between the spacial and time scales at which events
occur. For example the smaller the spacial scale at which an integrated circuit is constructed the fewer time
units it requires to perform a given operation. On a larger spacial scale, the collision of two galaxies requires
very many time units to occur. The correlation is not perfect since most interactions occur at speeds
considerably below that of light. Still the relation is noticeable and important for all levels of being. The
limitation on speed together with chaos and entropy seem to be commonly true laws at levels above the
quantum level and below the universal level.

order themselves.

In this view, life on earth did not begin because of some chance combination of chemicals, but
because the conditions that lead to order were present on the early earth and the chemical
constituents in that environment were such that they led to an expression of order that we call
biological life. The observable patterns at one scale or level of a hierarchy of systems are
explained best in the context of their environment and the subsystems of which they are
comprised. Explaining one level in terms of the features of constituent systems many levels
removed can not be complete. For example, the grand unified theories of physics are not
complete because they do not include gravity in their quantum scheme. Gravity is a mechanism
for explaining the behavior of the universe on its largest scale and the forces of quantum
mechanics do the same for the smallest scale. Efforts to unify the two have been largely
unsuccessful except when the singularities of the general theory are explored (see Hawking 1988).
At those singularities matter is concentrated to the extent that quantum effects must be taken into

This is not meant to imply that these levels consist of some small number of readily discernible,
easily enumerated, regular, distinct levels. But it does maintain that levels may be determined.
They may be determined through either inclusional methods in which a combination of features
serve to identify a level by identifying the members of that level or by exclusional methods in
which elements at different levels are recognized as such by their inability to communicate with,
or affect one another. The theme that characterizes the features that may be seen to distinguish
levels, is the recognition of scales of space and time and frame of reference. The more completely
a level is distinguished the more sense it makes to treat the states it may assume as a probability
space. This applies even more strongly to environments. The relationship is not reversible
however; distinguishing a probability space does not imply the identification of a level. It is to
probability space that the concept of entropy applies.

From the point of view of a processing system, at its level, in its environment, the processing
system can behave in a way to decrease its entropy (i.e. become more orderly). It does this in a
manner consistent with its internal rules and the possibilities presented in that environment. Of
course, the constituent parts may be such that the system achieves a stable orderly state (e.g.
the earth could freeze over lowering its albedo to the point that it can reflect most of the energy
that arrives from the sun and thus maintain itself for a long period in an orderly but static state).
But obviously, it may also progress to states far from equilibrium in which the production of
order continues for long periods.

Entities that are part of a system may see that system as their environment. A level can have
many environments (e.g. the organs of animals may be considered all at a level but each individual
animal is an environment for the organs that comprise that animal). In a sense the various
environments are different levels since the rules that govern the systems that those environments
represent may prohibit any interaction of their subsystems (e.g. for the most part, organs of one
animal do not interact with the organs of another animal) and this in itself would act as a
partition. But we will consider levels to be defined in light of the prohibitions to communication
between entities at different levels as due to differences in spacial and time scales, though we do
not insist attributing all partitioning into levels to those two reasons. In an environment the
entities that make up that environment are wholes. But with the possible exception of the quanta,
every entity is itself a system with subsystems. To those subsystems the entity is the
environment. The point of view of the entity is at the interface of its environment and the
subsystems that comprise its material being. Within a level, in an environment, because of the
finite number of entities making up that environment, and relatively well defined possible
relations between those entities, the possible states of that environment are calculable and
conform to the laws of entropy. The levels, so formed, are susceptible to interference from other
levels. Influences from remote levels are always possible (the earth may be struck by a comet,
destroying various organisms, or a cancer at the cellular level may destroy an organism).
Generally, the further apart two levels are, the less the entities in the one can affect the entities in
the other. The calculations of probability or possibility that an entity can bring to bear are largely
limited to those that occur from its habituation to its environment and knowledge of its internal
state. The calculations will always be statistical because of the inability to predict the course of
events of remote levels.

The mind exists as a system in the environment at the level at which humans live. It consists of
two main components, the brain and the body. It is a peculiar system because it is almost
entirely given over to dealing with the environment in which it exists. It is this fact that yields the
illusion, considered real by Descartes, that the mind exists independently of the body. All of a
person's knowledge deals with relationships that apply in his environment. It's almost impossible
to distinguish between the nature of mind and palpable entities at the level of human interaction
in the environment. Since the mind is intangible in that environment the inclination is to assign the
mind an existence similar to the other intangibles of that level, e.g. gasses, clouds, auras, or
objects of the imagination, e.g. ghosts, spirits or souls.

From the hypothesis many observations can be made and conclusions drawn. We make them
because they fit into the general scheme of systematization rather than because they can be
deduced from the acceptance of levels as a fact of nature. That is, we are making a coherentist
argument that avoids either a foundationalist or spiritualist explanation of mind.


Language, meaning, intelligence

The universe exhibits an ongoing hierarchical stratification that manifests itself in the emergence
of levels, rules, and objects consisting of complex objects along with rules that govern their
interaction. These rules of interaction (or behavior, or organization) can not be completely
explained in terms of the rules and constituent parts of the surrounding levels, but they in no way
contradict the rules extant in those levels. The further removed two levels the less effect the
events in the one have on the events in the other and the less the rules of organization in the one
can be explained in terms of the rules of organization in the other. Thus the explanation of the
organization of galaxies cannot be explained in terms of the interaction of molecules and the
behavior of biological organisms can not be explained in terms of their cellular construction and
the affairs (or lives) of a cell. The order that manifests itself at levels can be seen to occur in
conjunction with factors that can cause a decrease in disorder as described by entropy equations.
Since the entropy equations apply equally to physical states and informational states and in
many situations the two kinds of entropy coincide (e.g. where information bearing physical
systems are concerned), it can be inferred that the thoughts in the minds of men cannot be fully
explained in terms of their constituent parts and rules at lower levels (e.g. symbols, sense data,
memory, logic, perception, intention, the information bearing connections of neurons in the brain,
etc.). Further, The emergence of every man as a thinking individual (in possession of a mind) is a
process of self–organization driven by these same entropic principles. The level into which
persons emerge and at which minds think, is not unique or special and does not represent the
end of the creative process of hierarchical stratification. Another higher level can be seen to be
forming in the emergence of human civilization. It is emerging along with the technology that
characterizes it and the rules for communication among its human parts. Language or grammar,
and meaning as well, can be seen as rules in this higher level. Human intelligence too, may be seen
as a particular kind or quality of thought that is meaningful only at this level.

In the same way that hunger is not a feature of the organs of an animal, but of the animal itself,

language is not a feature of a man but of a population of men. It results from the interactions of
individuals in a population and is important to individuals only so much as they are members of
that population. This suggests a method of determining to which level a rule of interaction
belongs; if the rule is of no use to a member, isolated from the rest of the system, then it belongs
to the system itself. It argues against any innate language facility for it would then be expected
that, like other innate characteristics of the individual, such as race or eye color, there should be
a strong predilection, or at least a tendency, for an individual to speak a particular language. But
an individual can learn to speak any language much as he can learn to drive any make or model of
automobile. That he can reason better with the use of language is to no more remarkable than the
fact that he can haul more potatoes to market with the use of a truck. Further, the capacity to
communicate does not rely on the ability to hear or speak; the deaf and dumb learn sign language.
Even the deaf, dumb and blind learn signing through touch and braille. Large portions of the
human mind are devoted to handling communication with the other entities that share the
environment but it is not pre–programmed for a language. Still normal, formal, language is the
preferred means of communication between humans. Language is the mechanism by which
procedural knowledge is made declarative and declarative knowledge is made permanent and
available. Its chief effect is to facilitate the maintenance and growth of order in the population to
which it belongs. That civilization (at least the technologically oriented civilization of which we
are a part) is rapidly evolving beyond the comprehension or control of any of its members. This
suggests that civilization is interposing itself between man and the ecological system, becoming
the next higher level for the human organism. An average man would find it difficult to survive
outside the protection of that civilization. The dependence is not as pronounced as that of a cell
of a multicellular organism, the cell would find it impossible to live outside of its organism, but
certainly the dependence is greater today than it was thousands of years ago. So the effects of
language on the individual are secondary in the sense that first the environment within which the
individual develops is changed by the communication afforded by language and then the mind of
the individual that grows into that environment is different from that which it once would have

So it is a modern myth that the technical achievements of mankind were produced because of the
superior intelligence of man. The civilization and the minds that fit into it emerged together.
Consider that the myth is not extended to other social animals such as ants, termites and bees.
While the termite engineering that keeps a termite colony cool, or the ant agriculture that provides
aphid herding and fungi gardening, or the bees application of eugenics to maximize the success of
the hive may be marvelous inventions, the individual members of those communities are
nevertheless considered mindless automatons. Humans can be very objective where the human
ego is not concerned. Interestingly, humans are almost oblivious to their own composite nature.
Like an ant colony, the human body consists of an organization of cooperating entities. Biologists
are aware of that fact and occasionally remark on it. Lewis Thomas in Lives of a Cell muses:

"Mitochondria are stable and responsible lodgers, and I choose to trust them. But what of the other little
animals, similarly established in my cells, sorting and balancing me, clustering me together? My centrioles,
basal bodies, and probably a good many other more obscure tiny beings at work inside my cells, each with its
own special genome, are as foreign, and as essential, as aphids in anthills. My cells are no longer the pure line
entities I was raised with; they are ecosystems more complex than Jamaica Bay."

If it were true that the glories of the whole vest in its components then we should give full credit
to these little beasties who have so cleverly put us together. That is a tempting thought but one
with uncomfortable ramifications...Lewis Thomas continues:

"I like to think that they work in my interest, that each breath they draw for me, but perhaps it is they who
walk through the local park in the early morning, sensing my senses, listening to my music, thinking my

The interactions and characteristics of entities at the level of human existence should not be
imposed upon other levels of existence and the interactions and characteristics of other levels
should not be reckoned as originating at the level of human perceptions. Individuals in modern
society are in a curious position. Because of technology they have observational access to other
levels. Whenever possible the data from those levels is presented in a comprehensible form;
atoms become billiard balls, weather fronts become jagged blue lines on a map, bacteria are little
bugs, and the world is (or was until it was photographed from space) a multicolored globe on
which geologic and geopolitical distinctions are represented by bright pastel colors. At levels
further removed, the metaphors break down and the description of objects and relations become
purely symbolic; the planets execute trajectories that are the solutions of differential equations,
electrons move and occupy positions according to probability distributions, and the wealth of

nations fluctuates according to the interplay of variables that include interest rates, trade
balances, exchange rates, tariffs, and the relative costs of labor and capital. While the levels being
observed certainly exist, it must be kept in mind that the observations are always interpreted by
minds whose only direct experience is at the level of human existence.

It is a common mistake to assume that the processes observed at the human level of existence
pervade all of the universe. Spencer made that mistake when he projected the processes that
characterize earthly evolution as applying universally. As a further example of this mistake
consider economic systems.

In the eighteenth century, Adam Smith recognized what he called the invisible hand. By this he
meant the self–regulating, self–organizing nature of individual industries that leads to the growth
of wealth and order in an industrial economy. His theory was a rejection of the Mercantilist
theory which concentrated on the accumulation by trade of the representatives of wealth (gold,
silver, etc.). Smith investigated, division of labor, and supply and demand as mechanisms by
which to explain the observed production of goods. Supply and demand work in the market
place as the buyer (demander) and supplier fix prices for their goods and labor at a level that
satisfies both. When the suppliers of one good or service garners profits that exceed the profits of
other goods and services, those making less profits will divert resources to the more profitable
endeavors. This raises the supply of the high profit good or service and results in a lower price to
the consumer and less profit to the supplier. Eventually an equilibrium is reached at a point that
results in an allocation of resources to the production of goods according to the demand, and a
reward to labor according to its productivity. This natural system results in an optimal
production of wealth as measured by the positive gain of goods and services (as opposed to
accumulation of gold). The implication for political institutions is that they should exercise
laissez faire. A century later, upon witnessing the vast inequities in the distribution of wealth in
the industrial states (capitalist economies) that followed Adam Smith's dictate, Karl Marx
rejected the theory and maintained that the possessors of goods and land come by their property
through extortion and/or political means and not by productive contribution to society. Marx
borrowed the Hegelian dialectic and applied it to a materialistic concept of the universe. The
world was ever changing and progressing to a higher order. Thesis merged with antithesis in a
synthesis that represented progress. To Marx, labor was the elemental material from which the
wealth of nations derived. In particular Marx held that capitalism was a perversion of that natural
progression, in which the the wealthy and powerful actively prevent the workers from taking
their labor to market. Workers cannot and will not get what a capitalist calls a fair market value
for their labor. They become, in effect, slaves who receive only a subsistence share of the wealth
of society. They are kept in that condition by a collusion of the state and the wealthy. The
successful oppression of the workers results in the further accumulation of wealth in the hands of
the few. Eventually, because of its antithetical nature the whole capitalist structure must
collapse, perhaps in a depression as the human and natural resources are exhausted, but most
probably in a revolt of the workers.

As an alternative system Marx proposed communism. Communists believe that the state must
step in and direct the distribution of the goods and the allocation of the resources according to the
maxim from each according to his abilities, to each according to his needs. In a fair economy the
workers will happily produce for the good of all and at a level exceeding what they would
produce for capitalist taskmasters. Eventually the natural equilibrium will be restored, the
synthesis achieved and a state of unlimited, equitably distributed wealth will result. The state
will have served its purpose and can wither away. Activist communists must encourage
revolution in nations that subscribe to capitalist notions in order to hasten the day that the
oppressed workers are freed, utopia achieved, and the state dissolved.

The true state of affairs is that a capitalist system is never imposed by government. To the
contrary, it arises when the state does not meddle in the affairs of producers and consumers. On
the other hand, communism seeks to impose, on existing systems, an artificial set of rules that it
perceives to be appropriate and for the greater good . It tries to achieve economic, political and
social goals by constructing a system from existing parts and new rules. This makes for great
slogans and high ideals but little else. The mistake is believing that by arranging the thesis and
antithesis, the synthesis (at a higher level) can be controlled. Unfortunately the imposition of the
good at one level will not necessarily result in a good system at a another level. It is not
necessarily true that the concept of the good applies to economies. We have seen that the rules
and parts of a system emerge inseparably and together. Modifying the one or the other inevitably
modifies the system as a whole, and that change cannot be predicted.

Both capitalism and communism have been put into practice in the twentieth century to the
extent that significant evidence is available to evaluate the hypotheses that support those
systems. It seems evident now2 1 (1989), that capitalism works to allocate resources and goods
at the level of an industrial economy in a natural manner that achieves great wealth as compared
to state controlled systems. This would support the prediction that when that natural inclination
of the parts of an economic system is subverted by the imposition of arbitrary rules, that
economy works with an efficiency limited by the degree to which the imposed rules conflict
with the natural rules. The failure of the communist economies to be as productive as the
capitalist economies is really the result of a failure to see that a system is more than the sum of
its parts and rules, and cannot be fully explained in terms of those constituents. Goals cannot be
artificially imposed on such a system and be expected to be achieved through the simple
manipulation of the rules. This is not to say that a social/economic system with given goals
cannot exist nor that those goals cannot be desirable and noble. Marx was just born too late to
effect a successful transformation to his dream of utopia and too early to see that such a
transformation might not be necessary.

An appropriate view of civilization is as one facet of the continuation of a process, initiated four
billion years ago, toward order on the surface of the earth. That process has no interest in the
success or failure of any person or of man as a species. The process is not a thing and it is not
cognizant (even though it produces cognizant beings), and it did not produce order for some
purpose akin to amusement or so that it could know itself (however engaging that idea might be).
In fact it is highly unlikely that any idea based on analogy with human motives and emotion
(intensions) motivates anything anywhere except on earth or other earth-like environments and at
the human level of existence. Obviously, nothing prevents the process from giving rise to
systems that know themselves. In fact we can infer that all systems at all levels have an attribute
that may be called awareness. Given the common sense meaning of the word, the existence of
awareness in systems is implicit in the definition of a level or an environment as a system of
21 Judging by the success of the economies of states that apply the capitalist dictum laizzes faire (let them
be, referring to business).

entities that interact and participate in the existence, maintenance, and disposition of that
environment. The coherent interactions of an entity with the other entities in its environment is
the awareness of that entity of its environment. The more complex the environment the more
complex the interaction and the more acute the awareness. Awareness by one entity of the fact
that a second entity is also aware and has a similar view of the environment is a primary
requirement for the imputation by the first of intelligence in the second. Because a human, by
using language, can easily achieve this rapport with other humans, but must expend great effort to
communicate with other non–speaking entities, he considers his kind the more intelligent.
Intelligence defined in human terms is a subjective measure with man as the yardstick. But
intelligence can be an objective measure when described as a measure of the proper and
appropriate functioning of an entity in its environment.

And here we can see the origin of what are popularly termed intensional systems; those that seek
to explain the activities of the human in terms of the goals, purposes, hopes, beliefs, desires,
fears, hunches or, in general, intensions of that system. They are the result of the reification
through the facilities of language, of observations made by a human of himself and of other
systems at the level of human activity. The name (hope, fear, purpose etc.) comes to represent a
thing that, because of the inability of a system at a level to interact across levels, and in most
cases to even be aware of other levels, is assumed an object of universal import. There is little
harm in such deceptions except as they effect the attempts of men to create, in an entirely
different medium than a man, and at different levels from that at which men exist, that which they
have reified as intensions of one sort or another.

Intelligence and machines

The idea that a thinking, cognizant entity can be created, as it were, from whole cloth, is as
chancy a conjecture as the sometimes proffered hypothesis that the earth and universe were
created in totality a few thousand years ago in complete detail; light streaming through light years
of space, from stars that didn't exist just seconds before, down to an earth replete with fossil
records of animals that never existed, strewn through layers of sediments that were never
formed, by seas that never were. Certainly, it is easy to reject such a possibility, both for the
case of the creation of the universe and the creation of a mind at a level2 2. We are persuaded that
minds emerge as a system of mental objects and rules through the interaction of the human brain
and body with its environment, and that the emergence is a crucial aspect of its existence.
Further, there is nothing magical or special in the creation of a mind, it occurs as a result of the
same ordering process that leads to the creation of the objects and rules that characterize any of
the levels we observe in the universe. The machine equivalent of a mind, to be other than a
carefully constructed sham whose purpose is to win acceptance as intelligent, must emerge in a
similar manner. This point is reiterated in the conclusions below. There are other implications for
a machine intelligence, of a universe consisting of hierarchically stratified semi–permeable levels.
There are problems of scale in space and time.

An acceptable (to humans) machine intelligence must, as a prerequisite, exhibit two features; it
must perceive spacial features on a scale close to the scale at which a man perceives spacial
features and it must have thought processes that recognize sequential events on a time scale
approximating the time scale at which men perceive the passage of time. The Gaia hypothesis
(Lovelock), that the biosphere is a self–organizing, processing system is unacceptable to most
people even if no intelligence is attributed to that organization. But it is obvious that, even if the
intelligence of the Mother Nature of myth (conscious, purposeful action) could be attributed to
the biosphere, the space/time scaling of such a sentient being would be orders of magnitude larger
than that of an individual human. Only the most liberal of men would accept as intelligent, beings
that operate on time and spacial scales that are to them only an abstraction. Even the God that
rules man’s universe (or the Gods depending upon the religion) is perceived by believers to be an
anthropomorphic being operating at the human level of being (or a rung or two above that level).
The failure to recognize the scale of things as an important aspect of consciousness has
sometimes given rise to arguments against the possibility of machine intelligence. One such

22 We are using the word mind in its normal sense as the complex composition of mental objects and
thought processes that include reasoning, self–awareness, other–awareness, personality, emotional response,
etc. The word, by default, is associated with humans. Other animals, (and even some humans) while
possessing many of the attributes of mind, may be excluded from the ranks of those who possess minds by
careful definition. The words an intelligence are often used when the possibility of a non–human mind are
discussed. The principal qualification for the possession of a mind is the acceptance of that mind by other
minds. Nothing in this precludes animals or machines from possessing the equivalent of a mind, or an

argument (made by the philosopher John Searle) involves a model of a computing machine
running a program. The model essentially (we will produce a semblance of the argument that we
hope captures the essential points) consists of a man in a room with an instruction book, indexed
in Chinese, but with entries written in English, the man's native language. Also in the room are
large filing cabinets, indexed in English and full of data written in Chinese together with other
English instructions. Occasionally a secretary enters the room with a message written in Chinese.
The man consults the Chinese index in his instruction book and retrieves the English instructions.
They tell him how to access the file cabinets. From the file cabinets he gets Chinese characters
and/or further English instructions that he follows. Ultimately the instructions cause him to
construct a message in Chinese. The secretary comes back and retrieves the message and delivers
it back to the outside world. The man represents the computer's CPU, the filing cabinets full of
data and instructions represent the computer's memory, loaded with data and programs. The
secretary represents input and output channels. The claim is that this system can do anything
that a computer can, in particular it can run an AI program. In as much as no one would consider
the output from this system intelligent, no one should consider the machine running an AI
program intelligent.

The argument is wrong because it involves two translations of scale and ignores some of the other
requirements of an AI system. By focusing on the CPU level (or the inner workings of the
"office"), the scale on which the overall system works is overlooked. If one described the
functioning of the human brain as a system of electrical discharges predicated upon the chemical
disposition of various cells enclosed in a bony carapace...initiated by and resulting in discharges
that enter and leave through one access channel into the carapace, then it would be difficult to
view the brain as possessing intelligence. The second change of scale is in time. The system could
not possibly react on a time scale that humans perceive as the scale at which intelligent
interactions occur. Further, and more importantly, the system is fixed, it has no chance to grow.
The man can't hire more secretaries, buy more filing systems, rearrange the ones he has in order to
improve access, develop techniques to make the office run more efficiently, such as associating
recurring inputs with outputs so that the busy work of accessing the files can be avoided, and
rewrite some of the instructions or even some of the data, when it becomes apparent that to do
so would improve the operation of the system. Finally, no entity can be considered intelligent in
a void. An environment in which to be intelligent is essential. Because of the scale differences, an
office system, such as the one hypothesized above cannot interact with humans in a human
environment. But let us hypothesize similar systems with which the office system can
communicate and a universe of other objects that operate at the appropriate time and space scales
and that can in some way be sensed by the office systems, and an ability of the system to
reorganize itself in response to those inputs. In such an environment and to the extent that their
ability to communicate permits, the office systems would consider each other as intelligent
entities. Humans would not consider the office system intelligent, but they might well consider a
computer, operating at their scale, in their environment, with an ability to grow, learn and change
as an intelligent being.

Implications for the implementation of machine intelligence

It must be emphasized that the problem is not in the level at which the implementation takes
place, or in the similarity of the model to the system being emulated. Rather the problem is in
the implementation philosophy. If, as is the usual case, an attempt is made to create a
functioning, fully capable intelligence, directly from human level rules of behavior, or even from
constituent parts, whether they be mind function equivalents, or neuromorphic mechanisms, the
resulting intelligence is bound to be deficient. The natural process of producing new human minds
does not attempt this fait accompli. That is, human minds aren't born full blown, they are grown
over a long period of interaction in an environment; they emerge. Only the most rudimentary,
necessary knowledge about the operation of the body, is transferred directly from the molecular
level to the mind level. Any successful implementation of human–like mind in a machine will
have to provide a program that can emulate this process of a human mind growing and learning in
an environment. This is not just because it is impossibly difficult to produce a program that
represents a full grown intelligence capable of dealing expertly with an environment that it has
never seen, but because human intelligence is truly an emergent property of the human brain and
body at the human level of existence as it grows into its environment.

The problem is further complicated by the fact that the system that is called the mind has a
body component. Rene Descartes, tried to isolate those components as separate entities, but
they are not. The mind and body grow and emerge together as a system. There is a great deal of
confusion about this because thoughts (the product of the mind) and acts (the product of the
body) are different phenomena. And the mind is considered intangible while the body is very
tangible. But it is the brain and the body that are better described as separate systems. They are
subsystems of the mind, existing at a level below the mind. A brain without a body or a body
without a brain is useless, even in perfect health. When joined together and allowed to interact
with an environment they become more than the sum of brain and body. They become a
cognizant entity. If allowed to interact in human society they become a unique person. Person
and cognizant entity are kinds of minds.

The significance of all of this for the purpose of this paper is that all examples of systems that
are unconditionally considered intelligent (humans) belong to the class of processing,
self–organizing systems described above. Figure 2 part A provides a graphic representation of
such a system (in this case a person) showing the interaction between the person and the
environment. Computer programs, including AI programs are processing systems only in a
limited sense and, at the present, not self organizing. The limited nature of their material nature
and its interaction with an environment is portrayed in part B of the same figure. It is a corollary
of the hypothesis that the activities exhibited by systems that are interpreted as evidence of
intelligence, can be attributed to that self–organizing, energy processing and order–producing
nature, the specific form of which depends on the environment. AI systems must be constructed
in a manner that more closely approximates those systems. To be more specific, AI systems
should be constructed that can learn, grow, change, and interface with the surrounding
environment in manners analogous to their human counterparts.

These conclusions paint a gloomy picture for the possibility of creating intelligent machines. The
idea that all of the information necessary to deal with a real world can be programmed into a
machine is precluded. This problem has been anticipated by Terry Winograd and Fernando Flores
(Winograd, 1986) who believe that intelligence is a manifestation of the dynamic nature of the
structures of the mind and their reflection of the ever changing environment into which an
organism, is thrust. They elaborate on the hypothesis, asserting that intelligence is not only
affected by, but is actually a result of our history of involvement and continuous participation in
the world, in particular, in human society. In setting forth these ideas they draw upon the works
of the philosopher Martin Heidegger and the Biologist Humberto Maturana.

"Heidegger argues that our being-in-the-world is not a detached reflection on the external world as present-at-hand,
but exists in the readiness-to-hand of the world as it is unconcealed in our actions. Maturana through his
examination of biological systems, arrives in a different way at a remarkably similar understanding. He states that
our ability to function as observers is generated from our functioning as structure-determined systems, shaped by
structural coupling. Every organism is engaged in a pattern of activity that is triggered by changes in its medium,
and that has the potential to change the structure of the organism (and hence to change its future activity). Both
authors recognize and analyze the phenomena that have generated our naive view of the connection between thinking
and acting, and both argue that we must go beyond this view if we want to understand the nature of cognition -
cognition viewed not as an activity in some mental realm, but as a pattern of behavior that is relevant to the
functioning of the person or organism in the world."

Higher Levels

immediate higher level immediate higher level

Regulatory Constraints Regulatory Constraints
Initiating Conditions Initiating Conditions
Environment Laboratory

(feedback only

Focal Level Activities Focal Level Activities

(The activity of the program)

immediate lower level

immediate lower level
Material Constraints Material Constraints
Initiating Conditions Initiating Conditions
Person Computer
(Material Nature) (Program and arch-

Lower Levels Lower Levels

lines of influence


Figure 2: Interaction between a person and adjacent levels vs. computer and adjacent levels

Kenneth Kaye, in analyzing the growth of human babies into intelligent persons (Kaye 1982),
concludes that the social system into which the baby is thrust is responsible for the growth of its

intelligence. Kaye's exposition is relevant to the growth of human intelligence and is consequently
of relevance to any researcher who would aspire to see that kind of intelligence replicated in a

If it is the case that human intelligence is a dynamic and changing condition dependent upon
interaction with the environment and the human social structure for its existence, then
researchers under the traditional paradigms in artificial intelligence are not likely to succeed. They
have been working under the assumption that they can build into a program representations of
knowledge that when activated will exhibit intelligence. They are likely to be frustrated in their
efforts for two reasons;

1) If intelligence is created through the interactions of the individual within human society then machines
will not be intelligent. Machines cannot interact with humans as humans, and for the present, interact with
their environment almost not at all,

2)When researchers build programs to be intelligent they fix the means by which the machine can represent
the world. This cannot lead to cognition for it is precisely the dynamic nature of interaction with the
constantly changing requirements of the (largely social) environment and the ability of the environment to
change the organism which gives rise to intelligence.

Winograd and Flores conclude that the situation precludes the development of an artificial
intelligence. Their conclusion is inescapable if what is meant by artificial intelligence is the
duplication of human intelligence; the first of the two reasons given above is not likely to be
overcome to the extent that a machine intelligence is ever accepted as a human intelligence. There
is, however, nothing inherent in the nature of a machine that precludes growth and learning, and
consequently, intelligence in the broader sense of an undeniably intelligent machine. Further, since
interaction in the environment is an essential part of intelligence a direction for research is
indicated. The interaction of a dynamic learning program with other intelligent systems (in
particular with humans) in an appropriate environment becomes a candidate system for
development. If an intelligence with which humans can identify is desired then the kind of
environment in which the proposed system should grow and learn should include features whose
meanings can be shared by man and machine and should be populated by men and machines.

Consider the development of a machine with sublevels sufficiently like those of humans that it

can learn and grow in the human environment. Further suppose the machine has physical and
mental capacities significantly similar to those of a human. Place it into the human environment
(or a human–like environment) where it can interact with humans and teach it. In light of the
hypothesis on the nature of the emergence of levels, we can expect a mind to emerge according to
the same principles that lead to the emergence of a human mind. It will not be a human mind, it
will be a machine mind, but that does not imply that it will be an inferior mind, even measured by
human standards. But far short of this elaborate experiment more realizable efforts can be
envisaged. The tools are at hand in symbolic programming languages, neural network algorithms,
genetic algorithms, and computer simulation techniques to create a simulation of the above idea.
Create an artificial environment in the computer with appropriate interfaces to both humans and
the machine systems to be ‘taught’. Perhaps some interface to an elaborate database of objects
with specific attributes and characteristics familiar to a human (say items encountered in
everyday life; rooms, walls, chairs, doors,boxes etc.), and identifiable by the machine. The
system might feature a graphical interface with the human ‘teacher’ and more direct but
conceptually similar access by the ‘machine intelligence. Within the simulated environment the
human teacher and the machine can interact to create a machine of limited intelligence (see
Glasgow 1989).


To be explicit about what we hope to have accomplished in the above, we offer the following
summary of conclusions. The principle involved in the creation of an intelligence is the same
principle that explains the organization that manifests itself throughout our world and seems to
maintain throughout the universe. We are brought to this conclusion by the observation that the
universe is a statistically described place in which the entropic processes of physical systems
that bear information (e.g. RNA, DNA, neural networks, etc.) describe the same thing under
interpretation as information or physical structures. From this we feel justified in applying the
observations concerning hierarchical stratification and rule creation so obvious in physical
systems, to information systems. We recognize the mind as such a dual system. Then, that which
we recognize as mind and invest with the quality of intelligence emerges into the human
environment as a system of hierarchically structured mental objects and associated rules. As
noted, the rules/laws that emerge along with a level are not reducible to the functions and
interactions of constituent parts at lower levels. Obviously then, any implementation of
human–like intelligence in a machine will have to include lower levels, and the mechanism of that
implementation will have to be emergence rather than construction. Functionalists will argue that
the levels at which they perceive the workings of the human mind to occur are closer to that at
which the human entity as a whole operates and are therefore the most natural levels at which to
attempt an implementation (i.e. implementation via symbolic processes using the facilities of
programming languages). Those enamored of neural networks will argue that their model more
perfectly imitates the structure of lower levels of the human mind. And though it might be at a
level further removed than the functionalists level the similarities to actual human brain structure
makes neural nets the preferred implementation mechanism. Whatever the implementation tools
the philosophy must be one of creating a seed that can grow and be molded into the desired
intelligent being; not one of anticipating unpredictable future requirements and programming for
them. Whatever structures are built must be self-organizing and self-modifying through
interaction with an environment. Many such systems can be conceived, both symbolic (Glasgow
1989) and neuromorphic.


Akmajian Adrian and Demers Richard A. and Harnish Robert M. (1984). Linguistics.
Cambridge Massachusetts: MIT Press.

Aoki Chiye and Siekevity Philip. (1988). Plasticity in Brain Development. Scientific
American, Dec.
Bennett Charles H. (1987). Demons, Engines and the Second Law. Scientific American,

Bertalanffy, Ludwig Von (1968).. General System Theory: Foundations, Development,

Applications. New York: Braziller.
Block, H. D. (1962). The perceptron: a model for brain functioning. Reviews of modern
physics 34: 123-135.
Brooks Daniel R. and Cumming David D. and LeBlanc Paul H. (1988). Dollos' Law and the
Second Law of Thermodynamics. In Bruce H. Weber and David J. Depew and Jonas D.
Smith, ed. Entropy, Information and Evolution. Cambridge Massachusetts: MIT Press;

Copleston, Frederic S. J. (1983). A History Of Philosophy. Garden City New York:

Image Books, a Division of Doubleday & Company, Inc, ; 1983.

Crutchfield, James P. and Farmer, J. Doyne and Packard, Norman H. and Shaw, Robert S.
(1986). Chaos. Scientific American, Dec.

Cutland N. J. (1980). Computability, An Introduction to Recursive Function Theory.

Cambridge England: Cambridge University Press.

Davies, Paul (1989). The cosmic blueprint. New York: Touchstone.

Davis Martin (1958). Computability and Unsolvability. New York: Dover Publications Inc..

Dawkins Richard (1976). The Selfish Gene. New York: Oxford University Press.

Denett Daniel C. (1978). Brainstorms. Cambridge Massachusetts: MIT Press.

Devaney, Robert, L. (1989). An Introduction to Chaotic Dynamical Systems (second

edition). New York: Addison-Wesley.
Dreyfus Hubert and Stuart (1986). Why Expert Systems Do Not Exhibit Expertise.
summer IEEE Expert 1(2): 86.

Eiseley Loren (1958). Darwins Century. New York: Anchor Books, Doubleday &

Elsasser, W. M. (1970). Individuality in Biological Theory. In C. H. Waddington, ed.

Towards a Theoretical Biology 3: 153. Edinburgh University Press; 1970.
Feynman, Richard, P. (1985). QED, The Strange Theory of Light and Matter. Princeton
New Jersey: Princeton University Press.

Fodor, Jerry (1986). Meaning and Cognitive Structure. In Zenon W. Pylyshyn and William
Demopoulos, ed. Norwood New Jersey: Ablex publishing Corp.

Glasgow, John, C. II (1989). Emerging Systems and Machine Intelligence. Tallahassee,

Florida: Dissertation Florida State University.

Gould Stephen J. (1981). The Mismeasure of Man. New York: W. W. Norton and

Gould Stephen J. (1987). Time's Arrow, Time's Cycle. Cambridge Massachusetts:

Harvard University Press.

Grossberg, Stephen (1980). How does a brain build a cognitive code? Psychological
Review 87: 1-51.
Guth Alan H. and Steinhardt Paul J. (1984). The Inflationary Universe. Scientific American,

Haugland John (1985). Artificial Intelligence, the Very Idea. Cambridge Massachusetts:
Bradford Books a division of MIT Press.

Hawking Stephen W. (1988). A Brief History of Time. New York: Bantam Books.

Heisenberg Werner (1949). The Physical Principles of the Quantum Mechanics. Toronto
Canada: Dover Publications.

Herbert, Nick (1989). Quantum Reality, Beyond the new Physics. Anchor Books, , ed.
New York: Dell Publishing Group, 666 Fifth Avenue.

Hillis W. Daniel (1988). Intelligence as an Emergent Behavior; or, The Songs of Eden. In
Graubard Stephen R., ed. The Artificial Intelligence Debate. Cambridge, Massachusetts:
MIT Press.

Hofstadter Douglas R. (1979). Goedel, Escher, Bach: An Eternal Golden Braid. New York:
Vintage Books a Division of Random House; 1979.

Hopfield John. J. (1982). Neural networks and physical systems with emergent collective
computational abilities. Proceedings of the National Academy of Sciences 79: 2554-2558.

Jantsh, Erich (1980). The Self-Organizing Universe. New York: Pergamon.

Kaye Kenneth (1982). The Mental and Social Life of Babies (How Parents Create
Persons). Chicago: The University of Chicago Press.
Kayser Daniel (1984). A Computer Scientist's View of Meaning. In S.B. Torrance, ed.
The Mind and Machine: 168-176. Ne York: Ellis Horwood Limited, distributed by John
Wiley and Sons Limited.

Khinchin A. I. (1957). Mathematical Foundations of Information Theory. New York: Dover

Publications Inc.

Kline Morris (1980). Mathematics, the Loss of Certainty. New York: Oxford University

Kuhn Thomas S. (1962). The Structure of Scientific Revolutions. Chicago: University of

Chicago Press.

Laszlo, Ervin (1972). Introduction to Systems Philosophy: Toward a New Paradigm of

Contemporary thought. New York: Gordon and Breach.
Layzer, David (1988). Growth of Order in the Universe. In Bruce H. Weber and David J.
Depew and Jonas D. Smith, ed. Entropy, Information and Evolution. Cambridge
Massachusetts: MIT Press.

Layzer, David (1990). Cosmogenesis. New York: Oxford University Press.

Lovelock J. E. (1979). Gaia. New York: Oxford University Press.

Mandelbrot, Benoit, B. (1977). Fractals, Form, Chance, and Dimension. San Francisco: W.
H. Freeman and Company.

Maturana, Humberto R. (1980) The Realization of the Living. In R.H. Maturana and F.
Varela, ed. Biology of cognition, (1970) reprinted in Autopoiesis and Cognition:: 2-62.
Dordrecht: Reidel; 1980.

McCorduck, Pamela (1979). Machines Who Think. San Francisco: W. H. Freeman and

McNally D. W. (1973). Piaget, Education and Teaching. Sussex England: Harvester

Press limited.

Minsky Marvin L. (1985). The Society of Mind. New York: Simon and Schuster.

Nagel Ernest and Newman James R. (1958). Goedel's Proof. New York: New York
University Press.

Nottebohn Fernando (1989). From Birdsong to Neurogenesis. Scientific American, Feb.

Olmstead John III (1988). Observations on Evolution. In Bruce H. Weber and David J.
Depew and Jonas D. Smith, ed. Entropy, Information and Evolution. Cambridge
Massachusetts: MIT Press.

Pagel, Mark, D. and Harvey, Paul, H. (1989). Taxonomic Differences in the Scaling of Brain
on Body Weight Among mammals. Science 244(4912).

Parker, D. B. (1982). Learning Logic. Invention Report S81-64, File 1. Office of

Technology Licensing, Stanford University.
Piaget Jean (1970). Structuralism. New York: Harper and Row.

Piaget Jean (1975). The Equilibration of Cognitive Structure. Chicago: University of

Chicago Press.

Piaget, Jean (1952). The Origins of Intelligence in Children. New York: International
Universities Press.

Prigogine Ilya (1980). From Being to becoming. San Francisco: W. H. Freeman and

Prigogine Ilya and Stengers Isabelle (1984). Order out of Chaos. New York: Bantam
Books; 1984.

Putnam Hilary 1988). Much ado about not very much. In Stephen R. Graubard, ed. The
Artificial Intelligence Debate. Cambridge, Massachusetts: Bradford Books, MIT Press.
Putnam Hilary (1988). Representation and Reality. Cambridge, Massachusetts: Bradford
Books, MIT Press.

Pylyshyn Zenon w. (1985). Computation and Cognition. Cambridge Massachusetts:

Bradford Books a division of MIT Press.

Rescher Nicholas (1979). Cognitive Systematization. Totowa New Jersey: Rowan and

Russell Bertrand (1945). A History of Western Philosophy. New York: Simon and

Salthe Stanley N. (1985) Evolving Hierarchical Systems. New York: Columbia University

Thomas, Lewis (1974). The Lives of a Cell. New York: Bantam Books.

Tolman Richard (1979). The Principles of Statistcal Mechanics. New York: Dover

Turing Alan (1950). Computing Machinery and Intelligence. Mind 59: 434-460.

Wicken Jeffrey S.(1988). Thermodynamics, Evolution and Emergence: Ingredients for a

New Synthesis. In Bruce H. Weber and David J. Depew and Jonas D. Smith, ed.
Entropy, Information and Evolution. Cambridge Massachusetts: MIT Press; 1988.
Wiley E. O. (1988). Entropy and Evolution. In Bruce H. Weber and David J. Depew and
Jonas D. Smith, ed. Entropy, Information and Evolution. Cambridge Massachusetts: MIT

Williams Pearce L. (1989). Andr_ Marie Amp_re. Scientific American, Jan.

Winograd, Terry and Flores, Fernando (1986). Understanding Computers and Cognition.
Norwood New Jersey: Ablex Publishing Corp.