P. 1
compModelsOfEvolOfLangAndLangs

compModelsOfEvolOfLangAndLangs

|Views: 59|Likes:
Published by jongung

More info:

Published by: jongung on Sep 11, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

10/31/2011

pdf

text

original

Sections

  • Chapter 1 Introduction
  • 1.2 Artificial Life and Artificial Societies
  • 1.3 A Note on Previous Publications
  • Chapter 2 The Evolution of Communication and Language
  • 2.1 Introduction
  • 2.2 Language and Communication
  • 2.2.1 Features of Language
  • 2.3 The Evolution of Communication
  • 2.4 The Biological Evolution of Language
  • 2.4.1 Evidence for the Innateness of Language
  • 2.4.2 Evolution of Linguistic Ability
  • 2.4.3 Questioning the Innate Linguistic Ability
  • 2.4.4 Why and How? Just So Stories and Grand Theories
  • 2.5 The Cultural Evolution of Languages
  • 2.5.1 Language Change
  • 2.5.2 Explanations of Language Change
  • 2.5.3 Language Change and Evolutionary Theory
  • 2.5.4 Cultural Evolution
  • 2.5.5 Language Ecology and Species
  • 2.5.6 Neutral Evolution
  • 2.5.7 Co-evolution of Languages and Man
  • 3.1 The Artificial Life Approach
  • 3.1.2 Emergence
  • 3.1.3 Life-Like
  • 3.2 Methodologies for Applying Artificial Life
  • 3.2.1 Science without method?
  • 3.3 Improved Methodologies and Practices
  • 3.3.1 A Methodology Emerges
  • 3.3.2 Not Just Biology
  • 3.3.3 ALife for Novel Models and Theories
  • 3.4 ALife in Practice
  • 3.4.1 ALife Research vs ‘Real’ Research
  • 3.4.2 Principles of ALife Model Building
  • 3.4.3 Adapting Methods
  • 3.4.4 Implementing an ALife model
  • 3.5 Experimentation
  • 3.5.1 Validation
  • 3.5.3 The (Un-)Importance of Surprise
  • 3.6 Explanation
  • 3.6.1 Model and Theory
  • 3.6.2 Validation of Results
  • 3.7 Limitations Of Artificial Life
  • 3.7.1 ALife vs. Mathematical Models
  • 3.7.2 ALife Models as Proof
  • 3.7.3 ALife Models as Evidence
  • 3.7.4 ALife and Quantitative Results
  • 3.8 ALife and the Evolution of Language
  • 3.8.1 ALife and the Biological Evolution of Language
  • 3.8.2 ALife and the Cultural Evolution of Languages
  • 3.8.3 Coevolution of Language Ability and Languages
  • 4.1 Introduction
  • 4.2 Computational Models of Language Learning Populations
  • 4.2.1 Evolution of Language Ability
  • 4.3 Modelling Human Language
  • 4.4 An Artificial Neural-Network Based Language Agent
  • 4.5 Evaluating Communication in Artificial Populations
  • 4.6 Modelling Language-Physiology Coevolution
  • 4.7.1 Signal Production
  • 4.7.2 Signal Interpretation
  • 4.7.4 A Note on Terminology
  • 4.7.5 Representational Capacity
  • 4.7.6 A Bias for Language Learning
  • 4.7.7 The Individual Language Agent – 1
  • 4.7.8 The Individual Language Agent – 2
  • 4.8 Experiment 1: Language Negotiation
  • 4.8.1 Emergence of a Language Bias
  • 4.8.2 Success of Language Negotiation
  • 4.8.3 Interpreting the Results
  • 4.9 Experiment 2: Spatially Arranged Populations
  • 4.9.1 Emergence of Dialects
  • 4.10 Conclusion
  • Chapter 5 Modelling the Biological Evolution of Language
  • 5.1 The Evolution and Emergence of Language
  • 5.2 Modelling the Biological Evolution of Language Ability
  • 5.2.1 Heterogeneous Language Abilities
  • 5.2.2 Comprehension Leads Production
  • 5.2.3 Population Generations and Replacement
  • 5.2.3.1 Genetic Representation and Reproduction
  • 5.2.3.2 Selection, Mating and Replacement
  • 5.2.3.3 Crossover and Mutation
  • 5.3 Modelling Language-Physiology Coevolution, Part I
  • 5.3.1 Crossover and Mutation
  • 5.3.2 Parameters and Settings
  • 5.3.3 Discontinuous Evolution of Signalling Ability
  • 5.4 Modelling Language-Physiology Coevolution, Part II
  • 5.4.1 Crossover and Mutation
  • 5.4.2 Parameters and Settings
  • 5.4.3 Continuous Evolution of Signalling Ability
  • 5.5 The Effect of Neighbourhood Size
  • 5.5.1 Parameters and Settings
  • 5.5.2 The Effect of Large Neighbourhoods
  • 5.6 Coevolution with a Costly Language Ability
  • 5.6.1 Parameters and Settings
  • 5.6.2 Spatial Selection and Costly Language Ability
  • 5.7.1 Continuous versus Discontinuous Evolution of Language
  • 5.7.2 Spatial Selection
  • 5.7.3 Redundancy and Linguistic Ability
  • 5.7.4 Investigating The Adaptive Benefits of Language
  • 5.7.5 Embodied Communication
  • 5.7.6 Limitations and Shortcomings
  • 6.1 Introduction
  • 6.2 An Initial Test
  • 6.3 Human Linguistic Diversity
  • 6.3.1 Patterns of Diversity
  • 6.3.2 Linguistic Boundaries
  • 6.4 Analytical Models of Linguistic Diversity
  • 6.4.1 The Niyogi-Berwick Model
  • 6.4.2 Criticism of the Niyogi-Berwick Model
  • 6.4.3 The Cavalli-Sforza and Feldman Models
  • 6.4.4 Summary of Analytical Models
  • 6.6 Experiment 1: Emergence and Maintenance of Dialects
  • 6.6.1 Experimental Setup
  • 6.6.2.1 Visualisation
  • 6.6.2.2 Experimental Results
  • 6.6.2.3 Measuring Diversity
  • 6.6.3 The Effect of Neighbourhood Size
  • 6.6.4.1 Maintaining Diversity
  • 6.6.4.2 Human linguistic diversity: A comparison
  • 6.7 Experiment 2: Diversity from Homogeneity
  • 6.7.1 Experimental Setup
  • 6.7.1.1 Noisy Learning
  • 6.7.1.2 Innate Bias
  • 6.7.3 Discussion
  • 7.1 Artificial Life and Micro-Simulation Models of Linguistic Diversity
  • 7.1.1 Functional Requirements for Diversity
  • 7.1.2 Other Models of the Evolution of Linguistic Diversity
  • 7.1.3 Related Models
  • 7.2 Dialect in an Agent Based Model of Emergent Phonology
  • 7.2.1 de Boer’s Model of Emergent Phonology
  • 7.2.2 Experimental Setup
  • 7.3.1 Neutral Evolution Revisited
  • 7.3.2 Adaptive, Maladaptive and Neutral Change
  • 7.3.3 Universality and Uniformitarianism
  • 7.3.3.1 Universality
  • 7.3.3.2 Uniformitarianism
  • 7.3.4 Relativity and Non-Uniformitarianism
  • 7.3.4.1 Linguistic Relativity Hypothesis
  • 7.3.4.2 Non-Uniformitarianism
  • 7.3.4.3 Directed Change in Sound Systems
  • 7.3.5 Neutral Change in a Relativistic, Non-Uniformitarian World
  • 7.3.5.1 Neutral Change In Sound Systems
  • 7.3.5.2 Neutral Networks in Language Evolution
  • 7.3.5.3 Neutral and Adaptive Evolution
  • 7.4 A Modified-Neutral Theory of Linguistic Evolution
  • 7.5 Conclusions
  • Chapter 8 Conclusions
  • 8.1 Artificial Life and the Evolution of Language
  • 8.2 Methodological Approaches
  • 8.3 Redundancy in Language Evolution
  • 8.4 Language Change and Neutral Evolution
  • 8.5 Future Directions
  • References

Computer Models of The Evolution of Language and Languages DANIEL JACK LIVINGSTONE

A thesis submitted in partial fulfilment of the requirements of the University of Paisley for the degree of Doctor of Philosophy

September 2003

Abstract
The emergence and evolution of human language has been the focus of increasing amounts of research activity in recent years. This increasing interest has been coincident with the increased use of computer simulation, particularly using one or more of the methods and techniques of ‘Artificial Life’, to investigate a wide range of evolutionary problems and questions. There is now a significant body of work that uses such computer simulations to investigate the evolution of language. In this thesis a broad review of work on the evolution of language is presented, showing that language evolution occurs as two distinct evolutionary processes. The ability to use language is clearly the result of biological evolution. But the changes that occur over time to all spoken languages can also be viewed as being part of a process of cultural evolution. In this thesis, work using artificial life models to investigate each of these processes is reviewed. A review of the methods and techniques used in artificial life is also presented early in the work. A novel model is developed which is used to explore the conditions necessary for the evolution of language. Interesting results from initial tests of the model highlight the role of redundancy in language. From these initial tests, the model is further developed to explore the biological evolution of the human capacity for language. One significant outcome of this work is to highlight the limitations of the model for developing, and especially for ‘proving’, particular theories on how or why Homo sapiens alone evolved language. This is tied to a brief review showing that this weakness is not one specific to this particular model, but may be one that is possessed by all artificial life models that try to explain the origins of language. With further minor modifications to the model, the focus is shifted to the evolution of languages and language diversity. In comparison with some of the earlier conclusions, this work emphasises the positive contribution to ongoing scientific debate that is possible using computer simulations. In this case, experiments using the model focus on whether social and/or linguistic benefits are required in explanations of language change. A review and debate is then presented on work that contradicts our findings. Further corroboration of our conclusions is then gained by conducting a similar experiment using a different computer model. The key contributions of this interdisciplinary work are: first, in detailing some of the unique problems and issues inherent in using computer models specifically for modelling the evolution of language; second, in emphasising the importance of redundancy in language evolution; and finally, in adding to the current debate on whether the evolution of languages can be viewed as a form of adaptively neutral evolution.

i

Dedication
For dad.

Acknowledgements
That this thesis is now finally complete is thanks, in great part, to the help, comments, advice, and support provided by many people. I should like to take this opportunity to thank some of them. First, I must acknowledge the contribution of Professor Colin Fyfe. Without his many words of encouragement, regular cajoling and occasional inquisitions it is doubtful that I would ever have completed. Thank you Colin for your great patience in seeing this through. I should also like to thank the School of Computing, and its predecessor departments for paying my fees and providing financial support. Being the sole researcher at an institution working on a particular topic can be lonely at times, so I would like to thank my departmental colleagues for making it considerably less so – particularly Darryl Charles, Stephen McGlinchy, Donald McDonald, Bobby Geary and Douglas Wang. When I started out on my research I had little idea what I was getting myself into. A number of people I met at conferences and visits to other institutions helped me find out. To list them all would not be possible, but here I would like to offer thanks to Jason Noble, a friendly soul, through whom I met, and had a number of interesting chats with Inman Harvey, Ezequiel di Paolo and Seth Bullock. Simon Kirby is due thanks for numerous pointers, listening to my moaning, and those proceedings that I have yet to return. Mike Oliphant is thanked for giving me a second chance at explanation. Thanks are due too to those other friendly souls, Bart De Boer, Paul Vogt and Tim Taylor, and the members of the Edinburgh LEC group. David Hales, Jeremy Goslin and Chris Douce are all due an apology – I really should have returned more emails. Sorry. Dave also pointed me to some interesting papers, and told me lots about Artificial Life long before I ever thought I’d end up working with it myself. The final shape of this thesis is also thanks to the hard work of many anonymous reviewers whom I would also like to thank. The very un-anonymous Angelo Cangelosi and Jim Hurford gave this thesis its very final review and examination – and I should like to thank them both for their careful reading and positive comments.

ii

with whom many an enlightening conversation was had: Arthur McGready and Alasdair Duncan. And finally.Away from the world of academia. Pete McMinn. Dougie Gaylor. Thank you. Thanks also to the Players of Games and the Writers of Books: Gary Devine. thanks are due to those great imbibers. I am eternally indebted to Bronwen and Morgan for their love and support. Tom McGlaughlin. iii . You make it all worthwhile. Phil Raines and Gary Gibson.

.................................6 Neutral Evolution...............1 Features of Language ...................................................4 The Biological Evolution of Language .....................................................................................................23 2..................................8 2.....................37 3.................5 Language Ecology and Species ................34 3...30 2....3 Language Change and Evolutionary Theory ...............................................1.............5 2..................21 2........Contents Chapter 1 Introduction..............................................................................31 3..........1 Outline...18 2......2 Explanations of Language Change .............4...5 The Cultural Evolution of Languages..................................................................................2 Language and Communication ............1 Introduction........3 ALife for Novel Models and Theories....4.............35 3....................1 Agents ........5..........2 Principles of ALife Model Building ........................5..........................6 2...................10 2.......................................................................................................................................................4 Why and How? Just So Stories and Grand Theories .....4 ALife in Practice ......21 2............................................3 Questioning the Innate Linguistic Ability..................................................................................3 Chapter 2 The Evolution of Communication and Language.............................33 3..............32 3.....2 1.........14 2...................2................................3.............................................2 Emergence................16 2................1 Language Change.........................................................3..........27 2.............................3 The Evolution of Communication .........................................26 2.............................................................5 2.......................................................................4.........................................................1 The Artificial Life Approach ...........31 3.........................................25 2..............4........................................................................................................1 1......................40 iv ...1 ALife Research vs ‘Real’ Research ..6 Summary ...............................................................................................................................1...39 3....................4 Cultural Evolution................1 A Methodology Emerges ..................................................................................................7 Co-evolution of Languages and Man.....30 Chapter 3 Artificial Life: Computational Modelling and a Methodological Approach ........................................................................................................5..........................................5..................................2 Not Just Biology .........................................................13 2....37 3..................3 Life-Like ...................5......2 Methodologies for Applying Artificial Life ......2 Artificial Life and Artificial Societies ......................................3 A Note on Previous Publications ..........1 1....29 2......................................3 Improved Methodologies and Practices ...................................................................................................32 3.................2 Evolution of Linguistic Ability...................1 Evidence for the Innateness of Language .............................35 3.........................................................................4.1 Science without method? ..........................4....31 3................................13 2.........................2......................3..........5.................................................1..5.............39 3...........

..............................................................7...............................................................................4 Implementing an ALife model..46 3............................................................49 3..................................................................59 4.......................................................................5.................................69 4......................................7...................................................7 A Language Agent....................................................44 3........................59 4.................................................................................................................7...........9 Summary ...5 Evaluating Communication in Artificial Populations..........65 4.....1 ALife and the Biological Evolution of Language............4.............5..........................8..............................................7................3...................3 Learning ............52 3...............................................................8 ALife and the Evolution of Language .....8 Experiment 1: Language Negotiation..........2 ALife Models as Proof..7...........5 Representational Capacity ...............................................................................57 3...7.......................................8 The Individual Language Agent – 2 ............................7 The Individual Language Agent – 1 .......................................................75 4...............................2...4 An Artificial Neural-Network Based Language Agent..............70 4..48 3.69 4............1 Emergence of Dialects ...................8...........77 v ......................61 4.....69 4...............62 4.......................................7..........................................41 3......5 Experimentation .........7 Limitations Of Artificial Life.....71 4.............50 3.......................2 Computational Models of Language Learning Populations..2 Success of Language Negotiation ...............49 3..........47 3...............................4 ALife and Quantitative Results......1 Signal Production.........6...........................7...........48 3..........................75 4.......................................................................3 Adapting Methods..........................................66 4...............7......................................43 3..68 4......................................6 A Bias for Language Learning.64 4................56 3.........3 Coevolution of Language Ability and Languages .........9 Experiment 2: Spatially Arranged Populations.......................42 3.........8.....................................................8.............................................................................9....43 3..............................1 ALife vs.....................................................51 3................................3 Modelling Human Language .......................................................4 A Note on Terminology ..........................8...............................................68 4............8..........................................1 Emergence of a Language Bias.............................................7................1 Validation.....5.................3 The (Un-)Importance of Surprise.......................................71 4..............2 Signal Interpretation...............................53 3.........................................................3 ALife Models as Evidence................................66 4....2 Validation of Results.........73 4..........72 4..........7.......... Mathematical Models ..................................................................4....6 Modelling Language-Physiology Coevolution .........................................................................3 Interpreting the Results ......................................6.............................................................................57 Chapter 4 An Artificial Life Model for Investigating the Evolution of Language .....72 4.......................................59 4..........................1 Model and Theory.........6 Explanation ..................1 Introduction......................1 Evolution of Language Ability ...........................2 ALife and the Cultural Evolution of Languages.......2 Verification ..............................................53 3............................7......................................................................................................

.........2 Comprehension Leads Production ...6 Coevolution with a Costly Language Ability ..............................4.......................................3........................................................................3 Discontinuous Evolution of Signalling Ability............................................................ Part I ...........................................89 5..2 Criticism of the Niyogi-Berwick Model ..1 The Evolution and Emergence of Language......2.............1 Continuous versus Discontinuous Evolution of Language....1 Crossover and Mutation...........2.113 6...96 5...................................3 Redundancy and Linguistic Ability ..5..84 5...................101 5.............................7.......111 6.................. Part II....107 5...80 5....................................................................116 6...6......................1 Parameters and Settings ...1 Heterogeneous Language Abilities ...................86 5....3 Population Generations and Replacement .............87 5.......................115 6..............................................101 5.4......................................................................7.............................................................................2..81 5.........2 Modelling the Biological Evolution of Language Ability .....108 Chapter 6 The Cultural Evolution of Language and The Emergence and Maintenance of Linguistic Diversity ...........................................7................114 6.......................................................2 Linguistic Boundaries .....2 An Initial Test.............................................................................................92 5...............1 The Niyogi-Berwick Model .....................................96 5...................88 5..............................................................................5 Embodied Communication .............3.......................7.......2.................78 Chapter 5 Modelling the Biological Evolution of Language..............6.....................................3 Human Linguistic Diversity ...................111 6.....2 The Effect of Large Neighbourhoods ....................83 5..........................5 The Effect of Neighbourhood Size......................4.....8 Summary .....................................4 Investigating The Adaptive Benefits of Language .................................................4 Modelling Language-Physiology Coevolution..102 5.............................................80 5..4 Analytical Models of Linguistic Diversity...7.............2 Parameters and Settings ........2 Spatial Selection and Costly Language Ability .........4....84 5.............................5..4..................................................................2...........................95 5..........................................96 5..............................................................3.............................................2 Parameters and Settings .1 Introduction....117 vi .....81 5............................................................103 5..........................................92 5....................................................................112 6.................. Mating and Replacement ......93 5.................7......2 Selection....92 5................1 Genetic Representation and Reproduction.......3.............1 Parameters and Settings .............................104 5........3 Modelling Language-Physiology Coevolution..........................................3..............1 Crossover and Mutation......................1 Patterns of Diversity .........................113 6.....89 5.....................3...................10 Conclusion ..........3 Crossover and Mutation....2 Spatial Selection...........................................................2........4........3........................................................................................94 5.....3........................94 5....................................................................................7 Discussion ..........................................................................................................................6 Limitations and Shortcomings ............3 Continuous Evolution of Signalling Ability .......101 5......88 5..

.......................150 7.........................142 7..3..............................2 Results............122 6.......7.....7 Experiment 2: Diversity from Homogeneity ...........................149 7..................................150 7...........1 Neutral Change In Sound Systems ...120 6....2.........1......3........................................5.......2...................................151 7..................................................4 Relativity and Non-Uniformitarianism .......................................................................................1 Noisy Learning..146 7.....................................3....3.143 7.....133 6....................................................3 Discussion .................3 Results...............151 7............................................................1 Experimental Setup.........................................................................119 6........................................................................................................152 7.........................7.....1 Linguistic Relativity Hypothesis........127 6..............................................2 Innate Bias ...1 Universality...............3 Related Models ......1........4...135 7.............................132 6.....................125 6......................1 Experimental Setup........3.......................1 Artificial Life and Micro-Simulation Models of Linguistic Diversity.........3..131 6.....138 7..........123 6....5 Modifying the Artificial Life Model for Exploring the Emergence of Linguistic Diversity..........6.......2 Adaptive....................2 Other Models of the Evolution of Linguistic Diversity .........................................................................................6.....2............................................135 7.....................................................3 The Cavalli-Sforza and Feldman Models ....145 7.........3 The Effect of Neighbourhood Size ........5........................................................................................2 Non-Uniformitarianism..............................2 Results.........4...................................129 6....................151 7...........2 Experimental Setup..............126 6.....................3................6....................................3 Universality and Uniformitarianism ......4............................122 6.7............149 7.........133 Chapter 7 Cultural Evolution in an Agent Based Model of Emergent Phonology ............................145 7..................135 7........3..........150 7..........3...1 Neutral Evolution Revisited..1......133 6.................................2 Neutral Networks in Language Evolution ..................................5.........................................................7....4........4 Analysis.........132 6.......3...............2 Uniformitarianism....123 6......2......................2 Dialect in an Agent Based Model of Emergent Phonology ...........................................................................................................................2 Experimental Results ............................................................1 Maintaining Diversity .......................124 6.............................6.1 Visualisation .. Maladaptive and Neutral Change .3 Directed Change in Sound Systems ......6......................................3.....3 Discussion: Towards a Modified-Neutral Theory of Language Change and Diversity...................1 Functional Requirements for Diversity..............121 6.................6 Experiment 1: Emergence and Maintenance of Dialects ..............................................................127 6.........................................6..............7.............................. Non-Uniformitarian World...........2 Human linguistic diversity: A comparison ......................................1.......................1..137 7...................4..3......................................................139 7......................................................4.....3..................................137 7.....................................4 Summary of Analytical Models ...................133 6...............5 Neutral Change in a Relativistic...................................4............3 Measuring Diversity.............6...............3 Neutral and Adaptive Evolution .............149 7.........................8 Summary .2..........................153 vii .........................2...............................1 de Boer’s Model of Emergent Phonology ......................................3.................3........................6.6......................................................6......

........................................... 171 viii ..........................................................................................................................................3 Redundancy in Language Evolution .......................................................................................7......................5 Conclusions....................................................................................158 8.....................................154 Chapter 8 Conclusions................ 163 References.........................................................................................153 7........................155 8...................................................................................................................................................156 8..4 A Modified-Neutral Theory of Linguistic Evolution .......................................155 8.1 Artificial Life and the Evolution of Language .........159 Appendix A..........5 Future Directions.... 160 Appendix B Colour Plates.......157 8.........................2 Methodological Approaches ..............4 Language Change and Neutral Evolution ...............................................................................

these two forms are here termed the Evolution of Language (EoL) and the evolution of languages (eol) respectively. and why only for Homo sapiens and how did the capacity for language evolve? Other.Chapter 1 – Introduction 1 Chapter 1 1. there is current debate in studies of the eol on . Additionally.1 Outline Introduction In this work I apply computational modelling methods to study some questions on the evolution of language. and is also outlined in brief below. In contrast. In Chapter 4. As with EoL. a model of emergent phonology is used to support the results gained from the existing computational model. and biologically evolved. two distinct forms of language evolution become apparent – the biological evolution of some innate linguistic ability.) A broad picture of current thought on these two forms of linguistic evolution is given in the literature review (Chapter 2). in the human capacity for language and what is simply the result of cultural learning processes over many generations? Thus. and emphasises some of the conditions necessary for the EoL. and the cultural evolution of specific languages and language families. in the remainder of this thesis these two types of evolution of language will be referred to by their abbreviated forms. The long hand phrase ‘evolution of language’ will be used when referring to the evolution more generally. big and small. in Chapter 3. the model is used to demonstrate the continuous evolution of linguistic ability within a population of language users. (By convention. I describe the basic details of the computational model I will use in my own investigations. subtler and more fundamental. After the terminology of Hurford (1999). The computational approach taken is an ‘Artificial Life’ one. in Chapter 5. It is also necessary to review some of the vast array of work aimed at answering the questions. Then. who distinguishes these as the evolution of language and the evolution of languages. and just what this entails is necessarily reviewed early in this work. in Chapter 6. the model is adapted to investigate the eol. that exist about the evolution of language: why did the ability of humans to use language evolve. Some tests are carried out on the model to demonstrate its workings. questions emerge: what is language? what is actually innate. I will argue that this demonstration is relevant to the continuity-discontinuity debate that exists on the EoL.

1989). As indicated above. some shortcomings in my own work and a few in the other works will be highlighted. is a comparatively recent offshoot of Artificial Intelligence. This is given below. and accordingly a critical eye is cast over the work. a quick introduction to Artificial Life is in order. in the conclusion. this hopefully . with this in mind. and biology. with its roots in the unrelated fields of distributed Artificial Intelligence (DAI) and computer simulation. author’s emphasis) This states that Artificial Life is simply an advanced modelling technique – one where the behaviour of whole populations can be observed. The term Artificial Life is itself quite new.2 Artificial Life and Artificial Societies Artificial Life. 1989. 1. which develop over time from out of all of the local interactions…” (Langton. xxii. Instead. While the number of investigators now using Artificial Life methods to investigate the evolution of language seems to indicate that Artificial Life is a useful approach. An important goal of this research is to explore the suitability of Artificial Life techniques for study of the evolution of language and languages. and will be reviewed in more detail in Chapter 3. Through the workshop. Langton reports that some consensus of the “essence” of Artificial Life emerged: “Artificial Life involves the realization of life-like behaviour on the part of man-made systems consisting of populations of semiautonomous entities whose local interactions with one another are governed by a set of simple rules… high-level dynamics and structures observed are emergent properties. coined by Chris Langton. the behaviour of individuals within the population are detailed.Chapter 1 – Introduction 2 how computational models may aid in the formulation of arguments and in the provision of additional evidence. organiser of the first workshop on Artificial Life (Langton. The workshop was organised to bring some unity to the fragmented literature which existed on biological modelling and simulation. The results of the experiments are analysed and exactly what they say about the evolution of language explored in the final chapter of the thesis. p. although the rules for how the population might behave are not explicitly detailed within the model. hereafter ALife.

the International Society of Adaptive Behaviour. Livingstone and Fyfe. from initial exploratory experiments to later developments. The description of the model presented in Chapter 4 is drawn from. Chapter 3 also includes much of the article (Livingstone.Chapter 1 – Introduction 3 being less reliant on possibly incorrect and simplistic assumptions. This idea has been remarkably successful. (Livingstone and Fyfe. Societies and journals exist to support researchers using these new computational techniques in various fields – the Artificial Life journal. and more. The experiments of Chapter 5 are based on the same works.S. 1978). 2002)). and ALife techniques are now applied to a wide range of problems beyond biological and social modelling. many of these reproduce selected details . This has been substantially altered since. The bulk of the material of Chapter 6 was previously presented in different forms in (Livingstone. 1999b. as published in the C. 1998a. Also. with a number of additional sections. Livingstone. and a number that have gone before it. 2000. As a matter of course. 1999a). The Journal of Artificial Societies and Social Simulation.I. 2000b) and (Livingstone. individuals interact repeatedly and over time and the results at the population level are observed. again with additional material. 1999b. 2000). as evidenced by this thesis. Dessalles and Ghadapkpour. but with considerably more detail and expansion. Cangelosi and Parisi. 1998. which was presented as part of the transfer report on the 28th of January 1999. 1999a. previously published descriptions in (Fyfe and Livingstone. Through computational processes. and a number of international conferences are held annually with ALife as a central topic.3 A Note on Previous Publications Much of the work presented in this thesis has been previously published in different form. 1. ALife is being increasingly applied to the complex issues surrounding the evolution of language (for further examples. also see (Hurford et al. The material of Chapters 2 and 3 is drawn from the literature review. 1998b. Although seeming new.. departmental journal at the University of Paisley. Livingstone and Fyfe. 1997). the computational approach to micro-modelling had in fact been explored before by scientists in other communities. Livingstone and Fyfe. with some early groundbreaking work in social modelling in particular (Schelling. 2002). Livingstone and Fyfe.

as presented in Chapter 6. 2000a). was previously presented in (Livingstone.Chapter 1 – Introduction 4 of the model construction. Some of the material detailing work with an alternative phonological model. . as presented here in Chapter 4.

while apparently focussed. 1997. and some of the major theories on innate linguistic abilities. 1994. However.. one which has also been the subject of much ALife investigation – the evolution of cooperation (Axelrod. First I introduce some of the evidence for a biological basis for language. biology. the problems which this thesis will later investigate will be detailed. The evolution of language and communication is an inherently inter-disciplinary field of research. physiology and neuro-physiology (e. In particular. I draw on sources from many different fields. 1996. I then turn my attention to work on the biological evolution of language. of interest to researchers in anthropology. 1984). Here we see links to another fundamental problem. As I will show when questioning the degree of innateness of language. Finally.1 Introduction The aim of this chapter is to provide a grounding in the domain of study – that of the evolution of language. rather than how and why humans have gained the ability to use language. This concentrates on animal communication rather than human language. 1996. Bickerton. archaeology. how languages change over time. Thus an area of research has been staked out that. 1984. looking first in broad terms at language and communication – specifically at what differentiates human language from other systems of communication. Mithen. In this review. some recent work . there is some degree of crossover between these questions. inhabits an incredibly diverse space. and some of the many still open questions about the evolution of language will be identified. Deacon. psychology.Chapter 2 – Evolution of Communication 5 Chapter 2 Language The Evolution of Communication and 2. Pinker. Hurford et al. 1998).g. Dunbar. but provides a base for much work on the EoL. Some definitions of terms that will be used throughout the thesis will be given. I move from work on the biological EoL to look at the cultural eol. Here the concern is with the historical development of particular languages. linguistics. Then I review some theoretical work on the evolution of communication. I proceed to review some work questioning the precise nature of the innate language ability before looking at some attempts to explain why and how language evolved.

and surprisingly intractable. Animal signalling can take many forms and signals can use sound. I will introduce a number of terms with special meaning within linguistics. To get some idea of this problem consider first a simple view of communication. albeit unique in a number of respects. and have created a rich language for discussing language itself. commonly assumed: A signal can be said to be communicative where the originator of the signal intentionally uses it to transmit information to the receiver. 2. vision. note that.Chapter 2 – Evolution of Communication 6 on culturally evolving linguistic systems provides a challenge to current ideas about the human biologically innate language ability. Then we can hope that the models which are developed are relevant to the evolution of language. There are significant problems with such a simple definition. Bullfrogs croak to advertise their quality as mates. This chapter provides the background knowledge required before meaningful. ALife experiments can be carried out. human language being but one.2 Language and Communication A spectacular variety of means of communication between animals (of the same or different species) have evolved. Can we say that any bullfrog intends to advertise itself as being of low quality? Instead the attempt of the female to . it is required that some understanding of the distinction between non-linguistic communication and language is gained. This is unavoidable – linguists use a great deal of jargon. problem for theoretical biologists – that of defining what communication actually is. Before we can proceed to model the evolution of language. This requirement for making a distinction is expanded upon in the next chapter. In this work I side-step a rather fundamental. if signalling is viewed simply as the transfer of (mis)information from one individual to another. then signals may be given voluntarily or involuntarily (Krebs and Dawkins. As a last word in this introduction. First. emphasising only terms of special importance or those which will reappear later in this thesis. I have deliberately kept the use of such jargon to a minimum. 1984). touch and even scent or taste as a means of signal transmission/reception. and hopefully useful. in this chapter.

there is more discussion on the functional benefits of language. limiting the definition of communication to intentional signalling which aims to benefit both sender and receiver of a signal – limiting it to deliberate.. universal. in a theoretical study. For example. That a single. to establish a common ‘mood’. 1969) have problems too.). in the remainder of this and in the next chapter. but leaves many apparent examples of communicative behaviour outside the definition of signalling systems. e. Noble. communication involves situations in which the interests of the agents coincide and conflict in parts. however imperfect. how should deceptive signals be classified? Additionally. or for a particular inference to be made from it. again from biology. developed by Millikan. Bullock further notes that in many cases in biology. Any attempt to cite purpose on behalf of the receiver to circumvent this problem is similarly limited – a complimentary ‘manipulation’ perspective views signals as an attempt to exploit another individual on behalf of the signaller. however. 1997. is to declare signalling systems to exist only where selection has adapted the communicating agents to play their cooperative roles in signalling episodes (Bullock. cooperative signalling – can provide a useful start point for studying the evolution of communication. . The mind-reading perspective views signals as an attempt by a signal receiver to determine some information about some target which is emitting the signal – but the sender may not intend for the signal to be observed.g. (In linguistics such non-informational communication may be termed phatic where the intent of the communication is. 2000). definition of signalling has yet to be found that satisfies the whole breadth of studies into communication necessitates our selection of a more limited definition. Another approach. Relating to this. Thus neither intent nor informational content can be assumed when attempting to define communication. 2000.Chapter 2 – Evolution of Communication 7 determine which will be a good mate by listening to the croaks is what Krebs and Dawkins term ‘mind-reading’. This solves the problems noted above. Bullock. This simplifying definition is assumed as a given in my later experiments on the evolution of language. Attempts to define communication in terms of information transfer (e. Yet. 2000). Grice. Further discussion of many of these issues can be found in theses and papers by Bullock and Noble (Bullock. and other issues.g. Di Paolo (1999b) notes that communication can be useful even where the signals have no informational content.

4. Productivity: it is possible to produce (and understand) an infinite range of meanings. Rapid-fading: speech is transitory in nature. Hockett broke language down into thirteen components. using existing elements to produce new sentences. 8. 11. Interchangeability: language users can produce any message that they are capable of understanding. 9.1 Features of Language Charles Hockett (1960) suggested that language evolved gradually over time. representing one of the first modern attempts to identify precisely what is unique about human language compared with other forms of communication. Discreteness: language is based upon the use of a small set of discrete sound elements. 2. It is worthwhile listing these features. Rather than struggle with definitions.Chapter 2 – Evolution of Communication 8 However. and the signals can be localised. Total feedback: speakers can hear and are able to reflect upon everything they say. 5. Displacement: it is possible to use language to refer to events displaced in both time and space from the speaker. Specialization: the sound waves’ only function is to signal meaning. suggesting a continuity with non-human communication systems. and some of these will be mentioned. The features of language which Hockett identifies are: 1. my goal here is to distinguish between linguistic and non-linguistic communication. 6.2. one way of doing this is to compare the characteristic features of language versus those of other communication systems. signals exist for only a brief period of time. 7. 10. The features are chosen as characteristic of human language – exceptions may exist. Semanticity: elements of a signal convey meaning through stable signal-meaning associations. Arbitrariness: the signal-meaning associations are arbitrary. Vocal-auditory channel: sounds are used for language transmission. . Broadcast transmission and directional reception: signals can be heard by anyone within range. 2. 3.

This principle of the ‘arbitrariness of the sign’ is expounded in the work of Ferdinand de Saussure (1916). • Arbitrarily symbolic: the sounds and symbols used in language are primarily arbitrary. While many of these features can be found in various animal communication systems. 13. other than a purely arbitrary pairing of sound and concept. A textbook listing (from Sternberg. and the idea can be traced back to Aristotle and beyond. Duality of patterning: the sounds used in language do not convey meaning. like displacement. So we do not know which are the significant features of language simply by scanning this list. but combined in different ways to form words they do. but unless the model is able to include some significant features it may not be a useful model for its intended purpose. which other than human language is only known to exist in the dance ‘language’ of bees. The signals used in human language for communication bear no relation to the concepts being communicated. This challenges the significance of. they are not definitive – sign-language being an obvious exception. And while the thirteen features are characteristic of human language. Traditional transmission: language is transmitted by a cultural learning process rather than genetic transmission of innate calls. . the first two features. 1996) states that the characteristic features of language are that it is: • Communicative: language is used to transfer and share information. having no set relation to the underlying meanings. but this is now known not to be the case. Of these.Chapter 2 – Evolution of Communication 9 12. at least. it is helpful to compare the previous list with another from a more recent source. the first three features relate to the auditory nature of language. To emphasise the features of greatest significance. Yet it has been observed that communities of deaf children will form their own natural signlanguage (Kegl and Iwata. It was considered that only human language featured traditional transmission or duality of patterning. it should not be necessary to include all of these features. only human language contains all thirteen of these features. and that they are not weighted at all by their relative importance. In building an ALife model to help investigate the EoL. Hauser (1996) points out that no insight is provided into the functional significance of these features. However. 1989). these features are very rare in natural communication systems.

Learned language can be used to produce original statements. introducing different types of meanings – this last feature seemingly peculiar to the author’s theory of the evolution of language. Here there is a significant difference between human language and animal communication systems.3 The Evolution of Communication Amongst the vast array of animal communication systems are such diverse systems as bee dances to convey information about the location of pollen-rich flowers. Morphology looks at the internal structure of words. and for communities as languages change over time. The other feature proposed is the distinction between particular syntactic categories. • Regularly structured and structured at multiple levels: words are constructed from phonemes. • Dynamic: language is not fixed. but develops and changes with time.Chapter 2 – Evolution of Communication 10 The term ‘Saussurean communication’ refers to communication systems where concepts and signs are related to one another by such arbitrary relationships. Structure exists in language at multiple levels.g. two features are proposed which are not normally considered characteristic features of language. Elliot. Carstairs-McCarthy (1999) highlights three key features of language. leopard or snake. 2. • Generative and productive: it is possible to use language to express effectively infinite numbers of new meanings. That language has hierarchical structures allows a finite number of language elements to be used in the generation of infinite numbers of sentences and meanings. 1981). The first of these is the number of words with distinct meanings that may exist in any language. and vervet monkey alarm calls used to warn other vervets of the presence of specific types of predators – eagle. Biologists have been concerned for some time with how such systems could evolve. ant pheromone communication to lead other ants from the same colony to food and back to the nest. syntax at how words are combined to form phrases and sentences. Regular structures occur at these different levels. . Beside the previously noted duality of patterning. More recently. phrases from words and sentences from phrases. Language changes and develops over time – both for individual language users as their own language skills develop. A considerable body of work exists studying the development of language in children (e.

If communication is cooperative. increasing the distinctiveness of the signalling movement. to scare off others with threat of attack and hence use less energy than in an actual attack. e. any defectors that exist will be able to exploit the cooperation without rewarding those they exploit. e. This presumes that reactors will try to avoid manipulation by attending to honest indicators of qualities of size. etc. How a signal is ritualised may vary with the purpose of the signal and the rewards for actor and reactor. and those who produce the signals. Kin selection (Hamilton. then it might be expected to emerge in similar circumstances to those that can support the evolution of cooperation. or bees indicating where pollen may be found – and studies in the evolution of communication often focus on communication that is cooperative in nature (Hauser and Marler. In some cases communication serves a clearly co-operative purpose – vervets warning each other of predators. strength. A number of mechanisms have been proposed to overcome this problem. There should also be a direct link between signal design and quality . Zahavi (1979) proposed a handicap principle for honest signalling in sexual selection. 1999). Thus defection rather than cooperation will succeed in the population. Assuming that communication is cooperative leads naturally on to another problem – that of the evolution of cooperation. The fundamental problem for the evolution of cooperation is that in many cases cooperation does not appear to be an evolutionary stable strategy.Chapter 2 – Evolution of Communication 11 Lorenz (1966) and Tinbergen (1976) claim that many signals may have evolved from incidental movements or responses of actors which happen to pass information to reactors. This can succeed as selection favours those able to interpret such signals. desire to fight.g. Thus. What was once an accidental pairing of some action with a response is repeated and reinforced over time until it becomes a conventionalised communicative signal and response – a process termed ritualization. If cooperation emerges in a population. ignoring other signals which then fall into disuse. Over time this can lead to a population where cooperative behaviour is the norm. individuals within groups of kin which have evolved to cooperate will be fitter than those which live in non-cooperating family groups.g. 1964) is one – where cooperation occurs between kin.. surviving signals will be honest and costly to produce (thus only those individuals of higher quality will be able to produce higher quality signals). Ritualization may also occur to reduce ambiguity in signal reception. to anticipate (and avoid) attack.

Many other relevant topics are reviewed. Some researchers are investigating how the evolution of non-cooperative communication (such as predator-prey signalling) can be understood and explained (e. agents tend to repeatedly interact with the same few agents and this allows clusters of cooperation to form. Bullock.Chapter 2 – Evolution of Communication 12 being signalled. Working with the assumption that individuals are limited in the amount of mobility they have. models have been built in which agents are placed in some spatial arrangement and where they are limited to interacting only with other agents nearby. are linked to body size.g. cooperation can succeed (Kirchkamp. . including such aspects as how different environmental conditions shape the particular form of communication used – such as how the different acoustic properties of jungles versus those of open savannahs will affect the type and use of acoustic signals for communication. but many times. Because of the spatial constraint. In such circumstances. used to attract mates. enough to give the required base for the remainder of this thesis. and to act differentially based on past experience. The concern here is that not all animal signalling is cooperative. 1996). Spatial selection works as an alternative to memory. A thorough and comprehensive review of the evolution of animal communication (including some material on the evolution of human language) can be found in Hauser (1996). This section has presented a brief review of two large topics of study – the evolution of communication and the evolution of cooperation. For example the pitch calls of frogs. I hope. 1984) where the individuals play not just once. It is. 2000). Axelrod formalised a successful strategy for cooperative behaviour in the IPD – ‘tit-for-tat’ – where initially cooperative individuals either cooperate or defect depending on whether their partner previously cooperated or defected. and can similarly favour the evolution of cooperation. and this is certainly something which can extend to human language. Other mechanisms that can support the evolution of cooperation include spatial selection and the ability of individuals to remember which of the others is cooperative. These can both play a significant role in the iterated prisoner’s dilemma (IPD) (Axelrod.

Perhaps the closest examples to ‘primitive languages’ to be found are pidgin languages.g. human trait as emphasised by the following evidence. Pidgin contact languages are formed by migration of peoples with different languages into single communities. phrase head and role-player ordering (e. is capable of acquiring language as well as a human child of around five years of age (although the precise capacity that apes do have for language is a current topic of debate (Savage-Rumbaugh.a Universal Grammar. 1972) is now widely accepted (e. verb-object.4 The Biological Evolution of Language 2. Burnett. The UG is believed to contain a number of rules and meta-rules for grammar. Yet within a small number of generations – at times only one – pidgins develop into Creoles (Bickerton.Chapter 2 – Evolution of Communication 13 2. 1989). A creole is a descendant language of a pidgin. Rousseau.1 Evidence for the Innateness of Language It is an unavoidable fact that language is in some way an innate. But what is the biological basis for language. 1981). No monkey or ape. there is no culture lacking language. 1979. While different languages have different grammars. and a wide range of these have been studied including signlanguage pidgins (Kegl and Iwata. shared by all with only extraordinary cases providing exceptions. and in what ways does it determine the shape and form of human languages? The ability of children to quickly learn complex languages from fragmentary and incomplete evidence led Chomsky to argue that children are innately equipped with rules common to all languages . missing many aspects of grammar found in every other language. however well trained. 1755. This rapid development of grammar where there was none is a rich source of evidence on the innateness of grammar abilities in humans. Slobin. amongst whom it is a universal trait. Language is confined to only one species. UG. Pidgins are grammatically weak. this has proved to be a fallacy. For example.4. 1787). Pinker. or possessing only a limited language that can truly be considered ‘primitive’ in its linguistic development (Pinker. 1994). preposition-noun phrase. 1994). 2000)).g. and hence biologically determined.g. . but one that is rich in grammar. While the idea that primitive or ‘savage’ societies had equally primitive or savage languages was accepted in previous centuries (e. Of all human societies and communities throughout the world. the existence of some kind of Universal Grammar. (Chomsky.

and does not see the study of its evolution as being of value: “it seems rather pointless… to speculate about the evolution of human language from simpler systems…” Noam Chomsky. Chomsky himself rarely considers the problem of how the LAD may have evolved. and taken place over a longer time period. 1992) argue instead that the EoL must have been gradual. This is a continuous view of language evolution. LAD. A number of other authors (e.g. Chomsky and Bickerton both present views of a sudden and discontinuous EoL. occurring in smaller steps. Another view is offered by Bickerton (1990) who proposes that at some point in human evolution a basic form of language existed – one which he calls ‘protolanguage’. Chomsky proposed that there was an innate device for learning language. and generally more in keeping with modern evolutionary theory. . The LAD has been the focus for much research. Pinker and Bloom. As well as the UG. which allows either order but the same order throughout any language. page 70 At times he appears to discount the idea that it evolved at all – rather he claims that it came about by chance. the evolution of the LAD and language itself is the subject of much debate in linguistics. the UG contains a rule used in phrase construction. but remains constant in any single language.Chapter 2 – Evolution of Communication 14 adjective-complement) differs between languages. Such arguments are often based on the principle of exaptation. This protolanguage would have been lacking in a number of the features that characterise modern human language – principally it would have possessed a very limited grammar. or something akin to modern pidgins. 2. Yet the idea that somehow language did not ‘evolve’ but it suddenly appeared fullyformed persists. A single ‘macro-mutation’ was then responsible for the evolution of fully developed language ability and the emergence of language. Thus.4. 1972. consisting perhaps of only one or two-word utterances and incapable of complex sentence structure. and only once it appeared fully formed did it become used for language. and despite the fact that a LAD would surely be an evolved solution for language learning.2 Evolution of Linguistic Ability The innate device for language learning was termed by Chomsky the Language Acquisition Device.

At the same time it has been questioned whether the risk of choking is evolutionary significant (Aiello. A good example of this might be penguins’ use of wings for swimming. 1996). when compared to other species (Clark et al. Additional indirect costs of language are the additional investments that parents must place in children to rear them. 1996)). In order to support the extra energy cost of large brains. despite fairly significant development and larger relative brain size in humans. adaptation is the process whereby organisms improve their fitness by gradual modification to better fit their environment.. The larger brain necessary to enable speech costs more energy to maintain and requires a longer infancy to allow brain growth to complete (see. a useful exaptation (Aiello. 2002). These fallacies are refuted by Deacon (1992) who presents evidence that the brain is modular and that no major new structures have occurred to differentiate human and primate brains. but may increase the risk of choking and was taken to be clear evidence of the evolution of human physiology to support language (Lieberman. The dropped larynx allows greater clarity and distinctiveness in speech. largely due to the need to complete mental development.Chapter 2 – Evolution of Communication 15 In evolutionary theory. Arguments against the idea that language is the end result of some sudden exaptation. a reduction in energy costs elsewhere was necessary and the gut size was reduced necessitating a move to a more nutritious diet (Dunbar. Two common fallacies that support the idea of language occurring by accident are the ideas that the brain is a general-purpose mechanism or that brains vary greatly in design between species. Gould and Vrba (1982) coined the term exaptation to refer to the process of utilising some structure for some purpose other than that for which it originally evolved to serve. 1996). 1992) (although there are now arguments that the larynx dropped as a consequence of a move to a bipedal stance and that perhaps this was an additional factor enabling the evolution of spoken language. but the changes to the brain are not simply a matter of additional development – different areas of the human brain have been relatively more or less well developed than other areas. 2002). 2001). other work has pointed to different evolutionary changes that may have been selected for their improvements to the ability to speak clearly (Sanders. for example. Dunbar. In many mammal species. rely on the number of costly adaptations that have occurred to support language. . Changes in brain organisation are similarly limited. infants have reasonable amounts of autonomy from birth. Not so with human infants who are totally dependent for a considerable period of time after birth. In contrast to this.

. and every rule in the UG is somehow ‘hard-wired’ in the brain. and would confer an adaptive benefit to their users. Further arguments against the gradual and continuous evolution of language are based on claims that it is not possible to have ‘half-a-grammar’. Evidence from archaeological studies on the evolution of the human brain relating to particular language adaptations is somewhat contentious with alternative explanations or interpretations existing for much of the evidence (Buckley and Steele. are all hard-wired and each evolved to . etc. it is generally agreed that there is some innate ability.3 Questioning the Innate Linguistic Ability While disagreement exists over how innate language ability came about. An extreme innatist view is that the LAD/UG is itself innate. 2002). the many adaptations and changes required effectively rule out the possibility that human language somehow developed by a single surreptitious accident. 2. Despite their limitations compared to full human language. Thus. the argument goes. single step (Bickerton.Chapter 2 – Evolution of Communication 16 Lieberman (1992) and Deacon (1992) describe many of the adaptations in brain and body physiology for speech and language. The number of changes required. subjacency. could not conceivably have occurred unless over some considerable length of time. language could not be the result of some exaptation of a fully formed language organ. Constraints covering aspects of grammar such as branching order. 1984) (an argument now modified to take in two or three such steps (Calvin and Bickerton. such partial grammars (partial only in being less capable and powerful than modern grammars) are still useful. Despite this. Burling (2000) contests this and points to different stages of child language to demonstrate that partial grammars can – and do – exist. including the dietary ones. as the cost to individuals of some partially formed language ability would be too great to support were it not immediately useful. and that language evolution (perhaps from some proto-language) had to occur in an all-or-nothing discrete. 2000)).4. Which leads us to the next question: what exactly is it that is innate? Chomsky’s theory states that language is innate in that a number of language principles exist in the minds of children before they begin to learn language – and it is due to these that children are able to learn language and that a UG exists across all human languages.

Such considerations are also mentioned by Dunbar (1996) when he studies size of informal conversation groups. This appears to be roughly what is proposed by Pinker (1994. cultural processes may have significant influence. may lead to some innately possible grammars not appearing in any human language. In being ‘innate’. all human languages exist within the space of innately possible languages. this ‘strong’ view of an innate UG is under attack. The evidence of hereditary specific language impairment has been questioned (Bates and Goodman. For example. In other words. Thus language and its use depends on a host of limitations. and coded in the human brain. Additionally. 1999). but they do not necessarily fill the space. Chapter 10). but also by pragmatic limitations and peculiarities of the medium used. and the ease of generating such utterances. Representational innateness. evolution in language form may be responsible for some features otherwise assumed to be innate in the Chomskyan LAD. This view is expanded upon in Kirby (1999). including many imposed by the processes of speech production and perception as well as internal physiology and external environment. Care must be taken in assuming that any feature of language is due to genetic or phylogenic rather than glossogenic (cultural) evolution.Chapter 2 – Evolution of Communication 17 serve its particular linguistic purpose. as has the degree to which language is determined by mechanisms internal to the brain. However. As language is passed from one generation to the next. language is somehow constrained by the brain. Elman concludes . (Deacon. where the precise pattern of neural connectivity is pre-specified. Some evidence to support the idea that specific rules are biologically determined comes from the discovery of hereditary language disorders (Gopnik and Crago. and not all are necessarily innate. Kirby (1998) suggests that some language universals may be the result of historical evolution of language in the cultural domain rather than the result of evolution acting on genes. 1992) argues that language universals are shaped not only by the structure of the brain and its limitations. So. The innateness of language is also questioned by Elman (1999). who argues that the idea of what it means for something to be innate is itself under-specified. and other factors in human social behaviour. the ease with which heard utterances can be parsed. is ruled out as the genome is incapable of encoding sufficient information to do this. So not all features of the LAD can be said to have ‘evolved’. 1991).

g. determining general properties of neurons. the innate grammar is a consequence of many interactions of genetic expression – in e. A common criticism of work which attempts to explain in detail the process of exactly how language evolved in humans is that the resulting theories are “just so stories” – stories which ultimately cannot be proven because of a lack of evidence. by studying the evidence that is there. if not impossible. local connectivity rules and brain structure – and of ontological development – most importantly in neural development and learning. These last arguments do not rule out stronglyinnate language. . since Chomsky’s exposition of an innate LAD. 2. Rather. 1992. Such attempts are generally contentious and hard.Chapter 2 – Evolution of Communication 18 that language is innate but emphasises that grammar is not encoded in genomes. This subtle rethinking. Lieberman (1992) claims the function of language is obvious: “The contribution to biological fitness is obvious. this argument is flawed. from a biologically and genetically innate LAD to an emergent LAD is the unifying theme of the papers in the book in which Elman’s paper appears (MacWhinney. Vervet monkeys can make do with simple warning vocalisations to identify and warn of a variety of predators. The close relatives of the hominids who could rapidly communicate Look out there are two lions behind the rock! were more likely to survive. p23) However. 1999). a lesson in the form of a demonstration only would be more effective than one that was given by speech alone. There must be more to the fitness benefits derived from language. as were hominids who could convey the principles of the core and flake toolmaking technique in comprehensible sentences” (Lieberman. language leaving no direct physical trace in the fossil records. instead they question how language may be innate. to prove or disprove. why not hominids? To convey the principles of a simple tool making technique. it is possible to suggest what the evolutionary pressures were that lead to language and. While not particularly concerned with the question of the functional origins of language. attempts to explain the origin of language have again become popular.4 Why and How? Just So Stories and Grand Theories Perhaps the most controversial aspect of studies in the evolution of language is the attempt to explain the reason for the evolution of language in Homo sapiens.4. However.

1994) presents an argument that syntax evolved from signed language in early hominids. there is a lack of clear evidence that such signing is natural to primates and there is some evidence that it is not (Pinker. and recent evidence linking one of the key regions of the brain for grammar and language with a role in music appreciation (Maess et al. As Deacon puts it. Skoyles (2000). and it avoids picking a single. 1997). and maintain the cohesion. “Looking for the adaptive benefits of language is like picking only one dessert in your favorite bakery: there are too many compelling options to choose from” (Deacon. 1997. 2002)). Language may be used for grooming and gossip. One of the main claims for this is that primates are more able to learn to communicate with humans through signing than through vocalisations . of larger groups than is possible with physical grooming (although clear evidence of early hominid group size is lacking (Buckley and Steele. Such signing is not necessarily at all complex. Some of the benefits of language suggested include: . apes and chimps do have some physical ability for varied sound production. 1994. by its very nature. instead suggesting a range of language functions. highly specialised ‘original function’ of language. with many benefits. includes some elements of syntax . p377). p338-342). more importantly. Other theories. This theory is quite broad.f. Language can also perform a variety of other useful functions within such populations.that primates have more wilful control over signing than over vocalisation. positing a particular ‘original purpose’ or ‘original method’. generally not currently favoured. Further.. however..Chapter 2 – Evolution of Communication 19 (Armstrong et al. 2001)). 1996. However. One alternative theory that is popular is that language evolved out of its usefulness as a social tool (c. The other main claim is that manual signing.miming an action includes both action and object elements. include ones that propose language emerged from singing (for example. and are presumably only lacking in mental structures to finely control speech production and. Dunbar (1996) suggests that language evolved primarily to allow the formation. Deacon. are contentious and generally limited in evidence. Such theories. Dunbar. the capacity for language be it some form of LAD or whatever (Pinker’s ‘Language Instinct’ perhaps). and there is no argument presented claiming that primates naturally use strings or combinations of signs in a way that indicates some elementary grammar.

Children can learn what is dangerous and what is nutritious without having to taste every mushroom or walk up to every animal. without contributing themselves. extensive and detailed even across generations. Cangelosi et al. and transfer of information about others in a social group .which is possibly the most important information that can be known in such a group. Physical grooming is generally limited to two participants. • • Kin selection: the development of different languages and accents allows easy identification of members of different groups. as evolutionary pressure exists to be better able to convince others as well to be able to detect attempts at manipulation (Pinker and Bloom. 2000. Gossip allows an individual to learn about the strength. This also reduces the time required to form new relationships. The need to keep a mate entertained could drive a mental arms-race between males of a species in the quest for females.Chapter 2 – Evolution of Communication 20 • Grooming: maintaining relationships between individuals. • Second hand information: more generally. and gains co-operators the ability to look for evidence that another individual might not be reliable. This can be thought of as a form of the Prisoner’s Dilemma. 1992). and ultimately allows the formation of larger groups. it piggy-backed on brain adaptations for dealing with social . This allows individuals to maintain relationships with a greater number of other group members. Cheats: members of a group can gain at the expense of others if they can secure the co-operation of others. Larger groups benefit from mutual support against other groups and predators. Language gives defectors the ability to convince others. This knowledge can be complex. Worden (1998) goes further. 1992). Verbal grooming could be used to maintain a relationship between one speaker and around three listeners. This could also lead to an evolutionary arms race. 2002) • Mate selection: language can also be used to help advertise quality. a large variety of information about the world can be learned without the need for direct experience and repeated trial and error by every group (Pinker and Bloom. honesty and reliability of potential rivals and allies without the need for direct observation. in the search for mates. Those who succeed will have more offspring driving the evolution forward. he argues. • Gossip: gossip is useful for social cohesion. (Also see Cangelosi and Harnad. Not only did language evolve to serve a variety of social functions but..

Social situations are structured. complex and open ended. unless language was based such on pre-existing structures. from a number of different approaches. discrete-valued. and there is no shortage of quotes from literature about the benefit to be gained by stopping language change. 2. 1712) Or: “Standard English is the language of English culture at its highest levels as it has developed over the last centuries… This does not mean that speakers of non-standard English cannot be verbally agile within certain areas of discourse.5. By citing the ‘speed limit’ on evolutionary change. nor that topics traditionally discussed in the standard language are entirely barred to them. language change has been the subject of serious study for many years.1 Language Change Change in language is studied at many different levels. 1987) Away from such reactionary views. and Ascertaining. 1997). an argument similar to (Deacon. Typically studies concentrate on how language change operates at only .” (Swift. Enlarging. than some effectual Method for Correcting. our Language. In this section we provide an overview of some of this work. one important topic has so far only been mentioned as an aside – that of the eol. 2. He claims that language ability is based on similar mental structures used for social reasoning. People have been long aware that languages change over time.Chapter 2 – Evolution of Communication 21 situations. or diatribes on the degeneration of language.” (Marenbon. For example: “… nothing would be of greater Use towards the Improvement of Knowledge and Politeness.5 The Cultural Evolution of Languages Having surveyed work on the evolution of communication and cooperation and the EoL. social and historical. extended in space and time and dependent on sense data of all modes. it is argued that it was not possible for speech to emerge in humans given the time since evolutionary divergence from our common ancestor with chimpanzees. the individual. This list compares well with the lists provided above for the characteristic features of language.

Such studies are generally synchronic. 1995) or for a classic study. For example. work in historical linguistics is by its very nature diachronic. in socio-linguistics. their different interests and perspectives can lead to very different opinions on what is actually important when describing language differences. an individual’s idiosyncratic variations in language use are not considered important (as opposed to systematic variations in language use observed amongst some group of language users): .g. Yet. 1979). Trask. Psycholinguistics is concerned with how language works at the individual level. etc. Here language change may be viewed as the result of different functional pressures on language as speakers try to communicate with listeners (e. Chambers. The term idiolect was coined to refer to the unique variant of language possessed by any individual in a population. much work is done to reconstruct languages no longer spoken. 1995. sex. they are all interested in language change. and on providing explanations for use of particular linguistic variations according to social factors (class. attempting to describe the structure of language at a fixed point in time. Labov (1972). Trudgill.g. or to trace and explain the changes that may have occurred in the history of a particular language or family of languages (e. Sociolinguistics concentrates instead on how individuals actually use and change their use of language according to social situation. Slobin. As a consequence of studying the differences in dialect between different geographical and social groups. See. and this has led to the emergence of a number of complementary (and potentially contradictory) linguistic disciplines. In historical linguistics. grouping. (Tannen. 1996). studies of language change take place in the fields of socio-linguistics and dialectology. it became clear that each individual in a linguistic community uses a unique variant of their dialect. Studies in dialectology show how changes may be spread through populations – or may simply ‘map’ the different language forms that are in use in different population groups. At the social and group level.). In contrast to sociolinguistics. considering as it does changes in language over periods in time. While these different approaches are to some degree isolated from one another. 1994. how it happens and why.Chapter 2 – Evolution of Communication 22 one of these levels. for example. which concerns itself with language in society.

psycholinguistics and historical linguistics all concern themselves. some functional motivation. 1999a). page 215) This view. that language changes and differences have some reason. and the social benefits of having the appropriate dialect in any social interaction (Chambers. The spread of change is observed as it moves both across and within different social groups. Sociolinguistics.5. 1997. studies also note how language use is modified according to social situation. with language change. The diverse methods used and distinct focus of these fields has led to different approaches to explaining language change. Why adopt a novel form. and over many generations of speakers – the explanations given for language change are almost invariably given at the same level at which it is studied. 2. then there would be no reason why they should implement them at all” (Milroy. James Milroy has this to say: “It must be the case that human beings attach great importance to changes like this: if they did not. 1995page 85) While language change is a phenomenon which occurs at many levels – internal and external to speakers. is shared by others (Nettle and Dunbar. if it won’t provide some benefit? Regarding sound changes. what explanations of language change exist in the different fields that study it? In attempting to characterise each of a number of distinct fields in only a few sentences condemns me to provide grossly simplistic views – but representative ones. 1995). I hope. Such a marker would . All agree that it is the ability of language differences between individuals to act as a social marker that motivates language change and linguistic diversity.2 Explanations of Language Change Aware that the type of study affects the conclusions.Chapter 2 – Evolution of Communication 23 “Discovering how various “personality factors” interact to make idiolects would probably not repay the effort because they carry almost no social significance” (Chambers. 1993. Nettle. to a greater or lesser extent. Sociolinguistic studies of language change typically observe differences in language use by different social groups within a geographical area. A consequence of this is that sociolinguists may view language change as only occurring because of these socially functional factors. Additionally.

p169). (Clear. 1996. quick and easy. These arguments will be revisited later. Because of these competing functions. How might a change to one part of the system affect the rest of the system? This view of language as a system gained prominence with the publication of what became known as ‘Grimm’s Law’. rather than on the social factors surrounding their interactions. from the English of Chaucer to the English of Shakespeare and on – looks more closely at language itself than at individuals who speak it. effectively and ‘reasonably’ quickly. In his Deutsche Grammatik. generally being more interested in changes which happen over long periods of time – say. Slobin (1979) states four competing functions of language. Here the emphasis is on functions internal to the speaker or listener. is internal to each speaker and listener. Historical linguistics. efficiently. changes gradually occur in languages to optimise one function or another. For example. What is studied here is generally language as a system. easy to process) The competition. In contrast.Chapter 2 – Evolution of Communication 24 serve as a badge of group membership. This competition both fuels and maintains the continued evolution of language. expressive) The listener wishes language to allow them to quickly and efficiently retrieve a clear and informative message from speech. The language functions identified are that language signals should: • • • • Be clear Be processible Be quick and easy Be expressive The constant dynamic attempt to maintain equilibrium balances language simplification versus language elaboration. Greek and Sanskrit) . and the change it produces. (Clear. Pressures on language arise because in a conversational exchange the speaker and listener have different goals: • • The speaker wishes language to express meaning clearly. and help prevent outsiders taking advantage of the natural co-operativeness of others (Dunbar. and over some population will cause historical language change. Jakob Grimm (1822) explained that correspondences in consonant use across a number of languages were the result of systematic changes from older languages (Latin. quite different explanations of language change exist in psycholinguistics. the competition maintaining language in a dynamic equilibrium.

Over time this has led to the development of many Sound laws which describe ways in which systems of sounds may change over time.3 Language Change and Evolutionary Theory The re-application of biological evolutionary theory to explaining (or describing) how languages evolve has a history that can be traced at least as far back as Darwin: . and over the course of time a ‘chain’ of such sound moves forms (King. leaving behind a ‘gap’. These ideas of the sounds of language as a system – and of directional changes in phonology – led to the development of chain-shift theories of sound change. Viewing language as a system which is itself subject to change loses sight of the mechanisms by which change occurs. For example. A limitation of the historical approach is that it often does not explain why languages change at all – rather it provides a framework for studying the histories of changes in languages. Although historical linguistics is well suited to documenting and detailing changes in language use. but can be extended to other aspects of language (Anderson. Further. A change in the opposite direction – from an ‘f’ to a ‘p’ would not be possible. for which exceptions often exist (Adamska-Sałaciak. One sound may gradually shift its position in this space. the ‘p’ sound of Latin or Greek could change to the ‘f’’ sound of modern Germanic languages. however. and for describing which changes are likely or indeed possible. explanations that view language as a system are not limited to phonology. 2. Consider the sounds of language as existing in some auditory space. As the first sound moves away. however. 1969). Despite the name. these are not laws but more general rules of sound change. Recently. a second may be ‘pulled’ into this gap. 1978). as the rules and explanations put forward are those that apply to languages as a system – and not to the speakers of the languages (Milroy. some of its methods are limited in their ability to explain why language changes occur. 1973). a number of authors – linguists and others – have made serious attempts to use evolutionary theory to explain language change. They may not be evenly distributed through the space. and some of these are reviewed next.Chapter 2 – Evolution of Communication 25 (Baugh and Cable. 1993). 1997).5.

4 Cultural Evolution This question was asked by Cavalli-Sforza and Feldman (1978).” Darwin. cultural ‘material’ is received from the community around an individual. (neo-)Darwinian theory of evolution is based on many mechanisms that simply do not apply to eol. Finally. Xi. The equation given is more than strikingly similar to description of a social network used by (Milroy.1 . and the proofs that both have been developed through a gradual process. Community members may have from very strong to negligible influence on another individual. at time t+1. So how can evolutionary theory be applied to something like language? 2. The proportion of Xj. Languages ‘reproduce’ by transmission from speakers to learners and other speakers. A social network is. Rather than obtaining genetic material from two parents. The socially acquired characteristics will be determined by a summation of the influences exerted on the individual. in their attempt to develop a general theory of cultural evolution. A portion of a hypothetical network for an individual called ‘Ted’ might look like the network shown in Figure 2. are curiously parallel.R.t + ε i j =1 N (2-1) This equation is used to determine the traits.Chapter 2 – Evolution of Communication 26 “The formation of different languages and of distinct species.t +1 = ∑ wij X j . This is expressed mathematically as: X i . The amount of influence held by members of the community is unequally distributed. ε i is a random error term.5. with 0 ≤ wij ≤ 1 and ∑ N j =1 wij = 1 . indicative of the amount of influence or contact between the two. C. 1874 However. of the ith individual in a population of N individuals – based on the trait values of the population at the previous time step. quite simply.t contributed to Xi. a network describing an individual’s contacts and bonds with others. Learners do not apply ‘re-combination’ and ‘mutation’ operators to utterances they hear to form their own language.t+1 is determined by wij. 1980).

Thus. To describe these bonds mathematically we can use Equation 2-1.Chapter 2 – Evolution of Communication 27 Mary Tom Sue Ted Jo Bill Bob Figure 2. 1996. 1997) considers human populations as environments within which languages exist. In biology. 2. friends and work.1. Each individual is linked to many others in the community around them – through family. the social networks for the other individuals are not shown. with a relatively small number of strong bonds. it is interesting to note that social networks map very naturally onto a separately derived theory of cultural evolution. ecology is concerned with the interrelationships between an organism and its environment – which includes other organisms. A partial social network for an individual called ‘Ted’. Cavalli-Sforza and Feldman (1981) present a reworked mathematical model and Niyogi (2002) compares his own computational model of language change with this later model. language ecology considers the initial conditions of an ecology – the starting language and population distribution – and stochastic events to be of importance in the ensuing evolution and . the social bonds linking individuals vary in their strength. above: every individual in the community has links with every other individual. We will look at these in more detail in Chapter 6 where we study cultural evolution in more depth. varying in strength from zero (no contact) up to one. Other individuals might have more diffuse networks. In some instances an individual might be a member of a close-knit community. Language ecology (Haugen.5. 1971. Excepting links between members of Ted’s social network. As with its biological counterpart. with many weak bonds. While Milroy was not attempting to formulate an evolutionary theory of language change. Mufwene.5 Language Ecology and Species Two biologically inspired theories of language evolution with much in common are language as species and language ecology. Mühlhäusler.

Over time the precise shape of the “cloud” of idiolect will change and the position of the norm will shift. Clusters may form away from the norm and these may be favoured and become a new selection-gradient for the language quasi-species. Using traditional linguistic terminology. Mufwene argues that variations in population sizes and limits on interactions between people (social structure) strongly influence the process of creolisation. changing it. . The average language of some population may be considered the norm – but this may not be represented by any existing idiolect. p328). A model of evolution. Like animals of the one species. At other times a population may remain stable over long periods. Within the ecology. This view makes explicit some aspects often overlooked in objections to language evolution theories that limit the roles of function or intent in eol. neutral to what is being evolved. Lass develops a similar view. Again. Sometimes a particular variant may propagate through the population. we could simply say that the language had changed. Mufwene regards language to be a species. cluster. p375). A new norm would be formed inside the. even in stable populations. Mufwene considers that “the question for historical linguistics is to determine under what ecological conditions… small actions of speakers amount to… change in the communal system” (Mufwene. the language of an individual speaker – which will vary at least slightly from all others in the population – is considered to be an organism. 1997. The norm is a weighted average of the population. but – borrowing from the population biology of viruses – promotes the notion of language as a ‘quasi-species’ (Lass. formerly peripheral. the individual within a language quasi-species is a single idiolect. is presented with arguments to show that it applies equally well to language as it does to virus quasi-species. It is postulated that similar processes are at work in the normal eol: each new generation learns language from scratch based on the existing language around – whether that language is pidgin or an established language. Examples demonstrate how these influences can lead to radically divergent eol in situations with what might be considered similar starting conditions. variation is always present. I will briefly describe below some of the main points specifically as they apply to language. A quasispecies is a highly variable yet self-stabilising population.Chapter 2 – Evolution of Communication 28 development. which may be more densely clustered round some point than elsewhere but which may have many outliers on the periphery. 1997.

who both refer to Kimura. p. 1997. That replication is not.4).5. in any group the amount of influence exerted on any one individual by any one of the others will vary . showing that the path of linguistic diversification is not random. 2. However.5.2). and can not be. With a neutral model it is difficult to account for diversification without geographical isolation. Nettle proposes that in order for linguistic evolution to occur without geographical isolation. Kimura. This provides ample room for variation. Nettle argues that the social functions of language are required for the emergence of linguistic diversity (section 2. additional mechanisms are required. be sufficient to cause change and diversity in human languages. 1997. As previously mentioned. Nettle (1999a) argues against the neutral evolution of linguistic systems on three points: 1. or should not. Random changes would be non-directional and could be expected to cancel each other out. within which elements of linguistic ‘junk’ and other ‘marginal’ features exist. Structural correlations in many of the world’s languages represents parallel evolution. due to an averaging effect. However.6 Neutral Evolution With his work on neutral evolution. showed that it was possible for evolution to occur without any apparent selective forces at work. page 380) 2. regardless of functional benefits. As in the theory of cultural evolution (section 2.Chapter 2 – Evolution of Communication 29 Lass coincidentally concludes his argument with the same point as Mufwene: “What remains then… is to differentiate the ecological conditions under which selection gradients arise” (Lass. with a uniform likelihood of any one individual interacting with any other. Thus. Lass’ proposal (Lass. 3. The possibility that language evolution could be evolutionarily neutral is discussed by Lass as well as by Nettle (1999a). they have very different views on why the neutral evolution of language should. 354) is based on the observation that languages are imperfectly replicating systems. (1983). and allows changes to occur without disrupting the success of communication.5. Nettle’s first and second points both rely on the equal distribution of individuals. perfect means that languages will change.

this would have in turn put pressure on the evolution of the LAD. evolving. systems that influence each other’s evolution – say.6 Summary In this chapter I have provided a short review of a great many topics in linguistic and evolutionary research. As the ability to use language has evolved. coevolution can be invoked to try to improve understanding in areas apart from biology. the languages used are obviously constrained by the LAD.Chapter 2 – Evolution of Communication 30 according to a number of factors. Viewing the LAD (in whatever form it takes) as the biological language organ. This reduces the effect of averaging. described in the later chapters. In the next chapter I look at computational approaches to evolutionary modelling. Like many other evolutionary ideas.5. and review some of the methodological ground rules that have been developed for conducting such work. with the evolution of each species influencing or inducing evolutionary change in the other (Stearns and Hoekstra. 2. The different social networks within groups reduces the need for geographical isolation to produce linguistic diversity. If it is a coevolutionary system. the different organs of animals of the one species. As successful use of language became important for survival. in Chapter 7. The LAD shapes the evolution of languages. These have given rise to ground rules that I have tried to follow in conducting my own work. or parasites and carriers. with regard to language change. and increases the potential for sub-populations to vary from the mean.7 Co-evolution of Languages and Man Coevolution is a term which usually applies to the processes by which two species may evolve. it is typically used when referring to the coevolution of predators and prey. 2. . 2000). and arguments for or against it. A more generalised view of coevolution is that it refers to any two interacting. so have the languages that are used (this cultural evolution of languages is discussed further in Chapter 7). including the evolution of language. it is possible that the LAD has in turn evolved in response to the languages that it allows to exist. We will revisit the concept of neutral evolution.

The agents may be very simple and abstract. Rules govern the behaviour of agents during interactions and also when. interacting with one another and/or with a simulated environment. and a similar approach is also used in work in social science microsimulation (Gilbert and Troitzsch. In this context.1. 3. a broad introduction to language evolution and related problem domains was given. It is the aim of this chapter to provide some detail of the investigative approach used. or agents. Adami et al.whether it is the emergence of life itself. 1998). the evolution of co-operation. 1991).Chapter 3 – Artificial Life 31 Chapter 3 Artificial Life: Computational Modelling and a Methodological Approach Over the course of the last chapter. So despite Langton’s assertion that ALife represents “life-as-it-could-be” (Langton.1 Agents The basic element in most ALife models is the agent. 1991). As well as forming a definition of ALife. for example.1 The Artificial Life Approach The quote from Langton. give rise to an emergent and life-like behaviour. The agents themselves may be as simple as simulated ‘billiard balls’ to more complex organisms that adapt and learn (Holland. other than the “essence” described by Langton. A review of the proceedings from a more recent ALife conference will similarly reveal a diverse range of work. 3. In theoretical biology. such an approach has been termed individual-based-modelling (Grafen. Langton’s statement reveals that ALife was even then a diverse field. 1998). reproduced in Chapter 1 identifies the essence of much of ALife as the construction of systems in which local interactions of many entities. there is a commonality that can be found in much ALife work. and which.. 1999). an agent is a single individual in some simulated population. agents interact. where the work of different researchers had seemingly little in common. Much of the research tries to improve understanding of phenomena in the real world . signalling or language or the dynamics of pricewars in a free market (see. it can be seen that a major use is as a scientific tool for investigation of the real world. . However.

3. In this thesis I will generally adopt the latter.3 Life-Like The purpose of ALife modelling is usually either to allow future events to be predicted or to provide better understanding of existing phenomena. the classic example is ‘Boids’ (Reynolds. meaning than is used here. This is in contrast to traditional mathematical approaches where entire populations may be modelled by means of a system of complex mathematical equations (such as in traditional dynamical systems. In physics. for some.Chapter 3 – Artificial Life 32 The simulated populations may consist of handfuls of agents. 1987). or many thousands of agents. In ALife. individual boids follow simple rules to avoid collisions. as ALife models – due to their nature. and iterated the . the phenomenon can be described as an emergent one.1. 1998. the method of obtaining results is the same.1. It should be noted that. see Hofbauer and Sigmund (1988)). the term agent has a different. and life-like. Having initialised a population of agents and their environment. In this model. p 224) – temperature again providing a good example. where initial conditions and the stochastic events can radically affect quantitative results – are not always well suited to making predictions other than in the most general terms. The emergent phenomena itself may be subject to “macro-laws” – laws which describe the emergent behaviour at a higher level than that of the interactions which give rise to it (Holland. Individual atoms and molecules do not possess temperature – it emerges from the interactions of many different atoms and molecules. desires and intents (Wooldridge and Jennings. Whatever the purpose. a model of flocking birds. set parameter values. a classic example is temperature.2 Emergence Where a phenomenon at one level arises as the result of processes that occur at another lower level. From the interactions of many boids following these rules an emergent. interacting according to simple rules. Some definitions require that an agent is embodied in an environment with which it is able to interact and/or that an agent must possess its own (however modelled) beliefs. to try to match velocity with other boids and to try to stay near the centre of the flock. 1995). 3. flocking behaviour emerges. and more strictly defined.

In the social. While it is possible that numerical and statistical analysis of real and simulated data will allow a direct comparison. apparent or qualitative similarities between model and world are identified – as in the flocking patterns of Reynold’s boids. but growing. to some prominent researchers in the field of Evolutionary Computation. In contrast.Chapter 3 – Artificial Life 33 simulation for some period. In such a case.2 Methodologies for Applying Artificial Life Reviewing ALife research which aims to model the real world. where researchers simply play around with models and in which an “anything goes” attitude is prevalent (Muehlenbein 1998. and life sciences criticism of the methods by which a finding has been reached will substantially undermine confidence in the findings. methodology and many volumes have been published on suitable methodologies for research and experimentation. it is apparent that there has been a lack of discipline and rigor. the results of the simulation must then be compared to observations of the real world. it may be the case that some amount of interpretation is required. and this has been noted by a number of authors – some of whose suggestions for improvement are reviewed in this chapter. For example. however. The requirement that suitable methods are applied to research before the research can be accepted generates interest in. and awareness of. With comparable results between the simulation and the real world. . Indeed. it can be postulated that the processes which produced the phenomenon of interest in the model are essentially the same as those at work to produce the comparable real-world phenomenon. physical. as we shall later see. 3. a search of the University of Paisley library (a rather small university library) catalogue for texts with the words “method” or “methodology” in the title lists over 700 separate titles. a small. that a similar end result does not necessarily mean that the same processes are at work. personal comments). Note. number of papers have been published providing guiding principles and heuristics to apply when conducting ALife research. the related field of ALife is seen as an area characterised by poor quality research. which focuses on the scientific applications and understanding of evolutionary algorithms. In many cases it is hoped that this provides some degree of explanation for the phenomenon. where higher standards of rigour and use of appropriate scientific methods are expected. This compares poorly to established sciences.

1 Science without method? In the first issue of the Artificial Life journal. His conclusion is that ALife is more a method than a science. Instead the emphasis was on showing that ALife could be applied to a varied range of problems. Taylor and Jefferson. other problems are introduced by the use of computational models. the paper “Artificial Life as a Tool for Biological Inquiry” (Taylor and Jefferson. Where traditional mathematical models need to make more general assumptions about the activity of individuals within populations to make the maths tractable. (MacLennan . 1994) was published. itself a critique of an earlier work on the evolution of communication. there is no mention of methodology or recommendations on how to conduct such research. ALife has been lacking in developed methodologies to guide its application. Part of the problem is that ALife is often seen as being a method in its own right.2. 3. asking if ALife is a science. if it is. The potential of ALife for scientific inquiry in the biological sciences is discussed and some problems in biology where ALife may be usefully applied are presented. An example of the problems that can occur is presented in Noble and Cliff (1996).e. As we shall see. and. 1994). However. but very little groundwork on methodological issues. a key problem is that. as a new method for conducting scientific inquiry.g. then what does it study and what standards should be applied to its practice. In the remainder of this thesis I will endeavour to satisfy these standards. Yet. ultimately such distinctions are problematic – consider the case of statistics as a parallel). Noble (1997) questions such views.Chapter 3 – Artificial Life 34 Before using ALife methods to investigate the evolution of language in the following chapters. in this chapter I will review current ideas on ALife methodology and good practice. The advantages of ALife models over traditional mathematical modelling techniques was emphasised at the expense of cautionary notes on the limitations (e. i. ALife models can leverage modern computational power to allow models to be built which make fewer assumptions and simplifications. a scientific approach that can be applied to a target discipline rather than a standalone discipline itself (although. helping to find answers to complex problems. or as a completely new science to which established scientific practice need not apply. The proceedings of the early Artificial Life workshops produced a lot of important work establishing the field.

apply to the real world and truly answer the questions asked of them. with interdisciplinary blindness. then there will be problems relating the results to the real world. controlled experimentation and statistical analysis. noting many pitfalls and hazards for the unwary. refusing to recognise . These types of problems are noted by Gilbert and Troitzsch (1999) who discuss the use of simulation as a method in social science. with a lack of rigorous hypothesis testing. chapter 4) identifies the core problem with the use of computational models in research as being ensuring that the conclusions drawn from experimentation with a synthetic model are realistic. poor research methods. implementation and explanation. Miller claims that whenever computer scientists develop a new field claiming to be real science the same problems recur: poor scholarship. that of ensuring that comparisons with the real-world hold true.3. In the critique.Chapter 3 – Artificial Life 35 and Burghardt. or not employed. Noble and Cliff are compelled to present some warnings regarding the use of ALife simulation. 1994). These are: • • • to beware of counter-intuitive results due to the conditions used in the experiment that simulations with unnecessary complexity may show nothing and that if an arbitrary decision about the simulation implementation or design influences the results. before proceeding to look in more detail at the ALife research life-cycle and the impact of the verification and validation problems on experimental design. in ALife research is presented by Miller (1995). In the next section we review one of the first papers to address this problem. The problem of determining whether results could be due to initial conditions or even some programming error in building the model is one of verification. Miller first reviews some failings of computer science in general when it attempts ‘real science’ (sciences based on pre-existing natural phenomena).1 A Methodology Emerges Perhaps the first attempt to review and criticise the scientific methods employed. poor analysis of results. Miller takes a critical look at the use of ALife as a tool for theoretical biology. 3.3 Improved Methodologies and Practices 3. validation. Di Paolo (1999b.

ALife work may not present much gain to biologists. or just about to be presented. If the work is unacceptable. flexible behaviour. perform strong and thorough analysis of the results from simulations under numerous conditions. Such models are weak at coping with phenomena like complex phenotypes. then it is likely flawed. Sufficient biological knowledge is required to earn their support and confidence. Taking such models and relaxing the assumptions one at a time is a powerful technique and the results can then be directly compared to those of the formal model. and these are listed below. This can avoid effort being expended on solved problems. Miller suggests an ideal ALife project lifecycle in which work is iterated over several years with publications moving from conferences to the dedicated journals and finally to mid and then high level journals within biology. Miller suggests six methodological heuristics for successful ALife. For example. 5) Explore cause and effect in the simulation by running comparisons across different conditions. Rather than constantly moving on to new and better simulations. 3) Do a thorough scholarly review of the current biology literature relevant to the problem. Miller notes that this may be more difficult than it sounds. leaping from research fad to fad.Chapter 3 – Artificial Life 36 failures and wildly over-generalising successes. and may present some risk. failing to replicate results and extend findings towards a conceptually integrated discipline. and co-evolution. poor follow through. Vary independent variables and observe effects on dependent variables. . Miller views journals within ALife as playgrounds in which ideas and methods can grow before venturing outside. and most problems a casual reader may encounter have large bodies of work behind them. 2) Collaborate with real biologists who have already worked on the problem. mathematical models in biology may make strong and unrealistic assumptions to make the maths tractable. 6) Publish the results in biology journals. Miller’s Six Heuristics 1) Identify a known and unsolved problem that can be addressed using simulation. subject to peer review by real biologists. Biology is a mature and successful science. As redress. 4) Develop a well-targeted simulation that extends current biological models and yields directly comparable results. and finally.

3. chapter 7). the heuristics for guiding research have themselves been reviewed with some criticisms. these heuristics are helpful.3 ALife for Novel Models and Theories Miller’s fourth heuristic comes in for heavier criticism. Overall. there can be resistance to accepting for publication work based on computer modelling rather than the more traditional methods used in a particular academic field (for a particular example. Yet many of Miller’s statements could apply equally to those using ALife for linguistics or social sciences. While acceptance by peers from the target domain is a worthwhile objective. The advice here is to select a particular problem.3. together with other works which provide suggestions for good practice in ALife research.2 Not Just Biology While many of Miller’s points are well observed. This should allow the heuristics to be generalised to apply to a broader range of ALife work. Some of these are reviewed here. focussing on the particular failure of much of ALife research to properly apply itself to its supposed target problem and domain.or to act as if they believe it. The comments presented in this paper are a sobering reminder that ALife work purporting to illuminate some aspect of evolution must be presented within the context of the other work that already exists. this view easily supported by the large body of work using ALife techniques in subject areas other than biology. simply by substituting the name of the appropriate science where “biology” appears in Miller’s heuristics. and to ensure that the model does indeed address the problem. though the prospects are exciting.Chapter 3 – Artificial Life 37 Miller concludes that ALife research will only be as good as ALife research methods. 3. see Axelrod. 1997a.. and in some cases this may make Miller’s final heuristic a difficult one to successfully follow. Miller warns ALife researchers not to believe that they can enter a wellestablished field of science and solve all of the open problems with a few simple simulations . however. 3. etc. In particular. In an unrelated paper Hurford (1996) states that the act of building a computational model can help . One obvious point is to ask why should ALife be limited to just research in Theoretical Biology? That ALife has much to offer researchers in other fields of science is noted by both Noble (1997) and Di Paolo (1996).

and criticises the fourth in particular. without a good reason for preventing ALife researchers from participating in this work (Noble. These may be implicit points simply accepted without question in the original domain. however. providing new answers that are too hard. This is an important observation as in social sciences and humanities theories are much less likely to have formal mathematical expressions than they are in the physical and mathematical sciences. However. an ALife model developed from an existing biological model may depend on factors that are not generally agreed upon. to obtain using traditional analytical methods. This makes the point that existing non-mathematical models may also be used as the basis for ALife experimentation. and simulations used to help defend or attack such models. As we noted above. Computer models can be built to represent verbal models. supporting the principle that direct modelling can be good practice. There is. Further. no point in advancing a novel ALife-inspired theory to explain some .Chapter 3 – Artificial Life 38 find gaps or hidden assumptions in theories. An ALife model that extends an existing model will have many of the methodological and philosophical assumptions of the existing model. Di Paolo points to physicists who believe that some phenomena can be better modelled computationally than they can mathematically. restricting ALife researchers from developing new models and theories simply hands this work over to other researchers. verbal arguments are often used where systems involve complex processes that are hard to formalise mathematically. 1997). In contrast. He accepts that work following this rule may enrich current models. and the running of such models can help improve the detail and check for internal consistency. Given that in biology there exist a number of unresolved controversies and debates. Computer explorations where no prior formal model exists requires first building such a model. Further. not that a formal mathematical model already exists to support the theory. he questions whether this heuristic may present its own range of problems. work that is not developed directly from an existing model may have the potential to help resolve open debates. Di Paolo (1996) finds the heuristics more limiting than necessary. This view is similar to Miller’s point 4. or even impossible. a step avoided when using Miller’s guidelines. The principal problem identified is that of inheriting implicit assumptions. but only requires that a theory exist. however. developing a new model can cause problems with the model’s ultimate acceptance.

This helps to question and assess the assumptions themselves. chapter 4) promotes the development of a number of different formal models (such as mathematical and computational) of some phenomenon. but the overall process is essentially unchanged.4. These arguments all hold that ALife presents a valid approach for both (re-)evaluating existing theories and for developing new theories. This can help highlight which conclusions depend on particular models and which assumptions and features cause differences in the results. A theory is proposed and a mathematical model developed. Traditional modelling proceeds in a number of stages. There is no compelling argument binding the ALife researcher to working with models derived from existing formal models – although such models are useful for comparison. Comparisons between predicted and collected data can be used to support the validity of the model. . Di Paolo (1999b. with the computational models themselves providing demonstrations of the new ideas.1 ALife Research vs ‘Real’ Research Gilbert and Troitzsch (1999) shows that the general approach when using simulation as a research method is the same as when using traditional modelling methods. are there any significant similarities? 3. Instead of comparing predictions derived from a statistical model to collected data. The same stages exist when developing a computational model.Chapter 3 – Artificial Life 39 phenomenon when a better theory already exists – a reminder of the importance of properly researching the problem domain (Miller’s point 3). some similarity is searched for between the data and the results of simulation runs. 3. and then comparing the results of the different models to improve understanding. providing greater insight into the problem than otherwise possible.4 ALife in Practice Accepting the development of ALife models as an alternative to statistical or mathematical modelling. Observations are made and some data collected. Differences will exist in the details of each stage. are there significant differences in how research should be conducted? Or given the apparent gulf between the approaches. For a number of initial states and parameter settings the model is used to derive future states.

The observations must be related to the theories and hypotheses about the natural phenomena in question. He describes a possible way to integrate simulations into a scientific project. First. given two explanations for some phenomenon the simpler explanation. These are the need for minimal models and the need to make any assumptions captured in a model explicit. Second.2 Principles of ALife Model Building A couple of principles for ALife model building are agreed upon by almost all of the authors cited in this chapter. is often the more likely (the principle commonly known as Occam’s Razor). may detract from the simulation. two for working with a simulation. to help assure scientific integrity without limiting the potential of the simulations. It may make it harder to observe what is happening in the simulation or to determine which of the increased number of factors modelled are the key factors in giving rise to a . and further simulations run to test the hypotheses. and others beyond. different cases of interest are run. and the third for relating the results to the real world: 1) An Exploratory Phase. chapter 4) also considers how research should be conducted using computer models.Chapter 3 – Artificial Life 40 Di Paolo (1999b. attempts to capture more detail than strictly necessary. As theories can follow on after the model has been constructed. There are a number of reasons for preferring minimal models. it is also important to consider guidelines on how a model should be constructed. 2) Experimental Phase. “Models should be as simple as possible but no simpler!” (Doran. requiring less assumptions. Di Paolo also notes that this methodology has been used before. Reformulation is performed as required (by unexpected results). Hypotheses are generated. observables defined and patterns explored. He also makes clear that this is only one possible way of attempting to use ALife models according to scientific principles. and is not unique to ALife. 1996. During this stage. 3) Explanatory Phase.4. p382) A minimal model is one which models all of the mechanisms required by the theory and no more (or as little extra as possible). after the initial model has been constructed. Three distinct phases of research are identified. 3. perhaps to increase ‘realism’ in the model.

Additionally. which could be stated as: • Appropriate methodology for using ALife to investigate some real-world phenomena will. . this is perhaps an unrealistic expectation. It is observed that in this specific field of ALife endeavour many existing models are lacking in explicit statements of what exactly is being modelled. a methodology may coevolve with the research itself. but does not strictly contradict them. it points out that the general guidelines are insufficient for any individual research program. that some definition of life is required. This highlights another point. and that explicit lists of assumptions are often also missing. assumptions requires additional effort in verifying and validating models (see 3. While it would be ideal for any researcher to have a fully developed methodology at the beginning of any program of research.6). one undesirable result of using a non-minimal model might be an increased computational load – resulting in simulations taking longer to run. that there is a need to ensure that any assumptions built into the model are made explicit. unacknowledged or unknown. This conflicts with all of the attempts described above at deriving general principles for conducting ALife research. This itself is an application of a more general methodological rule. to some degree. Taylor has identified particular problems in using ALife to tackle a particular problem.4. An important step will always be to adapt the general methodologies for ALife based research to the particular problem domain as a single prescriptive methodology will not be applicable to all areas to which an ALife approach may be applied. and has suggested some particular solutions. If that is not a problem. particularly in the case of doctoral research. 3. or otherwise accounted for in any explanation of the model. there might still be an increased difficulty in verifying results.Chapter 3 – Artificial Life 41 certain result. More likely. Such assumptions have to be included in the theory. Noble (1997) describes a minimal model as being one that captures all and only the intended assumptions. Rather.3 Adapting Methods Taylor (1998) discusses specific issues that relate to using ALife to model the evolution and origin of life itself. depend on the phenomena under investigation and the scientific discipline which studies it. The subsequent discovery of hidden.5 and 3.

hereafter OOD and OOP. known as attributes. This is a particular weakness of ALife – it depends ultimately on writing a computer program. perhaps because the programming exercise is not seen as part of the research itself.. And yet there are almost no guidelines – a few are to be found in an appendix of (Epstein and Axtell. most suited to ALife modelling.g. and do so without error.4. and programming is a often considered less of a science and more of an art (leading Donald Knuth to title his seminal work ‘The Art of Computer Programming’ (Knuth. With OOD and OOP the process of designing and building an ALife model is based around providing a clear description of what the agents and any other objects in the model are. In an OOD. the ALife modeller has to write a computer program. In ‘normal’ software development. Bennet et al. but with little success outside of safety-critical systems (Storey. or methods (e. are Object-Oriented Design and Programming. 1996) – on this aspect of ALife research. Many volumes are dedicated to methods that should be employed in order to produce computer programs that do what they are intended to do. The software design and implementation paradigms. such a description and the methods by which it is developed are of paramount importance. The first task of the modeller should be to develop a description of the model that they will implement – the agents or actors that comprise the population and the rules which govern their interaction. 1969)). and the activities which they can perform. Values may be ascribed to different variables and the result is expressed again in mathematical terms. Both ALife and traditional formal models require some amount of abstraction. what their properties are. their properties. including . rather than as a tiresome task to be done before the research begins. While many ALife models. In sharp contrast. Attempts have also been made to formalize the creation of computer programs. which contains within it a simulated population. 2001). and what actions they can take to interact with other objects or agents.4 Implementing an ALife model The discussion on methodology presented so far has ignored the details surrounding the actual implementation of an ALife model.Chapter 3 – Artificial Life 42 3. the design is based around identifying the objects that exist in the system. 1996). which is now briefly considered. To build a mathematical model requires the researcher to abstract the interactions and processes that occur in a given population into a single set of equations. with detailed algorithms and/or system diagrams being required before programming begins.

Chapter 3 – Artificial Life 43 the ones described in this thesis. for example. Without ruling out the possibility that birds do something similar. 3. It is extremely unlikely that birds are capable of performing such calculations. Ensuring that a model is valid is. Here we concentrate on the considerations of whether a model and its results are valid and on some of the problems facing attempts to verify the results of ALife work.5. 3.5 Experimentation Once a model has been built. perhaps based on a gross estimation. but the model is based on an invalid representation of how birds act. the visible part of ALife research begins – running simulations under different conditions and observing what happens within the model as a result.1 Validation For any abstract model of reality. Are the principles on which the model is founded sound. 1994)). What makes ensuring that a model is valid more of a concern in ALife is that each model may be introducing its own sets of assumptions and abstractions. While the flocking behaviours produced are very life-like. 1997). the validity of the assumption of rational behaviour underpinning a great deal of research in economics has been questioned (Ormerod. and are the abstractions that have been made reasonable? Validation of a model is the key step of certifying that the model and results together provide a legitimate demonstration of corresponding processes and outcomes that occur in the real world. The results of the model are convincing. surprise results in ALife. it is clear that the rules that govern the behaviour of individual boids and those that govern birds flying in flocks are not the same. consider Reynolds’ boids (Reynolds. So. 1987). an important question is whether it is a valid model. The principles and assumptions the model is based on must be correct for the model to be valid. and responses to. the rules themselves are less so – requiring that each boid is able to compute the position of the centre of the flock at any time (Noble. are developed using OOP. Unsurprisingly there are a number of problems particular to ALife experimentation and a number of useful methodological rules to help. We briefly review some arguments over the place of. further work is required on the possible benefits OOD can bring to those wishing to implement ALife models. like many of the guidelines in this chapter not a problem only for ALife research (for example. as each model may have its own unique agent design and .

but may rather lead to results which would otherwise not arise. within a single programme of research. is performing the replication. or group. Heralded results may not be a consequence of the intended model.Chapter 3 – Artificial Life 44 implementation. replication cannot be relied upon as one author. The intermediate design stage. A replicated result then increases confidence in the published model. in a search for results that are supportive. in the direct coding of the abstract model. assumptions. linking the expounded theory and abstract model to the implemented model can also introduce errors – or where the design stage is absent. which may not affect the results in an obvious way. At worst these might significantly alter the results gained from the simulation. and the results gained from working with a simulation. During this work practical details of the implementation have to be finalised and implemented. One way to address this problem is through the replication of experiments (Noble. hidden. An implementation by some third-party of a model based only on a published description is unlikely to replicate the same bugs. arbitrary design decisions or hidden assumptions. An alternative is to compare the model against other related models. a statistical or traditional game theoretic framework. This may be possible where different aspects of . Such arbitrary implementation features of a simulation model can easily affect the results (Noble and Cliff. Some decisions here will be. 1996).2 Verification There are a number of sources of error that can affect any simulation work. such as causing a program to crash. or bug. However ensuring that an ALife model is based on sound principles does not ensure a correctly working model – that the model actually implements the theory. With iterative simulations it is difficult to ensure the eradication of bugs. additional. and that individual agents do what the experimenter believes they do has to be verified. but of unintended effects of some feature of the implementation. such as a thesis like this. At best. say. Unfortunately. Results observed again may not be a consequence of the supposed theory underlying the model. 1997). in effect. 3. but of an error in its implementation. These will then be less subject to thorough peer review than shared assumptions developed over time and used collectively by different researchers working in. One obvious source of possible error in a simulation is the presence of a programming error. they act to reduce confidence that the results gained are the correct ones and are not due to the errors present.5.

the result is not a chance product of a particular combination of parameter values. aiming to give the appearance of randomness (Gilbert and Troitzsch. Appendix C). 1999. but that the result gained is robust – i. and the influence of parameter settings should be minimised. The solution here is not to test that the same qualitative result occurs for all possible parameter values. Additional models created by the same researcher may also carry the . but is seen over a reasonable range of parameter values. Selected values for different parameters can interact in an unexpected manner. 1992). In short.e. An example of this will be seen later in this thesis. Frankfort-Nachmias and Nachmias. To compensate for this each simulation should be run several times with different seed values for every parameter combination used. With simulation models it is easily possible to repeat an experiment with different parameter settings.g. and some from theory and many may be arbitrary experimental parameters used in the running of the model. but will have a number of common features. Verification problems do exist in other scientific disciplines wherever measurements are made (e. A simpler problem to deal with is the one caused by simulation artefacts that can result from particular combinations of parameter values. where two quite different models are used. and implementations of this will rely on having the computer provide the simulation with streams of random numbers. Unfortunately. Some parameters used in a model may be derived from existing observations. but the additional problems of working with simulations are quite significant. The results of the experiment should reflect the theory not the arbitrary design decisions. These may be enough to allow comparisons to be made between different models. and their results compared qualitatively (Chapters 6 and 7).Chapter 3 – Artificial Life 45 the same phenomenon are studied by different researchers. The sequences are in fact pre-determined. the nature of random number generation in computer programs can be considered problematic. the random number generators are only pseudo-random. exhaustive testing is not possible. however with a potentially infinite range of parameter values. perhaps leading to unusual or exceptional results. ALife models generally aim to show how stochastic processes can lead to predictable results. Using a particular ‘seed’ value. the exact same sequence of ‘random’ numbers can be generated over and over again. Models will vary. The key solution to this is that wherever possible researchers should seek some form of additional confirmation for existing results.

While the research can be split into three phases the research effort may not necessarily proceed automatically from one stage to the next. confirm our theory. a surprise result during the experimentation phase may signal a return to exploratory work. where some parallel between the artificial and the real results can be noted. Some authors place particular value on surprise results in emergent systems (for an example. Figure 3. 1999b). Bullock (1997) argues that without a prior hypothesis of the system and emergent results. Obtaining unexpected results..1. Indeed.Chapter 3 – Artificial Life 46 same hidden assumptions that exist in the researcher’s previous models. progressing through Di Paolo’s three phases of research. 3. This might require reformulation of the underlying theory and additional work refining or redeveloping models. see Epstein and Axtell. possibly from attempts to reproduce the same model. and understanding gained of how particular results come about) the phenomenon is no longer truly emergent (Ronald et al. surprise results are of little interest – some prior idea of what the results should be is required. be no longer considered emergent at all.1. Some go so far as to include a requirement for a result to be surprising in their definition of emergence – and state that where a result is no longer surprising (after a model has been studied. it would rule that classical examples of emergence.5. remains useful and requires further investigation to turn an incidental observation into a scientific one (Di Paolo. emergent is controversial and unsatisfactory. . Exploration Experimentation Explanation Figure 3.3 The (Un-)Importance of Surprise Note has to be made at this point on the appropriate role and response to surprise results in ALife. Such a ‘moving’ definition of what is. 1996). 1999). Thus for researchers. Getting the results we expect will. and so this should ideally come from the work of other researchers.2). and what is not. far from being useless.1 In practice progress through the three stages of research identified by Di Paolo may not be straight forward. such as temperature (Section 3.

1997). If the results do not feed back to the real world then the experiments have not been ‘science’. or at least replicating. Models can be described. but acknowledgement must yet be made of current efforts in using ALife methods in more creative and artistic endeavours (Dautenhahn and Nehaniv. for the purpose of this thesis.) . it is generally the case that ALife research is more devoted to understanding. That it is necessary to detail the model and results in sufficient depth to allow replication can increase pressure on researchers to concentrate on this aspect of research to the detriment of the explanation which is as important. note that the field of Evolutionary Computation (Mitchell. Instead. and with poorly explained results an ALife model is meaningless even where care has been taken to verify the results and to replicate the model. real world phenomena than it is to more speculative work. (Additionally. detailed and dissected with little reference to the external phenomenon that they are supposed to represent. whether precise numerical values or general patterns. 1999) is one in which similar methods to ALife are used but without any attempt to tie models to natural systems. most importantly.Chapter 3 – Artificial Life 47 3. and possible redefinition. 1999). This is somewhat in opposition to Langton’s frequently quoted assertion that ALife represents “Life-as-it-could-be” – we are limiting ourselves to models that help us explain life-as-it-is instead. In explaining the results. Results of simulations will therefore require comparison with other scientific observations. The explanation is required to tie the artificial model and the results obtained with it. back to the real world. the goal is to provide convincing arguments that the model. we side with those who attempt to use ALife as a scientific tool for understanding the real world. and can have no bearing on evaluating the proposed theory (Noble. the results and.6 Explanation In ALife research the importance of explaining results is sometimes overshadowed by the experimentation. While this has been a topic of some debate. The future of ALife might see greater differentiation. it is the ability of artificial evolution to act as a means of optimisation or the general underlying theories of evolutionary systems that are of interest. the theory are all relevant to the target phenomenon. However.

including key parameter values causing qualitative changes in system output. The use of statistics to describe the model should be directed by the goals analysis. Statistical analysis of results may reveal many aspects of the model as a system. So. as described above. there are a number of recommendations to be made about how to explain results.Chapter 3 – Artificial Life 48 For those working in the ‘Scientific ALife’ mould. In most ALife work. it is also necessary to show that results obtained from an ALife model are valid.g. the first and third goals will be important. 3. The emergence of clusters of agents under differing conditions is a good example of this (e. In many cases it will be patterns that emerge in the model which are compared against patterns that occur in real life. but as a means to evaluating the theory which the model has been built to demonstrate. the results obtained are now the measure by which the theory can be tested. Such comparisons are qualitative measures – similar types of results are observed in both . As noted by Noble (1997) the use of appropriate experimental method and of statistics to provide analysis of results are recommended.6. based on some comparison of the results from the model against what is observed in the real world. • To provide a rigorous description of the system under all possible conditions. If the model has been built to demonstrate a theory.1 Model and Theory In ALife then. but using these does not automatically make an ALife experiment ‘science’. such analysis is likely to be useful. being confident that the simulation model is an accurate implementation of the theoretical model. the second goal being important for more theoretical work on complex systems. 1996). but should not take precedence over explaining the relevance of the model. • To detail the output as required for a follow on argument that the results are valid and correlated with those observed in the target and so support the proposed theory.6.2 Validation of Results Armed with a plausible theory and model. Epstein and Axtell. the results of simulation should not be viewed as significant research results in and of themselves. and how to evaluate the success of the model and theory. related to the natural process under investigation. 3. which may be one or more of the following: • To describe the model output in sufficient detail for replication.

any explanation of results will rely on some argument that the results of the computational model actually relate to the real world. Conversely.7. there is an important need for explanatory argument in detailing the outcome of research. 1971).Chapter 3 – Artificial Life 49 systems.1 ALife vs. and how good are they as indicators of the soundness of the underlying theory? This is one of the aspects to be considered when thinking of the potential limitations of ALife. and effort must be spent to make the implementation as transparent . the code unpublished. Some limitations of the ALife method also have to be recognised. the inner workings of computational models are generally hidden. and a mathematical model can be read by anyone with the appropriate mathematical skills. While unable to provide precise numerical answers. But are quantitative results possible. mathematics is both a means to explore a model and a means of communicating the details of the model. Comparing patterns is a subjective measure – the researcher is a subjective observer. Due to this. In some cases it may be that numerical or statistical measures may allow a direct comparison of the model to data taken from the real world. Mathematics is a language shared the world over. qualitatively similar results may provide evidence that the types of processes that occur in the model are not inaccurate representations of the real world processes. 3. Primarily.7 Limitations Of Artificial Life This chapter has largely concerned itself with how ALife might be applied to scientific investigation. From a set of quantitative results it should be possible to check predictions against real data as part of validation. and that the abstracted agents in the model interact and behave in a manner that represents the phenomenon being investigated. 3. 4) points out that mathematical models have some strengths that are not shared by computational models. in the tradition of dynamic models like Forrester’s world population dynamics model (Forrester. interpreting the results and deciding when similarity is close enough to support the theory. Mathematical Models Di Paolo (1999b. Quantitative results will often be obtained where the model is intended not so much for explanation but for prediction. Ch. Indeed. This deficit is clearly one that has to be considered when building software models. and the appropriate methods for this.

Detailed algorithms should be published and Di Paolo argues that the code that implements a model should be publicly available. the complexity of many programming languages can make source code nearly impenetrable to understand. How does ALife fit into this. the separation between the description of a model and its implementation is unfortunate.Chapter 3 – Artificial Life 50 as possible. and can ALife models be used as a form of ‘proof’? Obviously it would need to be shown that the model is valid.2 ALife Models as Proof Now consider some limitations of ALife regarding its ability to prove theories.68). but some uncertainty may remain. presumably at least on request. before the model could even be considered for its role in disproving a theory. This may entail performing the simulation a great number of times for many different parameter combinations – ‘good’ ones as well as ‘very bad’. as already discussed. but without a guarantee of accurate translation from theoretical to simulated model this problem will always remain. Even with publicly available code. but unavoidable. or its author. By means of systematic relaxation of assumptions (Miller's fifth heuristic) it is possible to make strong guesses. The importance of verifying models has been previously mentioned. and this principle is applied in much of contemporary scientific practise. Without an infinity of observations how can it be known that B is true for all cases of A? According to a Popperian view. 1997) holds that a theory can never be proven – but that theories may be disproven. Further. . Muller (1996) uses the principle of falsifiability in his arguments for proof by parameter optimisation. If it can be shown that even for the worst possible parameters in some simulation that the predicted result is still obtained then this is some way towards showing that the theory cannot be disproven. 1999b. p. the fact that it has not yet been possible to disprove something is as close to proof as it is possible to get. Another weakness of a simulation is that it is not necessarily clear what is happening in any given simulation to produce different results (Di Paolo. 3.7. Were this thesis. of a more philosophical bent then the issues here could certainly have been given a much more thorough treatment – instead the goal here is simply to illustrate some shortcomings of the ALife approach. Karl Popper’s principle of falsifiability (summarised in Thornton.

For example. any set of results are due to random processes and hence only one set of observations out of an infinity. the experiments detailed in Chapter 6 attempt to disprove existing theories. However. a model that obtains the same result but without the rule demonstrates that the rule is not required. This could tie up infeasible amounts of time with possibly little gain. For stochastic simulations. Rather than test for the worst possible parameter values we should. it still may be that an ALife model is better as a test-bed for falsification rather than proof. but it may be obvious that the extreme case is one that is not possible for the real world agents and system. Attention should still be paid to results for extreme cases. the worst possible parameters may well lead to results that do not support the theory at all. and would lead to very different outcomes. the results of ALife models may be considered not as proof but as contributory evidence in the construction of a scientific argument (Toulmin. as extreme parameters can actually represent quite different conditions and rules. in many cases.3 ALife Models as Evidence Alternatively. .Chapter 3 – Artificial Life 51 Practically. Rather than present the results of a simulation with a flourish and claim that it conclusively proves a theory. Again. 3. Even in this case it does not prove that the agents in the real world don’t follow the rule – just that it is not required.7. In some cases this may be useful in disproving theories: where a theory claims that some rule is required for a particular result. In some cases we may be justified in expecting reasonable values of parameters. 1958). Rather than ‘proof’. It has already been observed in the notes on obtaining quantitative results (Section 3. so introduces the question of how many simulations have to be run before an attempt at falsifying a theory is considered satisfactory.6. Hence. test across ranges of parameters to ensure that a particular result is robust. what is seen most often in ALife work are demonstrations that under certain conditions a certain result is observed.3) that ranges of parameters may give significantly different results. a parameter might represent the likelihood of some interaction occurring. this method may not work. the results may be used in conjunction with evidence and observations of the real world phenomena in building a convincing argument to support a theory. Extreme values of such a parameter would clearly represent quite different conditions.

subsumes many ideas from work on complex systems – and accepting complex systems theory places limits on the predictability of such systems (e. most models will rely on single numerical parameters in place of factors which are not easy to accurately quantify – emotional state or the chance that someone might misinterpret some signal (see. with supporting evidence. with limited richness in possible interactions and accuracy.g.4 ALife and Quantitative Results As noted above. 1987). This appears to claim that if concrete predictions cannot be made from it. Most ALife also. 3.. Doran. at least implicitly. Further. Steels and Kaplan. Thus qualitative predictions can be made where precise quantitative predictions may not be possible. A different view is put forward by Muller (1996). different classes of behaviour may exist. It must be recognised that models are abstractions of reality. for a particular set of rules and parameters.7. This is hard to accept. This approach is particularly suitable where there are existing theories that are being explored with the ALife models. for the different theories. e. in many cases the results of simulation models are compared qualitatively to real world observations. Gleick. Arguments will have already been formed. relates to the use of ALife in disproving theories). and in many cases there are strong reasons for looking only for qualitative results. Complex systems often recognise that for different values of parameters. 1996. The qualitative outcome may be predictable in many cases where it is not possible to put precise figures on the predictions. obviously. The outcome of the ALife research will ideally be further evidence to support one theory as well as evidence to weaken the support for other theories (this also. . 1998). then the simulation is not of much value. The researcher is asking if.Chapter 3 – Artificial Life 52 The known difficulties in validating and verifying ALife work are less problematic when the ALife models are no longer the sole support for some theory – a particular model is just one of many pieces of evidence. To use this approach successfully further emphasises Miller’s demands for thoroughly researching the target domain and/or for working with researchers in that domain. the same sort of results are observed in the simulation as in the real world. who argues that quantitative results are necessary for computer simulation to be considered useful in theory construction – that the simulations must give concrete figures which can be compared against direct observations.g.

where they can be obtained. In attempting to give an evolutionary explanation for the existence of any feature of language. with innovation by biological mutation and selection. 3. starting with concerns relating to the EoL. it is now appropriate to add any special considerations required for using ALife in research into the evolution of language. however. is that the evolution of language refers to two different processes: one biological and one cultural. Strongly correlated results increase confidence in the correctness of the model and theory. Universality. Thorough statistical comparisons may be performed and the model tested against reality.1 ALife and the Biological Evolution of Language Hurford (1992) presents six principles for evolutionary explanations of language. 3. as noted in Chapter 2. but could apply more generally to mathematical or other attempts at developing explanations for the evolution of language.8 ALife and the Evolution of Language We previously highlighted the assertion of Taylor (1998) that the methodology used in any program of ALife research should be adapted according to the particular problem domain under investigation. As a result. apply strictly to the biological EoL – thinking of language as an organ. . Any ALife model should be aware of this. Perhaps the single most important factor. or may involve both. Yet.8. Having reviewed methods and guidelines for the general application of ALife. they can certainly be compared more readily with collected data. poorly correlated results might indicate that the theory is simply wrong. Naturally. Hurford argues that an investigator should consider the following factors: 1. this depends on some understanding of the special problems. Work may focus on the biological evolution of linguistic ability in Homo sapiens or on the historical evolution of languages. we similarly divide this section on methodological considerations for computer modelling of the evolution of language. The linguistic feature being investigated should be common to language in general. questions and methods that apply to the evolution of language. These rules do. quantitative results may not be feasible in many cases. The application of these is not limited to computational modelling of the evolution of language.Chapter 3 – Artificial Life 53 So. and clear in which elements of these different processes relate to the particular theories being explored.

and this is another distinction that we must ensure is embodied in ALife models of the EoL. but rather the ones that simply happened to survive (a point which Hurford acknowledges). This is especially required to avoid proposing a biological solution to some culturally emergent aspect of language. Further. This can be presented as argument or simulation. 6. 3. 4. the relation of an innate feature to genetic expression may be very hard to satisfactorily prove. The first point is clear and without problem – if the feature of language is not truly universal then it is unlikely to be innate (unless of course there is some evidence that possessing or lacking the feature is determined by heredity). Depending on the features under investigation. But. as discussed in the previous chapter. and evidence provided of this. Demonstration. Some relation of language feature to genes has to be made. A range of hypothetical alternatives should be presented and tested. and this should be stated. Genetic Expression. We also distinguished the evolution of language from the evolution of communication. 5. In some cases only one or two alternatives may be envisaged. the final principle may also prove difficult. The feature should be innate. The more alternatives. determining the ‘innateness’ of a particular feature can be more difficult than establishing its universality. The possibility space of alternatives should be related to fitness and or reproduction. Despite these problems the principles are generally worthwhile. the feature will either necessarily or probably emerge as the only survivor out of the possible alternatives (3). Adaptive Value. they are not without problems. Given that chance has had a role to play in the EoL. it is possible that many features of language may not have been the features most likely to survive. Contingency. the more impressive if the feature chosen is the only one that emerges. An argument should be presented that given language-gene mappings (4) and language-advantage mappings (5). While these form useful guidelines. Recognising that there are important differences between . where there are several possible outcomes.Chapter 3 – Artificial Life 54 2. Innateness. and emphasise the need to put effort in particular into showing that the feature under investigation is actually one that is a biological feature of language – an important prerequisite if we are to attempt to provide an evolutionary explanation for its presence.

Without additional work. but learned. With such a constrained evolutionary space and . An attempt is made to demonstrate that language evolved due to a need to be able to warn others about possible attacks. or significant changes. Another example may be helpful. Such cooperation is rewarded by the model. This is rewarded. in the previous rhetorical example. and that many special considerations apply to the evolution of language separate from more general questions on the evolution of communication. and results gathered which seem to support the hypothesis. in a variety of animal communication systems should not prevent this approach from being useful. since deciding on a particular function that is served by language in the model is itself problematic.1). However. and will start to cooperate. such communication systems are also used by many species of animal (Hauser. Incorporating selected features. In doing so. One solution to help avoid this problem is to refer to the characteristic features of language (Section 2. Sooner or later.2. it may be that the space of evolutionary possibilities is quite small. and language evolution succeeds. A model is constructed. The fact that the majority of the features exist in some form. but may change. but because of the problems in interpreting the effect of the single use modelled. A model is built in which successfully using language allows agents to somehow cooperate with one another. perhaps other ways of using signals could be incorporated. sets of features may be selected which together are only rarely. Additionally the repertoire may not be of a fixed size. present in human language but absent in animal communication systems. Instead. can enforce the relevance of the model to the EoL. Also extending the model. even if not all together. these must influence the design of ALife models. the author might seek to model a signalling system in which the repertoire of warnings is not fixed. if at all. the problem may be caused not by ignoring other uses of language. it is not necessary to attempt to include all features. While language is known to serve many functions in human society. A rhetorical example is in order here to illustrate this requirement. found together in non-linguistic forms of communication Perhaps.Chapter 3 – Artificial Life 55 language and other forms of communication. there is no valid reason to assume that the results of the model are particularly relevant to the EoL. With one explicit use of language built into the model. some agents will evolve the ‘linguistic ability’. A simulation is run and results gathered which show how language may evolve to support cooperation. 1996).

an ALife model of the historical evolution of languages should avoid the genetic transmission of languages. although not a memetic model.Chapter 3 – Artificial Life 56 rewards built in for the single anticipated and possible use of language.7. linguistic diversity (Nettle. not greater. 3. not the evolution of language.2 ALife and the Cultural Evolution of Languages A different set of guidelines will apply to attempts to model the historical eol. Where the means of transmission incorporates learning from other agents there is no real objection to the use of memetic concepts. Some ALife models. What should also be inapplicable are the problems of confusing the evolution of language with the evolution of communication or cooperation – but this may not be the case. 1999a).8. The model developed in Arita and Koyama (1998). Such findings obviously show how different environmental conditions can affect the evolution of cooperation in species – but the model fails to allow for non-cooperation between agents that share a common dialect. may use hereditary signalling systems to represent language. And evidence from studies of language diversity show the opposite trend from that described by Arita – scarce resources lead to lower. and instead incorporate more realistic means of cultural transmission. An explicit learning mechanism is a better alternative – agents can . uses a hereditary signalling system although it has been developed to investigate the evolution of dialects in language. Hurford’s six principles will explicitly be irrelevant. possibly using the ‘meme’ paradigm (Dawkins. This problem is revisited later. the findings correspond more closely to the results that would be expected in investigations of the evolution of cooperation than the eol: common dialects allowing cooperation exist where resources are plentiful. Where the translation from a genetic to a memetic model takes place without acknowledging the differences in methods of transmission. questions arise about whether cultural evolution is indeed being modelled. 1976). non-compatible dialects (or even a complete lack of signalling) preventing cooperation exist where resources are scarce.4. the model is simply showing the successful evolution of cooperation. Some questions on how the evolution of the biological LAD might be modelled are discussed shortly. Unsurprisingly. in Section 5. after turning our attention briefly from biological evolution altogether. It may actually be better to not include any explicit use of language at all. So.

2000b).8. The review given here highlights many of the problems.9 Summary There are a number of problems facing ALife researchers. Such models are quite complex.4.3 Coevolution of Language Ability and Languages In the previous chapter the idea of coevolution was introduced. a test should be carried out on the learning of signalling schemes learned with a fixed. Undoubtedly. Additional complexity also introduces additional uncertainty into the results and the understanding of how they arise. in Section 3. For example. the testing should also verify the outcome when evolution is only possible in one of the two coevolutionary systems. An example of this is found in (Briscoe. but is extended here. Some . as well as the innate linguistic ability. the better.4. it is recommended in such models to test individual elements in isolation to verify their behaviour. and gives a variety of solutions. Guidelines offer an easy way to promote awareness. 3. in a model like that described above. and to suggest possible solutions. as many different interactions that affect the outcome are taking place. interest in and awareness of methodological issues is growing. The idea of unit testing an ALife model was mentioned earlier. LAD. But what of including both in the one model? While it is recommended that ALife models should be minimal. whether biological or cultural. As well as the interactions between different agents as one part of the model evolves. ALife models of the evolution of language should differ according to the particular aspects of language that are to be investigated. While it is not possible to prescribe a single methodology for all ALife work. the sooner into a research project that a suitable method is adopted. are there cases in which a single model should use both forms of evolution? 3. To compensate for this.Chapter 3 – Artificial Life 57 learn language from a selection of other agents instead of inheriting it from two parents. and any attempt to investigate this using an ALife model will require that the signalling schemes used in the model can evolve. As well as the testing of individual agents. non-evolving. which considers the coevolution of language and the LAD. a high level of awareness of the problems facing ALife research is clearly desirable. there are the interactions between the coevolving elements of the model. While the current state of methodological awareness in the ALife community is quite poor.

In the next chapter. the base model. Problems theoretical and philosophical are more academically attractive than the mundane problems of ensuring code is correct. and to trying to show how a collection of ones and zeroes can be relevant to processes taking place outside the computer. . There are. details are given of some of the steps taken in verifying and validating the models and the principles outlined in this chapter are generally followed. But for now precedence is rightly given to attempting to develop an emerging ALife scientific method. No software quality process has been followed. however.Chapter 3 – Artificial Life 58 problem areas still remain poorly covered in the literature – the academic pursuit of knowledge appears to be uncomfortable with software engineering. In future years as models grow larger and more detailed there may be increased need for more rigorous approaches to software development in ALife. no rigorous software engineering development documents. In the descriptions of experiments and models that follow. is presented and documented – according to many of the steps and methods prescribed in this chapter. round which the work of this thesis is developed.

ANN. Cangelosi and Parisi. The model as detailed here provides a base for further experiments on modelling the biological and cultural evolution of language. de Boer. 1990). 1996a. Cangelosi et al (2002) describe a related model in which agents learn lexical terms for distinct visual inputs – square.Chapter 4 – Investigating the Evolution of Language 59 Chapter 4 An Artificial Life Model for Investigating the Evolution of Language 4. 1998). These models typically demonstrate the self-organisation of language features through a process of repeated interaction and learning. detail how it works and justify the decisions taken in implementing it. 2000.2 Computational Models of Language Learning Populations A number of recent computational models demonstrate the evolution of innate communication schemes (for example. Di Paolo. This differs from the other models described here in that the lexicon is predetermined rather than emergent.1 Introduction In this chapter I develop the model that will be used in my computational investigations on the evolution of language. Oliphant. This model demonstrates how evolution can tune innate learning mechanisms towards certain grammars. . 1998. etc. Kirby. 1996. model in which recurrent neural networks attempt to learn context-free grammars in an investigation of innate language biases and critical periods. 4. described in the following chapters. grammars and sound systems in populations of language agents without evolution acting on the language agents themselves (e. 2000). and a predetermined number of inputs and outputs. not linguistic self-organisation. Evolution determines initial weight values for the networks. once the mechanisms for language have developed. Some preliminary experiments are also detailed. 1997a. ellipse. Other models demonstrate the self-organisation of lexicons.g. This is a consequence of the model being used to examine the phenomenon of symbolic grounding (Harnad.Steels. Batali (1994) combines evolution and learning in an artificial neural-network. The language agents have a fixed structure. Batali. I describe the model. selecting appropriate values for the class of languages on which the population is trained.

1998). The first problem is simply that recurrent ANN. and the ability to learn a grammar. have longer training periods than simple ANN. but not ANN-based. However. The networks themselves may consist of many layers and hundreds of nodes. Similar. While this does not capture the structural or generative nature of human language. Building a model where agents use simple grammars is certainly possible – (e. The authors limit their investigation to the development of a shared lexicon. such as pointing. 1997). Using a form of Hebbian learning (learning which increases the strength of a weight when the neurons which are connected by it both fire simultaneously). 1998. Using populations of ANN. but in part only. 1993) – but would introduce additional problems to be overcome. ANN agents relate meanings to signals and vice-versa with winner-take-all competition on the produced vector. with a different signal being used for each meaning. with observational rather than reinforcement based learning. is presented in (Oliphant. By using a fixed ANN structure. 1996b. work has been presented. given suitable training there is no reason . initial weight values are evolved to find those best suited for language learning. Another model where a community of ANNs negotiate a shared lexicon is presented by Hutchins and Hazelhurst (1995). and experiments would take significantly longer to perform. with meanings and signals represented by binary vectors with only one active value.g. these works have shown the emergence of shared lexicons in a number of distinct implementations. Steels. 1997. The agents within this model are similar to the ones presented in this thesis. or the connections between nodes. Using a simple signal-meaning mapping also allows for much simpler modelling of the evolution of language ability. In Oliphant’s model. Steels and Vogt. In these works. Elman. enables agents to know the correct meaning of an utterance. Steels and Kaplan.Chapter 4 – Investigating the Evolution of Language 60 A second model using ANN agents learning to produce and interpret signals. Steels. Steels. With a recurrent network. How can the EoL be modelled with such networks? Batali (1994) achieves this. agents attempt to learn symbol-meaning pairs through a negotiation process termed the ‘naming game’ – in which it is assumed that extra-linguistic means. this would have a significant overhead. with a number of variations and enhancements by ( Steels. although their agents are more complicated with an additional layer (used only in learning). Oliphant shows successful negotiation to a common optimal language. used for ANN grammar learning. there is not a simple relationship between the number of nodes. 1996a. 1996c.

the capacity for language learning. 4. . only a tuning of their initial parameter settings.. Kitano.1 Evolution of Language Ability What the models described above do not show is the evolution of linguistic ability over time. rather than evolving ANN to solve a single problem. but here it is simply the initial weights which are modified . train them to solve a certain task to determine fitness before producing a new generation of networks and beginning again.2. or Belew et al. These approaches all evolve ANN. without preventing them from reaching consensus on signals to represent particular environments. The different layers of nodes store vectors representing the external environment.Chapter 4 – Investigating the Evolution of Language 61 in principle why agents in the first generation should not be able to learn the languages eventually learned by those agents which have benefited from many generations of evolution. There is no evolution in the structure of the agents’ ANN. A number of approaches have been developed for evolving ANN of some complexity (for example. This allows agents to use different internal representations from each other for the same meanings. In ignoring grammar. the ANN need to co-evolve to solve a problem that itself varies according to the current population makeup. The nature of this problem is itself more complex. the signal generated in response and in the middle an internal representational state. In this model. with evolving network structure. 1991). 1990. an important feature of language is lost but the details of the model are kept to a minimum and focussed only on properties and processes that are necessary for a minimal form of language. A difficulty in trying to use such models in investigations of the EoL is that. A slight exception is the work of Batali. In this work.the network structure. the individuals first learn to identify stochastic sources in an environment and then learn a common language to communicate about the sources present in any given environment. in Fyfe and Livingstone (1997) a population was modelled in which the capacity for language learning does evolve. An interesting approach would be one in which the ANN structure – the number of nodes and their interconnections – evolved. The agents in later generations are not intrinsically 'more able' to learn language than those in the first. In contrast.. 1989. is not. see Miller et al. the language agents consist of three layers of nodes with two layers of interconnecting weights.

4. In such a model.Chapter 4 – Investigating the Evolution of Language 62 Agents with fewer neurons in the hidden layer might be unable to accurately represent the different sources existing in the environment internally. This applies equally to mathematical and to ALife models. but the ability of agents to interpret signals was not tested. it is a consequence of the lack of pressure on agents to be able to interpret the signals and could be considered an artefact particular to this implementation. and to focus on selected aspects of a system. Communicative success was seen to improve in populations with a common representational capability. existed. Experiments were performed with communities of agents with differing representational capabilities. they would be structurally less able to communicate effectively about their environment as would agents with larger hidden layers. or to be able to decode signals. by producing the same word for the same environment two agents are assumed to be successfully communicating. yet which overcame the problems of Fyfe and Livingstone (1997). Ideally the model would include all relevant features of human language. the model was to be one of language-physiology co-evolution. Thus.3 Modelling Human Language The model should capture some of the characteristic features of language (Section 2.1) if we are to claim that the results of running the computational model are relevant to an investigation of the evolution of language. Thus. and evolution towards homogenous representation capability was observed – but not necessarily towards better representation. as the language ability of agents changed. however. models are abstractions of real world systems and the power they have in allowing us to improve our understanding of such systems comes from the ability to simplify. . No pressure to produce different signals for different environments. As we saw in Chapter 3. to remove complicated and possibly extraneous features. While this was presented as a positive result. The original aim of my research was to develop a new model in which the language ability could evolve over time. be very difficult to build and even more difficult to understand. Thus. to allow investigation of factors surrounding the EoL. so would the languages that they used.2. A weakness of this model is that the production of signals was compared to assess the success of communication. showing the emergence of synonyms. Such a complicated model might.

Without at least these properties.g. Thus. We saw in Section 2.Chapter 4 – Investigating the Evolution of Language 63 Thus. However. starting with a list of relevant features. 1949). From an Information Theory perspective (Shannon and Weaver. an ideal communicative system removes all uncertainty. Such a system is extremely rare in natural communication systems other than human language. text-book.) Additionally. ensuring that the signal scheme is arbitrarily symbolic and features ‘traditional transmission’. except where the same signal is produced for all possible sources. the agents in our model should acquire their arbitrary meaning-signal pairs through a learning process. or where there is no causal relationship between source and signal.2. and those that do appear to have innate rather than learned mappings (Hauser. that the signals need to be communicative. such extra complexity could cause difficulties. any system that generates signals from inputs will be communicative. 1994). While it is possible to develop computer based models in which agents learn signals which are regularly structured (e. such models are more complex. As mentioned above. (Although. Batali.1 that six key. 1996). is clearly important. If the agents in our model are able to correctly interpret the majority of signals received then we will have established that the signals are highly communicative. we can select the minimum set that we feel will enable us to model the EoL. see Di Paolo (1997b) for an account of the role of communication for coordination where there is no hidden information. and so the decision was made to focus on the already identified core features. features of human language are that language is: • • • • • • Communicative Arbitrarily symbolic Regularly structured and structured at multiple levels Generative and productive Dynamic Transmitted by learning rather than hereditary means Of these features the first. a signal is communicative if it reduces uncertainty about the environment or some signified event. a model may be more truly one of innate communication systems rather than of human language. As we will be attempting to develop a model in which the linguistic ability of agents evolves. these features . Few animals use signalling systems with discrete meanings and signals.

a simpler two-layer (internal state and signal) model can be used instead. Finally. e. representing differing innate linguistic abilities. ability to make the signals model the meanings captures the arbitrary symbolic nature of language. The only remaining characteristic feature of language is that it is dynamic. we will use the established ‘naming game’ (Steels. 1998). The fact that there is no pressure or. in Fyfe and Livingstone (1997) the agents map an environment. While the agents no longer have differing internal representations. As will be later seen. changing over time. the function performed by the weights is the same – learning a mapping to generate common signals for common input states. A learning rule was chosen which would allow ‘signals’ to be fed backwards to produce meanings. indeed. A suitable learning rule will allow the development of language by individuals. developing a model in which the language is generative and productive. to an internal state. It is easy for an ANN to learn uni-directional mappings. akin to historical language change over many generations (Chapter 6). Due to the abstractions and limitations inherent in any computer program.Chapter 4 – Investigating the Evolution of Language 64 are rarely found together outside of human language.4 An Artificial Neural-Network Based Language Agent The individual language agents are implemented as simple ANN. 1996a) interactions between agents for the purposes of language learning and testing. Explicitly. 4. in a way similar to a Bidirectional Associative Memory (see Haykin. they provide a reasonable set of features for building a valid model of the EoL. e. Within the bounds of a model in which populations of signallers exist. For the purpose of investigating the EoL. the model also features evolution in the communication schemes being used. s: . We investigate the dynamic nature of the signals used in our model in Chapter 6.g. The principal advantage of an ANN implementation is that it is relatively easy to generate individuals with differing network structures. to a signal. allowing original and infinite use signals and combinations of signals is much more difficult. from some nominal ‘meaning’ to produce a ‘signal’. i. developing the models of the individual agents in order to allow the production and interpretation of a nearinfinite set of signals is clearly not practicable. the previous three-layer model is not required. Accordingly.

One obvious measure of communicative success is to measure over a number of communicative episodes what proportion of signals sent are correctly interpreted. Alternatively the simpler measure. Consensus. made explicit by Oliphant. All individuals in the population must have the same communication system. (Oliphant. 1997) defines three requirements which must be fulfilled for optimal communication. Distinctiveness. then communication can be said to have been successful. the next question is how can the success (or otherwise) of the evolution and emergence of ‘language’ be measured and evaluated. These are: • • • Coordination. where every signal sent is correctly received. to send a signal to another agent. In the following we assume that the function mapping from external environment to internal state is a given.Chapter 4 – Investigating the Evolution of Language 65 s = f 2 (i ) = f 2 ( f1 (e)) = f (e) Simplifying this. Having decided on the type of communication to be modelled in the simulated system. Steels and Vogt.5 Evaluating Communication in Artificial Populations In the model I will develop. then communication can be said to be perfect. The receiver then maps the signal into an interpreted meaning. An individual’s meaning-to-signal generation must be coordinated with the inverse signal-to-meaning mapping. The degree to which these requirements are met can be measured to evaluate the optimality of the communication system. If the interpreted meaning is the same as the initial meaning.2 we saw that minimal models are to be preferred over more complex ones. A caveat to this requirement is the assumption. or meaning. of equal numbers of signals and meanings.4. 1997. it is possible to use a single layer of weights to perform a similar learning task. . communication allows an agent possessing some internal state. In Section 3. 1997). 4. If all signals are correctly interpreted. The extra layer in the Fyfe and Livingstone (1997) model is more a distraction than an essential element of the model. The signals used by an individual to represent each meaning must be distinct from one another. and that agents have a common representation for different meanings – although as has been shown (Fyfe and Livingstone. this is not required for ANN to negotiate a shared signalling scheme.

The percentage success can be measured over many pairings. 1998) were used for training individual agents.Chapter 4 – Investigating the Evolution of Language 66 mentioned above. In the remainder of this chapter I will detail the basic model. The usefulness of the signals for transferring information between two agents is evaluated on the basis of mutual information in the meanings held by the two agents after signalling. without evolution. Information theory also provides means to measure the information bearing capacity of the channel. This will form the basis for investigating the evolution of language ability using the model. To do this all that is required is to form pairs of agents where one generates a signal and the other interprets it. We can think of this as being a model of the coevolving languages used by agents. meaning-to-signal mappings could be successfully negotiated. and the neural physiology that must exist to support the languages in use. and present a number of experimental tests.7 A Language Agent With a requirement only that agents are able to learn signal-meaning mappings. 4. In the next chapter I will add to the model by making the expressive capability of the language dependent on hereditary genes. Information theory (Shannon and Weaver. agents will learn to map messages that are sent by other agents to internal states or ‘meanings’. The genes determine the number of language nodes possessed by agents. 4. By learning from each other a co-ordinated communication system is developed by a community of agents. The same ANN is used for the production of messages from each arbitrary meaning as for the reverse mapping. 1949) provides further means for mathematical analysis of the communication schemes. But with no pressure on an agent being able to interpret the correct meaning of a presented signal the language could use a single word for more than one meaning. of success of interpretation can be used. It would even be possible for a language to be negotiated that used only one word for all meanings (see . If a standard linear ANN training algorithm (Haykin.6 Modelling Language-Physiology Coevolution In the model. a simple two-layer ANN architecture and learning algorithm is sufficient. which determines the range of signals that can be produced. as well as its quality (In Chapter 6 information theory is also used as a measure of linguistic consensus and diversity).

and any error is used to update the current weight values of the network (Figure 4. We apply this principle to ANN with only a single set of weights. The following sections present the equations for signal production and interpretation and for the learning algorithm. such a system would be far from optimal. Using the criteria set out above. fulfilling only the consensus criterion.1. A learning algorithm which ensures the emergence of an optimal communication scheme . The signal is presented at the output layer of the learner and fed back to produce a generated meaning. the learning algorithm should try to adapt the weights such that. The operation of the learning algorithm is then as follows. networks possess feedback generative weights and feedforward recognition weights.with a different signal shared by the whole population for each meaning state . using the same weights for both the recognition and generative tasks. generative weights map meanings to patterns at the language layer. The results of Oliphant suggest that optimal learning is performed by an algorithm using the transmission behaviour of the population to train language reception and the reception behaviour to train language production.Chapter 4 – Investigating the Evolution of Language 67 for example Fyfe and Livingstone (1997). A learner is presented with a meaning-signal pair. or output. This is compared to the original meaning. Language Layer Signal Interpretation Signal G eneration M eaning Layer Figure 4. the signal should produce the correct meaning when fed back through the ANN. A language agent neural network.is desired. and the problem of recognition is posed as which hidden. Using this approach. Recognition weights map language layer patterns to meanings. a similar problem is also noted in Noble and Cliff (1996)). units could be responsible for generating the input pattern. . Thus some kind of inverted learning algorithm is required. An inverse learning approach is presented in a number of generative models for ANN learning by Hinton and Ghahramani (1997).1). for any given meaning-signal pair. In these models.

and only one. the competition allowing a single meaning to be chosen unambiguously. the signal may be any arbitrary bipolar vector of length N).1 Signal Production An individual agent within the population is a single two-layer fully connected ANN with layers containing M and N nodes.. the single neuron with the greatest activation value is set to +1. In all cases described.7. . the remainder to –1. A meaning vector can be fed forward through the ANN to determine an agent’s signal for that meaning. the signal can be fed back to generate a meaning vector. y ′ . equation 4-2 is applied to determine its activation value. Signals are similarly represented at the language layer as arbitrary bipolar vectors of length N. as given below y j = ∑ xi wij i =1 M (4-1) then y ' j = 1 if y j > 0 . ‘Meaning’ is modelled as a bipolar (±1) vector of length M. Thus.Chapter 4 – Investigating the Evolution of Language 68 4. Competition exists between neurons in the meaning layer. bit of the vector to +1. Competition can then be applied to set one. Due to competition. respectively.7. there are 2N possible signals or words in the language. the remaining bits to -1. When a signal is fed-back during interpretation. and for M meaning neurons there are M possible meanings. For each neuron of the meaning (input) layer. it is likely that several meaning neurons will fire at different strengths. y1.2 Signal Interpretation To interpret a signal vector.. such that any signal fed back from the language layer only has one corresponding meaning. for N language neurons. is the word generated for meaning vector x . representing the agent’s internal state. each output being thresholded to a bipolar value (±1). a sparse coding of the meaning is used with only one bit in the vector having the value +1. 4. all others being –1 (However. which is presented at the inputs of a language agent. y ' j = −1 if y j < 0. y ' j = rnd [1.−1] if y j = 0 ′ N where the vector y .

Chapter 4 – Investigating the Evolution of Language 69 xi ' = ∑ y j wij j =1 N (4-2) xi′ = 1 for i = arg max k xk . x’. then a second layer of weights to output neurons would be a 2-layer network. This error is multiplied by a learning rate. Accordingly.7. all weights have a zero value.4 A Note on Terminology It is typical in ANN literature to describe an NN by the number of layers of weights it contains. Signal vectors are arbitrary. When a word is correctly classified the receiving agent performs no learning. the two layers being the ‘meaning’ and ‘signal’ layers of neurons. a network with inputs fed through a layer of weights to neurons. meaning vectors are sparse. the ‘inputs’ can be applied to either side of the network. . an agent will be presented with a meaning-signal pair. and for N signal layer nodes. ∆wij = α ( xi − xi′ ) y j (4-3) This learning algorithm only updates weights when a word is misclassified. signals will initially be random bipolar vectors as each output bit will be set to a random value (Equation 4-1). and the generated meaning. x′j = −1 for j ≠ i 4. In this work. and are fed through a single layer of weights to neurons at the other side. In the initial conditions.5 Representational Capacity As noted.3 Learning During learning. w. x.7. The error between the actual meaning. 2N possible signals can be learnt. we consider the network to have two layers of neurons. 4. Although there is only one layer of weights. connecting the layers. α. So an ANN with M nodes in the meaning layer can learn M possible distinct ‘meanings’. The signal will be presented at the output layer and fed-back to produce a generated meaning vector as described above. is used for learning by the receiver agent. (4-3).7. Thus. to determine the correction to be applied to the weights. 4. as the error vector is composed of zeroes.

Figure 4. Such a line. one and only one bit of the meaning vector should be set to +1. the remaining three bits to –1. however. After feeding back any signal. to be able to learn the bi-directional mapping is N ≥ (log 2 M ) + 1 (4-4) . layer to allow the agents to learn eight pairs of meaning-signal mappings. Such a line can discriminate one signal (Figure 4. for M meaning nodes. N. Such a network should be capable of learning four meaning-signal pairs.2).2. through the origin. The solid line discriminates the top right point. as an additional bias node is required in the output layer. to be used in interpreting signals. Two weights – the number of weights which feed back to each signal neuron – can code for a discrimination line of equation ax+by = 0. The above explanation can be extended to three dimensions to explain why three bits plus one bias bit are required in the output. including any bias. a discrimination line of equation ax+by+c = 0 can be derived. the minimum number of signal neurons. In summary.Chapter 4 – Investigating the Evolution of Language 70 So a network should be able to learn eight meaning-signal pairs if it has eight meaning-layer nodes and just three language-layer nodes. consider an ANN with four input (meaning) and two output (signal) neurons. 4. The dashed line through the origin fails to separate out one point from the other three. For this case there are four possible two-bit signals.7.6 A Bias for Language Learning This requirement for a bias bit can be explained.2). To simplify the explanation. and these can be mapped onto two dimensions (Figure 4. signal. as detailed below. would be unable to separate off any one signal uniquely. This is not possible. By adding a bias bit to the signal.

2.+ . Meaning Learned Scheme 1 2 3 4 5 6 7 8 1 2 3 4 .+ 3 .1. the agent continually fails to learn the mapping. and the error signal used to update the weights.2.7... decreasing over the learning period by dα = 0.+ .+ . Each meaning vector has a –1 in all but one position. An arbitrary communication scheme.7 The Individual Language Agent – 1 The need for a bias bit can be confirmed by training an individual language agent on an arbitrary communication scheme.. it is possible for a language agent to learn any arbitrary mapping..+ 7 + + - 8 + + + Table 4. .7. Meaning Signal 1 .- 6 + .- . the language agent repeatedly learned the signalmeaning mapping.+ + + + + + + + + - + + + + + + + + + + Table 4. Typical results are shown in Table 4. but with the addition of a bias node. The learning rate used is α = 0.8 The Individual Language Agent – 2 By adding a bias unit to the language layer. (The bipolar values are represented as + and -).+ + + + .+ .0005 after each learning example. The signals have been arbitrarily selected as the binary numbers 0 through to 7. Each of eight possible internal states is mapped to a different three-bit signal vector. In each round a randomly selected meaning-signal pair is presented once. and it is the index of this position that is listed in the meaning column of the table. 4. 4.+ . Results of four attempts to learn the communication scheme of Table 4. This is shown below in table 4.+ .+ + 5 + .+ - 4 ..+ + .+ .+ .+ . Over a large number of runs.+ .- 2 .+ + + + . The language agent is trained for four hundred rounds.+ .+ + . N are integer values. All attempts to learn the scheme fail..2.+ + . The above experiment was repeated with the same parameter values.1.- + .1.. active during signal interpretation.- .. with no failures. Over a large number of runs.Chapter 4 – Investigating the Evolution of Language 71 where M..

5. plus a bias. However.Chapter 4 – Investigating the Evolution of Language 72 4. in all successful experiments with 4 language neurons. and communication and learning takes place. 4.3(b) is drawn. an array is created and populated with language agents. it is still possible for populations to negotiate a language which allows the successful communication of all meanings – but only where the number of neurons in the language layer satisfies equation (4-4). successful language learning is possible for a system with eight possible meanings where the agents’ language layers have no bias but at least 4 neurons. the signalling scheme shown is for one agent in a population of agents which all use the second bit of their signals as a bias – in all signals. 3.8 Experiment 1: Language Negotiation The first proper test of the model is to determine whether homogenous populations of agents can successfully negotiate a communication scheme which they can use to share information about their internal states. This is shown in the negotiated languages of Table 4. The algorithm is shown in Figure 4. Over a number of rounds pairs of agents are selected. In Table 4. To do so. although there is no explicit bias. Weights are initialised randomly. for three possible signals three nodes are required at the signal layer.g. .3.3(a). 6. this bit is set to –1.1 Emergence of a Language Bias Without setting a bias. 4. a minimum of 3 language neurons. pick random meaning for each agent (picked in random order) pick another agent to be signaller generate training signal from signaller train both agents on signal Figure 4.8. are required for a language to be able to convey all possible meanings.3. A similar result occurred in the population from which the signalling scheme in Table 4.. one of the signal bits becomes set to the same value (+ or . only this time the fourth signal bit is always set to +1. 1. For t training rounds 2.1) in all signals. In effect. one emerges from the interactions of the agents as they attempt to learn to communicate with one another. with 8 possible meanings. after negotiation. Thus.3. E. Population training algorithm As detailed previously.

2254 0.299894 0. .23806 0. the initial set of signals produced will be random.740592 0.2961 -0.231434 Table 4.216147 Wi4 -0.24148 0. with value -1.22819 -0. The weights corresponding to the bias bit are large.263901 0.224402 Wi2 0.Chapter 4 – Investigating the Evolution of Language b) Word Word Word Word Word Word Word Word 73 a) Word Word Word Word Word Word Word Word 1: 1 2:-1 3: 1 4: 1 5:-1 6:-1 7:-1 8: 1 –1 –1 –1 –1 –1 –1 –1 –1 –1 –1 1 –1 1 1 –1 1 –1 –1 –1 1 –1 1 1 1 1:-1 2: 1 3:-1 4: 1 5: 1 6:-1 7:-1 8: 1 1 -1 1 1 -1 -1 -1 1 –1 –1 1 –1 1 -1 1 1 1 1 1 1 1 1 1 1 Table 4. which is derived from the values of α and t.722206 0.717312 0.5.2321 -0.25996 -0.291194 0.745487 0. M 1 2 3 4 5 6 7 8 Wi1 0.3.3(a) 4. such that α decrements to zero over the training period.8.241851 -0.749355 0.4. populations were initialised and test runs completed for varying values of N and varying numbers of training rounds. In (a) the second bit is uniformly set to -1. all agent weights are initially set to zero.233852 -0. Two negotiated languages.23764 0.250896 -0.226476 0.25074 -0. Further evidence of the emergent bias comes from examining the internal weights of a language agent drawn from a population in which an optimal communication scheme has been learnt.27316 0. Meaning.24527 -0. Additionally. The set of converged weights for one language agent are presented in Table 4.2 Success of Language Negotiation To evaluate language negotiation. All other parameters were kept constant – except for dα. Accordingly.787846 0. All other weights are the same order of magnitude as each other.28557 0. The populations of agents are then given either 100 or 250 rounds of language negotiation to form signalling schemes.23706 0. on the order of three times the size of other weights. The weights for a NN in a population which has negotiated that the second language bit act as a bias.269213 -0.792741 Wi3 -0. The language of this particular agent can be seen in Table 4.744461 0. in (b) the final bit is set to 1. The experimental parameters are shown below in Table 4.4.

Testing was performed. 10 runs were performed.Chapter 4 – Investigating the Evolution of Language 74 Fixed Parameters Variable Parameters 0. 3. A. as well as the means. 1. The results show the standard deviations across the ten simulation runs. Average success scores of all agents Figure 4. such that A <> B Pick random meaning. Success of language negotiation.0005 or 0. 6. M Present M to B to produce signal. . %success = 0 2. of Units in Language Layer Figure 4. Average success (with standard deviation) in interpretation shown. %success = %success + 1 Next test 8.5. The result obtained for success of communication for populations where the agents have no units in the language layer reflects the chance of a randomly guessing the meaning. Algorithm for evaluating language negotiation success The percentage successes for different combinations of parameters are shown in Figure 4. for each parameter set. MG If MG = M. For test = 1 to 100 Select random agent B.4. 7. using the algorithm presented in Figure 4. Right: 250 rounds.2 N 0 to 6 Learning Rate. Within each run success was measured as the percentage of signals correctly interpreted by the agents. 4.5.5. after training had completed. S Present S to A to produce generated meaning. 3. 5. Left: 100 rounds of language negotiation. t 100 or 250 M 7 0. For each agent. Experimental parameters For each parameter set. % success of communication 100 80 60 40 20 0 0 1 2 3 4 5 6 100 80 60 40 20 0 0 1 2 3 4 5 6 No.0002 dα (α / t ) Table 4.4. with variable numbers of units in the language layer. During this testing no learning is performed by the agents. α Population 120 Training rounds. in population.

3 Interpreting the Results With no language neurons. the performance of the agents is at chance level . This has the potential for allowing synonymy.8. no population was able to negotiate an perfect signalling scheme. the success rate of communication increases.9 Experiment 2: Spatially Arranged Populations In Chapter 2. populations with four or more neurons will consistently negotiate a language capable of allowing information to be shared. Without at least as many language nodes as required by equation (4-4). it was noted that the EoL and the evolution of communication are related to the evolution of co-operation itself and this is explored more in the . with a very high communicative success rate. average success increases and with two or more language neurons increased training time increases fitness. and to create such systems where they do not already exist. the expressive power of the language exceeds the communication requirements of the environment. 2002). Given the finite number of training rounds. As language capability increases. even where the capability for an optimal language does not exist. External confirmation that the agents and learning rule used in this model are able to negotiate an optimal communication system is made in a recent work. not all populations in which the agents satisfied equation (4-4) negotiated optimal communication schemes. If training time is extended sufficiently. four nodes in the experimental setup described above. But success at interpreting signals from other agents in the population is seen to improve as the number of neurons in the language layer increases. 4. where multiple signals for one meaning may be recognised correctly. the learning rule used here has biases for allowing agents to learn and maintain existing optimal communication systems. According to this. As the language capability increases.on average agents guess the correct meaning one in seven times. With more neurons. The results show that using our learning algorithm homogenous populations are able to negotiate a useful language. which categorises the different possible learning rules (Smith. so does fitness showing that agents are successfully sharing information about their internal state.Chapter 4 – Investigating the Evolution of Language 75 4. With four or more language units.

The same . but with a spatial organisation imposed on the population of agents. All parameters have the same values as previously. Rather than add spatial selection and enable evolution of agents simultaneously. The standard deviation used is 0. Here the same experiment is repeated. With neighbourhood limited communication the results in Figure 4. effectively limiting communication to close neighbours.Chapter 4 – Investigating the Evolution of Language 76 following chapter. In any and every pairing. when two agents were to be selected to communicate for learning or testing.6. where we investigate the possible effect of spatial selection on the EoL. In all experiments described the standard deviation used is 0.6. and so the centre of the curve is ‘zeroed’. a teacher will be picked randomly according to the location of the current learner. it is worthwhile repeating the previous experiment to see if spatial constraints on agent communication have any impact on the results we obtain. again each result averaged over 10 runs. An individual cannot learn or otherwise communicate with itself. placing a strong preference on immediate and very close neighbours. In the previous experiment. This will allow us to observe any differences before making any further changes to the model. and limiting all communication to within a neighbourhood of the currently chosen agent.7 were produced for 100 and 250 training rounds for populations of 120 agents. A normal distribution curve determines the likelihood of a neighbour being selected as a partner in for learning or evaluating communicative success. for example. The area of the neighbourhood is defined by a normal distribution centred on the learner. A simple spatial arrangement is created by organizing the agents in a ring. Success is measured as detailed in Figure 4. every agent had a uniform chance of selection. and the language is still negotiated over a number of training rounds in a single homogenous population.4. So. A ring arrangement is not necessary – a linear arrangement could be used instead – but means that all agents have the same number of neighbours and have the same chance of selection. any two agents could be selected. Figure 4.6.

7 appear to be consistently larger than those in Figure 4. 100 80 60 40 20 0 0 1 2 3 4 5 6 100 80 60 40 20 0 0 1 2 3 4 5 6 Figure 4. it may be the case that language negotiation is more successful amongst some agents than others in a single population.9. With 100 (left) and 250 (right) negotiation rounds.5. 4.7. particularly where the number of training rounds is low. rather than across the whole population. Why should the variance in success be greater. This could also explain an increase in success. Additionally. with homogenous populations of 0-6 language neurons measured over ten experiment runs each case. This could cause both the higher success rate. and why should the success rate of communication have improved? If the communication is limited to nearby agents. as agents only have to negotiate a common signalling scheme with a subset of the population. there appear to be a few small differences. This could occur if different signalling schemes are negotiated amongst different subgroups of a population – a dialect-like effect. and the increased variance. Below we describe a further run with spatial limits enabled. Communication between adjacent agents would have a higher success rate than communication between more distant agents. Average success of signal interpretation. Comparing both sets of results.Chapter 4 – Investigating the Evolution of Language 77 neighbourhoods used in language negotiation are used for communication during fitness evaluation. it may be possible for them to interpret more than one signal . the average success rates show some improvement when spatial constraints are present. and standard deviation. Using spatial constraints on agent interaction appears to increase the variance in the success of language negotiation – the standard deviations shown in Figure 4. and explore the signalling schemes of individual agents to determine if dialects are indeed formed.1 Emergence of Dialects It was noted previously that where agents possess a greater number of language layer neurons than required. which warrant some further investigation.

agent 5 would send the signal (-1. It is observed that large neighbourhoods negotiate a common signal for a given meaning. interpreting all signals correctly.1). plus bias.g.000 rounds. 4. interpreting all signals correctly. Each meaning is represented by a bipolar (+ and – 1 values) signal vector. This demonstrates that a degree of ‘bi-lingualism’ or even ‘multi-lingualism’ is possible in the agent communication schemes.-1.Chapter 4 – Investigating the Evolution of Language 78 as having the same meaning. despite differing communication schemes. Bias signal is not shown.6. This is shown in Table 4. The signals used by three adjacent agents for seven environmental states. agents may exist which interpret signals from different schemes correctly. Communication may still be otherwise optimal. to indicate meaning 0. e.10 Conclusion In this chapter I have developed a basic model for exploring the evolution of language and described some preliminary experiments. dialects did emerge. this can be put to the test by examining the signals used by different agents. A homogenous language population with four language units. does not negotiate a common language over the whole population. This allows a very high degree of coordination amongst the agent languages but. potentially with complete success in signal interpretation.6.1. At the boundaries between neighbourhoods. due to the local communication. The additional language layer nodes provide some redundancy in the representation capacity of the agents. All three scored maximum fitness. The three agents included each attained a maximal fitness score. These have shown that the . is trained for 5. If. and measuring the success of agents at interpreting different signals. in the last experiment with a spatially arranged population. but distant agents may have significant differences in communication schemes used. Meaning Agent Agent Agent Meaning Agent Agent Agent Meaning Agent Agent Agent Meaning Agent Agent Agent 0 4 5 6 1 4 5 6 2 4 5 6 3 4 5 6 -1 -1 -1 -1 -1 -1 1 1 1 -1 -1 -1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 1 1 1 -1 -1 1 1 1 1 1 1 -1 -1 -1 1 1 1 -1 -1 -1 -1 1 -1 Meaning 4 Agent Agent Agent Meaning 5 Agent Agent Agent Meaning 6 Agent Agent Agent 4 5 6 4 5 6 4 5 6 1 1 1 1 1 1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 -1 -1 -1 1 1 1 1 1 1 Table 4.

This replicates work done on demonstrating self-organising languages and signalling schemes. a linguistic ability with some degree of redundancy has an advantage over one that is merely sufficient. 1991. From Figures 4.8. This is to be expected. is investigated more thoroughly in Chapter 6. As the language capability increases.6 and 4. however. 1996c) amonst others.Chapter 4 – Investigating the Evolution of Language 79 agents in the model are able to negotiate a shared communication scheme. This result. Hutchins and Hazelhurst. another result is apparent. as the language capability increases beyond what is required to express all of the available meanings. a simple ‘language’. communicative success continues to improve. There would appear to be some benefit in having a linguistic ability somewhat greater than strictly necessary. . 1995. and some of its implications are reviewed as part of the next chapter. To rephrase. But. The emergence of dialects. Steels. as demonstrated by (MacLennan. so does the communicative success rate of the agents. noted in the second experiment.

4. 5. we described the basic model which will be used as the basis of our experiments. communication schemes. we will use our model to investigate some aspects of the evolution of language. . As has been noted. 1998). particularly the influence that spatial constraints on the population have on the evolution of linguistic ability. It is worthwhile reiterating several points. These are more akin to animal communication systems than to language. demonstrating instead how languages may evolve in populations of capable language learners. It is this evolution of physiology. rather than learned. Aside from the already noted exception of Batali (1994.1 The Evolution and Emergence of Language Many of the relevant aspects pertaining to the EoL have been reviewed in Section 2. and saw how the language agents are able to negotiate shared communication schemes. Changes to the model presented in the previous chapter are detailed. Such models commonly relate the evolution of signalling or language to a single specific task or adaptive function. which supports language. In this chapter we adapt the model in order to conduct an experiment investigating the biological evolution of language ability. and that there are costs associated with a number of these changes. As well as demonstrating a model of the evolution of language. We conclude the chapter by critically reviewing the contribution that the experimental results and analysis has on existing debate on the Evolution of Language.Chapter 5 – The Biological Evolution of Language 80 Chapter 5 Modelling the Biological Evolution of Language In the previous chapter. We also see how some design decisions when implementing the model can have a significant impact on the observed results. few of the current models of the evolution of language incorporate this. to emphasise the goals of the following work. and show that previous work on modelling the emergence and evolution of language has not been sufficiently targeted at answering some of these points. and Chapter 4 contains brief descriptions of various attempts to build models of the emergence and evolution of language. It is clear that human physiology has been adapted in a number of ways which has enabled the learning and use of language. those models in which individuals evolve as well as their communication schemes tend to demonstrate the evolution of innate. that we wish to model. Before proceeding with the experiment. we begin by highlighting some of the key points in the debate. and experiments run and results presented.

While this makes modelling easier. This requirement arises as a result of the inevitable heterogeneity which must occur in a population as a particular trait evolves. Assuming all agents in the population have at least one neuron in the language layer. two or more neurons. subsequent generations may feature heterogeneous mixes of individuals with one. All agents have the same. number of neurons in their meaning layers. For a synthetic example. Here they must be tackled and explicit solutions to these problems are required.2. Steels and Vogt (1997) have also shown that shared internal representations are not required for successful communication about external objects common to multiple observers. and of how to ensure that heterogeneous agents can negotiate communication schemes. . it is not necessary to resolve the problems of how to represent evolving language ability. Once some mutation occurs to initially provide the trait in a few individuals. is that communication and learning of language has to be possible with heterogeneous populations. Without modelling the coevolution of physiology and language. The simple change required is to make the number of neurons in the language layer a hereditary trait. it is acknowledged that this is not necessarily true in the real world.1 Heterogeneous Language Abilities This can be captured using ANN based agents derived from those presented in the last chapter. if individuals with more developed linguistic abilities are not able to form a common language with their less well developed neighbours then their abilities will not confer the expected fitness benefits. but not in the other models already cited. 5. many generations may pass before it is shared by all. Figure 5.1. In the case of language.Chapter 5 – The Biological Evolution of Language 81 5. Two agents of differing innate abilities for language learning and use must be able to learn to communicate with each other despite their differing abilities. fixed. subject to change and evolution.2 Modelling the Biological Evolution of Language Ability One problem to be explicitly addressed in our model.

2). The bipolar vectors are shown here simply as strings of + and -. different agents may have different structures. the communicative episode might unfold as shown in example (5-1). a zero bit in the signal. has no influence on the interpreted meaning. Signals which could be produced by the three language agents shown in Figure 5.+ . Signal Produced (A) (B) (C) 1 (+--) + + + . With one. Over a number of generations the number of neurons in the language layer evolves. according to table 5. Thus. The following examples demonstrate how communication is possible between the three agents of Figure 5. below: Meaning 2 randomly selected Agent C generates signal vector ( + + + ) Agent A receives signal ( + ) Signal interpreted as meaning 1 or 2 (depending on activation values) Meaning (5-1) Conversely.7. assuming them to be members of a single population of agents. So. x. and will have to be able to communicate despite the differences.1 shows signals that might be produced by the three agents.1. For the three meanings.1. then it is padded with zeroes.+ Table 5. A would receive only the first bit of the three bit signal as it only has one node in the language layer. With three nodes in the meaning layer. Even within a single generation. only non-zero (+ or – .1. A bias node is present in the language layer in all three cases but is not shown.1.1. A zero value presented at a language layer node will have no effect on signal interpretation when fed back – as can be seen from equation (4-2). Table 5. y. if the signal received is of a shorter length than the agent is capable of. respectively (4.Chapter 5 – The Biological Evolution of Language 82 Language Layer (A) (B ) (C) M eaning Layer Figure 5. two and three nodes in the language layer they can produce two.2 (-+-) + + + + + + 3 (--+) . there are three possible meanings. If a signal were sent from agent C to agent A. four and eight distinct signals.

1998) that comprehension leads production – that the ability to understand or interpret signals leads the ability to produce them. This is implemented by limiting the number of neurons in the language layer which can be active during language production. in ape language learning and in human speech acquisition. Language Layer M eaning Layer Figure 5. Accepting Burling’s argument. but only a limited number of nodes may be active for signal production. Example (5-2) demonstrates this: Meaning 2 randomly selected Agent A generates signal vector ( + ) Agent C receives signal ( + 0 0 ) Signal probably interpreted as meaning 1 or 2 (depending on activation values) (5-2) As demonstrated above.Chapter 5 – The Biological Evolution of Language 83 1) bits in the signal can have an effect on the interpreted meaning. This is seen in the ability of many animals to understand commands given to them. but hereditarily determined language production ability. using the agent architecture as described allows communication in heterogeneous populations. . but could improve the likelihood of successful communication significantly above chance. All neurons are active for signal interpretation. all language agents have the same structure.2 Comprehension Leads Production It has been argued (Burling.2. it is desirable to also capture this feature in the model. This is also done quite simply. In the comprehension-leads-production model. This is demonstrated in the following two examples. All agents have identical structures. Figure 5. 5. including those which they are unable to produce. Such communication might not always be successful.2. all agents are potentially able to learn to interpret all signals. A further modification further improves the chances of successful communication. any inactive neurons producing a zero value.2. By implementing the changes described above.

Again. zero values presented at a language layer node will have no effect on signal interpretation when fed back. This communicative episode might unfold as shown in example (5-3): Meaning 2 randomly selected Agent C generates signal vector ( + + + ) Agent A receives signal ( + + +) Signal interpreted as meaning 2 (5-3) During production. but where comprehension leads production poor signallers may benefit more from good signals produced by others than they would otherwise (comparing examples 5-3 and 5-1). 1996) is perhaps the most common technique used in modelling evolution. (Holland. This process is repeated many times over.2. In these evolutionary models. and a further child generation is performed. Training and fitness evaluation are again carried out. training and fitness evaluation is performed for one generation.Chapter 5 – The Biological Evolution of Language 84 If a signal were sent from agent C to agent A. GA. during which time the evolution of the language capability of the agents can be viewed. The next few sections detail the design decisions. the signal is padded out with zeroes. and so only the bipolar values have an affect the interpreted meaning. We use a GA in this model to allow the language capability to evolve over a number of generations. then this replaces the parent population. Example (5-4) demonstrates this: Meaning 2 randomly selected Agent A generates signal vector ( + 0 0 ) Agent C receives signal ( + 0 0 ) Signal interpreted as meaning 1 or 2 (depending on activation values) (5-4) In this case the result is similar to the previous version (example 5-2).3. The experiments which follow are performed using this full interpretation / limited production design. 1975. and parameter settings relevant to the genetic representation and agent selection and reproduction. A ‘child’ generation is created. 5. A would receive all of the three bit signal as all nodes are active for interpretation. . 5.1 Genetic Representation and Reproduction The standard genetic algorithm. Mitchell.3 Population Generations and Replacement The population training algorithm shown in Figure 4.2.3 will again be used.

the representation used should allow for at least zero to three language neurons. this confers the possible advantage of having similar phenotypic forms possess similar genetic representations. 0: 000 1: 001 2: 010 3: 011 4: 100 5: 101 6: 110 7: 111 Table 5.2. agents may have from zero to seven production-active language layer nodes. can learn unique signals for 2M meanings. 1991) and (Caruna and Schaffer.3. many genetic changes may be required to produce a small change in the phenotypic form. The standard binary representation of the numbers zero to seven. To increment from the value 3 to 4. all three binary bits must change. One alternative is to use a Gray coded (Gray. hereditary traits are represented by binary strings. (Belew et al. it is not always possible for a value to be incremented or decremented by one with a single bit-flip. As a result of crossover. the number of meanings is arbitrarily set to eight. 1988)). only one bit change is ever required to increment or decrement a value by one. Mutation may cause individual bits of the gene string to ‘flip’. as opposed to the more general use of GAs in optimisation.2. 0: 000 1: 001 2: 011 3: 010 4: 110 5: 111 6: 101 7: 100 Table 5. For example. The Gray codes for the values 0 to 7 are shown below in Table 5. The standard binary representation of these numbers is as shown in table 5. Thus. two parents’ bit strings are selected from which two child bit strings will be produced. using the standard binary representation in the GA. 1953) binary representation to store values in the genome. Using Gray codes.. and this has been recommended for use with the GA by a number of authors (for example. The use of the standard binary representation in the GA has been criticised. Using three bit gene strings as the chosen representation. Gray coded binary representation of the numbers zero to seven. each child string is part produced from a portion of one parent string and the remainder is taken from the other parent.3. as shown above. The standard operators for producing the child gene strings are crossover and mutation. With reference to the GA. from 0 to 1 or vice-versa. For modelling evolution. As an agent with M language neurons. In the experiments performed here. In the standard GA. however. During reproduction. are incremental changes particularly to be sought after? . plus bias.Chapter 5 – The Biological Evolution of Language 85 The only hereditary trait to be modelled is the number of language neurons active during signal production.

but first the mechanisms of selection. it is possible for even more dramatic discontinuities in evolution than are possible with the standard representation. Mühlenbein and Schlierkamp-Voosen (1993) detail various crossover and mutation operators for gene strings composed of real rather than binary values. Thus we have two different possible genetic representations for language ability – the standard binary representation.Chapter 5 – The Biological Evolution of Language 86 In nature. Using the standard representation. in a genetic string has been used for many years in attempting to solve numerical optimisation problems under the name Evolution Strategies. This will provide the adjacency benefit of Gray codes but without the possibility of single mutations causing such large changes. For example. The . an approach developed in the early seventies by Rechenberg (Mitchell. with the standard deviations used noted in each case. In general. Accordingly we can use integer valued rather than real valued gene strings to represent the innate linguistic ability of each of the agents.2 Selection. Gray codes may not be the best alternative. rather than binary bits. However.2. This use of real values. the largest change a single mutation could cause would be an increase of 4. The populations in all experiments described in this chapter are spatially arranged as described in Section 4. Mating and Replacement Other model details requiring elaboration surround the processes of selection and mating. and an integer representation. individuals with similar genes may be expected to have more similar phenotypes than those with dissimilar genes. however. Although it possesses the benefit of adjacency.9. 1999). 5.3. ESs.2). In our first experiments we will see the effect of choosing one of these representations over the other. mating and population replacement common to both versions will be detailed. The best option may be to dispense with a binary representation altogether. If we wish our model to capture this feature of the natural world then a genetic representation which captures the adjacency feature provided by Gray codes is to be preferred over the standard genetic representation. relatively minor genetic changes can result in significantly different phenotypic forms. for example from 0 to 4 (Table 5. from 0 to 7 in this case. Only one mutation is required to jump from the smallest possible value to the largest in a Gray coded gene string. our model uses discrete rather than continuous values to determine the evolved linguistic ability – with integer numbers representing the number of neurons active in signal production.

repeat P/2 times: Select parent. the replacement algorithm will attempt to place the children in the 23rd and 24th positions in the child population. if the first parent is 23rd in the line in its generation.8. see Mühlenbein and Schlierkamp-Voosen. Agents are then selected for reproduction.2. 1993). The replacement algorithm. p2. works a little differently. 5.2. Over time. the language negotiation process starts afresh in each generation. For population size P.Chapter 5 – The Biological Evolution of Language 87 replacement and mating algorithm has been selected such that the spatial relationships between agents are maintained not just for learning and fitness evaluation. Recalling the linear arrangement. The distance measure used for selecting a second mating partner is in all cases the same function as used for selecting a partner for signal learning or fitness measurement. producing children c1 and c2 Place c1 and c2 in child population Replace parent population with child population Table 5.4. In particular. Every agent has a fitness value calculated. The mating and replacement algorithm is as shown in Table 5. An agent may be selected as the first partner for mating more than once. as well as on design decisions (for example. Note that there is no other interaction between generations in this model. but also for mating and the placement of child agents into the succeeding population. randomly according to neighbourhood around p1 Mate p1 and p2. with each agent having a chance of selection equal to the agent’s fitness score over the total fitness score for the whole population.3. The fitness measure used is the percentage communicative success measure described in Section 4. The mating and population replacement algorithm.4. it is this ability to negotiate language that is being evolved. for adding child agents to the new population.3 Crossover and Mutation Reproduction in GAs utilises two key operations: crossover and mutation. This replacement algorithm serves to reinforce and maintain the spatial relationships within the model. p1. with no input from past generations. . randomly according to fitness Select parent. then the replacement algorithm iterates incrementally along the line until empty slots are found for both children. The workings of these operators depends on the method of representation. If either or both of these slots are already occupied.

This is sufficient to store any integer value in the range [0. During mutation additional random changes to the child strings may occur.7) . the fitness of a group of agents revealing how well the learned communication schemes have been negotiated for the sharing of information. as here. 5. with a portion of the gene string of each parent going to each of the offspring. Over the course of many generations the language capability of the agents will evolve. (5. a3} .3 Modelling Language-Physiology Coevolution.2. b3 } To produce C. The crossover and mutation operators used are quite standard implementations for use with the GA. The extent to which agents are ‘physiologically’ adapted for communication is revealed by the number of active language production neurons each possesses. crossover is performed such that. Part I The first experiments are to be simple demonstrations of the co-evolution of language and physiology.Chapter 5 – The Biological Evolution of Language 88 During crossover. 5.7]. Where there will only be 8 possible meanings. for each bit in the child string: (5.6) ci = {a i } or {bi } Where ai is chosen if i <= N. the two parent gene strings are copied. and with 7 language neurons an agent can potentially learn 27 (128) different signal-meaning pairs. is a random integer chosen uniformly from the set [1. b2 . these are expanded upon later. and with it the signal repertoires available to them. The second child is a complement of the first in drawing each bit from the other parent: d i = {bi } or {a i } Where bi is chosen if i <= N.3. else ai is chosen. else bi is chosen. the crossover point. For crossover. These will mate to produce two child strings C and D.5) A = {a1 . a2 . initially using standard binary codes in the GA.1 Crossover and Mutation Using the standard binary representation. assume two parent strings A and B. N. the gene string of each agent will be just three bits long. As the precise details of implementation depend on the chosen genetic representation. B = {b1 .3]. The ability to successfully negotiate useful communication schemes in each generation being the driving force for the evolution of the ANN. (5. this should be quite ample.

5.5 lists the parameters and settings for the first coevolution test. it does require that its neighbours have them.5 produced the results shown in Figure 5. over the course of 150 generations.005 Learning Rate.2 Mutation rate 0. In a test run.0 Table 5.Chapter 5 – The Biological Evolution of Language 89 Mutation is applied after crossover has generated the two child strings. This might be considered a low rate of mutation were the GA being used for some optimisation problem.2 Parameters and Settings Table 5. Each mutation that occurs inverts the binary value at that position. α Population 100 Training rounds. All agents have M = 8 nodes in the meaning layer. The mutation rate gives the chance of a mutation occurring at each position in the string. 5.5 Parameters and settings The initial population of agents all have N = 1 (only one language layer node active for signal production). and of agents’ number of language neurons active in signal production. . Parameters for coevolution (using standard binary representation) 0.3.3 are the averages of agents’ fitness scores. the average number of active signal production neurons rises from 1 to 6. fitness being evaluated on the basis of how well received signals are understood. as previously described. a typical run using the parameters shown in Table 5. The mutation rate was selected so as to produce one or two mutations per generation. Although an agent does not need active language production neurons to score highly on fitness. and the average success at interpreting signals from 20 to 80 percent (Figure 5. The first demonstrates the effect of the second.3). The standard deviation determines an agent’s neighbours. This gives chance greater than 80% of selecting an immediate neighbour. t 200 0. In this figure the gradual evolution of language ability in the population is apparent. Displayed in Figure 5.001 M 8 dα ( α / t ) Standard Deviation 1. with over 98% of selections being within a small neighbourhood or two neighbours on either side.3 Discontinuous Evolution of Signalling Ability With the model as described. but here we wish to limit the number of significant mutations which occur in each generation of what is quite a small population.3.3.

Chapter 5 – The Biological Evolution of Language 90 100 80 60 40 20 0 20 40 60 80 100 120 140 Generation Fitness / Nodes x 10 Av. This appears to be a significant discontinuity. The runs themselves are highly random with many stochastic elements – which individuals get picked.4 Detail from beginning of Figure 5. . Nodes x 10 Figure 5. With only one or two mutations to be expected per generation. Fitness Av. The graph shows a sudden jump in the linguistic ability and in the fitness of agents quite early on. so does the success rate (fitness). Fitness Av. 100 Fitness / Nodes x 10 80 60 40 20 0 5 10 15 20 25 30 35 Generation Av. Rather than a slow gradual increase in linguistic ability. the impression that relatively few mutations have produced rapid and major evolutionary development is confirmed. which meanings they attempt to communicate and what signals they initially send before training. Nodes x 10 Figure 5. The graph is re-plotted for the first 40 generations only. allowing the better adapted genes to rapidly take over the population. In just ten generations (the 15th to the 25th) the average number of active language layer nodes rises from just over one to almost 5. it seems that very small number of mutations have led to major changes to the fitness of those affected individuals.3 Coevolution of language and physiology. to allow a closer inspection.4. and/or their partners in language negotiations and evaluation. in Figure 5. As the communication capability increases (number of nodes). This has resulted in a small number of individuals having much higher fitness.3 However this is only the result of one experimental run.

Figure 5. A second mutation might increase the number of nodes to seven.5 A further 9 test runs with same parameters as Figure 5. only one mutation is required to allow a jump from one language node to three or five nodes (Table 5. However. 100 100 100 80 80 80 60 60 60 40 40 40 20 20 20 0 1 100 0 0 1 21 41 61 81 101 121 141 21 41 61 81 101 121 141 1 21 41 61 81 101 121 141 100 100 Fitness / Nodes 80 80 80 60 60 60 40 40 40 20 20 20 0 1 21 41 61 81 101 121 141 0 1 100 0 21 41 61 81 101 121 141 100 1 21 41 61 81 101 121 141 100 80 80 80 60 60 60 40 40 40 20 20 20 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Generation Figure 5.5 top right. bottom left). . It is apparent that. has occurred the new gene succeeds quickly in the population. there is less significant selective advantage to be gained from further increases. due to the genetic representation used. favoured due to the implementation details. middle centre and. however. a large number of tests are required before the results can be considered at all robust or typical. In all cases the improvement is relatively sudden. allowing for some improvement in language negotiation.3 This consistent sharp increase implies that this model supports the discontinuous view of language evolution (see discussion in Chapter 2).2). more clearly.5. with the average number of language nodes in a population increasing from one to five over the course of 30 generations or less. the overall result of significant improvement in communicative ability is consistent. After the initial rise in the number of nodes. and the graphs show some variation. Once such a mutation. despite the individual differences.5 shows the results of a further nine runs using the same experimental setup as in Table 5.Chapter 5 – The Biological Evolution of Language 91 Consequently. and some of the results show this distinct stepping (Figure 5.

The mutation rate is three times the previous value. the integer value is modified in some way. When a mutation occurs. the experiment must be repeated.2 Parameters and Settings Table 5. with very large or even negative values possible. there is no crossover of genetic material at all. the gene string for each agent consists of only a single integer. α Population 100 Training rounds. Here integers are used directly in the genetic representation. Thus. The nature of the mutations has changed. t M 8 dα (α / t ) Standard Deviation 1. and any value greater than 7 back to 7. this form of mutation is potentially open-ended. and two children C and D. . 5. for each integer in a gene string.0 Table 5. The results of mutation can be limited by the operator.8) or C = B and D = A There is a 50% chance of either outcome. each child will inherit its single integer gene string from a different parent. with an equal chance of either occurring. As before. rather than binary numbers.001 Again. Different implementations are required for the crossover and mutation operators than those already presented. Parameters for coevolution (using integer gene strings) 0.Chapter 5 – The Biological Evolution of Language 92 5.1 Crossover and Mutation Inheritance works quite differently in the integer evolution strategy used from that in the previous model. To prove that the observed discontinuity is indeed a consequence of the model’s implementation.4.2 Mutation rate Learning Rate.6 Parameters and settings 0. the number of expected mutations per generation has been kept constant. As there is only one hereditary trait. Part II The model has demonstrated the coevolution of the agent physiology with the agent communication schemes. and these new operators are described next.4. the mutation rate determines the likelihood of a mutation occurring. either incremented or decremented by 1. C = A and D = B (5. Given two parents A and B.6 lists the parameters and settings for the second coevolution test.015 200 0. as described above. 5. which here forces any negative result to 0. Unlike the outcome of inverting a bit in a binary representation. the initial population of agents all have N = 1.4 Modelling Language-Physiology Coevolution. but as the gene strings are now one-third the previous length.

and firm conclusions are hard to draw from these results.6 produced the results shown in Figure 5. the gradual evolution of language ability in the population is apparent. This is done for the following test.3 Continuous Evolution of Signalling Ability A typical run using the parameters shown in Table 5. Other settings and parameters can be varied however. A further nine sets of results from runs of 250 generations using the same parameters and setup are shown in Figure 5. improvement in communicative ability is consistent. as the communication capability increases (number of nodes). In particular. . yet gradual.4. we can determine if spatial selection (our chosen approximation to kin selection) has a significant effect. of nodes (x10) Figure 5. and the average success at interpreting signals from around 20 to over 80 percent. to try to give some idea of the effects different natural conditions may have had on the evolution of language in nature. by varying the neighbourhood size and repeating the experiment. In this figure. the average number of active signal production neurons rises from 1 to 6. In this test run. so does the success rate (fitness). 100 Average Fitness / Nodes x10 80 60 40 20 0 0 100 300 200 Generation 400 500 Average Fitness Average no. over the course of 500 generations. Where large physiological changes are not possible due to single major mutations. The coevolution of language and physiology has now been demonstrated using two different versions of the model.6 Coevolution of language and physiology.Chapter 5 – The Biological Evolution of Language 93 5. it seems that a succession of minor adaptations spread through the population.6. Despite individual differences the overall result of significant.7. As before. The models can be used to support either the discontinuous or the continuous EoL.

5. defection does not benefit the agents in any way and so. 5. to succeed.5 The Effect of Neighbourhood Size Spatial selection is one mechanism by which the likelihood of the evolution of cooperation may be improved.1 Parameters and Settings This experiment is a repeat of the previous one (with integer representation). The parameters are as listed in Table 5. even in the absence of spatial selection. .0. such a mechanism can be responsible for the emergence of cooperative behaviour in situations where individuals may gain more by defecting than by cooperating. with only one change. except for the standard deviation which is set to 6. Currently. as evidenced by success in communication. Like kin selection. If there is no benefit in defecting but some – even indirect – benefit of cooperation we should expect cooperation. and those which provide poor signals (requiring fewer or even no language nodes) to be defectors. cooperation should succeed in the population. In the model we have described.6 5. even with a weakened spatial selection.7 A further 9 test runs with same parameters as in Figure 5. we can consider those agents which provide good and distinct signals (requiring a larger number of language nodes) to be cooperators.Chapter 5 – The Biological Evolution of Language 94 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 Fitness / Nodes 100 80 60 40 20 0 0 50 100 150 200 100 80 60 40 20 0 0 50 100 150 200 100 80 60 40 20 0 0 50 100 150 200 Generation Figure 5.6.

This will be due to a decrease in the selective advantage of being a good signaller.8: Coevolution with weakened spatial selection strength. The signaller also has a lower chance of being selected as a mate by any of the agents it has provided with good signals. will have a greater chance of selection. and in turn be selected as a mate for that neighbour if the neighbour has been selected. That neighbour. With larger neighbourhoods. it is seen that in most cases the communication abilities and success rates of the populations do eventually improve and progress towards the values seen in the previous two experiments. There is greater variation in these results than previously observed. the advantages of cooperation are lessened. and the evolution of language inhibited. an agent will receive few signals from a lone good-signaller within its neighbourhood. The next step is to see the effect spatial selection has where there is a cost associated with being a good signaller. But with no advantage in defection. What appears to happen is that the evolution of language is not prevented. being a successful receiver. 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 Fitness / Nodes 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 50 100 150 200 100 80 60 40 20 0 50 100 150 200 100 80 60 40 20 0 50 100 150 200 Generation Figure 5. Such a signaller might improve the fitness of a neighbour.5. Thus. . As the model only rewards agents directly for successfully interpreting signals. the benefit a good signaller might receive is indirect.Chapter 5 – The Biological Evolution of Language 95 5.2 The Effect of Large Neighbourhoods We display the results of nine runs with large neighbourhoods in Figure 5.8. but it is slowed down.

For the experiments performed here. to reflect the supposed adaptive costs of larger brain size. it possesses.4. the fitness penalty is applied.Chapter 5 – The Biological Evolution of Language 96 5. . f(N).2 are shown in Figures 5. The resultant fitness score is used when selecting parent agents to form the succeeding generation.6 and 5.6. the experiment is repeated for the following different values of neighbourhood standard deviations: 0. 1. To test the effect of spatial selection. N.6 Coevolution with a Costly Language Ability As reviewed in Chapter two. there are costs associated with the ability to use language. is: f (N ) = N 2 (5. 3.6).2. The fitness scores shown are plotted after the fitness penalty has been applied. and are consequently lower than the actual communication success rate achieved. After evaluating an agent’s fitness (Figure 4. 5. An integer representation is again used in the gene strings. 5.9) This penalty has been arbitrarily chosen to penalise agents with larger numbers of active signal production neurons. no other modifications are required. Such costs can be added to the model by applying a fitness penalty to each agent dependant on the number of active signal production neurons. with the addition of a fitness penalty as described above.2 Spatial Selection and Costly Language Ability Results from using a standard deviation of 1 or 0.7.9 and 5. These results are clearly poorer than shown in Figures 5. 6 and 12. the fitness penalty. Other than this simple change.10 respectively.1 Parameters and Settings This experiment repeats the experiment of Section 5.6.

showing average fitness and nodes x 10 over 250 generations. Neighbourhood defined by a normal distribution of standard deviation 0. Showing average fitness and nodes x 10 over 250 generations. 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 Figure 5. Neighbourhood defined by a normal distribution of standard deviation 1.Chapter 5 – The Biological Evolution of Language 97 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 Figure 5.10: Coevolution with a costly language ability. and there is little difference between the results for the two different values of standard deviation. The average success rate (not fitness) over all nine runs for each value of standard .9: Coevolution with a costly language ability.2. A slow and fairly steady improvement is observed in all eighteen sets of results.

12: Coevolution with a costly language ability.12. . The minor difference between the results is shown more clearly. we then repeat the experiment with increasing values of standard deviation.11. the force of spatial selection is stronger and the evolution of linguistic ability enhanced slightly. Neighbourhood defined by a normal distribution of standard deviation 3.2 (top line) and 1. and may be influenced unduly by stochastic effects. 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 Figure 5.11: Average success rates for costly communication with neighbourhoods of standard deviation 0. 100 80 60 40 20 0 0 50 100 150 200 sdev 0. 5. 6 and 12 respectively. This seems to show that spatial selection does have an effect.0 (bottom line). as it is taken from just nine runs under each set of conditions.2 sdev1 Figure 5.14 show the results for standard deviations 3. With the smaller neighbourhood size. For further evidence.13 and 5. Showing average fitness and nodes x 10 over 250 generations. Figures 5. This result may not be particularly significant however.Chapter 5 – The Biological Evolution of Language 98 deviation is shown in Figure 5.

Showing average fitness and nodes x 10 over 250 generations.13: Coevolution with a costly language ability.14. Neighbourhood defined by a normal distribution of standard deviation 12. Showing average fitness and nodes x 10 over 250 generations.14: Coevolution with a costly language ability. . Neighbourhood defined by a normal distribution of standard deviation 6. it can be seen that as the neighbourhood size increases the average linguistic ability that evolves get progressively poorer. Reviewing the results presented in Figures 5.Chapter 5 – The Biological Evolution of Language 99 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 Figure 5. 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 100 80 60 40 20 0 1 51 101 151 201 Figure 5.9 to 5.

while the benefits only arise from the interactions of multiple speakers. Figure 5. This is investigated more thoroughly in the following chapter. Thus the spatial selection can help overcome the fitness penalties suffered by altruistic individuals. With very large neighbourhoods. its partner will be selected randomly from those within the same neighbourhood. Thus the signalling ability or signals used by agents distant from the agent being evaluated do not influence the evaluation. • During fitness evaluation.3) does not alter the outcome in any significant way (other than appearance of discontinuities. • During language negotiation. The result of these effects is that the benefits a good signaller provides to its neighbours are more likely to be reciprocated – in that the neighbours will be more likely to use a compatible communication scheme. Spatial selection affects the model in a number of ways. While not repeated here. This is despite a very small fitness penalty for having only one active language production neuron – only one point of fitness.Chapter 5 – The Biological Evolution of Language 100 With small neighbourhoods a significant improvement is observed in the number of active language neurons and in the average communicative success rate.5%. signals are received from agents within the neighbourhood of the agent under test.14. and can learn to interpret them successfully. once an agent has been selected for reproduction by fitness. discussed further below). the population is unable to even maintain the initial level of linguistic ability. and will be more likely to pick the good signaller as a mate – where neighbourhoods are smaller. The cost of such an ability is borne by the individual speakers. Under these conditions the average communicative success approaches a pure chance level of success of 1 in 8 or 12. and it appears that during negotiation local clusters of signal dialects emerge. 2000) . • During mate selection and mating. It is also not required that all the agents in the population use the same signals. These results show clearly the strong influence that spatial selection could have on the evolution of linguistic ability. Using a binary representation (Section 5.4). the results of simulation runs using the binary representation are discussed in (Livingstone and Fyfe. agents within the neighbourhood of good signallers will receive better signals. The simulation runs described in this section all used an integer representation for the agent genes (Section 5.

the evolution of linguistic ability proceeds more by large jumps than by incremental steps. Even some of the most notable opponents of the continuous EoL have in recent years modified their arguments considerably. Rather than provide evidence to support one side of the continuity-discontinuity debate. a time where individuals possessed different degrees of linguistic ability may have existed. it is perhaps to be expected that additional selection mechanisms are required. Our model demonstrates the successful use of communication in populations of heterogeneous language ability. accepting elements of continuous evolutionary theory into their thoughts (compare (Bickerton. 1984) and (Calvin and Bickerton. Due to the implementation details of the model. see Chapter 2). as previously argued. However. Whether the EoL was continuous or not.7 Discussion 5. 5. That natural selection alone is insufficient in many cases to account for the evolution of altruistic behaviour is well known. the continuity-discontinuity debate appears to be settling down and largely resolved in favour of a position that accepts a large degree of continuity (Aitchison. Many studies have used theoretical or computational models to argue for the effectiveness of additional selection mechanisms in enabling the evolution of cooperation (several have already been cited. 2000)). The settling of this debate has also occurred with little regard to input from computational modelling based research. there has been recent debate over whether the EoL has been the result of a single ‘macro-mutation’ or the result of a continuous process of gradual change and adaptation.1 Continuous versus Discontinuous Evolution of Language The model in Section 5.4.4.7. the evolution of linguistic ability proceeds in small incremental steps. 1998).7. but the scientific value of this alone is questionable. It is also possible that other mechanisms influenced the natural EoL.2 Spatial Selection The model is perhaps more successful at demonstrating the need for additional selection mechanisms beyond natural selection to account for the EoL. With the assumptions that language can be costly to the individuals equipped with the ability to produce language. such as kin . Section 5.3 demonstrates the discontinuous evolution of linguistic ability. As noted in Section 2. The experiments presented seem to support each of these positions in turn. the model appears to give some evidence against arguments that either is not possible. We have demonstrated the EoL with spatial selection. Making a small change to the model.Chapter 5 – The Biological Evolution of Language 101 5.

even when there is a cost applied to the signalling ability. Pinker (1994. and for redundant signalling capability to emerge repeatedly.8 With cost. Experiment . the populations evolve a redundant language capability.146). This appears to be the main role of redundancy in existing explanations. 1987 p. is that. Despite the existence of other models that show how spatial or kin selection can lead to the evolution of cooperation (as mentioned in chapter 2). That is. There are in fact two such possible benefits of redundancy in the signalling capabilities of the agents.3. The minimum required number of language nodes for successful communication is just three.3 Redundancy and Linguistic Ability A further observation.4 and 5. the average number of nodes exceeds this – as shown in Table 5.6.2 4.Chapter 5 – The Biological Evolution of Language 102 selection (Hamilton. who also argues that it provides a fallback or failsafe against disruption.2 Without cost II. in those experiments where language negotiation/evolution succeeds.g. binary representation 6. In contrast. and too easily distorted (e.2 Table 5.7. Average number of nodes Without cost. 5. signals would be very susceptible to disruption from noise in the environment. in many cases.7 The average number of nodes (over all runs shown) in the language layer in agents from the final generation of experiments shown in sections 5. Crystal. There is no noise present in our model however. Without redundancy. which can be made when reviewing the results of the various experiments presented in this chapter. sdev = 0. integer representation 6. p181) refers to Quine (1987) on this aspect of redundancy. the average number of language production neurons exceeds the number required to produce a different distinct signal for each of the possible meanings. it would appear to be serving some additional purpose. See Di Paolo (1999a) for a detailed exploration of the application of kin and spatial selection in artificial life models.7. That redundancy has an important role to play in language is recognised in linguistics. we believe that this is the first demonstration of the effect of spatial selection on innate traits which have an acquired expression – an innate ability which can lead to cooperation only as the result of a learned behaviour. 1964).3 With cost. 5. sdev = 1 3.

6) discusses this use of redundancy in language. 2001). This perhaps warrants further. While this is not part of the definition of language used in this thesis.4 Investigating The Adaptive Benefits of Language Throughout the experiments presented in this chapter we have made the. investigation.g. 5. but relaxing the assumption of cooperative communication is problematic. intended benefit) either to the signaller or to others of its group.7. That redundancy in representational capability can improve learning is known to researchers in ANN (e. future. in attempting to demonstrate the particular problems that may occur when attempting to relax this assumption in agent based simulations. where we explore the effect that introducing neighbourhood limited communication has on the model. as Lass terms it. redundancy may allow the agents to successfully interpret different signals which are used to represent the same meaning by different neighbouring agents. not insignificant. and reward listeners for correctly interpreting the signals they receive. simply assuming that it is beneficial for agents who are able to comprehend the signals produced by others. First. explanation is that redundancy makes the signalling scheme/language learning task easier and more likely to lead to success (evidence of this is found in the results in Chapter 4 where it can be seen that the addition of additional – redundant – language nodes improves the agents’ success at learning a common signalling scheme). assumption that language is used cooperatively. but does not appear to be widely recognised as an important benefit of redundancy in human language learning – and is not mentioned by any of the works cited in this section. We look at dialects in more detail in the following chapter. The second. where he limits his definition of communication to signals produced by individuals where the signal is of benefit (or. To justify this assumption necessitates a further review of some relevant literature. the models here use a similar assumption. This concentrates on other ALife work. Lass (1997. In doing so we have also abstracted the actual use of language considerably away from any specific application. not just for computer modelling but also in more . ‘Linguistic junk’. allows speakers of differing dialects (or different variants of a common dialect) to communicate more successfully than would otherwise be possible as it may provide supplementary clues as to what is being said. Barlow.Chapter 5 – The Biological Evolution of Language 103 First. related. presumably. ignoring the selfish applications of language. This simplification may appear questionable. the assumption is explicitly included by MacLennan (1991) in his innovative work on modelling the evolution of communication. Ch.

5. Success at tasks within the artificial world – possibly finding mates or gathering resources – leads to improved reproductive chances. difficult to develop and to interpret. their ability to understand linguistic communication is indicative of their ability to use language for their own benefit. This requires ‘embodying’ the agents in some form of artificial environment to some extent (some examples follow). Lee. 1975. A possible resolution of this is to assume a distinction between cooperating in the development of a shared language versus actual cooperative behaviour (Grice. It may seem more realistic. we might accept that the assumption of cooperative behaviour is required to make the construction of our models tractable. it should be possible for the communication strategies or abilities to evolve. requires the cooperation in the distinct process of language negotiation. To some extent. this avoids rather than solves the problem that real or simulated agents should be capable of using language for non-cooperative ends. Including the possibility that communication may be used for both deceptive and cooperative purposes in a model of the evolution of language will lead to a complicated model. to cooperate or to compete. Where the use of communication can lead to more successful behaviours. To be able to communicate at all.5 Embodied Communication Alternatively. 2000). Whether agents are cooperating or competing. Rewarding the successful replication of internal states – an abstraction of the ability to understand what message has been sent – has to be replaced by somehow rewarding the external behaviour of agents. this behaviour being affected by attempted communication.7. but might not wish to reward agents for simply being able to understand the signals generated by each other. to have models in which the real world benefits of language use are represented more explicitly than in the model which has been presented in this chapter.Chapter 5 – The Biological Evolution of Language 104 theoretical approaches to the evolution of communication (Bullock. This approach has its own limitations as the following examples help illustrate. and beyond the scope of this thesis. . In Chapter 2 we reviewed the work of Krebs and Dawkins (1984) that illustrates some situations that challenge the assumption the communication is necessarily cooperative in nature – such as manipulative signals that benefit the signaller but not the receiver. 2000). and somehow better. Some means of determining the honesty of different agents and the degree by which agents trust other agents would be required. This is far from the minimal models that we seek to build.

The environment is such that matching signal/response pairs will lead to greater success at the mating task. and the decision to eat or reject mushrooms could equally well be undertaken without this extra level of detail. however. A work which combines the embodied evolution of communication with a non-embodied selection mechanism – in this case one based on elitism – is presented by Cangelosi and Parisi (1998). Despite the assertion of Werner and Dyer that embodied models are required for modelling the evolution of communication. The lack of realism in the design of the model might challenge this idea. leaving an equivalent minimal model. The learning task. This model successfully demonstrates the evolution of signalling in an embodied model. there is a significant limitation. In both. successful communication leads to higher chance of reproduction. But overall. this is the same as what is approximated by an external fitness function with roulette wheel selection.Chapter 5 – The Biological Evolution of Language 105 Werner and Dyer (1991) evolved communication in an artificial ecology in which a communication protocol allowed immobile females to emit signals to guide blind males to them for mating. with tightly constrained sets of possible signals and responses. who evolves language in a population of ANN. signalling can serve only one purpose in this model – and this is predetermined. is achieved by having the agents move around an environment. The authors claim that the XGA – a genetic algorithm with mating the result of success in the artificial environment – is more realistic than arbitrary fitness measures. one that is present in these models. The task selected for language learning is for agents to be able to inform each other whether mushrooms are edible or are poisonous (as the ANN agents consume mushrooms they gain fitness rewards or penalties accordingly). Mating and reproduction occurs when the task is successfully solved. if any. Yet it is unclear what effect on the results. Poor communication will not necessarily lead to failure however as a lucky male might still find a female anyway. In these the communication is tied to a particular behaviour. The ANN based agents move round an environment in which there are mushrooms of poisonous and nutritious varieties. and in both it is possible for poor communicators to succeed nevertheless. Additionally. to the extent that it is not clear that any significant difference would occur by replacing the XGA with such a selection scheme. rather than relying on external measures of fitness. . Once all have had a period of testing in the artificial world some will be selected for reproduction according to their fitness. There is no room for alternative strategies to evolve which might lead to greater success other than to communicate.

the result is largely pre-determined and predictable: that language/communication can evolve to provide the stated benefit. It is. The environment and the rules governing interaction and replication provided by the computer program constrains the possible evolution that can occur. Demonstrating how language might have evolved to serve a particular function borders on presenting a ‘just-so story’. Accepting the great complexity of arguments over the origin of language ability in humans (as reviewed in Chapter 2) and the very many functions it serves. which individually and combined provide adaptive benefits but are not predetermined.Chapter 5 – The Biological Evolution of Language 106 chosen by the experimenter. however. the constraints in such models are such that by providing only one possible benefit of successful communication. If we are attempting to explain the evolution of language. Werner and Dyer. even where the results surprise the experimenter . by Taylor (1998). are required to highlight what the requirements are for positive results. 1993). Attempts to embody communication in an environment without a predefined role or purpose have to date had little success (e. If communicating is free. considerably more complex than any model which has been implemented to date. embodying language such that it provides a single particular benefit is no better than making the sweeping assumption that language is of benefit to those who use it (to transmit or to receive information). is not inconceivable. Negative results provide a useful contrast and. This relates to problems in using ALife as a means to understanding the evolution of selfreplicating systems as noted. and by enabling the emergence of language. at some length. Further.g. Such embodied models provide a host of alternative possible explanations for the evolution of language with little way of choosing between them. even though this may not be the intent of the author. but such failures to evolve communication or language are missing from many of these models. The concept of building a model which allows agents to evolve communication to fulfil a wide variety of uses. One reason for the lack of negative results is that there is no cost associated with communication or cooperation. then this could be as serious a problem as not having any specific adaptive benefit from the use of signals. proclaiming the original reason for the evolution of language. as noted by Hurford (1992). and cooperation brings no loss of fitness but successful cooperation brings benefits then the evolution of communication or language is to be expected – particularly given the constraints placed on the range of possible evolutionary changes.

So. In the EoL in Homo sapiens. model without embodiment or the attached overheads of such modelling. a model must first be built in which language can evolve. and the required physiology to support it. But such simplifications can be built into models without necessarily compromising the models. the successful application of computer models is far from being a panacea for the difficult problems surrounding the understanding of the EoL. while it may be attractive to remove as many assumptions as possible. A better understanding is more likely to come from the rare finds of fossils of our hominid ancestors. computer models are good for demonstrations of how different uses of language provide benefit – but they are not much good for proving why and how language evolved. from which much of the current knowledge about the evolution of language. 5. . which. With the additional complications that arise it is quite reasonable to fall back on some common assumptions. stripped away. which can be significant in time spent developing and running simulations. historical accident may have played a large role. may leave an equivalent.Chapter 5 – The Biological Evolution of Language 107 who built the model. and to embody a simulation of evolution within a ‘realistic’ model that includes an environment modelled on reality. Over the course of time a large part has been played by historical accident and serendipity – what makes the evolution of language interesting to scholars is that it occurred in the evolution of Homo sapiens alone amongst all the species that exist in the world. a lot of what occurs in some embodied models is simply ‘window dressing’.6 Limitations and Shortcomings As should be clear. The lack of clear benefits of embedding models compares poorly with the advantages of not embodying the communication. and many distinct adaptations to different environmental pressures appear to have had some role in pre-disposing our hominid ancestors to language (Deacon. To evolve language in ALife. As noted above. doing so does not necessarily improve a model. it is unlikely that a computer model will ever conclusively demonstrate the precise conditions that led to the emergence of language. 1997). As argued. simpler. With such a large part played by chance.7. In modelling the evolution of language it is possible to reward successful use of language without holding that all language is cooperative or believing that merely understanding language confers a benefit. To be best able to use such models constructively it is important to appreciate their limitations. is derived.

It is known that language both incurs many additional costs as well as serves many functions – and this work supports theories that see the social nature of hominids as being key to subsequent evolution of language. In this case. emphasises the importance of redundancy. this work dovetails with the complementary notion that without suitable social groups existing the fitness costs could easily prevent the emergence of language. with a small and closed set of possibilities.8 Summary In this chapter the experimental model described in Chapter 4 was used in a series of experiments investigating the co-evolution of language and physiology. The spontaneous emergence of redundant linguistic capabilities in our model.Chapter 5 – The Biological Evolution of Language 108 A particular shortcoming with the model used in this chapter for modelling the EoL is that only very limited evolution is possible. A consequence of this is that it is easy for language to emerge in this model when the costs favour its emergence. comparing the models it is clear that depending on the details of implementation. or sets of initial weights. 5. This highlights some of the problems discussed in the previous chapter. it is generally ignored in most work on the EoL. Where a number of authors point to the social functions that language serves as being responsible for its emergence (Chapter 2). . The first two experiments show populations of agents evolving to be capable of using better signalling schemes. this is the first known demonstration of the evolution of communication in populations where the capability to use learned signals evolves. As argued. future. but it is much harder in nature where the search space is open ended. While some authors have noted the different roles of redundancy in language. This significant observation warrants further. the agents evolved such that they had redundant language capabilities. and expanded in this one – of how the implementation of an ALife model can impact on the results in undesired ways. rather than simply the signals themselves. different ways of representing genes themselves leads to contradictory outcomes. models may be constructed which support arguments on both sides of the (now somewhat settled) continuitydiscontinuity debate. investigation. Another observation was that where language evolution was successful. More satisfying is the subsequent demonstration of the requirement for some additional means of selective force – such as kin or spatial selection – for the successful emergence of the ability to use language. Then. and of how different implementations of the same model might give quite different results. and the effect it has on agent fitness.

it was also noted that different dialects of signalling schemes emerged in populations where neighbourhoods limited the communication interactions.Chapter 5 – The Biological Evolution of Language 109 Finally. . This spontaneous emergence of dialects is investigated more thoroughly in the next chapter where attention is turned to the cultural evolution of languages.

.

.+ + ... This is contrary to the previously reported results of Nettle (1999a) and arguments of Milroy (1993) (see section 2.. which displays some of the signals generated by a sub-section of the population from one experimental run..1.– + + ...+ – + .- + + + + . a situation where all agents use the exact same signals would intuitively provide the maximal benefit to the agents. without any possible influence that the evolution of the network structure may exert. and we find that it supports the conclusions drawn from the initial observation.. there is one interesting result that can be observed.. which hold that only where diversity provides an adaptive benefit will it emerge.Chapter 6 – The Cultural Evolution of Languages 111 Chapter 6 The Cultural Evolution of Language and The Emergence and Maintenance of Linguistic Diversity 6.+ + + + + + + + + + + + + + .. there appear to be greater differences between the signals used by non-neighbouring agents..+ + + ..1: Reviewing the signals from one set of results from one of experiments shown in section 5.+ + + .+ + + + + - .. There is no benefit for using a different signal than a neighbour’s in this model.+ ....5.+ + + + .+ .9.+ + . This is presented in this chapter.– + . .+ + .+ ..+ + + – + .1 Introduction Despite the somewhat negative conclusions from the previous chapter.4 While adjacent agents use very similar signals. A reformulation of the previous model to allow the evolution of signal diversity to be studied in isolation. Agent Nodes 0 6 1 6 2 6 3 6 4 5 50 51 52 53 54 6 7 7 7 7 Meaning 1 + + + + + + + + + + + + + + + + + + + - Meaning 2 + + – + + + + - Meaning 3 + + + + + + + + + + + + + + + + + - Meaning 4 + + + + + + + + + - .+ + Table 6.. A listing of the signals produced by the different agents in the population shows that not all agents are using the same signals.+ + .+ + + + + . however.2).+ . prompting a number of experiments to examine this phenomenon.+ .. The full table. as shown in Table 6.+ .+ + .+ . It appears as if signal ‘dialects’ have emerged (also see Section 4.+ + . showing all signals produced by all agents is included in Appendix A.+ + .+ .1). On the contrary.+ + .+ + + + . is required to examine this more closely.

due to the local communication. The three agents included each attained a maximal fitness score. It is observed that large neighbourhoods negotiate a common signal for a given meaning. 6. This contradicts some of the views reviewed . This appears to show the emergence of different signal dialects within the population. However there are no obvious reasons why dialects should form in this model.000 rounds. interpreting all signals correctly. then it is possible that the model can be used to make inferences about the causes and formation of human dialect diversity. and arguments regarding this are also presented and reviewed in more detail.2. but distant agents may have significant differences in communication schemes used. agents may exist which interpret signals from different schemes correctly. does not necessitate a common language over the whole population. A homogeneous language population with 4 language units is trained for 5. and allows them to negotiate a communication scheme over a number of training rounds.2 An Initial Test The first reformulation forms a single homogeneous population of agents. Only one generation is used. and there is no motivating factor to encourage the formation of dialects. This is shown in Table 6. Meaning Agent Agent Agent Meaning Agent Agent Agent Meaning Agent Agent Agent Meaning Agent Agent Agent 0 11 12 13 1 11 12 13 2 11 12 13 3 11 12 13 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 1 1 -1 -1 -1 1 1 1 1 1 1 -1 -1 -1 1 1 1 1 1 1 -1 -1 -1 1 1 1 -1 1 1 1 1 1 -1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 Meaning 4 Agent Agent Agent Meaning 5 Agent Agent Agent Meaning 6 Agent Agent Agent 11 12 13 11 12 13 11 12 13 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 -1 1 1 -1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 -1 -1 -1 1 1 1 1 1 1 Table 6. despite differing communication schemes.Chapter 6 – The Cultural Evolution of Languages 112 This provides some evidence for the possibility that much of human linguistic evolution is adaptively neutral. If this can be compared to human dialects. starting from random signals. interpreting all signals correctly. The long training time allows a very high degree of coordination amongst the agents but. At the boundaries between neighbourhoods.2: The signals (including bias) used by three adjacent agents for seven environmental states. and the training algorithm for language learning is unaltered from the previous chapters. All three scored maximum fitness.

5 and directly challenges the results of Nettle (Nettle and Dunbar. it is difficult to visualise the results – large tables of data are required to review the dialects formed in one generation. 1998). Many of the issues relating to the EoL do not apply. Figure 6.g.3 Human Linguistic Diversity 6. however. 1999a. Nettle. Geographically close dialects may have sufficiently similar grammars or lexicons to allow speakers from each dialect to understand each other quite well. a stronger case can be made.1 Patterns of Diversity Linguistic diversity has already been discussed to some extent in Chapter 2. 1999b) that demonstrate the need for a social function for the successful emergence of dialect diversity. Several such dialect continua exist in Europe. Also. Only the briefest of summaries is presented here. but the model as applied in the previous chapter is not entirely suitable for several reasons.1. both with and without relation to the EoL. Crystal. Crystal (1987) presents an overview of some key characteristics of human language diversity. despite differences in the dialects. and language diversity is itself a facet of human language that has been studied extensively. It is possible to draw on these results to refute Nettle’s claims and with further study. skewing the experimental results.3. Geographically distant dialects may not be at all mutually intelligible. blurring the boundaries . 6. Firstly. Therefore some changes are necessary to remove any possible unwanted effects of having structurally non-homogeneous populations and to improve the transparency of the model. more extensive descriptions can be found in the literature (e. there may be some interference between the evolution of the agents’ linguistic ability and languages learned – a few individuals with more limited language layers than their neighbours may effectively hamper communication and learning between more capable surrounding agents. Trudgill. 1997. The core details of the model remain the same. or Chambers and Trudgill. but a further overview is called for before proceeding. and it would not be easily possible to compare the dialects over a number of generations. Nettle. 1987. despite being at either end of a chain of dialects where every dialect is intelligible to speakers of the neighbouring dialects. 1995. One important finding of dialectology is that the boundaries between different dialects or languages are not always easy to define.Chapter 6 – The Cultural Evolution of Languages 113 in section 2. Linguistic boundaries may be so formed.

the boundaries are not even nearly coincident. Dialects may differ in their lexical. forming clear dialect boundaries. showing some degree of mutual intelligibility between adjacent dialects (After Crystal. Often.2 Linguistic Boundaries The geographical boundaries between dialects may not be easy to determine. A B C D E Figure 6. on which boundaries may be drawn to show where language use is distinct on either side according to a particular linguistic feature. as individuals near a boundary may use differing mixtures of lexical and grammatical items from the surrounding major dialects. Romance and Slavic language groups. or idiolect. What is viewed as a dialect is merely some norm .Chapter 6 – The Cultural Evolution of Languages 114 between the different languages within the Germanic. p25) Figure 6. It may be expected that these lines will be largely coincident.3. As if this picture were not complicated enough. 1987. Scandinavian. and these boundaries themselves influence the future change and diversification of languages. p28) 6. Only when viewed at a more distant scale is it possible to determine distinct dialect areas – with a poorly demarcated boundary between them. Sampling the language use of individuals in some area results in a map. morphological or phonological features.2: Dialect boundaries (After Crystal. Figure 6.1: A schematic dialect continuum from dialect A to dialect E. semantic. it is also widely recognized that no two individuals use language in the exact same way – in a sense every individual speaks their own particular dialect. These boundaries are termed isoglosses. 1987. however.2.

Chapter 6 – The Cultural Evolution of Languages 115 derived by sampling many idiolects. No notion or representation of linguistic features.f. as we shall see. 1998). For example. population size or growth or geography exists in the model. The analytical approach has here been used to describe language evolution. p111). The models that are possible are more constrained. there may be many coincident isoglosses splitting the sides and a resultant lower degree of mutual intelligibility. Sharp linguistic boundaries that do exist often coincide with significant geographical boundaries. Pagel resorts to means outside of mathematics.4 Analytical Models of Linguistic Diversity While the features and characteristics of language diversity described in the previous section are well known. spatially distributed. no information about how or why languages evolve may be gained. 6. language is itself the boundary working to maintain linguistic differences. The second part explores the different number of languages that exist within closely related language groups. such as mountain regions. studies in Africa have shown that certain genetic markers act as good predictors of the language group of a subject’s spoken tongue (Renfrew. limiting the interactions of individuals across the divide. 1980. As well as linguistic and other cultural evidence that some cultural or physical barrier really does inhibit and limit interactions. The unit of this model is ‘a language’. we will consider the contribution of analytical techniques using mathematical models. A mathematical model of the evolution of language diversity in a large. Pagel (2000) presents two models of interest. Maths is used to determine whether the difference is statistically significant. Where the boundaries limiting interaction are maintained by linguistic rather than cultural distinctiveness. Chambers and Trudgill. Where such boundaries exist. While the model appears to describe the growth in the number of languages over time. population with a large number of linguistic variables would not be tractable. The first models the growth of linguistic diversity over time. is highly significant when examining or attempting to explain linguistic diversity. or with strong cultural boundaries (c. Like simulation models. genetic evidence can corroborate this. Before we proceed. reviewing a few which have been developed for this purpose. and a . For example. and has not been attempted in any of the models which are reviewed here. This. But for an explanation. mathematical models require some amount of abstraction and simplification (Chapter 3). it is the nature of the mechanisms which give rise to them that we hope to illuminate and explore with the use of artificial life based models.

non-mathematical. V = verb. which uses only degree-0 sentences (without recursion or additional sub-clauses). a fairly detailed review of the former is first required. TLA. henceforth the NB model. S = subject. as used as an example in Niyogi (2002)) is shown below. The common sentences are considered ambiguous. This model works on the basis that children are exposed to sample sentences produced by their parents. S V O. S Aux V O. Adv S V O. 1994). When this happens. such as head-first or head-last or verb position.1 The Niyogi-Berwick Model Perhaps more ambitious is the model developed by Niyogi and Berwick (1995). Lightfoot. S Aux V O1 O2. For each variant. The actual competing grammars are taken from human languages. and that the children then acquire their own language grammars according to the sample presented and the learning algorithm used. O2 = indirect object. In Niyogi and Berwick (1995) a dynamical systems model of grammatical change is presented. and may vary in different factors. 1999. individuals start with a randomly selected grammar. an attempt is made to . L1 = { S V. S Aux V.Chapter 6 – The Cultural Evolution of Languages 116 demonstration made of how appropriate maths might be used to highlight a particular spread of language as being unusual and requiring further.3. Adv S V. while others will only be parsable in one or the other. explanation. As well as being highly relevant. Adv S Aux V O1 O2 } Figure 6.4. In order to understand the latter. the set of possible sentence types of the -V2 grammar (non-verb second structure of modern English. Adv S Aux V. For example. this model has been widely reviewed and accepted (e.g. O1 = direct object. In this. p102). Aux = auxiliary. justifying the extended treatment it is given here. a number of the sentences may be common to both. Adv S Aux V O. The Trigger Learning Algorithm (Gibson and Wexler. 6. In this model. two grammatical variants compete in a population. Adv = Adverb For any two competing grammars. and generalised in Niyogi (2002) – which additionally relates the NB model to the Cavalli-Sforza and Feldman (1981) model of cultural evolution – to include spatial organisation of the language users. It is then possible to derive mathematically the progress of competition between two grammars. which is used unless a sentence which cannot be parsed in it is encountered. in that speakers of either grammar may produce them. Adv S V O1 O2. is the learning algorithm used. The –V2 grammar corresponding to modern English as represented in the NB model. S V O1 O2. the set of possible grammatically correct sentence structures is generated.

and if this succeeds the new grammar is retained. Due to the neighbourhood model used. is postulated but the equations for this are not derived. and over time the proportion of the population using L2 may rise. 6. 1991). If a = b.4. as at the extremes there are always children who are not exposed to the other competing language. often observed in linguistics (see Aitchison. and this is derived for the situation where language learners learn from exactly two randomly selected examples. s-shaped curve of language change. In all cases. As the ambiguous sentences are those that are valid in both grammars. and in every generation this separation is enforced. In a competition between two grammars L1 and L2. which are also valid L2 sentences. the fixed point will be one in which L1 is used by over half of the population or used by the entire population where an infinite number of learning examples are presented to learners. However. The NB model is successful in reproducing the logistic. then L2 will be favoured. The outcome of the model is similar to that of the previous model. This relies on the assumption that all sentence types have an equal likelihood of occurrence. while L2 has few ambiguous sentences. if L1 has a high proportion of ambiguous sentences. over time the competition will reach the fixed point of exactly half of the population using each variant. a. In particular the unlikely possibility that the more complex and unusual sentences in L1 occur with the .2 Criticism of the Niyogi-Berwick Model Some weaknesses of this model are highlighted in Niyogi (2002) itself. and in which there are an infinite number of examples. to make the basic maths tractable the spatial model assumes that all the speakers of L1 are in one grouping adjacent to the L2 grouping.Chapter 6 – The Cultural Evolution of Languages 117 parse the sentence with a different random grammar. b. the competition evolves to a single fixed point. the proportion of ambiguous sentences only differs where the grammars have different sized sets of possible sentence types – and the grammar with the larger set will have the smaller proportion of ambiguous sentences. where we are considering the evolution of dialect diversity. It is this that is of particular interest here. An alternative neighbourhood model. Niyogi (2002) then generalises the model so as to include spatially distributed populations. If a < b. which does not arrange the speakers into the two homogeneous neighbouring groups. with the competition between the two dialects evolving to a stable fixed point. languages are never completely eliminated however.

proportions of speakers and the resultant linguistic mix of population. such as represented by the fixed point end result in the NB model. One of the claims in Niyogi and Berwick (1995) is that the NB model can be used to evaluate theories of grammatical acquisition (such as the TLA). is very improbable. Attempts to apply the model to real instances of grammatical change seem problematical if this assumption proves to be inaccurate. A (learning algorithms) and P (probability distributions) is demonstrated by the NB model.Chapter 6 – The Cultural Evolution of Languages 118 same frequency as the simple ones is central to the derivations and progress of competition between two grammars in this model. distinct grammars. Clark (1996) shows that the logistic curve found in the NB work appears only where the selection is between exactly two grammars. such as movement. .5. It is unlikely that the NB model can satisfy this claim. The key weakness of the NB model is in abstracting away the sampling issues – and is particularly bad for small groups. That the NB model always evolves to a fixed point. Using a stochastic simulation model. birthrate. a failing noted in Briscoe (2000c). This emphasises that although the model has been generalised to include a spatial dimension it completely fails to represent a language ecology (see Section 2. and lacking much of the richness of human interaction. are either over-simplified or not represented at all in the model. Additionally. Briscoe argues that we should expect different results from the NB model – at least until the population becomes very large. Such persistence is unlikely in general in real languages. Later in the chapter we will review the alternative stochastic models presented in Briscoe (2000a). computer based micro-model. In a micro-model stasis.5) in any meaningful way. as yet. and these can be represented more easily in stochastic than in analytical models. Additional important factors. The generalised spatial model has some of its own problems. A poor grammar will never be eliminated completely – even though almost all of the population has converged on a different grammar. Briscoe argues that. Further problems are described by Briscoe (2000c). being limited to competition between two. who highlights some of the problems of using an analytical macro-model of a population rather than a stochastic. no clear advantage to macro modelling for realistic G (class of grammars). requiring as many assumptions as it does. from which no further change or innovation will occur is obviously unrealistic. and only two. a ‘better’ grammar with one speaker in a population of millions will eventually take over the population – leaving the previous common language a tiny minority tongue. Equally unlikely in real languages.

No further analysis of the model is presented. Niyogi (2002) relates the NB model to the Cavalli-Sforza and Feldman (CS-F) model of 1981. Here we present our own analysis using a simpler. Arbitrary values of wij could include . In a one parent model where each child learns only from their parent. Cavalli-Sforza and Feldman (1978) present some further analysis showing the qualitative effects of different transmission models and assumptions (one or two parents. That influence from other members of the population may be the mechanism which prevents this is postulated. earlier CS-F model. then two agents i and j might exert a very strong influence on one another or may be entirely disconnected. to illustrate how the use of analytical models can lead to suspect conclusions. it is noted that the variance will grow linearly over time and that eventually there would be no cultural homogeneity.t +1 = ∑ wij g j . If 0 ≤ wij ≤ 1 . the cultural traits acquired by an individual are determined by a weighted sum of the traits existing in the previous parent generation.4. The cultural trait. X. Their analysis shows that both models have directly comparable outcomes.Chapter 6 – The Cultural Evolution of Languages 119 6. acquired by the ith agent in a population at time t+1 is determined by equation (2-1) (Chapter 2). We can say that the grammar acquired by the ith member of some population will be: N g i .3 The Cavalli-Sforza and Feldman Models As noted in the previous section. (Cavalli-Sforza and Feldman. So the cultural traits acquired by any member of a population is determined by the cultural traits of those around them – weighted according to the amount of contact between individuals. the results of both are qualitatively the same. and varying the amount of influence from other members of the parent generation) on the degree of variation that survives in the model. not just the learner’s parents. assert influence over the acquisition of cultural traits – in this case language. In this earlier model. In applying this model to the eol we are interested in the case where the population around a learner. Despite minor variations.t + ε i j =1 (6-1) where 0 ≤ wij ≤ 1 and ∑ N j =1 wij = 1 This is analytically intractable if the weights wij vary for every pair of individuals within the population. 1978). but not shown analytically (other than for the case of two parents influencing the traits learned by children).

t +1 = 1 N ∑ g j . Spatial organisation is lost in most of the work. In attempting to provide a mathematical model of complex phenomena.3) presents an equation for how an individual trait possessed by a single individual within a population may be determined according to cultural influences – where the traits themselves exhibit no selective advantages.2). we can lose the w term from the equations altogether. but possess underlying similarities in their treatment of the eol. This simplifies the equation considerably: g i . it would be possible to calculate which particular values succeed in the population we find that this results in a model in which the population always converges on the average trait value. While this example is somewhat extreme. introduced in learning. Simplification is required if these equations are to be solved mathematically. for any initial distribution of trait values. This implies that selectively neutral changes cannot ultimately succeed in a population.4 Summary of Analytical Models The analytical models discussed have some important differences. variation from the average grammar would be small and would only survive at all as a result of the small errors. The model also limits the possible eol as the population can only choose between two grammars – the grammars themselves are unable to change or evolve.t + ε i N j =1 (6-2) It should be clear that unless εi is large. and other less well connected agents (similar to the social networks presented by Milroy (1980)). A very strong averaging force is in effect. It also shows evolution favouring grammars with a grammatically selective advantage. many real world issues are simplified or removed altogether from the models. If we assume uniform contact amongst the whole population.Chapter 6 – The Cultural Evolution of Languages 120 situations where a single population actually contains two distinct and unconnected populations.4. despite the implausibility of such uniformity.4. In simplifying the model such that. 6. The NB model is used to examine what happens when there are two competing grammars. the acquired grammars for any two agents could be only trivially different.4. . treating the populations as inhabiting the one common space. The presented CS-F model (6. and as population replacement proceeds. This allows each member of a population to possess a large number of different traits. and comes to some questionable conclusions (6. εi. a random allocation of weights would more likely result in some highly connected clusters of agents.

Chapter 6 – The Cultural Evolution of Languages

121

This result showing that diversity would be expected to disappear from the population is not a necessary consequence of the model itself – but is due to the additional assumptions introduced to make the maths tractable. And indeed, the results of the work on the NB model may be the result not of the model itself, but consequences of the many additional assumptions and simplifications introduced to make the model solvable (Niyogi, 2002).

6.5 Modifying the Artificial Life Model for Exploring the Emergence of Linguistic Diversity
Having reviewed some of the attempts to mathematically model linguistic change and diversity, and highlighted some of the weaknesses of the approach – principally centred round the absence of any richness or randomness in the interactions of individuals – we proceed to extend the Artificial Life model presented in the last two chapters. Several modifications are required to allow an exploration of change and diversity, and these are explained next. The most important change made is to remove the evolution of linguistic ability from the model. This simplifies the model, allowing more focus on the evolution of linguistic diversity without any interference from evolving or heterogeneous language abilities. These could otherwise distort or confuse the results by providing an additional potential source of linguistic diversity. Any linguistic diversity which emerges in the altered model can then not be due in any part to the influence of the EoL. This also results in a more realistic study of the effects of cultural evolution, as the eol works on a much faster time scale than the EoL. Removing EoL from the model is achieved simply by fixing the number of active language production neurons, N, for all agents to a single arbitrary value. The selection of this value can help overcome another problem with using the model as previously described. It should be apparent, considering Figure 6.1, that comparison of the different signals used by a whole generation of agents will not be simple. Even less simple would be an attempt to compare signals used by agents over a great many generations. Some means of easy visualisation of results is required. Setting the value of N to 3 can enable visualisation. Each signal will be exactly three bits long, allowing a repertoire of eight distinct signals in total. Using the individual bits, which compose a signal, to set red, green and blue colour values, each signal generated by an agent can be represented by a distinct colour.

Chapter 6 – The Cultural Evolution of Languages

122

Colour Plate 6.1 lists the colours that represent each of the eight possible signals. To obtain these colour values, the bipolar signal vectors are transformed to binary vectors – with all –1 values being reset to 0. For the remainder of this chapter, we treat the signal vectors as being binary vectors although this is solely for convenience for the consequent ease of mapping to colour values. Allowing for redundancy in the signal ⇔ meaning mappings, the number of possible meanings, M, and signal bits, N, have to be set such that M =< 2N. As previously explained, for purposes of visualisation, N = 3, thus we require that M =< 8. M was arbitrarily set to 3. The model parameters and settings are summarised in Table 6.3. Parameters for coevolution (using standard binary representation) See below Training rounds, t Learning Rate, α 0.2 Population 120 0.2 / t dα ( α / t ) M 3 Standard Deviation 1.5 N 3 Mutation Rate 0 Table 6.3 Parameters and settings With no mutation, the population maintains a homogeneous signalling ability, but otherwise the implementation of individual agents is unaltered. With no evolution of the signalling ability, there is also no selection of individuals for reproduction. Success of communication can still be measured, but is no longer used as a fitness measure for mating. It is no longer the biological evolution that is of concern, but a cultural one. Within the model, child generations are created automatically, but signals are no longer learned simply from peers. Communication and learning across generations is required to investigate the eol. The cross-generational learning of signals is discussed in more detail in the next section.

6.6 Experiment 1: Emergence and Maintenance of Dialects
The first experiment is simply to verify the observation previously made – that dialects emerge in the model as a result of the learning interactions between the agents.
6.6.1 Experimental Setup

Within this and the following experiments, it is required that, during learning, training examples are provided by a parent generation to allow the signal repertoires to be learned over time. Agents are selected from the parent generation according to spatial distance from the learner. An agent occupying the same position as the learner, but in the parent

Chapter 6 – The Cultural Evolution of Languages

123

generation is at distance 0. Again, selection is based on a normal distribution centred at this point. In human society language is learned from peers as well as elders, and this can be included in the model, by providing a set number of training examples from the parent generation, and a number from the current peer generation. While this makes the model more ‘human’ it does not necessarily follow that it will have any significant impact on the results. This experiment was run with two model configurations: one with learning from the parent generation only, and one with learning from the immediate parent generation and the current (peer) generation. Table 6.4 details the number of training rounds applied in each case. All signals are initially random before training. Additional Parameters Parent and peer learning Learning from parents only Parent generation training, tp 40 Training rounds, t 40 Current generation training, tc 20 Table 6.4 Parameters for parent and peer learning and parent only learning.
6.6.2 Results

6.6.2.1 Visualisation For each experimental run, three coloured columns are produced. A single row of one column shows the signal used (according to the colour scheme shown on Colour Plate 6.1) by each agent in one generation to signal a particular meaning. Successive generations are shown below each other to form the columns. Thus, column 1 shows the signal used to represent ‘meaning’ 1 by each and every agent over some number of generations. Column 2 shows the signals for meaning 2 and column 3 the signals for meaning 3. The position of an agent in the spatial array is the position used to plot the signals generated by that agent, as demonstrated in Figure 6.3. With plots such as that shown, the existence of local dialects can be observed across the population, the lifetimes of which can be viewed down the columns.

Without any signal mutation.1. Running ten test cases for one million generations each. or errors in learning. For the results shown the neighbourhood size is determined by a normal curve of standard deviation 1. but not infinitely.3. 10. While diversity may be maintained in the model for a long period.2.2(a) and 6. once the signal schemes have converged. 6. This diagram shows the plots for each of the three meanings. with similar results – larger standard deviations resulting in dialects covering more of the population. Plotting the signal used by each agent in a generation for one of the ‘meanings’ results in a line. The model had been tested a number of times for 1000.6. Other values were used.6.2. and cannot re-appear once it has Generations . showing the evolution of the communication schemes.000th to the 100. every agent in the population used the same signal as every other agent. three resulted in communication schemes which had converged across the whole population: for any one of the three meanings. Diversity is clearly maintained in the system for long periods.5. The results shown in Colour Plate 6. The diversity obtained is maintained for many thousands of generations – more than can be viewed in a single plot.000 and then 100.Chapter 6 – The Cultural Evolution of Languages 124 1 Agent position Meaning 2 3 Figure 6.2(b).000 generations without convergence to a single global dialect being observed. it may not survive over many generations.2 each show one thousand generations of language evolution – but the generations shown in every case are the 99. they remain converged. Exemplar resultant plots are shown on Colour Plates 6. The effect of neighbourhood size is investigated more closely in a following section. Also see Colour Plate 6.2 Experimental Results Experiments were run under both sets of conditions described in section 6.000th generations. Showing successive generations reveals the changing use of signals for the meaning over time.

this time for a localised sub-group of the population gives a different picture. for a given internal state we can determine how much uncertainty exists in the range of signals produced by the agents to signal that meaning. We compare the average value of H after 100.3731). If there is a significant change. then it is likely that diversity has increased or decreased over time. H. H. the closer to 3 bits the value of H might be expected to be. for the final generation. This can be investigated using Information Theory.000 generations gives an average of H = 2. then there might be very low uncertainty.2.3744). For a system with eight possible signals the maximum possible uncertainty is 3 bits. Taking a localised subgroup of agents from the population. The more dialects that have formed. If all the agents produce the same signal then there is no uncertainty. If the subgroup speaks a common dialect. 6. Uncertainty values can also be used to help determine if the diversity is reduced over time. in each of 10 experimental runs of 100. Calculating H for each of three internal states. the values for H should be significantly smaller.000 generations with that determined after one million. These sixteen agents have been selected in each case without regard to any possible dialect boundaries – so this lower uncertainty exists even though there may be differences in signal use within these subgroups.6. can be calculated for each of the meanings for a population. Repeating the calculations. An entropy (average uncertainty) value.3 Measuring Diversity The results appear to show that local dialects emerge within a population that does not share a global signalling scheme. Statistical t-Tests are performed on the sets of H values found under each set of conditions to determine if there is a significant difference between the sets of . If many local dialects have been formed then the values calculated for H for the signals produced by the whole population for a given meaning should approach the maximum possible value. That is. The calculations here are based on the signal use of a continuous group of sixteen agents from the centre of the spatial distribution.2153 bits (with a standard deviation of 0. there is high uncertainty. This time the average uncertainty. due to the variation in signal usage over the whole population there is a high uncertainty. is 1.Chapter 6 – The Cultural Evolution of Languages 125 been lost (although later we will look at ways in which signal diversity can emerge in populations with converged signal schemes).1278 bits (with a standard deviation of 0. If the signal production is diverse. Thus.

343076 One-tailed t-Test 0.000 or after one million generations. Including these results shows a significant difference in the level of uncertainty in signal use. Extending neighbourhood size to infinity results in a single neighbourhood encompassing the entire population. there is no appreciable reduction in the signal diversity over time. communication will be more localised and there may be a greater amount of dialect diversity. . the uncertainty values (measuring diversity) after one million generations do not significantly deviate from those after 100. With larger neighbourhoods. communication will take place over larger areas. 100K vs 1M Generations Including Converged Data Sets Excluding Converged Data Sets 0.002086 0.686152 Two-tailed t-Test 0.5. In the results described so far.004172 0. Intuitively. A uniform distribution is used here to ensure that partner agents are selected without regard to distance. Where convergence has not occurred. t-Test results determine confidence that two data sets may be drawn from the same distribution – whether there is a significant difference between the data sets. In obtaining the results from the results after one million generations. it is not possible to determine from the uncertainty values whether a data set has been drawn from a population after 100.3 The Effect of Neighbourhood Size The results gathered so far are all for a single neighbourhood size. we use only the data from runs where convergence has not occurred. but without them there is no significant difference apparent in the two sets of results. we investigate whether the population will converge on a single language at a faster rate. Colour Plate 6.6. Excluding converged results.Chapter 6 – The Cultural Evolution of Languages 126 results. With smaller neighbourhood size.002283 0. and diversity may be reduced.41212 Paired t-Test Table 6. diversity is maintained for extended periods of time. and tests with smaller neighbourhood sizes do not cause any change to this result. as the neighbourhood size changes so will the pattern of signal distribution.5. These results are summarised in Table 6.3 shows some typical results of signal negotiation under such circumstances (other parameters are unaltered from those given above). Of more interest is what happens with the larger neighbourhoods. Except in cases where convergence occurs. where a uniform random distribution is used to select partners for signal learning and testing.000 generations. 6.

1.3(ii). given spatial limitations on agent interactions during language acquisition. how is diversity maintained for so long? If we consider language to be a cultural system. with few cultures. So diversity does not last infinitely. as shown in Plate 6. and all test runs converged by 100.000 generations. This allows some cultural diversity to be maintained despite increasing homogenisation. This plate shows 1000 generations from the 1000th to the 1999th generation of a sample model run. The patterns that emerge are similar to patterns of white noise.000. However.4 Analysis These results show that.3(ii) and (iii) show the typical appearance of the resultant communication schemes. In effect.4. 0) and turquoise (0. Due to the redundancy in the signalling capability. Plate 6. a single global dialect has emerged. Only white and yellow are used to represent meaning 3. the diversity may eventually be lost. one that is tolerant of minor variations in signal use between different agents. The resultant diversity can be preserved for many generations. For example. only green (0. 6. yet one that may have some diversity within it.Chapter 6 – The Cultural Evolution of Languages 127 Under these conditions a single global dialect rapidly emerges. such as (Axelrod. in Plate 6. Some other mechanism must be responsible for the maintenance of diversity. 1997b). and the remaining four colours for meaning 2. this allows the agents to correctly interpret the signals that they receive. and the more features cultures have in common then the more likely they are to interact and share their culture – leading to increasing numbers of shared cultural traits. Without a source of innovation. which prevent the dissemination of culture. but does survive for a very extended duration. 1) are used to represent meaning 1. dialects may form. who considers how different ‘dialects’ of culture may interact. 6.6. in our model agents attempt to learn from surrounding agents regardless of the degree of difference in their signalling schemes. . Strong differences result in cultural barriers. Eventually the communication schemes converge. But for each of the meanings only a subset of the eight possible signals is used. then the result can be compared to work on cultural diversity. Convergence happens more rapidly than previously – 40% converge inside 10.1 Maintaining Diversity The question to answer is then.6. and ends in polarization.3(i). 1. which share few if any features. In Axelrod’s model cultures are defined by a number of features.

and is one factor in the evolution of linguistic diversity. a learner may only be able to resolve the conflict between two different dialects surrounding it. Redundancy in the signal layer allows multiple signals to be mapped to one meaning (but only one reverse mapping).4 Figure 6. An example is shown in Figure 6. The different mappings mean that each agent has a system of mappings – and changing one value may affect the others. where linguistic innovation may introduce new language features previously lacking in a language community. . Conflicting training signals may result in agents learning different meaning-signal pairs than presented. Alternatively.Chapter 6 – The Cultural Evolution of Languages 128 Each learner in the population forms their own idiolect after learning from a surrounding mix of idiolects and training examples that is potentially unique for every learner. a signal is learned for a meaning despite not being present in the original training set. or where contact between different language communities might have quite unexpected results. In the highlighted region. this artefact does have parallels to real language diversification. While the precise workings of this are peculiar to this particular model. the learning for one mapping may perturb the weights that encode another signal-meaning mapping. by learning some mappings not present in either. First. as the learner attempts to resolve the conflict. as weights are updated repeatedly as an agent learns. In most cases the conflicts are not solved by one dialect succeeding at the expense of the others. This might result in a learner using a different signal for a particular meaning than used by any of the agents in the parent generation. Each agent learns three bi-directional signal-meaning mappings. There are two ways in which this can cause ‘novel’ forms to appear. This allows learners to form different signal schemes from those used by their neighbours. This may occur when the dialects use the same signal for different meanings.4. Another important factor is the interaction of agents who may use similar signals for different meanings – which may result in a learned signalling system being different from both of the original systems.

Over ten runs. This finding is apparent from the many plots produced.5: The percentage of communicative successes over the spatially distributed population. where another agent within the same neighbourhood generates the signal.347% and 2. and was measured by an information theoretic analysis of the data.Chapter 6 – The Cultural Evolution of Languages 129 Thus.7% (standard deviations of 0. localised learning is able to maintain diversity. even if it alone is not capable of producing diversity. The dialects held by agents within our model form a dialect continuum connecting (at the extremes) dialects that are not mutually intelligible. Average success is over 95%. .4. and that attempting to learn from a mixture of dialects can result in continued diversity. We have determined that local dialects do indeed exist with the emergent signalling schemes.2 Human linguistic diversity: A comparison There exist some qualitative similarities between the spatial organisation of dialects within our model and the geographical organisation of human dialects (as described in Section 6. Figure 6. and this is done in the following section. The average success at interpreting signals which may have originated from anywhere in the population (which will include some number of signals from the same neighbourhood) was much lower at 38.745% respectively). versus the success of interpreting signals that may be generated by any other agent – regardless of distance. Following that. we will review how diversity may emerge from a converged communication system. the average success at interpreting signals that originate in the same neighbourhood was found to be high at 98. This is confirmed by comparing the average success of interpreting a local signal. 100 90 80 70 60 50 40 30 20 10 0 10 20 30 40 50 60 70 80 90 100 110 120 Figure 6. At this point we can also see if the results can be compared more directly to those from linguistic studies in dialectology. 6.5 shows the results of testing the success rate for correct signal interpretation by agents.6.2%. Almost all agents correctly interpret all signals presented to them.3).

so boundaries between dialects are generally not distinct. If the model is a reasonable approximation of linguistic transmission in human society. . which shows both areas of linguistic change and regions of dialect stability.5 4 3 2 1 0 10 20 30 40 50 60 70 80 90 100 110 120 M a x im u m H a m m in g d ista n ce T o ta l H a m m in g d ista n ce Figure 6. and both produce a signal for each of the meanings. The lines marking the change in use of the items do not generally fall together. 1. using Hamming distances. if no signal pairs had any bits in common). is similar in appearance to the schematic shown in Figure 6.6 charts the maximum Hamming distance across the three signal pairs when comparing the signals of pairs of adjacent agents across the population – adjacent pairs of agents are selected. 1) has two bits different from (1. then the processes at work in the model may correlate well with those at work in the evolution of human languages. These comparisons are useful in showing that the model leads to qualitatively similar results to those observed across the globe. Figure 6. Also shown is the result of adding the Hamming distance values for the three signal pairs. It is possible to chart the Hamming distances for the signals for each of the three meanings. 0) and hence has a Hamming distance of two. Figure 6. This corresponds to the only measured communication success rate score below 90% in Figure 6. The Hamming distance between two binary signals is simply the number of bits different between them – for example (0.6: The maximum (for any one signal) and total (over three signals) Hamming distances between signals used by adjacent agents in a spatially distributed population. Only at one point does the total Hamming distance total reach 4 (out of a maximum possible of 9.1. A more direct qualitative comparison is possible by plotting the accumulated Hamming distance across the population. This graph.7. 1.Chapter 6 – The Cultural Evolution of Languages 130 It is also possible to map a correlate of isoglosses.

Further. where change and innovation appear to be irrepressible (e.7: The cumulative Hamming distances between the signals used by adjacent agents over the spatially distributed population 6. Language can then be viewed as a system with imperfect replication – in any attempt to replicate an existing language. Once diversity is lost it is never regained. using only small noise values increases confidence that the amount of noise used is not unreasonably inflated. This compares poorly with human languages. a number of changes will occur. It is widely recognised that the evidence available to children during grammar acquisition contains contradictions and is insufficiently detailed for error-free acquisition. However. It is not simple to quantify the incidence of errors and misunderstandings in language use. The different pressures on language users (see Section 2. or to quantify the effect of ambiguous evidence. there are many potential sources of change and innovation in human language that have no counterpart in our model.2) to modify their language use to maximise comprehension and convenience do not exist. 1991. and the learner will actually acquire a ‘new’ language. exist. (1998). once the communication scheme has converged it remains fixed. For example. grammars are large systems. demonstrate an ALife model in which errors in signal production. Aitchison.7 Experiment 2: Diversity from Homogeneity In the previous experiment. One way to represent this is to add noise to the signals that are presented to learners during training. perception and interpretation can cause change in agent lexicons.5. . Nor does the possibility of errors.g. However. Steels & Kaplan. which may help drive some language changes. Lass.Chapter 6 – The Cultural Evolution of Languages 131 60 50 40 30 20 10 0 10 20 30 40 50 60 70 80 90 100 110 120 Meaning 1 Meaning 2 Meaning 3 Figure 6. 1997).

As well as unique and individual accents.1 Noisy Learning A variable amount of signal noise is used. for the third. (0.1. with the same qualitative results. or white. The experiments detailed below were also repeated using populations with initial signal repertoires of red. With either option it is not necessary to show whether this noise is sufficient to prevent convergence. or black. both with small noise parameter settings: 0.6. 0). and this may in turn affect the learned communication schemes of the language agents. 1).1 Experimental Setup The model setup is initially the same as described in Section 6. In doing so. with a few differences. The effect of this is to help to preserve existing systems slightly. In the results reported here. Two sets of simulations are performed. preventing agents from converging to the same signal scheme. it will be demonstrated that even under unusual circumstances. Individual bias can be modelled by initialising the weights of the language agents to small random values. for the first meaning.01%. there is a greater likelihood of learners faced with similar evidence acquiring different languages.7. returning it to a diverse one. The noise rate is the chance per signal bit of flipping the bit. . green and blue (where there is an equal Hamming distance between each signal). The populations are started with an initially converged language – the first generation is set such that all agents use the same signals.1% and 0. Learning examples are provided only by the parent generation – there is no learning from peers. 0. 0). it is simply required to demonstrate that noise and/or innate bias can disturb a converged signalling system. 1. or red. We also implement either innate biases or signal noise. all agents in the first generation use the signal vector (1. If different language learners have innate biases that influence their language acquisition.7. This may cause changes in individual signals. set at the beginning of each experimental run. 0.Chapter 6 – The Cultural Evolution of Languages 132 Another potential source of diversity is individual bias in language learning. 6. Rather. there may also be subtle variations in individuals’ grammars and lexicons. This may influence learning. that allow friends to identify one another by their speech. 6.1. For a one percent noise rate each bit of a signal has an independent chance of one in a hundred of being flipped. This we attempt to do now. should convergence occur it will not last. and slow down change. for the second and (1.

7. Again.05.8 Summary We opened this chapter with observations of what appears to be signal diversity in the negotiated signal schemes created by agents in the previous chapter. In the second. imperfect learning initially leads agents to learn schemes that.7. Our results here agree with Lass. limited. and are introduced without disturbing the chance of successful signal interpretation. We noted that this . and some results are plotted in Colour Plate 6. the original signal scheme cannot be recovered from the current one.4. although the rate with which diversity is introduced seems greater when noise is present than when innate biases cause the emergence of diversity. and are qualitatively the same. With the finite. given sufficient time. Lass (1997) argues that ‘linguistic junk’.2 Results The results for both sets of conditions can be considered together. these weights may prevent the agents from acquiring the same signals with which they are provided as training examples. 6.005 to –0. The redundancy of signals allows these changes to take hold. the range is from 0. initial weight values are set to a random value in the range 0. 6. two sets of runs are performed. the initial weight value is drawn from a uniform distribution.Chapter 6 – The Cultural Evolution of Languages 133 6.005. but compatible signals. let alone one which uses just eight signals to communicate just three meanings. The imperfect transmission and acquisition of language is thus the vehicle by which change occurs and diversity is reintroduced in this model. In all cases the innate biases or noise is sufficient to cause divergent dialect evolution. Inspecting the various graphs plotted on Colour Plate 6. In the first. is a key feature of human language for language change.2 Innate Bias All agent weights are initialised to small random values.4. 6. learning period. by allowing agents to learn different.1. Here. black and red) have only one bit difference.3 Discussion Both innate biases and noisy communication are capable of introducing diversity to the signal schemes used by the agents.4(c) it is evident that. it also contains much redundancy.7. In both cases. a consequence of redundant signalling. and introduce diversity.05 to –0. while different. the signals first introduced in place of the initial three (white. From Plate 6. While human language is much more complex than any simple signalling scheme. are not incompatible with those of the agents in the parent generation.

to study the emergence and cultural evolution of dialect diversity in the artificial life model previously developed in this thesis. and appear to exist in human language learning. We then proceeded to review some work on mathematical models of linguistic diversity. This removes from the model those features that attempt to model biological evolution. and develop a theoretical framework for these conclusions. Then a number of experiments were conducted. we presented a brief review of patterns of linguistic diversity as seen in human society. the cultural evolution of language is modelled. Drawing our own conclusions from a simple model developed by (Cavalli-Sforza and Feldman. It was found that the patterns of signal diversity found in the model had some significant similarities to the patterns of linguistic diversity found across human languages. or the existence of innate bias) are not unrealistic. and conducted an additional experiment to investigate the findings. 1978). Specific results also included observations that the continued existence of diversity could not be guaranteed. This was done in order that we might have some basis for comparing the results found in our model against those found in the real world. Instead. . Having done so. In the next chapter we will present further work that corroborates the conclusions of this chapter. by allowing agents to learn their signalling schemes from agents in previous generations. The structure of the agents was fixed for these experiments. we similarly found an outcome that would be unexpected in the real world. We saw how some of this work leads to conclusions at odds with the real-world evidence. These conditions (noisy learning. although under certain conditions diversity will emerge in populations with initially homogenous signalling schemes.Chapter 6 – The Cultural Evolution of Languages 134 spontaneous emergence of diversity seems to contradict some existing arguments on the causes of linguistic diversity. under different conditions. and in order that we might further investigate the emergence of linguistic diversity.

and all language acquisition occurs during the first stage.2) – applying a novel use (group marking) to a feature (dialect diversity) which has already emerged in the population. we reviewed some of Daniel Nettle’s arguments on the causes of linguistic diversity. First.4. 7. starting with work which disagrees with our own. This is used as the basis for an argument that dialects emerged for this reason. Some of this other work is worth reviewing.1 Artificial Life and Micro-Simulation Models of Linguistic Diversity The work described in the previous chapter is not the only work that has been carried out using computational models to investigate the evolution of linguistic diversity.5. This does not necessarily follow however.2 and 2. quite distinct. and then develop this corroboration by making use of a different. The model presented in (Nettle. Learners pass through five life-cycle stages.5. model. the language used consisting of a model of a vowel sound system. In this chapter we show that there is a need for additional corroboration. Each group . 1999a. in Nettle and Dunbar (1997). and putting forward an argument for the cultural evolution of language without linguistic selection. He has also provided backing for his arguments using a variety of simulation-based models. we showed how some minor modifications to our model allow us to model the cultural evolution of language.1 Functional Requirements for Diversity In Chapter 2. and to obtain results with clear qualitative similarities to the real-world cultural evolution of language. 1999b).6.Chapter 7 – Cultural Evolution of Phonology 135 Chapter 7 Cultural Evolution in an Agent Based Model of Emergent Phonology In the last chapter.1. and how such a marker may be used in the evolution of cooperation. a model is developed which shows how dialects may be used to indicate group membership. Nettle has presented two further models that support his arguments that social status and social functions of dialect differences are pre-requisites for the emergence of dialect diversity (Nettle. and could be another example of exaptation (Section 2. Nettle. Sections 2. groups of cooperative agents are able to resist invasion from non-cooperative individuals. We finish by setting out the significance of our results. 7. where the new language agents learn from the other agents in the same village. 1999a) arranges language learners into a series of small groups. By using these markers.

but retains the same agent life-cycle. and uses these models as demonstrations. Adding in social status changes the findings significantly. In the former. the elderly are replaced by a new set of infants. No learning occurs after the first stage. but the general finding is that sustained diversity requires that social status exerts a very large influence upon the acquisition of grammar. It is shown that. unless the groups are completely isolated from one another. the impact of all the surrounding grammars is calculated. learning only occurs during the first life stage. Nettle argues in his work that the effects of averaging and thresholding would work to stifle diversity.Chapter 7 – Cultural Evolution of Phonology 136 contains four individuals at each of the life-cycle stages (twenty in total at any time). After this. were it not for the effect of social status. In determining which grammar a learner acquires. After the fifth life stage. Each individual has a 25% chance of gaining high social status after the first life stage. These last two models each have design features which lead directly to these results. where agents pass through five life-stages. all agents existing on a single spatial array. plus a slight perturbation due to noise. 1981). Apart from this there are few similarities. This uses a sum of all the surrounding grammars. p or q. if the result is in favor of one grammar. that is the grammar acquired. Otherwise. Learners only learn language from those individuals with high status within the village. weighted by distance. The infants each learn a sound system according to the set of sound systems in use by the existing group members. Instead of learning vowels. In the latter. plus a small amount of noise. The model presented in (Nettle. except where a social-status weighting is introduced. vowels are learnt by an explicit averaging of the vowels in use already in the local group. This thresholding forces the grammar learners to acquire the more commonly used grammar variant within their neighbourhood. the sound learnt is the average formant frequency values used by all of the adults in the population for that vowel. the impact measurement and forced selection of a grammar from one of two distinct grammars – without the possibility of acquisition of elements of different grammars – is a form of thresholding. Inspired by social impact theory (Latané. before being replaced by new learners. Several factors may be varied in this model. for any vowel. 1999b) has many major differences. all the individuals are ‘aged’ one stage. Then. He then uses models in which these are enforced by the language . there are no sub-groups within this model. Again. With social status included small differences between groups may become magnified over time and it is found that contact between groups no longer eliminates diversity. agents acquire one of two grammars. diversity does not emerge.

3 Related Models In some cases.Chapter 7 – Cultural Evolution of Phonology 137 acquisition rules he has built in. While this is a plausible position. that averaging or thresholding will prove to be the barrier to diversity that Nettle argues they are. Language is inherited. Axelrod (1997b) presents a model to investigate the dissemination of culture through a spatially arranged population. this is obviously relevant to the evolution of dialects. or where there is a possibility of learning grammars that are different but compatible with surrounding grammars. In the model. it is cooperative strategies that are being evolved here. Accordingly these last two models give few insights into the causes of. and as sites interact they share traits and slowly converge. it is not strongly supported by their model. which relies on genetic mutation for the emergence of linguistic diversity. If the spatial distribution of speakers is indeed a factor in the emergence of dialect diversity. They hold that it is the spatial distribution of individuals that is the key factor in the emergence of dialects. it may be that models that have been developed to study other systems may. For example. Viewing language as a cultural trait. neighbouring sites may interact if they have at least one cultural trait in common. The results are at odds with . 7. It is not proven that under more realistic learning conditions. The degree of vocabulary sharing is also related to the availability of resources. be relevant to the study of linguistic diversity. where language is acquired as the result of many interactions. human linguistic diversity. and influences on. Eventually a stable distribution emerges where a limited number of groups survive.1. in some way. Rather than vocabularies.2 Other Models of the Evolution of Linguistic Diversity Arita and Taylor (1996) present what is possibly the first attempt to explain the origin of linguistic diversity using a micro-simulation model. with mutation producing diversity and learning leading to increased convergence. as evidenced by cases where the evolved communication strategy is not to communicate at all. and no traits being shared with sites belonging to neighbouring groups. Mutation rate is again identified as being an important factor in the emergence of diversity in the vocabularies. in their investigation into the evolutionary dynamics of vocabulary sharing. but without an identified linguistic equivalent. within each group all sites having identical sets of traits. then it must be able to work when the only means of language transmission is through learning – as it is for human language Innate language is again used by Arita and Koyama (1998).1. 7.

section 2. While natural language errors are somewhat more systematic than the random errors introduced in this model. but merely the outcome of learning over time leading to a reduced set of surviving grammars. where similarities in the lexical forms are ignored by the agents. which demonstrates the emergence – through cultural transmission – of compositional language in a population of language learners. based on artificial neural networks. Maeda et al. apart from the question of dialect.4.2 Dialect in an Agent Based Model of Emergent Phonology The preceding review has shown that the work of the previous chapter is not the only work using computational models claiming to illustrate the processes which give rise to dialect diversity. This appears to be due to the networks learning one of two similar signals for particular internal meanings – an ability not present in the Steels and Kaplan model. Steels and Kaplan (1998) demonstrate how various linguistic and extralinguistic errors can lead to continued language change. it is found that noise is not required to maintain competition between forms. Following on from (Kirby. The results show language reorganization after contact is made between populations using different languages. the model successfully demonstrates the large influence such errors may have on language innovation. In this work Kirby shows results indicating the existence of geographically distributed dialects of grammars. however (see Figure 6. In other cases the relevance of the results to the question of the origins of linguistic diversity is clearer. not results that compare well with those observed in the real world. the previously mentioned work of Kirby (1998). Nettle has produced a variety of different models. dialect diversity is completely absent from the population – again. 7. builds a simulation model to show that Universal Grammar constraints may not be innate constraints at all. A similar model. and related discussion). For example. (Brighton and Kirby.Chapter 7 – Cultural Evolution of Phonology 138 observed phenomenan in human language. In this model. 1999a. is presented by Dircks and Stoness (1999).3.2. all of which support his theory that social motivation is required to produce and maintain linguistic diversity (Nettle. But after the reorganization. Some of these models produce results which appear to directly contradict our own conclusions. . Other related work has looked at the process of language change. 2001) present a study of the eol exploring the conditions under which compositional languages are maintained. (1997) examine the effect of language contact. 2000).

and can then try to mimic the sound. de Boer and Vogt. Agents use a model auditory system to try to perceive the vowel. The agents produce and attempt to learn vowel sounds. but additional supporting evidence from an unrelated model would provide useful support to our argument. a learner agent determines which vowel in the list is the most likely one that they may have heard. Human language learning is not based on assessing what the average signal is. This . This was done. where tongue position determines the formant frequencies of the generated vowel. 1997). Each agent maintains a list of vowel prototypes. and testing determined that it performed qualitatively the same as de Boer’s own implementation. The model presented in the previous chapter does provide contrary evidence to Nettle’s. We have argued that the details of these models are such that they explicitly implement the averaging and/or threshold problems. 1998). we re-implemented Bart de Boer’s model of an emergent phonology (de Boer. 7. extending it slightly in order to investigate the emergence of dialect diversity. Vowel production is based on a mathematical model of the human articulatory system. but this uses more traditional techniques to find optimal distributions of vowels within a vowel sound space without modelling the language users or their interactions in any way. De Boer’s model has been thoroughly documented.Chapter 7 – Cultural Evolution of Phonology 139 Nettle. with the advantage of more closely modelling the processes involved when populations of speakers try to match the sounds that they hear being spoken. Yet part of the strength of Nettle’s argument is that he has demonstrated the same requirement for social motivation in a number of unrelated models. Upon hearing a vowel. de Boer. enabling an independent reimplementation to be attempted (de Boer. De Boer’s results compare well with these earlier results. 1999. through repeated interactions and learning.1 de Boer’s Model of Emergent Phonology De Boer’s model is one in which populations of agents form an emergent vowel sound system.2. An earlier model of emergent vowel sound systems has also been presented by (Lindblom. 1999b). With such a goal in mind. 1997. 2000). or on calculations to determine what signal is used by the majority of people and so explicitly including such rules in a model may invalidate the results of the model.

1 shows a typical emergent vowel system. vowel systems with few or many vowels may emerge. Added to the chart are approximate positions for some of the major vowel sounds of the English language. and 5000 learning interactions. Such charts of ‘vowel space’. from a simulation with a population of 20 agents. the learner either updates the prototype for an existing vowel. Based on the feedback. The results of this emergent system have been compared extensively with human vowel systems. The vowels generated are represented by four formant frequencies.g. Over time. Figure 7. 2000.105). equation 5) into a single value. More details of the inner workings of de Boer’s model are included in Table 7. p. The vowel systems may be viewed by having all of the agents produce the sounds for each of their vowel prototypes and charting the vowels. or creates a completely new vowel prototype. to shift it closer to the perceived one. F1. . vowel prototypes are added and occasionally merged or pruned until each agent has a set of vowels.1.182. p. F2’. With varying amounts of noise. but for purposes of display the last three are combined (see de Boer. Johnson. with learners and tutors randomly selected from the population. F3 and F4. 1997. F2. and this has confirmed that the vowel systems so produced are realistic and highly possible. and with a yes/no feedback signal.Chapter 7 – Cultural Evolution of Phonology 140 sound is then produced. These interactions are repeated. plotting vowel sounds according for first and second formant values are commonly used in phonetic studies (e. The original signaller listens and determines whether the vowel it hears back is sufficiently similar to the original vowel.

V is the set of vowels possessed by an agent. 2000). and v is some vowel. .1.1. Signals A1 and A2 are the articulations of the selected vowels. Initiator (teacher) If V = Ø then Add random vowel to V Pick random v from V. Extract from the basic rules for agent interaction in de Boer’s model of emergent phonology (from de Boer. An emergent phonology. The approximate positions of some the major vowels of English have been superimposed on the graph.Chapter 7 – Cultural Evolution of Phonology 141 Figure 7. Clusters appear in areas of the phoneme-space where multiple agents have learned shared vowel sounds. Increment count of uses of v Produce signal A1 from v Receive A1 If V = Ø then vnew = Find phoneme(A1) V = V ∪ vnew Calculate vreceived Produce signal A2 from vreceived Receive A2 Calculate vreceived If v = vreceived then Send non-verbal feedback: success Increment count of successful use of v Else Send non-verbal feedback: failure Do other updates of V (see below) Imitator (learner) Receive non-verbal feedback Update V according to feedback signal Do other updates of V (see below) Table 7.

100 agents were arranged in a single. vowels which are close to one another are merged and. some of these are discussed below.2. The algorithm for agent learning is not altered. Neighbourhood size was determined by the size of the population divided by ten.Chapter 7 – Cultural Evolution of Phonology 142 During the ‘other updates of V’ step of de Boer’s model.2 Experimental Setup The model was enhanced to include a larger population of agents. new vowels are randomly added to V. d (+/. The actual neighbourhood size is double this. spread across a spatial array. Table 7. within a limit of ten agents distance. The selection of partners is from a uniform distribution. Parameter settings for emergent phonology model 25. The value used in these experiments is in the mid-range of values used by de Boer. as agents can communicate with others on either side up to distance d. The partner is selected from a position along the line on either side of the original. a number of simulations were performed for a variety of parameter settings. other than to enable the neighbourhood-based selection of partners.d) 10 Noise (%) 15 Table 7. This figure ensures that there is reasonable distance between the far ends of the population. Parameters Population 100 Training rounds Neighbourhood size.2. non-toroidal.2 sets out the parameter settings for the simulation detailed in the following results. Once an agent has been selected a partner is required for learning. bad vowels (which do not correspond well to vowels heard) are removed. . In de Boer’s thesis the effect of varying noise is well documented.000 The large number of training rounds ensures that each and every agent will receive of the order of 200 training examples (the number of training rounds being determined by the population x 250). with a small probability. Varying the number of training rounds thus allowed tests with larger and smaller populations to be performed. As before. Agents near the ends of the line have as a consequence fewer other agents with which to communicate. Smaller noise values allow the emergence of vowels systems with more vowels than occur with larger noise values. 7. the results of these being supportive of the assertions following. line. This would lead us to expect vowel systems of around four to six distinct vowels to emerge.

The first twenty agents of the population form the first group. with distinct differences only occurring between adjacent groups.3 show the same emergent vowel system. A more detailed examination reveals what is really happening.2. Even within a single group. . It is not clear in Figure 7.Chapter 7 – Cultural Evolution of Phonology 143 7. Depending on how the vowels are categorized.2.2 how many distinct vowels have emerged – the precise answer depending on how the individual sounds are categorized. At the bottom of chart. Some of the individual diagrams remain a little unclear – it is not always obvious whether one or other of the clusters represents one or two vowels. and one more at the bottom. the next twenty are placed in group two. the groups are a completely arbitrary division of the population according to spatial position. As such it should not be expected that the vowel systems will be extremely close within these groups. and it is possible that different agents within a group have learned slightly different sets of vowels. The population has been split into five arbitrary contiguous groups. there is some spatial distance. one top centre. The groups themselves have not been chosen with regard to how close the vowel systems of the individuals within the group are – rather. This gives a total of six vowels. and so on. The diagrams in Figure 7. two at the top left.3 Results Running the model produces the output shown in Figure 7. Possibly the clearest categorization would be to count two vowels at the top right of the chart. there appear to be around six distinct vowels in use.2. The emergent vowel system of the population. within the expected range. Figure 7. there are some breakaway vowels.

However. . which is largely shared amongst the agents within a group. Top row shows groups 1-3. more major shifts and differences exist. a dialect continuum has emerged in the population. Figure 7. The emergent vowel system of the population. Minor changes exist within neighbourhoods.3. Across the population.Chapter 7 – Cultural Evolution of Phonology 144 Figure 7.7).4. In this figure. bottom row groups 4 and 5. it would appear that most groups have developed a four or five vowel system. Figure 7. the phonemes of the first and the last groups are shown together (as white and black dots respectively). As with the previous work (sections 6. allowing successful communication therein. Each diagram shows the phonemes used by a different contiguous sub-group of the population.6 and 6.4 emphasises the differences that exist across the population. Emergent vowel systems of the first and last sub-populations.

Chapter 7 – Cultural Evolution of Phonology

145

Although the model is entirely unrelated the qualitative result is the same: the negotiation of a communication scheme/phonology within a population, where neighbourhoods limit interactions, gives rise to emergent dialects without any requirement for any need or motivation to create the different dialects. The de Boer model has some advantages over our own for studies of language change and diversity. There is no fixed number of signals, or of ‘meanings’, leading to systems which are more open, more like those found in human language. The differences in vowel use by different sub-populations, or over time (de Boer, 2000, p.190-192) generate what appear to be chains of changes. The close parallel of the emergent sound systems to those of human language makes comparison of results to observed changes in human languages possible. As such, there is potential to use this model to further study the development of push/drag chains (systems of change where movements in clusters create spaces for other clusters to move into, creating a ‘drag’ effect, or where the movement of one sound forces another to move apart, a ‘push’ effect’, (King, 1969)) in vowel sound systems.

7.3 Discussion: Towards a Modified-Neutral Theory of Language Change and Diversity
The experiments in this chapter and the last, both demonstrate a form of cultural evolution without any cultural selective pressure. Thus these experiments demonstrate one form of neutral evolution of language – the evolution of language without selective pressure on adopted forms being exerted by social bonds or factors.
7.3.1 Neutral Evolution Revisited

Neutral evolution, and its application to linguistic evolution was discussed previously, in section 2.5.6. In light of these most recent results, we look again at neutral evolution, and how it may apply to the eol. Neutral evolution is evolution in which selective processes do not operate (Kimura, 1983). Neutral evolution, also known as genetic drift, can occur anywhere where there is variation in a population but where none of the variants has any specific selective advantage. By chance, rather than by adaptive selection, one variant may become a new norm. The resultant species or languages will be distinguishable from their ancestors. For biological entities the definition of ‘neutral’ change is quite clear, as being any genetic mutation which does not affect the reproductive success of the creature. In the case of language there are a number of possible interpretations.

Chapter 7 – Cultural Evolution of Phonology

146

The first is an analogue of the original meaning. A neutral change in language may refer to some linguistic change which does not affect the ease, or difficulty, of acquiring that language. Christiansen (2000) and Kirby (1999) have looked at how the ease with which a particular language variant can be learned can influence its survival, showing that hard to learn variants are selected against and may not survive. What is left may then be a large number of variations that are all – possibly approximately – as easy to learn as each other. Selection amongst these variants may then proceed in the manner of neutral evolution. This will henceforth be referred to as linguistically neutral evolution. A second interpretation, closer to the original biological meaning, is that an evolutionary neutral linguistic change is one which does not affect the reproductive success of the language users. There are a number of ways in which language influences evolutionary fitness, and determining how speaking a particular language sways the speakers’ reproductive chances is not entirely straightforward. There are two distinct ways in which the fitness of language users could be affected by changes in their speech, according to whether the changes affect the communicative success or effectiveness of dialogues or if the changes affect their social standing within their group. Changes that do not affect the success of communication could again be considered as being linguistically neutral. Changes that do not affect the fitness of language users according to social position will be referred to as socially neutral evolution.
7.3.2 Adaptive, Maladaptive and Neutral Change

Where changes affect the communicative success, whether the benefits are to be gained by communicating ideas, observations or by gossiping (Dunbar, 1996), adaptive changes will be those which make communication, both signalling and interpreting, easier. In arguing that language change cannot be socially neutral, Milroy (1992) argues that from the perspective of communicative fitness, all change is dysfunctional – by not speaking precisely the same language as others an increased chance of misinterpretation is introduced (also see section 2.5.2). This is used to justify arguments that change must be socially adaptive. Such a view is extensively denied by Lass (1997). Lass notes that language features a high degree of redundancy and as a result it is possible to make quite significant changes before communication is adversely affected, a point also made by Pinker (1994). While some changes may affect successful communication, there is much room for changes which do not degrade communication.

Chapter 7 – Cultural Evolution of Phonology

147

However, the social benefits of speaking the correct dialect are well-documented (see, for example, Chambers, 1995), and there have been many numerous studies of how dialect markers are used in determining membership of all manner of groups from street gangs to business elite. Social marking may also influence reproductive fitness of language users, as a consequence of increasing cooperation amongst those with similar dialect (Nettle and Dunbar, 1997). Nettle (1999a) further claims that, with a neutral model, it is difficult to account for diversification without there being complete geographical isolation between groups. Thus, Nettle proposes that in order for linguistic evolution to occur without geographical isolation additional mechanisms are required. Nettle argues that the social functions of language are required for the emergence of linguistic diversity, a view shared by Dunbar: “… dialects arose as an attempt to control the depredations of those who would exploit peoples’ natural cooperativeness” (Dunbar, 1996, p169) However, Nettle’s arguments rely on a particular learning model – in which the learners sample the speech of the population and learn an ‘average’ of the language around them. This further relies on the equal distribution of individuals, with a uniform likelihood of any one individual interacting with any other. As recognised by Cavalli-Sforza and Feldman (1978), in any group the amount of influence exerted on any one individual by any one of the others will vary according to a number of factors. This reduces the effect of averaging, and increases the potential for sub-populations to vary from the mean. The different social networks within groups reduce the need for geographical isolation to produce linguistic diversity. Nettle does not consider the effects of sub-groups within communities, which I suggest is an important feature in the development of linguistic diversity. Dunbar (1996) puts forward 150 members as being a natural group size for human communities, and points to “sympathy group” sizes of between 10 and 15 people. Any change which does occur must be propogated over a series of interactions between individuals. With the existence of groups-within-groups, there is no need for isolation before linguistic diversity can emerge. Further, the averaging effect itself is questionable. For example, for random variation in the formant frequencies of phonemes it may not be the case that such variation will ‘cancel out’, or that the average values will be learned. Phonemic and articulatory constraints (see Lindblom, 1998; de Boer and Vogt, 1999) may prevent the ‘cancelling

Chapter 7 – Cultural Evolution of Phonology

148

out’, and the learned phonemes may be only tolerably close to those heard. Given two distinct forms of a linguistic feature, a learner need not choose just one to learn or to learn some composite, but may learn both. One form may be preferred, but both may be used in varying amounts (such variations in language use often being the focus of studies in sociolinguistics). As well as applying to the lexicon, it has been proposed that language learners learn multiple grammars, so as to cope with the variation in grammars in use around them (Kroch, 1989). This objection about averaging is extended to Nettle’s computational model in which averaging is explicitly performed. In contrast, the results presented in this and the previous chapter successfully show the emergence of dialects as a consequence of localised variations and interactions without any social significance or utility of particular dialect forms. While particular linguistic forms may mark group membership, other linguistic features and forms might be free of any information that marks the speaker as belonging to a particular group. Where they exist, changes to non-marking features – be they phonological, lexical or grammatical – might be possible without any socially adaptive consequences for the speakers. Indeed, in Milroy and Milroy (1993) such changes are noted amongst the men and women of two Belfast communities, where the authors state, “it is the group for whom the vowel has less significance as a network marker which seems to be leading the linguistic change”. (Milroy and Milroy, 1993) (reprinted in Trudgill and Cheshire, 1998, p192) By the arguments of Milroy (1992), change must be socially adaptive to overcome the linguistic pressures against change, yet here we have evidence of change without social advantage. So we have seen that there is evidence and argument to support the existence of both socially and linguistically neutral change in language. Naturally, a change which is socially neutral may or may not be linguistically neutral, and vice versa. Indeed, it is possible that should the benefit in one domain (social or linguistic) of a particular change be strong enough, change may occur even though it is somewhat maladaptive in the other domain. In the next section we look to other theories in linguistics and assess their impact on a neutral theory of language change.

Chapter 7 – Cultural Evolution of Phonology

149

7.3.3 Universality and Uniformitarianism

Two popular ideas about language also lend some support to the idea of language change as being the result of neutral evolution – Linguistic Universality and Uniformitarianism. 7.3.3.1 Universality A prevalent belief about language in modern linguistics is that no language is superior to any other. All languages are not the same, but they are all broadly equivalent in their overall expressive and communicative powers. While particular concepts may be somewhat easier or harder to express in one language than in another, on balance it is not possible to say that any one language is ‘better’, and many examples which were previously held to show such differences have since been reassessed and shown to be no more than myth (e.g. Pinker, 1994). If this is the case, then it would appear to be the case that language change must be linguistically neutral. If it were not then we should expect to be able to find languages that
are better than others. Linguistic evolution occurs at a significantly faster rate than

biological evolution, and over the millennia, as different languages evolved at slightly different rates, some examples of more primitive languages would be left. But as languages round the world have been studied it has been noted that while there are many primitive cultures, there are no primitive languages. Universality, however, says nothing on whether language change is socially neutral. 7.3.3.2 Uniformitarianism The Uniformitarian Principle as applied to linguistics implies that all languages are subject to a common and unchanging set of rules (e.g. Lass, 1997, p. 26-32). Over the millennia in which languages have been evolving, the same processes and the same rules have always applied, and the same rules and processes apply to all languages in the world. Such a principle is seemingly supportive of – or at least, not opposed to – the idea of neutral evolution. This does not imply that all language change is socially neutral, but we could suppose that were all change socially driven that changes in society would have more marked effect on the rules and process of language change. Thus, the uniformitarian principle gives some support to the idea of language change being the result of socially neutral processes.

Chapter 7 – Cultural Evolution of Phonology

150

7.3.4 Relativity and Non-Uniformitarianism

So, from the above it seems that some of the prevalent views on language change are either supportive of, or not opposed to, a theory of language change through neutral evolution. However, there are those who continue to argue against, and to present new arguments against, linguistic universality and the uniformitarian principle. If our theory of neutral evolution is to be robust, then it should be shown how it is not totally incompatible with the counter positions to universality and uniformitarianism. In this section we highlight some of the possible challenges, and address them in the following. 7.3.4.1 Linguistic Relativity Hypothesis The idea of linguistic universality is often defined in opposition to the notion of linguistic relativity. The linguistic relativity hypothesis holds, first, that speakers of particular languages have their thoughts shaped by the languages they speak, and second, that different languages may be significantly better or worse than others at expressing different concepts (Lucy, 1999). This is also commonly known as the Sapir-Whorf hypothesis, after the works of Edward Sapir (Sapir, 1949) and Benjamin Whorf (Whorf, 1956), and also as Linguistic Determinism. While highly debated, and currently out of favour (Pinker, 1994), the case for dismissing the linguistic relativity hypothesis is not proven, and it still has some prominent supporters (for example, Lucy, 1999; Slobin, 2003). We can suppose that where significant differences exist in how two different languages represent concepts, one of the languages may be better adapted to a particular environment, society or task in general. The principles of the linguistic relativity hypothesis would therefore hold that some of the changes that have occurred in language histories have been linguistically adaptive, bestowing adaptive benefits on the speakers of the new variants. If languages can be adapted to particular cultures, then must the uniformitarian principle also be reassessed? This is discussed in the following paragraphs. 7.3.4.2 Non-Uniformitarianism While uniformitarian principles are held by many in linguistics, at least broadly so, (Newmeyer, 2000) reminds us that this uniformitarian assumption is not a safe one. Newmeyer argues for two possible types of non-uniformitarianism, ‘non-U’. Strong-nonU supposes that there are functional forces acting on language that are somehow culturally

These sound laws represent other examples of directed change – examples which are replicated in different languages across the world. such as a general tendency to change from OV order grammars to VO grammars. 2000. it in not possible to say that . 7. 7.5 Neutral Change in a Relativistic. Thus. 1999a).3 Directed Change in Sound Systems The discovery of directed changes in language evolution is not new. 1997). As noted in Chapter 2. In the strong case. representing parallel evolution. resulting in particular direction to linguistic evolution. 7. however. p. and few arguments. Weak-non-U supposes that there are constant functional forces acting on language. in Weak-non-U. and so on. that particular sound changes have benefited particular languages.3.” (Newmeyer. Again it would appear that directed change in sound systems must be linguistically adaptive. the rules and processes are constant but they effect continued directed change on languages over time. 1999). and not neutral. Based on current knowledge. This is the basis for one of the arguments against neutral change proposed by (Nettle.Chapter 7 – Cultural Evolution of Phonology 151 determined. and so this is possible. and where changes in part of that system may lead to changes in another part. challenges that can be answered.4. the pressures applied to languages are themselves the product of the societies using them. who argues that a neutral model should result in a random pattern of linguistic diversification. leading to “non-accidental correlation between ‘purely’ grammatical features and aspects of culture.3.5. Other arguments that the evolution of languages is in fact directed. and hence change is the result of adaptive and not neutral change. whereas observed patterns of change have structural correlations.3. have been presented in (Bichakjian. There is evidence that some languages are adapted towards the functional needs of particular societies. Non-Uniformitarian World Linguistic relativity and non-uniformitarian ideas. climate. as well as directed change in sound systems. appear to present strong challenges to the idea that language change may be an evolutionarily neutral process.1 Neutral Change In Sound Systems There is currently no evidence. Grimm’s and other sound laws are based on observations of such change (AdamskaSałaciak.166). They are. In both cases it is implied that over time a selective pressure is being applied to languages. much of historical linguistics takes a structuralist view where language is considered a system.

could language change – directed or direction-free – be an example of neutral evolution in its own domain? The pressure on language users to use vowels which are distinctive enough to be readily distinguishable limits the space of possible changes. neutral does not equate to a purely random process. then the result of the sequence of changes must be adaptively neutral.2 Neutral Networks in Language Evolution A neutral network (Huynen. By having a constrained choice of selectively neutral changes. This notion that there may be a set of variants within which selection can occur without adaptive benefit is known in biology as a neutral network. selecting against ‘bad’ changes. it would most likely be noted that from similar start points. But beyond this. In a sense it is the individual sounds which are co-evolving over time. there may be a set of possible changes which do not affect the linguistic ‘fitness’. we see that directed changes can still be evolutionarily neutral. While the changes are adaptive from the point of view of the survival and evolution of individual sounds. similar changes would occur in different sound systems. there may only be a small number of such possible changes. that change is adaptively neutral. lexical and grammatical levels. If a change occurs selecting an adaptively equivalent variant. causing changes to the sound systems they form. 1995) is the term given to a set of variants of a form which are of equal fitness.5. they are not adaptive changes for the speakers. The directed changes that appear are a consequence of pressures on each sound in a system from the others. If the result of many individual changes is a sound system with no obvious linguistic benefits over the sound system that existed before the changes. Selection can occur freely within these limited sets. Accordingly. In each case. This thought experiment shows an adaptive change in one domain (sound) as being neutral in another (human survival). 7. . In any sound system.Chapter 7 – Cultural Evolution of Phonology 152 the English language was in any way better after the Great Vowel Shift (Baugh and Cable. The existence of many other forms that are selected against does not mean that the selection that does occur is not evolutionary neutral. 1978) than it was before – despite the many changes to the vowel sound system that occurred. If all selections were between adaptively neutral variants then. over time.3. The idea of neutral networks can be applied to linguistics at the phonemic.

3.3 Neutral and Adaptive Evolution Our final argument contends that it does not even matter if it is shown that many examples of language change are adaptive. Neutral evolution encompasses all of the changes that do not affect fitness. 1999). What then is the role of social and personal motivation in language change? To say that adaptive benefits are not required for the evolution of diversity is not to say that such benefits do not exist. With a high degree of redundancy present in language. There is still good reason to believe that neutral change occurs. To comprehensively deny neutral evolution. 7. social or linguistic functions are seen to be unnecessary for the emergence of diversity.5. it must be shown that all changes are functional.Chapter 7 – Cultural Evolution of Phonology 153 7. or that they do not influence the evolution of languages. we would argue that the reverse is more likely – that the . We have reviewed some of the objections to a neutral theory and shown them to be unconvincing. Accepting that language changes are influenced by social pressures on language users. page 354). Indeed. without regard to the content of the speech? Rather than claim that it is the usefulness of dialect as a social marker that led to the evolution of linguistic diversity. As noted above. classic studies such as that of language change in Martha’s Vineyard (Labov. For neutral evolution to be considered a factor in linguistic evolution it only has to be shown that some changes are not functional.4 A Modified-Neutral Theory of Linguistic Evolution To summarise. 1972) show that social factors do exert a strong influence. It has even been claimed that the majority of mutations are selectively neutral (Nimwegen et al. Is there something remarkable in the human ability to determine significant social information simply from accent and dialect.. the concept of neutral evolution comes from biology where there is certainly no lack of examples of adaptive change. we have argued that no functional or adaptive benefits are required to create linguistic diversity and that diversity should arise naturally from the imperfect transmission of language from users to learners. and is a significant process in the eol. a point also appreciated by Lass (1997. there is a great deal of room for such changes. This represents a neutral theory of linguistic evolution and we have shown that this could well be responsible for diversity in language dialects. we can question why language users adapt their language according to such pressures. Accordingly.

This model generated results which are qualitatively comparable to observations of human language diversity. While neither social nor linguistic function are required to create linguistic diversity – geographical spread and imperfect transmission alone are sufficient – both remain as important factors in the evolution of languages. This theory holds that particular language changes could not only be socially adaptive or mal-adaptive for speakers. and utilisation. From the results of this work. . and conclusions from.5 Conclusions In this chapter we performed a further.Chapter 7 – Cultural Evolution of Phonology 154 unavoidable linguistic diversity has led to increased awareness. short. or linguistically adaptive or mal-adaptive. but that they can be adaptively neutral. our use of the term a modified-neutral theory of linguistic evolution. we derived our modified-neutral theory of linguistic evolution. In the final chapter. but that it is not the only cause of change – hence. literature review – this time of work detailing the results of. a variety of computational models of linguistic diversity or related systems. such corroboration was provided by performing additional experiments using the emergent phonology model of de Boer. of such differences. Seeing that there was some need for additional corroboration of our previous results. we review the importance of this work – and of the other work detailed in this thesis – and point to ways in which this work could be extended in the future. We conclude that the neutral evolution of languages is unavoidable. Language change and dialect diversity can emerge without any adaptive benefit: such change and diversity being a consequence of the repeated learning of language by different individuals distributed by spatial or social constraints. 7. and from our reading of relevant literature. as did the results of our previous model (Chapter 6).

1 Artificial Life and the Evolution of Language In this thesis we have seen how computational models can illustrate and illuminate processes that are otherwise hard or impossible to observe. 1999) on the use of models of self-replicating systems to investigate the emergence and evolution of life and the development of species and organisms of increasing complexity. The principal theories that already exist have been developed without the use of models. A model is then built in which agents . the model will in effect have greatly limited the search space and set of possible outcomes. or by providing a grossly limited set of possible interactions with the environment. The problem is similar to that discussed by Tim Taylor (Taylor. This is most evident in some artificial life work on the EoL where the investigator sets out to determine if some application of language might provide sufficient adaptive benefit to explain the emergence of language. The problem domain is very open ended while the models that have been developed have. we have encountered some limitations of the artificial life method for building explanations of the EoL. Thus artificial models can be useful where it is not possible to make direct observations of the phenomenon of interest. In this.Chapter 8 – Conclusions 155 Chapter 8 Conclusions In this thesis. For the models to work very significant constraints are imposed on the representation of individuals or on the environmental and reproductive processes. such as is the case with the evolution of language. 8. and presented some of our own artificial life based investigations. By providing set rules and mechanisms for reproduction. been much more limited. and how they can be useful in improving our understanding of such processes. There is one key reason for this limitation. we have reviewed some of the growing body of work which uses artificial life methods to investigate the evolution of language and languages. we have argued that such models have limitations in their power in adding to the considerable base of knowledge that exists on the evolution of human language ability. but have also found results which we believe should be of interest to the wider community interested in the evolution of language. of necessity. and to date computational models have had limited worth in advancing any one theory of the origin of language over any other. Despite this.

the rules for language use. modelling the cultural evolution of languages appears to suffer less from the problems mentioned .2 Methodological Approaches However. the experimenter then runs the simulation and waits for random mutation to find the correct solution. limited models are built that can only support a single theory. then we might expect it to be somewhat more common than it is. 8. This has led to a range of works that each support one of a variety of uses of language as being the prime reason for its emergence (see reviews in Chapters 2. With only one application of language possible. Instead. In particular. Further examples are provided in some of the other work that has been cited in this thesis. From an openended problem domain. sufficient cause.Chapter 8 – Conclusions 156 are able to evolve and interact. can provide useful insight and useful additional evidence for resolving debate. If each of the many posited reasons for language’s genesis was. however. Against this background of numerous positive results is the knowledge that. In contrast to work that attempts to explain the origins of language. But it is clear that the complexity of the EoL is such that it will be some time before simulation models of the evolution of language will be able to significantly add to our knowledge of why and how language evolved. as early hominids presumably did not. using some of the reviewed methodological principles for conducting artificial life experimentation (reviewed in Chapter 3) can help researchers in finding productive topics to pursue. 3 and 5). and only a very few species have naturally acquired learned systems of communication at all. An example of this approach is in the more detailed findings and arguments that have been drawn from the work of Chapters 6 and 7 compared to that of Chapter 5. work that tries to resolve current questions by building models. Typically. but these should evolve. As methods and models improve it may be possible to relax these constraints. where competing theories can be tested against one another. and very limited. in nature. and the representational capacity for language use are usually explicitly present from the beginning. and suitable fitness rewards for agents that are able to use the model language. both the evolutionary changes available and the interactions possible are tightly bound. only one species has evolved language. Ideally agents should not possess the appropriate mental structures and/or rules for language use. The best we can currently achieve with such models is to illustrate particular points or ideas. by itself.

The structure of individual agents. this verification resulted in our conclusion that our model was unable to provide evidence to favour either the continuous or the discontinuous EoL over the other. This is an important difference. These findings are now summarised in the next two sections. 8. In Chapter 6 we were able to verify the results of our model of the cultural evolution of language by a direct comparison of the qualitative results obtained against observations of human linguistic diversity. or representing more meanings than strictly required) improved the fitness of agents possessing it.e. and one that bypasses the problems noted by Taylor (1999) about the use of artificial life for modelling evolution. model gives us significantly more confidence in our findings. and forced us to consider the limitations of the model. was that a redundant language capacity (i. In our own work. In simplifying the earlier model. In our work on the EoL. but not how to produce them (Chapter 5). in Chapter 5. in Chapters 4 and 5. quite different. we have tried as much as possible to follow the principles of good practice discussed in Chapter 3. yet not so simple that it was no longer relevant to the problem being modelled. but are predetermined. That we are able to reproduce our some of our key results in a second. . and the basic rules that they follow. We have presented arguments validating the design decisions taking in building the models – such as when deciding what characteristic features of language should be implemented in our minimal model (Chapter 4). The evolution of languages occurs in a system with a predetermined biological substrate. we were attempting to build a minimal model that was as simple as possible. We have been equally attentive to the task of verification. The model developed in Chapter 4 and used in experiments in the following two chapters was a simplified version of an earlier model. one capable of using more signals. or when choosing to build a model of the EoL in which all agents are equally well able to learn how to interpret signals.Chapter 8 – Conclusions 157 above. we took this a step further by conducting a further experiment to verify our findings. using a different model that had already been extensively documented. do not vary over time. In the following chapter.3 Redundancy in Language Evolution Perhaps the first notable finding of this work.

This is a topic of current and ongoing debate. However. Countering this is the view that change and diversity are unavoidable. agents were able to understand neighbouring agents with a high degree of success despite large differences in dialect along the continuum. the role and importance of redundancy requires further investigation. To counter those results. we presented results from two distinct models supporting the latter argument. our experiments and debate supported this latter case. 8. respectively. the benefit was in allowing greater flexibility for agents attempting to learn conflicting lexicons. We further added to this existing work. has been the investigation of the evolution of linguistic diversity. we argued that improbable assumptions coded into models supporting the former position gave rise to their results. improving their ability to learn language from conflicting and contradictory evidence provided by different speakers. While a similar idea has been previously presented (Lass (1997) pointed to redundancy as allowing changes to occur in language without affecting the success of communication). and subsequent discussion. in this case. It was noted that along a language continuum. and entirely to be expected due to the way language is learned. this work has brought into focus the likelihood that redundancy itself provides a significant benefit to language learners.Chapter 8 – Conclusions 158 The benefit of a redundant language capacity could also overcome moderate fitness penalties awarded according to capacity. We concluded that. who presented arguments for and against. This was interesting because the benefit occurred despite the absence of noise – the most commonly cited benefit of linguistic redundancy. . and is left as work for the future. In this debate. or otherwise it would not occur for a variety of reasons. on linguistic relativity and non-uniformitarianism. With regard to simulation work. One view holds that language diversity and/or change must be adaptive.4 Language Change and Neutral Evolution What is probably the most significant contribution of this thesis however. In doing this we also took time to look at possible challenges introduced by related ongoing debates in linguistics. and briefly outlined further arguments forwarding language change as being adaptively neutral. Previous work on language change as a form of neutral evolution was presented by Lass (1997) and Nettle (1999a).

or takeover by one of a number of competing languages. say. Chapter 18) examines the role played by two different mechanisms of language change. over time.5 Future Directions As noted at the beginning. this debate is not thoroughly resolved. or for examining the role played by different mechanisms of language change. In looking at the evolution of languages. providing a bigger picture that frames the sometimes opposing views that currently exist in. Such work is often troubled. ALife work may be able to explore the workings of such mechanisms more closely. of creole genesis. For example Labov (1994. socio-linguistics and historical linguistics. linguistic ecology (Section 2.g. directions for future research include examining how the evolution of languages differs for different aspects of language (e. this thesis is the product of interdisciplinary research. It is to be hoped that. grammar versus phonology). and proposes that both have some part to play. For example. As noted above. as has been noted by Bullock (1997) in the conclusions of his PhD thesis. both the role of redundancy in language evolution and the effect of different linguistic ecologies present further questions. but between different branches of linguistics itself – where principles and ideas in fields such as historical linguistics and sociolinguistics seem at times to be in opposition. 8. some of the greatest conceptual and intellectual struggles have not been in attempting to bridge the gap between computer and linguistic sciences. allowing insight not possible from currently available linguistic evidence.5) would appear to be an area where computational models could be productively applied – investigating more closely the social structures supportive of language stability.Chapter 8 – Conclusions 159 As yet. however. there are a variety of ways in which future research might expand upon it.5. work on the evolution of language might help to draw some of these distinct approaches together. and related. Other interesting. and the work presented can be greatly extended. . Regarding the work of this thesis.

with neighbours tending to use the same – or similar signals.Appendix A – Spontaneous Emergence of Dialects 160 Appendix A – Table of Signals Learned The table below lists the full set of signals learned by a population in one of the experiments detailed in Section 5. No. This shows that considerable diversity exists. This demonstrates the spontaneous emergence of dialects. Agent Lang. Nodes 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 6 6 6 6 5 7 6 6 5 6 6 5 6 6 5 6 6 6 6 5 6 5 7 6 6 5 6 6 7 5 6 6 6 7 6 5 6 6 6 7 6 6 7 Signal 1 ++++-+ -++--+ -++--+ --+--+ --+---+--++ --++---++----+---+--+-++---++ +--++---+++ -+-++ -+-+++ -+-+++ ---+-+ --++-+ --++--+----++--+++--++++-+++--++++ -++++++++++-+++-+-+++ +-++----+-------+-+---+-+--+-+-+-+-+ -+-+-++-+-+--+-----+----+-+ ---+--- Signal 2 +----+ +----+----+----+ +--++--+-++--+-+ +--+-+ +--++--+-+ ---+-+ ---+---+-+ ---+----+-----+----+----+ +----+-+-+-++-+ +-+++++---+-+--+-+--+-+-+-+--+ +-++-+ +-++-++ ++++++++-+++--+ +++--+ ++--+++---++ +---+ +----+ +----+ -----+ -+---+-+---+ -+---+ -+----- Signal 3 -+++++ -+++++ -+++++ --+-++ --+-+ --+-++----++ ----++ -+--+ -+-++-++-++ -+--+ -++----+----+++-++-+-++-+-++---++---+-+-+--+--+++++--+ ++-+-++-+-++-+++-+-+ ++++-+ +++++ -+++++ ++-+++ -+-+++ ++-++++-++ ++--+++-++++-++-+-++-+ -+--+----+-+--+-+ Signal 4 --++---++---++---++---+----++----++++-++++-+++++-+ ++++-+ ++++++++-++++-+-+++-++-+ --++++ --+++--++++ +-+++ --+++--+++ +-+++-++-++++-++-+-+++-+-++-+-+ +---+ +---++ +++-++ +++-++++-++ +++-+ -+---+ -+---+ -+---+ -+--------------+ ------+ Signal 5 +-+-++++--+++--++---++--++---+++---+ ++---+ ++--++---+ ++--++ ++--++---+ ++---+ ++--+--+-+ +--+++--+++--+-+--++--+-+ ---------++ --+--+ --++-+ --++---+----++---++-+ -+-++ -+-++-++-+-++-+-++---+ -++--+ -++--+-----------++---++-+ ---+++ ---+++ ---++++ Signal 6 --++----+----+----+-----+ ----+-+---++---++---+---++---++ +-+-+ ----++---++-+-+ +-+-++ +++-++ +-+-++ ----++---+ +-+-++-+-+ +++-++ ++++++ +++++ ++++++ -+-+++ -+-+++-+-++ -+-+++ -+-+-+ ---+-+ ---++++ ---+++ ----+ ----+----++---++-----+ +----+----+ +---++- Signal 7 +++-++-++++ +-++++ +--+-+ +--++-----+ ---+--------+---+-+--+-++ +-+++ --++++ --++++ --+-+ --+-+----++ -----+ ---+-+ ---+-+-+--+-++--+-+ +--+-+ +--++--+---++---++-+--++--+--+ --+----+++--+++---++++ +-+++ +-++-+ +--+-+ +--+-+ ++-+-++ ++-+-+ ++-+++ Signal 8 +-+--+ +++--+ -+---+ -+---+ -+-+-+---++ -+-+++ -+-+-+ -++--++---++---+++-+++++ -+++++ ++-++ ++-+++--+++--++---+++--++ +--+++--++ ----+++---++ ----++ ----+ ---+++ +--+++ +--++++--++ +--+++--+-+--+++--+-++-++-+ ++-+++-+-+ ++--++ ++--++ +---+++---++ +---++ +++-+++ -+-+-++ -++++++ +--+-+- ++-++++ +++-+-+ -++++++ ++--++- .4.

Nodes 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 6 6 7 5 5 6 6 6 7 7 7 7 6 7 7 6 7 7 7 6 5 7 6 7 6 7 7 6 5 6 7 7 6 6 5 6 7 7 5 6 7 5 6 6 6 5 5 6 5 6 6 6 6 7 Signal 1 +--+-+--+-+-++--+-++--++--++-+ --++-+ -----+ -+-+-++ ---+-++ ---++-+ ++-+-++ ++-+-++-+--++-+---+-+--+-+--+ -+----+ -+----+ -+----+---+---+-+---+ -+--++-+--++ ----++----++----++ ----+ ----++ --+-+++ -+++++ -++++-++++ -++-++ --+-+++ --+-+++ -++-+ -++-++ -++-++-++-+ --+-++ --+--+ --+--+ --+-+ --+++ --+++--+++ +--++---+++--+-++++-++++--+ 161 Signal 8 Signal 2 -+-+-++-++++-++-+--++ +--++ +--+++ +-++++ --++++ --+++--++++--++-+--++----++---++---+ --+---+ --+----+----++----++----++--+++--++--++ -++--+ +++-++ Signal 3 -+--+-+--++++---+ +++-+-+---+--+ +-+--+ ++++-+ Signal 4 ------------+-----------+----+ +----+ --+--+ Signal 5 ---+++ ---+++ --++++---++ --+++ --++++ ---++---+++ ----+-----+------+---+----+++--++----+++-+-++++-+++-++-++--+-++--+-++-+-++ -+--+-+ -+----+---++ -++--+ -++--++ -++---+ -++---++-+ -+--+++----+-----++-+-++-+-++-+++--+-+--+-+ +++++ +++-++-+++-+-+++ +-++++ ++++++ -+++++ -++++ --+++ --++++ --+++ +--+++ +-++++ +--+++ +-++++ +-+-+-- Signal 6 +-+-++ +-+-++ +-+-+++ +-+-+ +-+-+ +-+-++-+-++ +-+-++-+-+-+-+-+-+-+-+-+-+---+-++-+-++--+-+++-+ +-+++--+++++ --+-+-+ --+-+-+ --+-+-++-+ -++---+ -+---------+ -----------+ ------+ ---++---++ --+++--+++-+-+++-+-++++-++++--++ +--+++ ----++----+ ---+++ ---++++ --+++ --++---++---++-+ -+++-++++ -+++++ -++++ -+++++ -+++++ -+++++ -++++-+++--- Signal 7 --++++ --+-+--+-+-----+ ------+--+-++---++----+--+--+--++-----+-----+----+-----+--+--+--+----+-+ ----+-----++----++ --+-+ ----+++ --+-+--+-++--+-++-+---+-+---+-+--+---+----+ +--+-++ +--++++ ---+++ ---++---++ ---+++-+++-+ +-+++-+ +-+++ +-+++-++++-+ -++++ -+++++ -++++-+--+----+ ----+ ----+-+--+ ++--+-+--+-++-+--+-+--+-+-+ ++---+ +----+ +--+-++--++--++--+-+--+-+ +--+-+ +--+-+--++-++-++-+++++-+-+++-+ +-++--+ ---+--+ +--+-+--+--+-----+-----+ ++---++--++---++----+ +----++---++ +---+-+---+-+---++-+-+ +++-++++-+-+ +++-+-+ ++--+++--+++-++ ++-+++ -+-+++-+-+++-+-++ ++--++ +++-+++ +-+-+ +-+-++ --+-++ +---++ ++--+ +++-+ -++-+-+--+ -+--++ -+-+++ ++-+++ ++-+-+ ++-+-++ ++++-++ --+-+++ ++++++.++--+-+ +-+++ ++++-+ ++++-++-+-+-+--+ +-+--+ ++---+ -+---+--+ --+-++ +-+-+ +---++ +++-++ ++--++ ++---+ +-----++--+ ++-+++ ++++++++-+ ++++-+ +++-++ +++-+ +-+-+ +-+-++ +++-+ ++++++++-++++-++++-++++-+-+ +++-+++ ----++- ++-++++ +-+-++- .-++---+ ++++++.Appendix A – Spontaneous Emergence of Dialects Agent Lang.+-+-+-+ -+++++-+++-+ -+++---++++--+++++++-+-+ +++-++++-+ ++--+-++--+++---+-+--++ +-+--+++---+ +++---+ +++--+++-+-+---++-+-+ +-+---+ +-+--+ +-+-+++ +-+-++ +-+-++++--++++-+++ +++++ ++++++ -++--+-++--+-+++-+ --+--+ +-+-+ +-+-++-+-+---+-+-+---+ +---++ +-+-+ +-++-+++--+++--+++++ +++++ ++++++++++ +++++++++++-++++-++++-+++-+ +++++-+ +++---+++++-+ +-+-+-- +++-+++ ++--+++ +++-+++ +----++ +++++++ ++---++ ++-+-+ ++-+++++-+ ++++--+-+--+-+--+-+-+ +++-++ -+---+ -+---+----+-----+-+---+-++-+-++-+--+ ++--+- -++++++ +-++--- +++++++ ++--+-++++++. No.

No.Appendix A – Spontaneous Emergence of Dialects Agent Lang. Nodes 97 98 99 6 5 7 Signal 1 ++++-+++++++--+- 162 Signal 8 Signal 2 ++-------+----+- Signal 3 +++-++ +++-+ -++-+++ Signal 4 +-++++ +-+++ +-++--- Signal 5 +-+-++-+-+ +-+-+-+ Signal 6 -++++--+++ --++++- Signal 7 -++-++ -++-+ -++-+-- +-+--+ +-+-+-+--++ Table A.4 .1: This table shows the full set of the signals from one set of results from one of experiments shown in section 5.

Appendix B – Colour Plates

163

Appendix B – Colour Plates

Appendix B – Colour Plates

164

Colour Plate 6.1

Signals and colour values

Colour Plate 6.2

Evolution of Communication Schemes

(a).i-ix Set of results after agents learn only from agents in the parent generation. Shown in each are the last 1000 generations of 100,000. See section 6.6.2 for more detail on interpreting the results. ii) iii) i)

Appendix B – Colour Plates

165

iv)

v)

vi)

vii)

viii)

ix)

000.i-ix Set of results after agents learn from other agents in their own generation as well as from agents in the parent generation. Shown in each are the last 1000 generations of 100. i) ii) iii) iv) v) vi) .Appendix B – Colour Plates 166 (b).

i) ii) iii) .Appendix B – Colour Plates 167 vii) viii) ix) Colour Plate 6. with internal variation.3 Emergence of global dialects i) Convergence ii) and iii) Global dialects.

Appendix B – Colour Plates 168 Colour Plate 6.1% noise (i-iii) and three with 0.4 Diversity from Homogeneity (a) From homogenous signalling schemes to diversity.01% noise (iv-vi). i) ii) iii) iv) v) vi) . Shown in each are the first 1000 generations of a run. three examples of the effect of noisy signals at 0.

05 to 0.005 (iv-vi). Initial weight values range from –0.05 (i-iii) or from –0.005 to 0. i) ii) iii) iv) v) vi) .Appendix B – Colour Plates 169 (b) Six examples of the effect of bias from initial weights on signal evolution. Shown in each are the first 1000 generations of a run.

from the 9000th to the 10.000th generation. the first thousand generations. i – From homogeneity to diversity. ii – No evidence of original homogeneity.Appendix B – Colour Plates 170 (c) A further example of the effect of innate bias. i) ii) .

Sound Laws: Reactions Past and Present.S. In L. Basic Books. L. On Discontinuing the Continuity-Discontinuity Debate. Belew. E. Princeton University press. Evolution of Linguistic Diversity in a Simple Communication System. Schmid. Oxford. (1997b). proceedings of the Fourth International Conference on the Evolution of Language. K. E. Adami.). Taylor (Eds. R. In Evolution of Social Behaviour Patterns in Primates and Man (Proceedings of the British Academy 88). Barlow..M. and Y. Arita. Knight (Eds. J. (1997). S. Artificial Life IV. Princeton. (1984).). J. H. . Signs of the Origin of Syntax. Hurford and T. Abductive and Deductive Change. M. MIT Press. New York. T. E. C. Language Change: Progress or Decay?. Artificial Life Vi.References 171 References Adami. E. Angeline and T. C. R. StuddertKennedy and C. J. Redundancy Reduction Revisited. B. (1973). Kitano and C. R. Aiello. Bäck (Eds. MIT Press: 9-17. H. R. C. E. The Dissemination of Culture: A Model with Local Convergence and Global Polarization. Cambridge. (2001). J. Runciman. A Simple Model for the Evolution of Communication. (1998). Stein (Eds. R. In Artificial Life VI. M.). Language 49: 765-93. C. Taylor (1996). Journal of Conflict Resolution 41 (1997): 203-226. (1991). (2002). F. The Complexity of Cooperation: Agent Based Models of Competition and Collaboration. P. K. H. R. Network: Computation in Neural Systems 12: 241-253. Amsterdam. U. Current Antropology 35(4): 349-368. T. Bipedalism. M. UCLA. In J. Axelrod. Fossil and Other Evidence for the Origin of Language. R. J. MIT Press: 405-409. D.. Maynard-Smith and D. Koyama (1998). UCLA. (1997a). Aitchison. John Benjamins: 1-13. The Evolution of Cooperation. Axelrod.. J. A. Kitano and C. Fitch (Eds. Arita. Fogel. and the Origin of Language. H.: 2. Harvard University. Terrestriality. (Eds. Aitchison. Adamska-Sałaciak.). Cambridge University Press. Axelrod. W. Cambridge University Press: 17-29. J.). L. G. Belew.I. Oxford University Press: 269–289. Evolutionary Programming V.A. (1996). New Jersey. Taylor (1998). Stokoe and S. Austin and D. In C. In Approaches to the Evolution of Language. Aiello.A. Anderson. Hurford. and C.). In Historical Linguiustics 1997. Armstrong. R. Wilcox (1994).

Ghadapkpour (Eds. S. D. G. Bennet. Evolutionary Ecology of Spoken Language: CoEvolutionary Hypotheses Are Testable. London. 5th European Conference on Artificial Life. Kelmen and P. and T. Cambridge. and J. Bickerton. E. MA. (1998). (1999). (1990). Grammatical Acquisition: Inductive Bias and Coevolution of Language and the Language Acquisition Device. Lawrence Erlbaum Associates: 29-80. Buckley. The Survival of the Smallet: Stability Conditions for the Cultural Evolution of Compositional Language. E. Bickerton. Goodman (1999). World Archaeology 34(1): 26-46. and S. Language 76(2): 245-296. J.). Schraudolph (1991). Computational Simulations of the Emergence of Grammar. (2000c). McGraw-Hill. Maes (Eds. Chicago University Press. D. J. Mahwah. Bates.. Springer. New Jersey. D.-L. A History of the English Language. Routledge. 3rd International Conference. Langton. Cable (1978). . M. In Approaches to the Evolution of Language. C. Bichakjian. Studdert-Kennedy and C. (1984). (1981).).). A. Warner (Eds. Steele (2002). Briscoe. In R. Baugh. Farmer (2001). E.). In The Emergence of Language. Knight (Eds. McInerney and N. In Artificial Life II. Tsoulas and A. E. J. S.References 172 Batali. MIT Press: 160-171. Kirby (2001). C. Dessalles and L. On the Emergence of Grammar from the Lexicon. G. In J. Species and Language. Chicago. In J. C. J. Brighton. J. Briscoe. Briscoe. In Diachronic Syntax: Models and Mechanisms. Psycoloquy 10(33). (1994). C. Oxford University Press: 75-108. Susik (Eds. C.. Bickerton. Karoma. Berlin: 592-601. Behavioral and Brain Sciences 7: 173-221. Ann Arbor. Batali. (2000b). (2000a).). H. Michegan. Belew. Pintzuk. J. D. B. Prague. MacWhinney (Ed. Proceedings of the Fourth Artificial Life Workshop. Roots of Language. K. Oxford. Cambridge University Press: 405-426. Macro and Micro Models of Linguistic Evolution. S. H. Evolving Networks: Using the Genetic Algorithm with Connectionist Learning. Taylor. Published in Advances in Artificial Life. The Language Bioprogram Hypothesis. Evolutionary Perspectives on Diachronic Syntax. Farmer and S. J. B. Rasmussen (Eds. Hurford. R. AddisonWesley. McRobb and R. N. Innate Biases and Critical Periods: Combining Evolution and Learning in the Acquisition of Syntax. Brooks and P. Paris: 27-29.). and J. Object-Oriented Systems Analysis and Design Using Uml. Language Evolution and the Complexity Criterion.). The Evolution of Language. J.

W. A. In J. Bickerton (2000). Chambers. W. and D. Towards a Theory of Cultural Evolution.). Comprehension and Production in Early Language. London. Burling. (2000). Cambridge. In Simulating the Evolution of Language. Symbol Grounding and the Symbolic Theft Hypothesis. W. (2002). and Application to Problems Concerning the Evolution of Natural Signalling Systems. Schaffer (1988). Paris. L. Syllables. Burling. H. and P. Calvin. J. L. and S. K. and J. The Evolution of Language. Chambers. The Evolution of Language. Cambridge University Press. Cangelosi. Cangelosi and D. D. R. Cangelosi. Dialectology. The Emergence of a "Language" in an Evolving Population of Neural Networks. Carstairs-McCarthy. and M. The Adaptive Advantage of Symbolic Theft over Sensorimotor Toil: Grounding Language in Perceptual Categories. A. Feldman (1978). Morgan Kaufmann. 2nd international conference. Dessalles and L. R. . Paris: 35. Connection Science 10(2): 83-97.. Eds.-L. Parisi. Lingua Ex Machina: Reconciling Darwin and Chomsky with the Human Brain. S. Sociolinguistic Theory. Cangelosi. Evolution of Communication 4(1): 117-142.). PhD Thesis. K. Parisi (1998). and M.). The Origins of Complex Language: An Inquiry into the Evolutionary Beginnings of Sentences. MIT Press. Springer. Simulating the Evolution of Language. (1995). L. 3rd International Conference. Cultural Transmission and Change: A Quantitive Approach. (1999). R. Cangelosi. In J.-L. Of the Origin and Progress of Language. Greco and S. Oxford. In Fifth International Conference on Machine Learning. Ghadapkpour (Eds. Cavalli-Sforza. MA. Caruna. Binary Coding for Genetic Algorithms. Evolutionary Simulation Models: On Their Character. J. Something to Talk About: Conflict and Coincidence of Interest in the Evolution of Shared Meaning. Brighton Bullock. University of Sussex. Feldman (1981). and D. Springer: 191-210. 3rd international conference. Burnett. L. London. Ghadapkpour (Eds. Harnad (2002). Blackwell.References 173 Bullock. S. In The evolution of language. (1997). Trudgill (1980). Oxford University Press. Cavalli-Sforza. A. (2000). J. A. The Slow Growth of Language in Children. (1787). Princeton University Press. and D. (1998). Parisi (Eds. Dessalles and L. Interdisciplinary Review 3: 99-107. A. A. London. Representation and Hidden Bias: Gray Vs. and Truth. A. Harnad (2000).

Cambridge University Press. C. (1992). (1997). Switzerland. Nature 411: 183-193. Mondada (Eds. Clark. ECAL 97. S. Deacon. The Cambridge Encyclopedia of Language. de Boer. Language and Mind. Di Paolo. de Boer. J. R. E. Ghadapkpour (Eds. W. Deacon. Nehaniv. H. R. In The Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form. Parsowith (Eds.). Gell-Mann (Eds. San Diego. The 9th White House Papers: Graduate Research in the Cognitive and Computing . B. Brighton. 3rd International Conference. Cambridge University Press. D. and L. Floreano. J. 3rd International Conference. and P.-H. In 4th European Conference on Artificial Life. Trudgill (1998). Norton & Co. Clark. Studdert-Kennedy and J.).).. Advances in Artificial Life: Proceedings of the 5th European Conference on Artificial Life. Some False Starts in the Construction of a Research Methodology for Artificial Life. Vogt (1999). M. Chomsky. Proceddings of the Aisb '99 Symposium on Creative Evolutionary Systems. R. and P. Cambridge University Press: 177-198. The Symbolic Species: The Co-Evolution of Language and the Human Brain. Springer: 664-673.H. J. N. (1999). Dialectology.. Hurford (Eds. Proceeding of the Evolution of Language. and C. Hawkins and M. M. Dawkins. In J. The Selfish Gene. The Evolution of Language. Knight. A. Lausanne. UK. (2000). ECAL 99. In J. Internal and External Factors Affecting Language Change: A Computational Model. MSc Dissertation. The Society for the Study of Artificial Intelligence and Simulation of Behaviour.-D. UK. P. Emergence of Sound Systems through Self-Organisation. (1996). Wang (2001). A. (2000).). Christiansen. Eds. Emergence of Speech Sounds in a Changing Population. Dessalles. (1972). University of Edinburgh Crystal. (1996). Noble and S. Using Artificial Language Learning to Study Language Evolution: Exploring the Emergence of Word Order Universals. T. T.-L. Scalable Architecture in Mammalian Brains. Mitra and S. Edinburgh. de Boer. Addison-Wesley: 273303.-L. (2000). Paris. USA and Allen Lane. Paris: 45-48. Brain-Language Coevolution. J. B. Ghadapkpour. Eds. D. P. In D. K. K.References 174 Chambers. Dessalles and L. In The Evolution of Human Languages. Generating Vowel Systems in a Population of Agents. Dautenhahn. Harcourt Brace Jovanovich. The Penguin Press.). (1976). Nicoud and F. J. (1987). 2nd Edition. W. B. A. (1997).

J. Floreano. J. and D. D. L. Simulating Societies Using Distributed Artificial Intelligence.). G. R. MA. Forrester.D. Nicoud and F. School of Cognitive and Computing Sciences. J. Stoness (1999). and R. C. In D. Springer: 381-393. Brookings Institution Press. University of Sussex Dircks. (1971). C. Grooming. Cambridge. E. J. Dunbar. Elman. Doran (Eds. C. (1999b).References 175 Sciences at Sussex. Axtell (1996).. E. J. Elliot. Washington. Mueller. (1981). (1997b). In 4th European Conference on Artificial Life. ECAL 99. UK. Fyfe. Advances in Artificial Life: Proceedings of the 5th European Conference on Artificial Life. Livingstone (1997). Child Language. (1996). Switzerland. Brighton. New Jersey. Cambridge University Press. N. Nicoud and F. Lausanne. A Little More Than Kind and Less Than Kin: The Unwarrented Use of Kin Selection in Spatial Models of Communication. An Investigation into the Evolution of Communication. Elman. The MIT Press. In D. Brighton. ECAL 97. Gilbert and J. Mondada (Eds. U. University of Sussex. L.-D.). M. Springer: 720-724.). Nachmias (1992). (1999a). In The Emergence of Language. Research Methods in the Social Sciences. London. Lawrence Erlbaum Associates: 1-28. Faber & Faber. Springer: 504-513. (1999). Switzerland. Developing a Community Language.). World Dynamics. Di Paolo. and S. Social Coordination and Spatial Organization: Steps Towards the Evolution of Communication.C. E. Epstein. Di Paolo. Edward Arnold. Advances in Artificial Life: Proceedings of the 5th European Conference on Artificial Life. W.. and D. C. MacWhinney (Ed. Cognition(48): 71-99. Frankfort-Nachmias. Gossip and the Evolution of Language. Cambridge. B. Di Paolo. (1993). Growing Artificial Societies: Social Science from the Bottom Up. Troitzsch. Mondada (Eds. Doran. In ECAL 97. E. Lausanne. Cambridge. U. J. Mahwah. J.K. UK. MA. Learning and Development in Neural Networks: The Importance of Starting Small. G. Investigating Language Change: A Multi-Agent Neural-Network Based Simulation. (1997a). MIT Press. On the Evolutionary and Behavioral Dynamics of Social Coordination: Models and Theoretical Aspects. . K. ECAL 99. Floreano. Di Paolo. Adaptive Behavior 6(2): 285-324. E. The Emergence of Language: A Conspiracy Theory. Berlin. A. J. In Social Science Microsimulation. (1996).

J. In Syntax and Semantics Volume 3: Speech Acts. Dil (Ed. Hauser. and K. Davies (Eds. Simulation for the Social Scientist. MIT Press. C. (1996).058.a Missing Term in the Science of Form. Utterer's Meaning and Intention. In Ecology of Language. J. M. Grafen. Logic and Conversation. D. B. Wexler (1994). MA. W. and M. Grice. Krebs and N. Gleick. (1991). Animal Communication. Hamilton. Troitzsch (1999). Triggers. H. (1964). Gopnik. Gilbert. A. CA. B. Harnad. (1953). Oxford. Morgan (Eds. D. Chaos. New Jersey. Journal of Theoretical Biology 7: 1-52. Grimm. F. Paleobiology 8(1): 4-15. A. Stanford University Press: 325339. Wilson and F. (1975). Modelling in Behavioural Ecology. N. Familial Aggregation of a Developmental Language Disorder. J. G. Göttingen. Keil (Eds. In The Mit Encyclopedia of the Cognitive Sciences. . Dieterich. (1969). S. R. and Z. P. S. NY. (1971). Massachusetts. G. In Behavioural Ecology. S. P. (1822). Gould. P. Haykin. U. Hockett. A. Hauser. S. Prentice Hall. Linguistic Inquiry(25). (1998). Marler (1999). Cambridge. F. The Origin of Speech. (1960). Cognition 39: 1-50. Philosophical Review 68: 147177. Hinton. Essays by Einar Haugen. Vrba (1982). Stanford. H. D. and E. Part I: Zweite Ausgabe.632. E. Neural Networks.). (1987).). MIT Press: 22-23. Exaptation . The Genetical Evolution of Social Behavior. E.S. Gray. Cambridge. The Evolution of Communication. Open University Press. Pulse Code Communication. Generative Models for Discovering Sparse Distributed Representations. Ghahramani (1997). Physica D 42: 335-346. C. S.References 176 Gibson. The Symbol Grounding Problem. Blackwell Scientific Publications: 531. (1990). and K.). R. Academic Press. E. Philosophical transactions fo the Royal Society B(352): 1177-1190. Crago (1991). J. Haugen. Scientific American(203): 88-93. Patent Number 2. M. M. Grice. The Ecology of Language.). Deutsche Grammatik. Cole and J. and P. Heinmann.

R. Edinburgh University Press. Emergence. Holland. Push Chains and Drag Chains. R. Dunbar. Eugene. Cambridge. Ore. Oxford. J. Kegl. J. H. S.). E. Oxford University Press. J. In The Evolution of Human Languages. Hurford. A. An Approach to the Phylogeny of the Language Faculty. Fitness and the Selective Adaptation of Language. In The Evolution of Culture. Gilbert and R. Approaches to the Evolution of Language. Kimura. Sigmund (1988). R. In Submitted to Functionalism/Formalism in Linguistics (Proceedings of the 23rd Annual Uwm Linguistics Symposium). from Chaos to Order. J. Kirby.S. In proceedings of the fourth annual meeting of the Pacific Linguistics conference. Syntax without Natural Selection: How Compositionality Emerges from Vocabulary in a Population of Learners.. M. Iwata (1989). U.). Function. Oxford. University of Oregon. Hurford. Conte (Eds. J. R. Hurford. Holland. Kirby. The Theory and Evolution of Dynamical Systems. Oxford University Press. Technical Report Preprint. A. In Approaches to the Evolution of Language.. Power (Eds. Adaptation in Natural and Artificial Systems. Hazelhurst (1995). UCL Press. and Innateness: The Emergence of Language Universals.References 177 Hofbauer. Glossa 3(1): 3-21. (1975). J. (1998). J. M. J. K. MI. (1999). (1983). J. Knight (1998). Huynen. (1969). M. R. and B. (1999). Hawkins and M. Ann Arbor.: MIT Press . Hutchins. Santa Fe Institute Johnson. R. D. Cambridge University Press. Acoustic and Auditory Phonetics. How to Invent a Lexicon: The Development of Shared Symbols in Interaction.A. Knight and C. MA. (1997). (1996). In Artificial Societies: The Computer Simulation of Social Life. Cambridge University Press. University of Michegan Press (2nd ed. H. Cambridge. S.1992). M. The Evolution of Language and Languages. Blackwell. Cambridge University Press. J. Knight (Eds. Lenguage De Signos Nicaraguense: A Pidgin Sheds Light on the "Creole?" Asl. (1998). R.). Studdert-Kennedy and C. Hurford. (1992). Functional Innateness: Explaining the Critical Period for Language Acquisition. A. The Neutral Theory of Molecular Evolution. Gell-Mann (Eds. Selection. Kirby. Addison-Wesley: 273-303.. In The Evolutionary Emergence . (2000). Cambridge University Press: 359-383. Studdert-Kennedy and C. S. C. Hurford. King. (1995). and K. and G.). N. Exploring Phenotype Space through Neutral Evolution.

W. In Social Science Microsimulation. Langton. (1990). Animal Signals: Mind Reading and Manipulation. (1994). Addison-Wesley. Gilbert and J. R. Lindblom. Principles of Linguistic Change: Internal Factors. M. Mueller. Malden. Dawkins (1984). W. C. Blackwell Scientific Publications. Berlin. Designing Neural Networks Using Genetic Algorithms with Graph Generation System. Addison-Wesley: xiiixviii. G.). G. Addison-Wesley. Doran (Eds. In S. Langton. C. Langton (Ed. In The Evolution of Human Languages. D. J. Volume 1: Fundamental Algorithms. G. M. C. J. Change and Evolution. R. G. Hawkins and M. Preface. (1981). G. (1969).I.). (1992). Preface in Artificial Life. Gell-Mann (Eds. K. Kroch. Cambridge.). C. In Behavioural Ecology. University of Pennsylvania Press. B. Philadelphia. Hurford. Blackwell Publishers Ltd. J. (1989). The Ethics of Deception: Why Ai Must Study Selfish Behaviour. Lee. Studies in the Sciences of Complexity. E. On the Evolution of Human Language. G. (1996). C. Labov. Reflexes of Grammar in Patterns of Language Change.). Systemic Constraints and Adaptive Change in the Formation of Sound Structure.. Reading. Kitano. (2000). Rasmussen (Eds. Redwood City. P. R.). American Psychologist 36: 343365. Studdert-Kennedy and J. Journal of Language Variation and Change 1: 199-244. Taylor. B. D. In Approaches to the Evolution of Language.References 178 of Language: Social Function and the Origins of Linguistic Form. Lieberman. H. Davies (Eds. Ma. Lightfoot. Historical Linguistics and Language Change. In Artificial Life II. J. AddisonWesley: 21-47. Complex Systems 4: 461-476. (1998). G. CA. (1997). A. Blackwell. (1999). AISB Quaterly(104): 20-27. Spatial Evolution of Automata in the Prisoner's Dilemma. . (1991). Latané. O. C. Knight. Oxford. Krebs and N. Knuth. D. Labov.). R. Troitzsch. Cambridge University Press. Langton. (1989). B.F. J. Springer: 307-358. Hurford (Eds. Cambridge University Press. The Development of Language: Acquisition. Lass. and R. The Art of Computer Programming. Kirchkamp. UK. (1972). Krebs. Massachusetts. N. A. U. R. Sociolinguistic Patterns. The Psychology of Social Impact. Farmer and S.

. J. Cangelosi and D. M. D. CIS Departmental Journal. Advances in Artificial Life: Proceedings of the 5th European Conference on Artificial Life. and C. M. A Modified-Neutral Theory for the Evolution of Linguistic Diversity. Fyfe (2000).). Knight. In K. Parisi (Eds. Computing and Information Systems. Keil (Eds. Computational Models of Language Change and Diversity. D. Mondada (Eds. Livingstone.). (2000b). In The Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form. R. Cambridge. AISB '99 Symposium on Imitation in Animals and Artifacts. Livingstone. London. Philosophical Transactions of the Royal Society of Britain 251: 273-284. In GECCO-99 Student Workshop. In B. J. Livingstone. Knight (Eds. Orlando. (1966).). Cambridge University Press: 199-215. D. D. Lausanne. R. Livingstone. Livingstone. CIS Departmental Journal. Floreano. Fyfe (1999a). Evolution of Ritualization in the Biological and Cultural Spheres. Fyfe (1998b). On Modelling the Evolution of Language and Languages. Livingstone. Cambridge University Press: 242-264. Springer: 139-146. Nehaniv (Eds. D. D. Lucy. Studdert-Kennedy and J.). Fyfe (1998a). and C. Paris. In The Mit Encyclopedia of the Cognitive Sciences.). The Society for the Study of Artificial Intelligence and Simulation of Behaviour: 139-146. Fyfe (1999b). D. Studdert-Kennedy and C. C. C. the Belgian-Dutch conference on the evolution of language.). University of Paisley 5(2): 55-62. A. presented at Bene-Evolang 2000. The Evolution of Dialect Diversity. deBoer and P. (1999a). Methodology in Madness? Computing and Information Systems.). University of Paisley 6(3): 113-118. In The 2nd International Conference on the Evolution of Language. A Computational Model of LanguagePhysiology Coevolution.A. A. SpringerVerlag. (1999). In The 3rd International Conference on the Evolution of Language. Dialect in Learned Communication. Hurford (Eds. (1999b). D. and C. (2002). Livingstone. Switzerland. Modelling the Evolution of Linguistic Diversity. D. In Simulating the Evolution of Language. and C. Livingstone. and C. Dautenhahn and C. Edinburgh. Nicoud and F. Linguistic Relativity Hypothesis. London. ECAL 99.-D. D. Lorenz. Wilson and F. Modelling Language-Physiology Coevolution. A Computational Model of LanguagePhysiology Coevolution. In D. K.References 179 M. MIT Press: 475-476. Brussels. Vogt (Eds. Livingstone. Livingstone. A. (2000a)..

(1999). Schmid.. D. Crowley. Farmer and S. In D. Nature Neuroscience 4: 540-545. R. B. Marenbon. (1980). Ed. M. J. MA. B. MacWhinney. Synthetic Ethology: An Approach to the Study of Communication. Sasaki and M. Social Network and Gender. Mithen. T. and G. Amsterdam. S. Cambridge. Gunter and A. G. F. New York. (1987). Designing Neural Networks Using Genetic Algorithms.). S. Thames and Hudson Ltd. In The Mit Encyclopedia of the Cognitive Sciences. and L. Taylor. (1996). Oxford. C. Proceedings of the Third International Conference on Genetic Algorithms. Addison-Wesley. What Research on Creole Genesis Can Contribute to Historical Linguistics. Milroy. (1999). MIT Press: 293-295. Self-Reorganizing of Language Triggered by "Language Contact". R. (1995). The Emergence of Language. Brighton. In Historical Linguistics: Problems and Perspectives. Extract reprinted in Proper English? T. Burghardt (1994). . University of Sussex Miller. Schaffer (Ed. Milroy. T. F. S. Language and Social Networks. London. J. Keil (Eds. Maeda. In Artificial Life II. Mitchell. G. London. Jones (Ed. Adaptive Behavior 2: 161-187. Wilson and F. Basil Blackwell. MacLennan. Friederici (2001). On the Social Origins of Language Change. The Prehistory of the Mind. C.). D. S. London. Milroy. (1996).). Hegde (1989). Austin and D. Brighton. Longman. UK. C. Stein (Eds. Langton. M. John Benjamins. B. Cambridge. Artificial Life as Theoretical Biology: How to Do Real Science with Computer Simulation. B.. J.).. English Our English: The New Orthodoxy Examined. (1991). Milroy (1993). International Journal of Applied Linguistics 3(1). Mufwene. Koelsch.. C. J. In ECAL 97. Blackwell. Mitchell. G.. Oxford. (1997). (1993). Musical Syntax Is Processed in Broca's Area: An Meg Study. P. Cognitive and Computing Sciences. Linguistic Variation and Change. J. MA.. (1992). Cognitive Science Research Paper.References 180 MacLennan. L. London. Tokoro (1997). Centre for Policy Studies. Morgan Kauffman. Todd and S. A. In Historical Linguistics 1997. J. Y. M. Synthetic Ethology and the Evolution of Cooperative Communication. Rasmussen (Eds. M. Evolutionary Computation. S. M. Maess. MIT Press. An Introduction to Genetic Algorithms. C. Mechanisms of Change in Urban Dialects: The Role of Class. 1991 Miller. Lawrence Erlbaum Associates.). Milroy. J. U. Routledge.

Social Markers and the Evolution of Reciprocal Exchange. Doran (Eds. E. Ghadapkpour (Eds. and R. P. BioSystems 37(12): 31-38. Routledge. Pollack and S. Springer: 66-75. The Evolution of Language. The Evolution of Language. In J. Formal Approaches to Innate and Learned Communication: Laying the Foundation for Language. D. and D. J. Defining Animal Communication. Ghadapkpour (Eds. M. D. and Why It Matters for Understanding the Evolution of Language. E. Paris. D. (2000). Mataric. Wilson (Eds. v. J. (1999a). I. MA. Berlin. Linguistic Diversity. G. Santa Fe Institute working paper. W. Dunbar (1997). Exploring and Testing Theories: On the Role of Paramter Optimization in Social Science Computer Simulation. Dessalles and L.-L. and D. Gilbert and J. Lingua 108: 95-117. E. (1996). 3rd International Conference.References 181 Mühlenbein. Linguistic Ecology: Language Change and Linguistic Imperialism in the Pacific Region. (2002). P. Evolutionary Computation 1(1): 25-49. (1997).-A. (1997). Models of Cultural Evolution and Their Application to Language Change. Nettle. J. Nimwegen. 3rd International Conference. N. Berwick (1995). K. Noble. Cambridge University Press. MIT Press.. Maes. Nettle. Oliphant. Brighton. In ECAL 97. C. Massachusetts Institute of Technology Noble. M. Oxford University Press. In Social Science Microsimulation. Cambridge. Schlierkamp-Voosen (1993). Muller. The Evolution of Language. In Linguistic Evolution through Language Acquisition: Formal and Computational Models. Dessalles and L. Predictive Models for the Breeder Genetic Algorithm. (1999b). Mühlhäusler. London. U. P. From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behaviour. M. Oliphant. Troitzsch. On Simulating the Evolution of Communication. Niyogi. J.-L. University of California.). Using Social Impact Theory to Simulate Language Change. 3rd International Conference.). M. In P. Newmeyer (2000). Briscoe (Ed. P. J. Mueller. Neutral Evolution of Mutational Robustness. San Diego . Huynen (1999). PhD. Noble. (1996). J. The Scientific Status of Artificial Life. In J. and R. Nettle. The Logical Problem of Language Change. J. Cliff (1996). G. G. Crutchfield and M. The Dilemma of Saussurean Communication. H. Meyer. Current Anthropology 38(1): 93-99. Paris.). (1996).). Niyogi.). UK.

Harvard University Press. Sipper and M. (1987). Harvard University. The Selected Writings of Edward Sapir. Human Tongue. Dessalles and L. Rate and Pattern of World Linguistic Evolution. and P. Mandelbaum (Ed. Cours De Linguistique Générale. Knight. E. Flocks.-J. (1998). Cambridge University Press: 391-416. Artificial Life 5(3): 225-239. C. 3. University of Illinois Press. (1987). S. Computer Graphics 21(4). F. Eds. The History.). Plenary Talk. Hurford and T. A.A. Cambridge. . E. (1978). M. Pinker. P. Culture and Personality. C. Aiello (Eds. S. (2000). The Language Instinct. C. (1916). N. In Language. Masters and C. Second Discourse (Discourse on the Origins of Inequality). (1949). Berkely and Los Angeles.W.D. New York. D.). (1994). Design. C. (2002). R. Savage-Rumbaugh.. I. L. Sapir. Pinker. In The Adapted Mind. T.. Sanders. Natural Language and Natural Selection. Herds and Schools: A Distributed Behavioral Model. W. Vol. R. presented at The Evolution of Language. a Distinct Class of Muscle That May Have Evolved for Speech. Cosmides and J. Saussure. W.). E. Shannon. In The Origin and Diversification of Language. Quine. London. G. G. U. C. Reynolds. University of California Press. proceedings of the Fourth International Conference on the Evolution of Language. D. Pagel. Observation. Ghadapkpour (Eds. J. C. Hurford (Eds. Ronald. E. Weaver (1949). Capcarrère (1999). M. O. In The Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form. Surprise! A Test of Emergence. and W. S. Paris. M. In J. The Origins of World Linguistic Diversity: An Archaeological Perspective. In J.S.). Barkow.A. W. Jablonski and L. Paris. S. M. The Death of Economics. Schelling. Tooby (Eds. Kelly). (1755). Mass.References 182 Ormerod.. Micromotives and Macrobehavior. Illinois. Urbana.). University of California Press: 171-192. (1994). Fitch (Eds. M. Studdert-Kennedy and J. Penguin.-L. Quiddities: An Intermittently Philosphical Dictionary. Renfrew.). H. Pharynx and Vocal Fold Muscles Contain Slow Tonic Muscle. (2000). V. 3rd International Conference. Martin's Press Translation. Oxford University Press. A Mathematical Theory of Communication. Norton & Company. Rousseau. Faber and Faber. 1964 (in the Collected Writings of Rousseau. Cambridge. Bloom (1992). J. St.

Evolution: An Introduction. A Self-Organizing Spatial Vocabulary. Oxford. Studdert-Kennedy and C. Smith. L. Pollack and S. R. C. MIT Press: 368-376. Storey.-A. Oxford University Press. Steels. MA. In C. Goldin-Meadow (Eds. Foresman and Company. L. Hoekstra (2000). and F. Tannen. Steels. (1996a). In Language in Mind: Advances in the Investigation of Language and Thought. (1998). I. (2000). In Reprinted in Proper English? T. K. H. (1996). Talking from 9 to 5. MA. and P. L. D. Steels. Psycholinguistics. R. L. Cambridge. Emergent Adaptive Lexicons. D.). Cambridge University Press: 384-404. Wilson (Eds. . and D. MIT Press. Knight (Eds. N.References 183 Skoyles. The Cultural Evolution of Communication in a Population of Neural Networks. Cognitive Psychology. (2003). Illinois. Scott. (1712). (1994). UCLA. J. J. Taylor. Artificial Life 1(1): 1-13. Artificial Life VI. Mataric. Steels. Bradford Books. L. In P. S. ECAL 97. In 4th European Conference on Artificial Life. I.). In Approaches to the Evolution of Language. Steels. Language and Thought Online: Cognitive Consequences of Linguistic Relativity. From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behaviour. Artificial Life as a Tool for Biological Inquiry. (1996b). Hurford. Cambridge. :Harcourt Brace. Self-Organizing Vocabularies. Steels. Swift.). L. D. William Morrow & Company Inc. M. Addison-Wesley. Grounding Adaptive Language Games in Robotic Agents. Kaplan (1998). (1996c). Adami. In Proceedings of Artificial Life V. Extracts from a Proposal for Correcting. Connection Science. The Singing Origin of Speech. D. Sternberg (1996). Stochasticity as a Source of Innovation in Language Games. and R. Slobin. W. Cambridge. (2002). Maes. Fort Worth. Synthesising the Origins of Language and Meaning Using CoEvolution. Belew. MIT Press. Safety-Critical Computer Systems. 1991. Gelnview. Taylor (Eds. K. London. R. Jefferson (1994). C. Improving and Ascertaining the English Language. Paris: 206-209. Crowley (Ed. Kitano and C. J. 3rd international conference.). J. (1979). Self-Organisation and Level Formation. J. E. In The Evolution of Language. MA. F. Gentner and S. Meyer.). Slobin. Stearns. M. Artificial Life 2(4). MIT Press: 179-184. Vogt (1997). Routledge.

T. C. Using Bottom-up Models to Investigate the Evolution of Life: Steps Towards an Improved Methodology. M. Cambridge.edu/entries/popper/. MA. W. Workshop at ALIFE IV. Volume 1: Multilingualism and Variation. G. S. Evolution of Herding Behavior in Artificial Animals. London. MIT Press: 478-485. Karl Popper. London. Origin and Emancipation During Evolution.). Taylor. J. P. J. H. E. Language. G. The Sociolinguistics Reader. (1979). Ed. MIT Press..References 184 Taylor. Jennings (1995). D. From Artificial Evolution to Artificial Life. P. G. R. (1956). Trask. Ritualisation and the Evolution of Movement Signals. Trudgill. Historical Linguistics. PhD Thesis. Taylor. Evolution of Communication in Artificial Organisms. C. Biological Significance.. The Evolution of Language from Social Intelligence.stanford. L. Werner. Hurford. Cambridge. Dyer (1993). UCLA. M. Werner. G.). and J. L. Cheshire. and N. B. Arnold. M. Intelligent Agents. S. Stanford Encyclopedia of Philosophy http://plato. (1958). (1998). L. Berlin. The Uses of Argument. Behaviour 72: 77-81. In C. London.). From Animals to Animats 2: proceedings of the Second International Conference on Simulation of Adaptive Behaviour. Zahavi. R. (1998). In J. (1997). L. Tinbergen. University of Edinburgh Thornton. and M. Whorf. Sociolinguistics: An Introduction to Language and Society. Thought and Reality: Selected Writings of Benjamin Lee Whorf. Langton. (1998). A. Wagner (Eds. T. (1999). N. G. Wilson (Eds. Dyer (1991). (1996). Wooldridge. Cambridge University Press: 148-166. . Arnold. and M. Eds. Knight (Eds. Farmer and S. Roitblat and S. Addison-Wesley. Derived Activities: Their Causation. Toulmin. Trudgill. (1976). A. In Approaches to the Evolution of Language. The Right Stuff: Appropriate Mathematics for Evolutionary and Developmental Biology. Meyer. R. Rasmussen (Eds. R. SpringerVerlag. In Artificial Life II.). P. (1995). M. Worden. Cambridge University Press. Quarterly Review of Biology 27: 1-32. StuddertKennedy and C. Nehaniv and G. Penguin.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->