P. 1
An Introduction to Formal Languages and Automata

An Introduction to Formal Languages and Automata

|Views: 21|Likes:
Published by Madalina Pop

More info:

Published by: Madalina Pop on Nov 30, 2010
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





b d

An Introduction to Formal Languages andAutomata
Third Edition

University Californiaat Davis of


6l;l .rf;ti.


t' o

dtry'l-,tlti,tFI, hgfryfl6a
';;'ut: \ n qf I" A\ ,r'f7 lA ,obi

Workl Headquerters Iones and Bartlett Puhlishers

40 Tall PineDrive MA Sudbury, 01776 978-443-5000 info@jbpub.corn www.jbpub.com

Jonesand Bartlett Publishers Canada 2406 Nikanna Road Mississauga, ON L5C 2W6 CANADA

Jonesand BarJlettPublishers International Barb House,Barb Mews London W6 7PA UK

Inc. Copyright O 2001 by Jonesand Bartlett Publishers, or All rights reserved.No part of the materialprotectedby this copyrightnotice may be reproduced recording,or any infotmation including photocopying, or mechanical, utilized in any fonn, elcctronic storage retrievalsy$tem,without written permissionf'romthe copyright owner. or Library of Congress Cataloging-in-Puhtication Data Linz, Peter. / and automata PeterLinz'--3'd cd An introductionto formal languages p. cm. and Includesbi hliographicalref'erences index. G' A

26+,3 . L5 4 LooI

l. Formal languages. 2. Machine theory. l. Title. QA267.3.Ls6 2000 5 | 1.3--dc2l Chief ExecutiveOfficer: Clayton Jones Chief OperatingOfficer: Don W. Jones,Jr. and Publisher: Tom Manning ExecutiveVicc President V.P., ManagingEditor: Judith H. Hauck V.P.. Collese Editorial Director: Brian L. McKean V.P;, Dcsigir'and"Prodgction: \ Anne $pencer arket+rg-iFauI Shefiardson V. P., Salcs anit*ffr . V. P., Man uf aeturingjandilnhrr'trrry dpntrol : ThereseBriiucr SeniorAgquisitionsEditor; Michacl $tranz f)evelopment and Product Managcr: f,lny Rose Markcting Director: Jennifer.Iacobson ect Production CoordinationI Tri{ litrm -Pt'oj M anagcment Cover Design; Night & Day Design Composition:NortheastCompositors Printing and Binding: Courier Westford Cover printing: John Pow Cotnpany,Inc.


Covel Imasc O Jim Wehtie This book was typesetin Texturcs2. I on a MacintoshG4. The fbnt families usedwere Computer The first printing was printed on 50 lb. Decision94 Opaque. Modern, Optima, and F'utura.

States Arnerica _. -'_ of Printed the United in 04030201 lo987654321



I ,r./1,il.t!\

his book is designed for an introductory course orr forrnir,l larrguages, autornatir, txlmputability, and rclated matters. These topics form a major part of whnt is known as tht: theory of cornputation. A course on this strbitx:t rnatter is now stir,nda,rdin the comprrter science curriculurn ancl is oftrlrr ta,ught fairly early irr the prograrn. Hence, the Jrrospective audience for this book consists prirnrr,rily of sophomores and juniors rnirjrlring in computer scicntxlor computer errgirrwring. Prerequisites for the material in this book are a knowledge of sorne higher-level prograrnrning la,nguage (cornmonly C, C++, or .Iava) and fatrrilinritv with ihe furrdarn<lnta,ls data structures and algoriihms. A colrr$e of in discretc mathematics that irx:hrcles theory, furrctions, relations, logic, set and elernerrtsof mathematical reasorringis essential. Such a corlrse is part of the standard introductory computer sciencecurriculum. The study of the theory of cornputa.tionhas several purposc$, most imprortantly (1) to fa,miliarizestuderrts with the fbundations and principles of computer sciettce,(2) to teach tnaterial that is useful in subsequerrt colrrres! rrnd (3) to strengtlrcrr utudents' ability tu t:ilrry out formal and rigorous rrrirthematical argurnerrts. The presentatiorr I ha,ve chosen for this text fa-



F RHr-AciE

vors the first two purpose$r although I would rr.rguethat it a,lsoserves the give strrdcrrtsinsight into the material, thircl. To prt:sent ideas clenrly arrd 1,o rnotivation and ilhrstration of idcir.sthrough ex* tlte text stressesintuitive a,m1llcs.When there is ir choice, I prefcr arguments thtr,t a,reeasily grer,sptxl and elegant brrt rlifficult in concellt. I state clefinito thosr.'tlnt are concisr,l tiorrs ancl theorems llrecisely and givt: the tnotiva,tion ftlr proofs, brrt tlf'tt:rr k:ave out the rorrtirre and tediorrs rlctails. I believe tlrrr.tthis is desirnblc for peclagogir:nl rcasdhs, Many proofs are unexc:itirrgapplications of irrduction that are sptx:ific to particuLrr llrobletns. with diff'ererrt:es or contra,clit:tiotr, arguments in full detail is not orrly ullllecessary,lrtrt interPresenting $rrdr feres with the flow of the storv. Therefore, quite a few of the proofs are Inay consitlclr tltern lacksketchy irrrcl someone wlxl irrsists on complerttlrress in cletrr.il. I do not seq:this as a clrawback. Mathematica,l skills are uot ing else's argutttents, but comc frorn thinkthe byproduct of reading sorrreorle ing atrout the essenrxlof a problem, disrxlvtlrirrg idea-ssrritatllc to make the poirrt, thel carrying tltetn out in prtruistldetail. The kr,tter skill certainly arrd I lhink th.r,t the proof sketches irr this text providc has to be lea,rnerd, very appropriir,tcstartiug points fbr such a practitx:. sornetitnes vi(lw a course in the theory of StudentS irr courputer sclit1rrce computation aa urlnecessarily abstract and of little practical con$(xpelrce. 'Ib convinr:c thetn otherwi$e, t)nc treeclsto appeir.l tcl their specific irrterests and strengths, suclt a,stena,t:ity and inventivttntlss itt clealing with hard-toof solverllroblettts. Beca,user tltis, tny a,pprtlitt:h empha,sizeslea.rnirrgthrough probletn solving. I By a problem-solvitrg approa,ch, rrteattthat students learn the material prirnarily througlt problem-type illustrative examplcs that show the motiwell as their conncction to the theorcrns attd vation bohirrd the concepts, a^s tirne, the examples rrriryinvolve a nontrivial aspect, clefinitiotrs. At the sa,me a for whir:h students must dist:ovc:r solution. In such an approach, htlrnework contribute to ir,rrrajor part of the leartting procefJs.The exercises exrrrc:ises :rt the end of each sectiorr are designed to illutrftrate and ilhrstrate the malevels. Some tr:rial and call orr sttrdents' problem-solving ability a,t vtr,riotrs of the exerci$csare fairly sirnple, pickirrg up where the discussiotrin the text Ieaves ofl and asking students to carry ou for antlther step or two. Other extrrcisesare very difficult, challenging evtrrrthe best ntinds. A good rnix of such exercisest:ilrr be a very eff'ectivt:teaching tool. Ttr help instructors, I have provitled separately an instructor's guide thrr.t outlines the sohrtitlrrs of the exercise$irrrd suggeststheir pcdagogical value. Students need not trrr asked to solvc all problems bqt should be assigned those which support tlte goals of the course and the viewpoint of the instnrt:tor. Computer sr:ience currir:ulir tliffer from institrrtiorr to iilstitutiorr; while a few emphasize the theoretir:nl side, others are alrnost entirely orientt:d toward practiclnl application. I believe that this tt:xt can serve eitlNlr of these extremes, llrclvided that the exercises a,re stllected carefully witli the students' btr,c:kgroundatld intertlsts in mind. At ttle same time, the irrstructor needs to irrform tlle


students aborrt the level of abstraction that is expected of tlxrm. This is particularly tnre of the proof-orietrttxl exercises. When I say "llrove that" or "show that," I hrr,ve mind that the student should think about how a, in proof rnight be cclnstrur:ted ancl then produr:e a, clear argurnent. How fbrrrtal srrch a, proof should bc needs to be deterrnined by the instructor, ancl stutltlnts should be given guitlrllines on this early irr the txlrrse. Tltc content of the text, is allllropriate for a one-sernestcr txrurse. Most of the nraterial can be covered, although some choice of errrpha.sis will have to be rnirde. In my classes,I gencrirlly gloss over proofs, skilr4rv as they are itr tlte tcxt. I usually give just enough coverageto make the rcsult plausible, asking strrdents to read the rest orr their own. Overall, though, little can be skippexl entirely witltout potential difficulties later on. A few uections, which are rnrlrked with an asterisk, c:rr,n omitted without loss to later be material. Most of tht: m:r,teria,l,however. is esscrrtial ancl must be covered. The first edition of this book wrr,u published in 1990,thc:stxxrnda,ppeared in 1906. The need for yet another cdition is gratifying and irrtlic;atesthat tny a1l1lrorr,ch, languagesrathcr than computations, is still viable. The via charrgcs ftrr the second edition wercl t)volutionary rather than rcvolrrtionary and addressedthe inevitable itrirct:rrra,c:ies obscurities of thtl Iirst edition. and It seertrs, however, that the second r:dition had reached a point of strrbility that requires f'ew changes,so thc tlrlk of the third editiorr is idcntical to the previous one. The major new featurtl of the third edition is the irrc:hrsion of a set of solved exercises. Initially, I felt that giving solutions to exerciseswas undesirable hecause it lirrritcd the number of problerrts thir.t r:a,nbe a,ssigned hourework. Howfor ever, over tlre years I have received so rrrany requests for assistance from students evt:rywhere that I concluded that it is time to relent. In this edition I havc irrcluded solutions to a srnall rrumber of exercises. I have also added solrro rrew exercises to keep frorn rtxhrcing the unsolved problems too much. Irr strlec:ting exercises for solutiorr, I have favored those that have signiflcant instructioner,l ver,lues. For this reasorr, I givc not onlv the answers, brrt show the reasonirrg that is the ba,sisfor the firml result. Merny exercises have thtl ser,me theme; often I choose a rupresentative case to solve, hoping that a studerrt who can follow the reasorrirrgwill be able to transfer it to a set of similar instances. I bclicrve that soluiions to a carcfirlly selected set ttf exercises can help studerrts irrr:rea"re their problern-solvirrg skills and still lcave instructors a good set of unuolved exercises. In the text, {lxercises for whir:h rr,solution or a hint is g-ivcrrrr,rqr identified with {ffi. Also in response to suggcstitlns, I have identified sonre of ther harder exercist:s. This is not always easv, sirrt:e the exercises span a spectrrrm of diffic;ulty and because a problen that seems easy to one student rnay givr: considerable trouble to another. But thcre are some exercises that havcl posed a challcnge fbr a majority of my studcnts. These are rnarked witlr a single star (*). There are also a few exercisosthat are different from most in that they have rro r:lear-cut answer. They rnay crrll f'or upeculation,



suggest additional reading, or require some computer programming. While they,are not suitable for routine homework assignment, they can serve &s entry points for furtlter study. Such exercisesare marked with a double star (** ). Over the last ten years I have received helpful suggestions from numerous reviewers, instructors, and students. While there are too many individuals to mention by name, I am grateful to all of them. Their feedback has been in'aluable in my attempts to improve the text. Peter Linz




to the Theory

of Computation 3

1.1 Matlrenratical Prelirrrirrrlricu ar,nd Notation Sets 3 Functions and Relations 5 Craphs and l}'ees 7 Proof Techniques I 1.2 Three Basic Concepts 15 Lirrrgrrir,ges 15 Grarnrnilrs 19 Automala 25 +1.3 Some Applications 29

Chapter 2 Finite Autornata 35 2 , 1 I)eterrrrinistit: Finite Accepters 36
I)ctc:rrnirristic Accepters and'IIrrnsitiorr Grir,phs Languir,gcsand Dfa,s 38 R.t:gulil,rL:lngrrages 42 2 . 2 Nondeterrriinistit:FiniteAccepters 47 Definilion of a Nonrleterministic Accepler 48 Whv Notxlctt:rrninism'1 52



3 Identifving Nonregular Languages 114 llsirrg the Pigeonhole Principle 114 A Pumping Lemma 115 5 Context-Free Languages L25 Chapter 5.1 Regular Expressions 7I Forma.anrl Left-Linear Grammars 89 Right-Linear Grammars Generate Regular Languages 91 Right-Linear Grammars for Regular Languages 93 Equivalence Bctween Regular Languages and Regular Gra.vlll CoNrnNrs 2.2 Parsing and Ambiguity 136 Parsing and Mcnbership 136 Anlbiguity in Grarnrnars and Latrguages 141 5.l Delinition of a Regular Expression 72 Languages Associated with Regular Expressions 73 3.4 Reduction of the Number of States in Finite Automata 62 Chapter 3 Regular Languages and Regular Grammars fl 3.1 Corrtext-Free Grammars 126 Exarrrples of Context-Flee Languages 127 Leftntost and Rightmost Dt'rivations 129 Derivation Tl'ees 130 R.3 Equivalence of Deterministic and Notrdeterministic Finite Accepters 55 +2.rs 95 Chapter 99 4 Properties of Regular Languages 4.mma.1 Closure Propertitrs of Regular Languages 100 Closure under Simple Set Operations 100 Closure under Otlter Operations 103 4.2 Connection Between Regular Expressions and Regular Languages 78 Regular Expressions Denote Regular Languages 78 Regula.3 Context-Ftcc Gramrnars and Programmirtg Ltr.3 Regular Gra.2 Elementary Qrrestionsabout Regular Languages 111 4.elation Between Sentential Fttrms and Derivation 'llees 13? 5.r Expressions for Regular Languages 81 Regular Expressions for Describing Simple Patterns 85 3.trrnars 89 Right.rrgrrages 146 .

1 The Standard T\rring Machine 222 Definition of a Thring Machine 222 T\rring Machines as Language Accepters 229 Tlrring Ma.rs 1.3 DerterrrinisticPushdown Autornataand Deterrrfnistir: ContextFr{lc Lirnglrri.ngua.rsfbr Deterministic Corrtext-F}ct:Langua.chines as Tlansducers 232 9.2 Pushdown Automata and Context-Free Larrguagcs 184 Pushdown Autorrrata fbr Context-Flee Languages 184 Corrtcxt-Floe Grammars for Pushdown Autorrrata 189 7.ges 195 *7.3 A Me:mbership Algorithm for Context--F]'eeGrarnrnrr.72 Chapter 7 Pushdown Automata 175 7.7 Nondeterrnirfstic Pushdown Automata 176 Definition of a Pushdown Arrtomaton tTti A LangrrageAccepted by a Pushdowrr Automaton I79 7.4 Gramma.\-Productions 156 Removing Unit-Productiorrs 158 6.ges 200 Chapter 8 Properties of Context-Flee Languages 205 8.2 Combining Tlrring Machines for Cornplicated Tasks 238 9.'e Languages 218 Chapter 9 Turing Machines 221 9.l Form 168 +6.2 Closure Propcrtien and Decision Algorithrns for ContextFree Languages 213 Closure of Context-Free LangJuages ?13 Some Decidable Properties of Contcxt-Fre.1 Methods for Tfansforrrring Grammars 150 A Useful Substitution Rule 150 Removing UselessProductions 15? Removing.2 Two Important Normal Forrns 165 Chomsky Normal Form 165 Greibach Normtr.r.3 T\rring's Thesis 244 .ges 210 8.CoNrEr-rts ix Chapter 6 Simplification of Context-Flee Grammars 149 6.1 Two Pumping Lemmas 206 A Purnpirrg Lcrnrna fbr Context-Flee Languages 206 A Purnping Letrrnil firr Linear La.

t:hint:Therne 25t) Eqrrivalcrrt:tl Classesof Autonrata.ltlrr Languages 308 Ptoblem Sl2 12.'onrnrlrs Chapter 10 Other Models of Turing Machines 249 10.chiners 258 Mttltidimensional T[rring Mtr.etlucitrgOne Undecidable Problem to Another 304 12.rsarrd Lirnguages 289 Conterxt-Srlnsitivc Languages and Litrear Bounded Aulomata 29t) Relation Betweeu Recursive and Ctlrrtt:xt-Setuitive Languages 2gz 11.4 A lJrriversal I\rring Machine 266 10. A Hierarchy of Formal Languages and Autornata 278 11.1 Some Probletns That (ltr.3 Context-Sensitivc (]rarnrna.nguages 318 .4 I'he Chomskv Hierarchv 29Ir Chapter 12 Limits of Algorithrnic Cornputatiorr 299 12.2 Uurestricted Grarnmars 283 11. Sta.2 Ma.5 Liricar Bouttded Autotnata 270 Chapter Ll.y-Option 251 Thring Machines with Semi-Infinitc Tape 253 The Off-Line Tttrirrg Mat:hine 255 'I\rring 10.ble 278 Languages That Art: Not R.1 Mirxlr Virriatiotrs on the T\rring Ma.3 Tlte I'osL Correspondence 12. 250 clf Ttrrirrg Machines with a.3 Norrtletertninistic T\rring Ma.chirrt:s 261 10.4 [Jndccidable Problems for Context-Free Lir.l 11.rrnotBc: Solved By l\rring Machines 300 The T\ring Machine llalting Problem 301 H.chines 263 10.1 Recursive and Reclrrsively Euurnerable Languages 276 Enumera.t:cursively Enumerable 279 Erlrrrterable But Not A Language That Is Rer:rrrsivr:ly Recursive 28.chineswith Morc Cotttplex Storage 258 Mullitape Ttrring Ma.2 Uritlt:c:itlrrble I'robletns for Recursivelv llnrtmertr.tx:ursively A Language That Is Not R.(.

3 I4.Cot-{reNrs xi Chapter L3 Other Models of Computation 323 13.1 14.2 Post Systern$ 334 13.ewriting Systems 337 Markov Algorithms 339 L-Systems 340 Chapter 14 14.4 An Introduction to Computational Complexity Efficiencyof Computation 344 Ttrring Machinesand Complexity 346 LanguageFamiliesand Complexity Classes 350 The Complexity Classes and NP 353 P 357 343 Answers to Selected Exercises References 405 Index 4OT .3 R.2 14.1 Recursive Functions 325 P-rimitiveRecursiveF\nctions 326 Ackermann'sF\rnction 330 13.


inly true of computer science studcrrts who rrru interested rna.reessential to many of the special and complex corrstructs we crrcourrtrlr while wclrking with computers. This attitude is appropriirte.rr. But givcrr ihis practical oritlrrtir. there are soure colrtlrlotr urrclcrlyirrg prirrt:ipltrs.ndthat a. But in spite of this diversity. we construct abstract rnodels of corrrllrtcrs and comprrtation.l topics.I N T R O D U C T I OO TN T H E H E O RO F T Y COMPUTATION otrprrterr science is ir.pplication.re. This ftrr makes computer science a very diverse arxl lrroarl rlis<:ipline. The field of computer scienceincludes a wide rarrgr:of sper:irr. Tcl strrdy these basic principles. Tlteoretical qucstions arcrinteresting to them only if they help in finding good solutions. I'hose who work irr it oftcn hirve ir.l world involves a wealth of specific detail that must lre lerirrrrcxl a uuccessfirla. These ruodels embody the important features tlnt are cornnron to both harrlwarc and softwtr. sinr:e without npplications there would be little interest in cornputers. f'rom machine design to progratntrtittg.tiorr.ctical discipline. Tlte use of cornputtlrs irr thel rea.inly in working on difficult applicatious from the real world.pra. onr: rnight well a. Thiu is certa.mrlrked pref'erencefbr useful and tangible problerns ovt:r theoreticrrl spt:c:ulirtion.sk "why study theory?" 'Ihe first arrswer is that tlrrxrry provides concepts and principles that help us understand tlrtl gerrcral rrirturt: of the discipline. Even .

The construc:titlrrclf rnodels is one of the esdisciplitte.rrnlxrtcrscience.tr.beled. Perhaps the rnost stringent requirement is thu rrtrility to follow proofs aud .nsformirrg the input into tlte output.'first chapter. In tlx. A f'ormell lirnguage is the set of all strings permitted by the rules of fi)rrnirtiorr. and can make decisions in tra.sin a vcry broad way to set thtl stagc for later work. tr. This will involve sclmelrnilthernatical machinery. the conchrsionrr wu draw will be based on rigorous arguments.but there are rnanv othcrs. is that the ideas we will discuss have srlmt: irnrnediate and itnporta. yet powerfirl. Fina. A formal language is arnir. The third irlrswer is oue of which we hclpc to txlrtvittce the reader. Trr rrrodel the hardware of a comprrtt:r. In this hook.tion by whit:h thcse sytnbols can be cotnbined into crrtities called sentences.rrguirgesfrom formal languages.t possessesall Lhe indispensable f'eatrrrt:stlf a irrput. Tlte reader will need a rea^sonably good gra. The fields of digital design. we will look at models that represcrrtfcatures at the core of all c:ornputers and their applica. In Section 1. T!'eesand graph structures will be rmul f'requently. ftnetions.they have rnarry of the same esserrtial features.Chopter I IurnorrucjrloN To rHE Tsr:enY ol' Col. This is lenging.renot) suitable fbr solution try srrclt trtechatrical Ineans.ttern rtxxrgrritiorr.tputarlott whertr such moclelsa.retoo simplc to be applicable immediately to real-world situations. from opera.lsleepless nights. We cau learn a great deal ir.pproach is of course not unique to rx.tlrttrnaticsthat will be required. wtr will forrrralize the concept of a rnechanical computation by givirrg a precise clefinition of the term algorithrn and study tlrt: kittds of problems that are (and tr. The srrtricc:trnatter is intellectually stimrrltr.ndrrirnpilt:rs are the most obvious erxarnplcs. and the usefiilnessof a discipline is often sentials of any sc:iurrtific clf clependenton the exi$ttrrrt:c simple. This tr.bout programming lir. It :r.spof the terminology and of the elementary results of set thtxrry.lly. Although sorne of the formal langrrirgcs we study here are simplt:r thirrr prograurmitrg langua.ny crha. prograrntning laugua.lir. the insiglrts wt: gain frotn studying them provide the foundations on which sptx:ific.rxxlpts porary utorilgrl. we introcluce the notion of iln automaton (plural. a. produces output.bstractiorr of the general characteristics of prograrnming languages.rxlllcrhaps not so obvious answer. automata). A ftrrmal lirrrgrrage cotrsists of a set of symbols irrrd some rules of forma. While intuition will frcquently be our guide irr exploring idea. rlevelopment is ba*sed. An automaton is a.tirrg atrd furr. we will show the clo$er(xlrrrrc(:tiotr between these abstractions and irrvc:stigate the conclusions we carr tlcrive from them.lthoughlittle is needed beyond the definition of a. In the cour$e of orrr stutly. It provides ma.ting systerrrs to pa. thtxlric:satrd laws.s.directed graph. anrl rclatiorrs. A second.ges.sicidea.nt applit:atiorrs.1. The cotrcepts we study hert: nrrr like a thread through mrrr:h of txrrrrputer sciettce.ges.tiorru. may have somtl tcrndigital computer. prrzzle-likeproblems that can lead to ir()rrrc probkrrn-solvittgin its pure essence. although these requirementn alrelnot t:xterrsive. we look at these ba. construr:t thir. we revit:w thc rrrain ideas from ma.

When the neecl arises. .pplica. intersection (n). Anothttr bir. we use rrrore explicit notation. b. Mothemoticol Preliminoriesond Nototion WffiWmHilW Sets A set is a collectiott rtf t:lclrno'rrts. gralrllnar$.3..rssion these two scc:tionswill be intuitive in rather tltirrr rigororrs.1 MarrrnlrATrcAl PRr:r. Later.We will assurne that thc rcirrlrlr ha. denotecl by F.} denotes the set of all positive everr irrtcgrlrs. The sta..l idea. a. Ellipses are usetl wltcncvc:rtlNl rneir. induction. consists of a.ll elernenls not. ancl proof by c:clrrtrir.mple.ter tltatr zero. without any structure olher tharr rnr:rnhership. exa. 2 is shown as 5 : {0. ziseven} (f. The usual set operations arc union (U).sto illustrate that thesc c:tlnr:rrpts have widespread uses itt cornputcr ur:ience.the fbr set of irrtt:gers0. in S. These cortcepts oc{:rrr irr rnarry specific fbrms throughout the book.1.tions of tlrr:sc gerrera. The cotrplerntlnt tlf a set . To indicate that r is arr clcrnrrnt of the set 5. 9 z } .diction.lnrwnRrES AND Norauou atr utrderstarrdingof what constitutes proper rnathcrnirtical reasoning.2}.st example. 1. defined as 51U52 : {z : r e Sr or r €.tementthat r is not in S is written r f 5.Sz : 5r {z : z € Sr arxl r fr 52}. we take afirst look at thc r:entral concepts of languages.sthis necessarybackground. We read this as tt. in which we write S= {i:i >0.. This includes farniliarity with the hasic proof techniques of dcrluction. thtl goal is to get a.clear picture of tire corrceptswith which we are derrling..9is sc:t of irll ri.92}.J 1. wc givc some simple a. To rnakc this .tion is complementation. .6.2. and difference (-).4. A set is specified by cnr:losingsome description of ils elernentsin curly bracxrs.The discr. we will make all of this rmrr:hrnoro precise.9.z} slands for is all the lower-caseletters of thc Engliuh a. {a.9. In Section 1. srx:h thrr.t rl is grea. 5 1 1 5 2 : { z : r € S r a r r rr E .l) frrr the ltr. .ndrj is even.while {2. trrrd a.lphabet." implying of course that z is irrr irrteger.ning clear. Tltus. l . In Section 1.but for lhe ntotttettt.sicopera.utomata.1 is induded to review some of the rrririrr results that will be used arrrl to entahlish a notational colrurrorr grourrrl f'rrr subsequentdiscussion. Sectiott 1. we write r € .

it is obvious that Sus: S-fr:5. b . Wc write this as 5rc5' If St C S. set 5 is callecl the powerset of S ir. fr =U. Sr n 5'2 = fl.sthe DeMorgants laws. 'Ihe otherwise it is infinite. Here lSl : 3 and lZtl :8. A set is said to be linite if it contains a finite nlrmbcr of elemenls. " } } . If 51 a. called the ernpty set or the null set is denoted by #. The sct witlt no elements. ffi: F l n S z : . (t 2) (1. b } . that is. { o . The following useful identities.9. Snfr=9. 5 rU S z . thcrr set U of a'll possitrlt: F:{*:rr-(I. wc write tltis as $rCS. 5::5.b.t b } .tpu'rn'rtou rnerarrirrgful.t the universal elements is. r : } { b .nd is denoted by 2's. Exottplq l'. Flom the definitiorr rtf a set. { a .n#S}. n . TIte set of all subsets of a. Observe that 2s is rr. set of sets.if 5 is lrsl . this is denoted bV l5l. then its powerset z s : { f r . If U is specified. { o } .rlsl I . are needed orr $eivc:ral A set . Srl3z.4 Chopter I llqrnooucrroN To rrrn THnoRv cln Cor. size of a finite sct is tht: rrurrtberof eletnents in it. hut 5 rxrrrtilirrs ir..we need to know whir. tltett the sets are said to berdisjoint.ncl5'z have IIo coilllrron elemeut. { . finite.rrelernetrt not in 51 we say that Sr is a proper subset of . known a. A given set norrnally has marry sutrsets.3) occasions.c}. then This is arr instirrrce a general of result. { c } .f I is If $ is thc set {a. c } .9r is said to be a subset of 5 if every element of 5r is also atr element of S.

4) is not. (2 S r x 5 ' z : { ( 2 .the elements of il stlt irre ordered sequencesof elements frorn otJrer sets. Itr in the heha..f is said Lo be a partial function. 1 ' r .IEs Norauolt In rnany of our exa. Let / (n. If thcre exists a.s f ( n): o ( s( n) ) . generally S r x 5 ' r x ' . n . .ny applir:rrtiorrs.unirptl cletrtetrtof the domain another set. If / dcrxrtt's a futrcLion.q(n) be functions whose doma. and the serxrnd / : . Furthermorel we il. Srrr:h$ets arc said to be the Cartesian product of other sets. ( 2 . ' x 5 r : { ( r 1 . ( 2 . . .mples. T 6 c ' 6 . 6 ) . 2 ) . we sav that . 3 .| . If tht: tlornirirr of / is all of 5r. We write this ir.f ha. .) and . )r: .then the flrst set is t:ir.ltvttw.in of / is a strtrsc:t .tes is oftetr sullicient and a corrrrrrorr order of magnitude nota. T 2 .tion carr be used. For the Ca. I Functions Relotions ond A function is a rukt that assignsto elements of one set a.2) is in 51 x 5'2. $ 2 of to itrdicate thal the doma. we say thrlt / is a total function on 5r. which itself is a set of orclerednairs.\ AND 1.1 MnrHntutnrtcAL PRt:t. .q.ngeof / is a subset of 52. otherwist: . 2 ) .but (2.in is a.lltxl sct is its range. 1 ' r()4 . positive constant c such that for all rz f(n)tcs(n). 5 ) .rtcsiarrproduct of two sets. € S r } . are in the set of positive integers.rcoften interested only very large. Thc pair (4.( 4 . .:.( 4 .n.51atrd that the ra. subst:t of the positive itrtegers. the donaiu and rauge of the firrrt:tiottsinvolved In ma.( 4 . 5 1. we writer S:Sr x52:{(*.virlrof tltese functions as their arguments btlclottte arr such c:asers urrrlerstanding of the growth rtr. We write of /.Ee Sz}./) :r€S'1. 2 .nobvirlrs firshiorrto tlte Cartesian product of rnr)rt)than two sets. 3 ) . Notc that tlte order in which the elements of a. . llnir are written matters.sordcr at most g. The nolation is extendeclin a. 6 ) } . 3 ) .

renot sensibleand catr lead to irtr:clrr(lct rxrnclusions. I'rrr whit:h we tuJe /(") :o(g(")).'ConrurArrorv 'r'o ChopterI IurR.)). y r ) . u z ).For such a set to delirrc a firnt:tion. there exist constants c1 and c2 such tlnt if cr lg(?l)l l/ (")l ! czle(rl)l. is a.lity a.nipulations such as O (rz)+ i) (n) = 20 (n) a.'l'roN 'r'Hu If .(ru)). 2 Thcn / (t) : o (s (rl)). if used properly.ch can 11 occur at rno$t on(:ea. Finar. l/ (")l l c ls(?z)l then / has order at least g. Ma. a*swe will see in later r:hirpturson the a. the .nalysisof algorillurs. . In order of rnagnitude notatiorr.Still. . ( n ) : 1 o r z+ 1 o o . the order of magnitude argurnents tlrrrr tlr: effective. ( r z . If ttris is not satisfied. ExompleI.n element in the clornain of t}re furrc:tion.. wtr ignore multiplicative constants and lower order tertns that becotnencgligibkl as n increases.lly. < / and g have the same ordcr of magnitude. s (n) : o (ft. h .nge. / (n) : o (h (r.onuc. I Some functiorls can be rtlprt:srlrrttxl by a set of pairs { ( " r .sthe first element of a pair. I (rt) : "'t. ea. tlrrl syrnllol : should not be interpreted irs txlra.s I(n)*o(g("))' In this order of magtritude trotittitlrr. wh{:rc il.3 Let f (n) : 2nz + iJtt. } . expresseda. and gti is the corresportdirrgvilhrelin its ra.'l'Hnonyor.ncl order of magnitude expressiorrs r:annet be treated like ordirrirry cxl)r{}ir$ions. .

t'l'tor'r set is called a relation. we write :I:='!J. Both vt'rtices atrd edgesmay be lahclctl.q. Orre kirrd of relatiott is that of equivalence.rATrcAL PnnLrNrrNanrES AND No'r.. Clearly this is atr equivalencerelation. To indica.. and l. 12 = 0. ll r r a u hvn o . in a relir. the syrnrnetry rule if '': r .'r.l. I f xsnrslF f . directed graph (digrrr.in ha.nge..pair (r:.ctly orrclitssociated eletnent itt the ra.. a*ssor:iated . syrntttetry. A relatiori rlcrrotexl lry : i. 'Ihen 2 : 5.4 ' ' Consider the rela. is a. Such a construct is actually ir. Ea..uml.ll z. and 0 = il6.tethat a.err} of edges. :y r' dt and the transitivity nrlc If fr: A a n c ly : a.'Dn} of vertices and the set E: {e1.. the set V : {tt1. Graphs miry bc labeled.sexir.ph). a label being a ntrme or other itrformation with parts of the graph. frlr irrstance e. then n # E. irs it satisfies reflexivitv.n orrtgtlirrg edge for ?ri and an incoming edge forr.37mod3.Ltk) is an edge from ui to tr4..37) arr crpivirlcrrce relation. I Grophsond Trees A graph is a construct consistirrgof two fitilte sets.tion tht:re miry trcl scvtlral such elernenls in the range. consiclered atr equivalence if it satisfies three mlcs: the reflexivity rule r: = r: for ir..i: \U. Relatious are Inore general thtlrr firrrt:tions: in a function each element of the doma.chedgttis a pair of vertices fiom V.ransitivity..i.sirrce we associate a direction (fiorn ui to u6) with each edge..l M.'tt2.e2. a generalization of thc is concept of equality (identity).tion on thc sct of rrotrttegativeiulegers defined hy 'l': l' if and only if rmocl 3 . Wc srry that the edge e.

The graph with vertices {u1.1 there is a loop on vertex u3.2...t)t.ur) . such that there is exactly one path frorrr the root to every other vertex.cycle with base u4. Tlees are a particular type of graph.. and that htust)ne rlistinct vertex.llerla.trxltxlges is ca. The length of a walk is the total nurrrber of rxlgcs travr:rscrl .then it is sa.nd ui the child of u1. . In Figure 1. . il. on If we do not concern ourselves with efficiency.lly.. ) . .n eclgefrom a vertex to itself is calk:d a loop. u2. a.. . list all outgoing edges . 1 . (z1.1 so rea. .llverrticesuk1. we will refer to atr algoritlun for lindirrg all sirnpkr paths between two given vertices (or all siurplc c:yrlcs bn^sed rr...116). If there is an edge from ua to ui. (rr.slong as the. we carr llsc tlrrl following obvious method.in going from the initial vertex to the final orre. (rr. u r ) . we a. The sequence edges(ut. we have all paths of length orrt: txlgt:s (u. then ua is said to be the parent ()f rrj. (r*. On several occasiotts. ( ' . graph are labeled. A s e q u e n c e f e d g e s( a t . Starting frotn tlte giverr vcrtcxr say ?ri. u 3 ) | i s d e p i c t e di n F i g u r e 1 . A walk fron ui to itself with rro rcpcir. .rr). A tree is a directed graph that has no cycles. . (u3. .ched. Iist all orrtgoing (ui.. . we can talk about the label of-a walk.rt) is ir cyc:le. . u * . we will eventually lisi all sirnple paths beginning at rr.w a l k o ) fiom rri to urr. These are called the leaves of the tree.u2) is a simple perthfiom ?rrto ??. we will have all sinrple paths of lerrgth two origirrrrtirrgat a. Since there ate orrly ir finite number of vertices.. ' u i ) .id !o be simple.tt. The level associated with each vertex is the nunber of edgesin the path from the root to the vertex.Chopter I Itrrn.bels of encorrnteredwhen the path is traversed. startittg at u4. u3) .th we are rxlnstnrcting.vertex). Flom these we select those ending at the desired vertex. If the edges of a. A wrrlk in which no eclge is repeated is said to be a pathl rr path is simple if no vertex is repeated. These terms are illustrated in Figure 1.onuc'l'roN'l'o tsr: Tttr:rtHv ou Cotr.1.u'r'n'r'tow Figurc 1. If no vertices other thatt tlte base are rrlllc:itttxliri ir r:yr:le. In Figure 1. For a. We r:ontinue this until all possibilities are accounted for. ( t r . Fina.sirnple one. This definition implies that the root ha^srro irrcoming edges and that there &re some vertices without outgoing edges. ( u i . of llrt rxit ir. r r " i.ur). After we do this. called the root.s s a i d t o h e t r ..rr).'s. The height of the tree is the Iargest level number of any vertex.u3} and edges {(u1. At this point. ( u 3 .1 f \ \'otl ---t------------- Graphs are conveniently visualized by diagrarns in which the vertirx:s are represented as circles and the edges as lines with arrows corrrrt'cting tho verti<x. This label is thc scqucrrr:c r:dgo ler.u1) .y do not lead to arry vcrtclx alrtlirdy rrsed in the pa.

Two special of proofteclufques are used so frequently that it is approprintc to rcvielwthem briefly.. we want to prove to be true. we want to a*ssocirr. In mathema. .. . arrd so on. Induction is a technique by which the truth of a number of statements can be inf'erred from the trr.rEs AND Norauoru Figure 1.. we know that Pt. . We can then use induction to show that everv statement in this sequenceis tnre.te ordering with the nodes at each level. ilnd rnilny proofs artl simllly a . Irr a pro<lf by irrclucticln. we r:arr allply Contlitiorr 2 agairr to tlaim that P61z must be true. \Me need not explicitly continue this argument because the patterrr is clcrrr.rATrcALPn.ticalarguments.P2.. we More details on graphs and trees can be found irr rnost books on discrcte mathematics.P. Pz.L"'"13--t I I At times. The problern is such that for any z ) A.. Therefclrc..nulumnn. Suppose we have a sequenceof statements Pr . . Pk are true. For some fu ) [. every statement is true. Brrt now thir. .-1. 2. tlrt: trutlm of P1. we employ the accepted rules of deductive reasorring.. we rrrguo as follows: Ftom Condition 1 we know that the first k statements are true.2 Root Height = 3 .rth of a few specific instances. Thc cltairr of rcit*sorrirrg t:itrr btl cxtended to any strrtcrnerrt. imply the truth of P.sequence such steps. Furthermore. an in srrch ciL$e$ talk aborrt ordered trees.. ProofTechniques An important requirement for reading this text is the ability to follow proofs. Pz. $rrpposealso that the following holds: 1.. Then Condition ? tells us that P611 alsrl rmrst btr tnre.t we know that the first h * 1 statements are trlrc..1 MarHu.1. These are proof by induction and proof by contradiction.

pacityte "rcmcmbcr" things during the comprrtir.r tomaton is and how it c:itrrbe representedby a gra.FIN ITE AUTOMATA tho". we harvt. provide formal defiuitions.rrrlstirrt to clevelop -. Wt: llcgitr with linite accepters. a firritc itutornittott is severely limitcd irr its A ca.ninstarrt:t: clf ir firrite acceuter. 35 . whit:h ir.tt:s firriLe. Ttl llrogress. orrly ir general understanditrg of whir.ph. Since a.r't: sirnple' spea gcrreral scherneiutroduced in the ltrst drapter. But sitrcethe number of srxrtrstir. linite atnoutrt of inf'ormir.ptcrto the basic cotrceptsof comprrtaof tion. I'his type cia.t.tirlrr.a finite a'utomat(lrr (:arronly deal with situations in which tlrt: infortnalion to tre stored at irr arry tirrre is strictly bounded.l ur introduction in the first cha.rigorous results.tiorr carr be relained in the control rrnit tly placittg Lhe uttit itrto a. The a.l case of thc of autclmittotr is characterized hy having no tt:rnporary storage.t irrl auAt this point.n input file cannot be rewritten. we must be more prct:istt. ir. is sptx:ifi<: state. particularlv tIrc tlist:ussiotr autornata' was hrief antl irrftlrrnirl. 1ti is tt.utornittorr -Exatnple1.

then if tlrtl rlfa iu in sta.F). with its input mechanisrrrort the lefhmost symbol of the input strirrg.ffifrtll A deterministic finite accepter or dfa is defined by the quintulller M : (Q.tir. then the graph associatcd with M will have one vertex lal-rclcdq0 irnd another labeled q1. A deterministic finitt: ir. Q x X . The initial . use transition graphs. Rrr tlxtr.nhm one input symbol.Tln input rnechauismcan Inove otrly frorn 'I'he tratrsitiorrs Icft to right and reads exactly ontl symbol on each step. the dfa will go itrto stirtc' q1. Otherwise the string is rejer:tcxl.a) : gr .Chopter 2 AUToMATA ffi FiniteAccepters Deierministic The first type of automa. internal state to another are govcrned bv the transition futrctiorr frorn one d.bclcda represetrtsthe [ratuitiorr d (gs.ton we study in detail are firrite accepters that are deternfnistic in their opera. if rI (qo. tlrtl irrlxrt meclha.tes. An edge (qn. it is essential to havc a. Trr visutr. Ibr exirtrllle. it is a. At to thc initinl time. Drrring eelchmove advancesone position ttt tlrc right. while the labels orr the eclgcs art: the current values of the input sytnbol.te4u ancl the current.qs.mple.l function called the transition function.E. where Q i s a firtite st:t of internal states.tion. in whir:h the vertices represerrt stattls arrtl the edges of represent transitiorrs.. t J O € Q is the initial state.Q iu el tota. Thc Inhels on tlre vertices are Lherrirlrr()s tlx: sttr. of the automaton.) = q't.l stir.d.tesclf some clfa M.ssurnecl be irr thc initial state q0. the string is ircx:clltedif the automaton is in one of its Iirtr. E i s a flnitc sct of symbols called the input alphabet. We start with a precise frrrrnir"ldefinition of deterministic accepters. clt:iu and intr"ritive we picture to work with. itrlrut syrnhol is a.n. ond Grophs Accepters Tronsilion Deterministic sniiit$lil]tnir tiftffif[j! f. d .qt) la.tes.rx:epteroperates in the following tttanrrt:r.if qp and qt are interlral strr.lizeand represetrt fittitr: irtrt{lrna. T -C Q is a set of final states. When the etttl of tlxr string is so each rnove colrsurrrrlri reached. In dist:ussirrg arrtomata.

t:itch otte Iabeledwith a diff'erentq. 1 } .1 DnrnnluNrsrrc Frurrn Accl.i:i.f.l ma.we $eethat the autouraton will accept thtr strirrgs 101. X. 11001.1I) tlc:firritionof a dfa to its tra. q0.# T . and its va. d-.Q. the l is relirrlirntl the autotnaton goerJ strirtg and. 4 0 { q r } ) . is More forma. not 100or 1100.bt:lt:r1 arry vertex.tonwill llc irr after reading that string.ttcr to converl from the (8.plr:RS 87 Figure 2. Next. at the same time.cccptthe string 00' Therefore.tecl the initial vertex.nedge (qr.tctlo. r 1 z . { 0 . qi) Iabeled a. we see that the arrtrlmatott remains intcl state 4r' We in state qs.lly. The ser:orrdargutrtettLof d* is a string.a): qi. irr a fltral state q1.2.tc will be identilied by a. the 'Ihe with q6 is called graph has a. c:orrsecutive 0's. finitt: ttt:t:cpter..nsitiorrgraplt represeutatiou and vice veruir.luegivtls tlre stale the autotna. Stnrting irr state gs. IUI= ( { g u . whcrc d is given by This dfa accepts the strirtg [Jl.nsition firrx:tiotr d* : Q x X* . :i..lo.redrilwn with ir. By similar since after reading two reasoning. 6 r"L-+Q It is convenient to irrtrorlut:e the extendecltra. it will be in uttr.tesa.') a cletertnitristic: then its associatedtrtr.trincoming rrnlir. the syrnbol 0 is rcird fi. vertex assocria. hrrt tlf b . are now at the end of thrl The dfa does not a.. the string 01 is a. if M : (Q.q t . X.nsitiorrgraph Gna has exactly lQl vcrtitr:s..if d (qo.i €Q.. Looking at the edgesof thrl graph. d .-For every transitiott rrtle d(rJ. d.crt:cpted. It is a trivia.double circle.1 I arrow ttot origiuating a. represents tlrc dfh } . Final sta. while tltose labeleclwith qy € F arc tht: final vertices.li:i ii =r. 0111' and :. rathrlr tltatr a single symbol. For exermplc.rst.t sta. ) : q' n .

go.38 Chopler 2 FIt'rIrp AuToMATA and 6 ( q t . : d(o:. d.l) . Formally. r The language accepted hy a dfa M : (8. we can define d* recrrrsively by d. b ): q z .ffl$ifii . (9.renow ready to define formally what we mean by an associated language. First. But d" (q0. (2l) 1' r)\ &t-+!-q-F Ir u € Ej. longuoges ond Dfq's Having made a precise definition of an accepter. q) d (go'a) (2.2) to get U-_!no' :_d (d*(qs.3). L (M): ur) {ru e E* : d* (q6. b) ab] a).we get !'!'' d*(qo.A) : q. e F} .F) is the set of all strings on X accepted by M. lsfi. we use (2. we a. . In formal notation. s -EA To seewhy this is appropriate. vious: the language is the set of aJl the strings accepted by the automaton. then d* (qn.(qo. let us apply thesedefinitions to the simple caseabove.!: oz ab) as expected.3) : Substitutingthis into (2. ab): qr. The associatiorr is ob.E.0) : d (d.

sedon graphs are emvalid as those that use the fbrrrtrl properties 6f d' Thel following preliminary result gives us this assura. followed by a single b. the results are hard to fbllow. Consider the dfir. bc trltal functions.2. so ther.irr Figure 2.r.ccolltit. until the The automaton irr Figure 2.nt:tl.llNIS'tICiFtNtre Accr:p'tuns 39 and trlrrstltluenlly d*.te frotn which it can never 'I'ire sla.2 reurtrins in its initia'l strr.'tt. the clfa goes into trtir.2 .gea. Wtr see clea.1) arrcl rMe (2. use gra.> 0l .phs. All other irrput strings are rejtN:tt:d.chstep. At ea. To do so.narrtornatott deterministit:. If this is also the la"rt syrnbol of the input. as far i.lling such a.2). Whilc it is possible to hase all arguments strictly on thc properties ofthc transition functiorr attcl its extensiorrtlrrough (2.''b:rt.nt:tt in a nrlrtlirral state. thg autorlatotr a. * lu. Tf nqt.rrd rnealls that thtr tlfa stops it either ir. string is accepted.2 wt: allowecl the use of two labels on a sitrgle edge.2 In drawirtg F-igure 2.ystritrg in X* ir.cr:cptedby the a'rrttlrlatotr is f. In our discrrssiorr.t e$(rirpe. Sut:h rnultiply labelerdcdges are shorthand for two or mor(r distittct trirrrsitiols: the trelnsition is taken whenr:vcr the input syrrrbol nratches any of the edge labels.tiorr. ttre langua.te q2 is ir trap state.which ptlssible.renttt rtfsled bV the rtlpresentation antl thal arguments ba. a urrique tnove is dtlfirrcd. w f F I .ry number of c.r assllrirnce that we a.t L(M): ( ) { r ue X * : t 5 * q o . A dfa will pror:tlsstlver.cc:cpt or not a. then the q2. Nonaccepta. so t'hat we irr'(l justifiecl in ca. I These exarnples sltow htlw t:ottvenient transitiott graphs artl for workiug with finite irutotnata.teq11 first b is etrcountered.cr:ttlttsall slrings clclnsistirrgof au arbitrtr. wc tttust of cotrrse irave sonre are more intrritive. Note that we rt:cpire that 15.rlyfiorn tlte graph tha. In set nota.1 Dnrurl. Figtrre 2.

'l'herr frrr every q?. It is apparent f'rom this exarnplc that a dfa can easily be implemented a^s ir {xrmputer prograrn. we errrllloy reasoning sirnilirr to that for prograrnming in higher-level languages. The table in Figure 2. 6. yet typiur.w) : qi if antl only if there is in G.l example of an inductive proof in connection with automata.X. But if d* (r71. the result of tlx: theorem is so itfrtititfr' obvious thrrt a formal proof seems unrlecessary. Tra.a) : {i.in. In construc:ting automata ftrr la. thus completing the 1rr{lof.4).ur): qi.nguagesdefinerrl informally.qo.irrrtl k:t Gna be its a.€ X*. Whilt: graphs are convcrrir.w) : qt (. Assurne thnt the claim is truc frrr a.rr.ssociatedtrarrsition graph.F) be a deterrninisticfinite accepter. Agir.nt fbr visualizirrg irutomata..DQ. The best implernentation or representation depends on tlte specific applicatiorr.ri. there rnust be a walk in Gy labeled u from qi t. Since the result is ohviously true for n: l. to qi.qi) with label a. as a simple table-lookup or a.urof lr:rrgth n * 1 and write it ns ...1air walk with label zu frorn q.ll strings u with lrl < n. t\ implies that there is a walk in Gy from qi to qj laheled u. ilire secorrdis that the result will be rrsed over and over. 6* (rqi. straiglrtforward way to show that the existence of such a path irrrplit:s (2.et M: (Q. whilcl the column label representsthe currerrt input symbol.sa sequence of "if" statements.l..irly obvious frorrr irn exa.ei € Q.mple.3 is equivalent to Figure 2. Here the row Iabel is thc r:urrent state. Fbr exa. Supposenow that 6* (qi.nsition graphs are very corrvcrrient for the kinds of argurnents we warrt to make here.. Thus there is a walk in Gnr labeled ua: u between {a and qi.ru: Aurovarn I. f'or every 'ur€ I+.u): q6. The errrtrv in the table tlcfines the next state.() qk.mination of srrr:h simple carrcs Ilxample 2. Since lul: n.n proved rigorously using an induction on the ir$ be Icrrgth of ur. so wr) rr$e them in rnost of our discussions. I lJt-_lr.ting and proving it as a theorerrr lets us argue quitc confidently usittg graphs. Corrsirlr:r then any . It r:a. we can clainr by induction that. other representations art) also useful. This rnakes orrr exa.40 Chopter 2 Frr. $o sta. The argurnerrt r:irn be turned around in a.2. fclr example.1. so that by construction Gy has arr cdge (Sn.(qi. arrtl . The first is that it is a simple. But the .mplesand proofs more transpa"rent than they worrld be if we used thc properties of d'*. we can represent the functiorr d as a table. then M rmrst have a tranuition d(qr. Proof: This clairn is fa.u :. We went through the details for two rt:a^sons.

b. and the second is a b. if the first symbol i$ not an a or the $econd one is not f. On the other hand. b} starting with the prefix ab.". We can therefore solve the problem with an automaton that has four states. The only issue here is the first two symbols in the stringl after they have beerr read.J Find a deterministic finite accepter that recognizes the set of all strings on X : {a. no further decisions need to be made.2. Itwo statesrfor recognizing ab ending in a final trap state. an initial"state. I Figure 2.I DETERMINISTIC FINTTU ACCEPTERS 4l Figure 2.$ b' tl b !o ?o ?t q. the automaton goes to the final trap state. . the automaton enters the nonfirlal trap state' The simple solution is shown in Figure 2. If the first symbol is an a.4. and one nonfinal trap state. where it will stay since the rest of the input does not matter.4 . Iz {t 9z !t ?z programming of a dfa is tediou$ and sometimes conceptually complicated bv the fact that such an automaton has few powerful fbatures.

0 ). The family of languages that is accepted lty detcrministic finite acceptersis quite limited. 1}.te in which two 0s were the immediately prrrcding symbols can be labcled simplv 00. We are only interestcd in the la.whett a synrLrol is read.rrr I 1 ' ( o t lc ) o . hecause Iater synrbcllsdo not matter. we need to krrow not only the currcnt input symbol. This state mtrst be a trap stelte. Like variable narrr()sin ir prograrnming languager.rtting the autornaton into specific slates arrrl la. We see frorrr this exarnplc how useful mnemorricrlabels on the states are for kceping track of things. In this ca-se. For exantple.5. f i ( fixotrtpl* [.heling them accordilgly. we get a set of larrguagcs ir.Chopier 2 }'rllrn Aurolr.te is labeled 001. If we labcl the states with the relevarrt symbols. but wt: also need to remember whethrlr or not it has been preceded by one or two 0s. Fcrr cxrnvenience. In decidirrg whether the substritrg 001 ha. Tbar:ea fbw strings. tht:n it must be rejected. We can keep track of this hy pr. for example.soccurred. whether or not need to know sclrnel the two previous syrnbols were 00.0 0 .syto see what the trrrnsitions must be. If we consider all possihle with thertr. I Regulor [onguoges Every finite arrtomaton accepts sorne larrguage. .ted such a set of Ianguages a family. a fact we remember by keeping the dfa in the state 00. A complete solution is shown in Figure 2.5 liol*r# riilIr-.this nonfi. We will call finite autornatal.state names are arbitrary arrtl r:an be chosen for ttttternorrit: rea^rons. such as 100100 and 1010100. This gives us the ha"sicstructure of thc solrrtion.tra q Figure 2. d ( 0 0 . becausc this situation arises only if there are three consecutive 0s. but we still rnust add provisions for the srrbstring 001 occurring in the middle of the input.l state. except those corrtaining the substring 001.to see that the solution is irrdt+ed correct. For example.rral sttr. it is vr:ry ea. We must define Q and d so thnt whatever we need to rnirkn the correct decision is rernerrbered by the autornaton. This implies that therc must be a path labeled 001 lrom the initial state to a rronfina. All other stateri are acceptirrg state$.cceptsall the strings on {0. Thc strrrcture and properties . we part of string to the left.ssocia. If tht: string starts with 001. tlx: sta.4 Find a dfa thtit tr.st two.

dfa fbr it.nytil.trNrsTrc F'tt'tt'l'Fl 43 in of the la. diagra.r.* 1". Tlrl sohrtiorr is cotnplicated by the fact that there is no explir:it way of tr:stirrg tlrtl errd of the string. What this dfn. Therrcftlrc. } .!.gc rr:gular by corrstructing a.5. must do is check whethcr a striug begins and ends with au a.ges this family will bet:ome clearer irs our study proceeds. We carr writc a. L. Slxrw that -1. it will be obvious tha. lf this is rxrt tirc end of the string.1 Dulnnl.AccpprpR.t the langrra. and another b is ftlrrrd. Aga'in we is show tha.wcl rrt:cxlrr dfa that recognizestwo conseclttivc strings of essen'Ihe identica.r to Exrrrnpkr 2.L is called regular if and only if there exists sornedetertnitristic flnite accepter M such that L: L(M). trace a few exa.LL)1. it will take the dfa out of the final state. This clifficulty is ever(xrrnc try sirrtply prrtting thrl rlfir irrto a final state whenever the second a is enrxlrrrtcrtxl. Thc txrrrstructiorrof a dfa for this language is similn.qftfr f. To show that this or any other la.gt: 1.mple 2.nexplicit exprussirlnf<tr L?.lstiltc.11t2 b}.s 2..ngualgcis rcgular. : {utL.ccepts strirrg if atttl otrly if it begins rr.t the dfh. caclt a taking the autorna. Aftcr orre or two to tests.ckto its finir.o.5 Show that the lirngrrtr. e {a. b}* } is regular. 7. Aga. I is Let tr be the language in flxa. nanely. Thc: cornplete solution is shown in Figure 2. R. " F A language . all we have to clo is find a. wltat is between is immateritr.3.dfa for it.Lr.6.tnples see why this works.tza i ..2 : {rz. Sca.nningcontinrresin this wiry. Exolmple f .tonba.in.rily .m tially tlrc sartreforttt (but noi necessa.rrgrrirgc regular.l.l irr value).tt: 'u e {o. a. tlrc lir. Since we have txlrrstnrc:tc:drr dfa for the lauguage.ngua. but a little more complicated.2 regular. by definition. filr ttxl trorn(lrrt wtl will sirnply attach a rrarrreto this farnilv.fi ffi $$lliRtl$ri tffi uil. and ends with an a. we is can claim that.


Chopter 2 FtuIrn Aurolvlare

Figure 2.6



I t Jt bJL r i f+_!4 " * ' A ' * t ' r A (fitl,r aa YJ aa 9F{ r?,F j * -_.-v | t

in Figure 2.6 can be used as a starting point, but the vertex gs has to be modified. This state cafl no longer be final since, at this point, we must start to look for a secondsubstring of the form awa. To recognize secthe ond substring, we replicate the states of the first part (with new names), with q3 as the beginningof the secondpart. Sincethe completestring can be broken into its constituent parts whereveraa occurs,we let the first occurrenceof two consecutive be the trigger that gets the automaton into a's its secondpart. We can do this by making d(gs,a) : qa. The complete solution is in Figure 2.7. This dfa accepts,L2,which is thereforeregular.a




, ,q ^ w

,-T.u.rp"o u r t f t - i l

={ tu f)J^
41-11 61aujLft


Figure 2.7


t L,ilat"
t r ,,1,86 4."


?.1 DETERMTNTSTTC Frr-rrrn Accnprnns


The last examplesuggests conjecturethat if a language is regular, the "L so are .L2.,L3. We will seelater that this is indeedcorrect, ....

1. Which of the strings 0001,01001,0000110 are accepted by the dfa in Figure

f)t* E = {a,b}, constructdfa's that acceptthe setsconsisting of
(a) all strings with exactly one a, (b) all strings with at least one a, (c) all strings with no more than three a's, ffi

(d) all strings with at least one a and exactly two b's. (e) all the strings with exactly two a's and more than two b's.

3 . Show that if we change Figure 2.6, making qs a nonfinal state and making
qo1qt1 qz final states, the resulting dfa accepts Z.
4 . Generalize the observation in the previous exercise, Specifically show that if -nI

A* G"ive
(a) I=

M : (Q,8, d, go, F) and

= (Q,8, d, qo,Q - F) are two dfa's then T(6


dfa's for the languages
{abswba:tue{a,b}-} ffi

(b) fI : {wpbwz:

tor € {a, bI* ,wz € {a,b}-}

Give a set notation description of the language accepted by the automaton depicted in the following diagram, Can you think of a simple verbal characterization 6f the language?




( z) fi"a

dfa's for the following languages 61 5 = {a, b}.

(a) I=

{ta: ltalmodS=0} ffi

(b) Z, = {tr : ltrl rnodSI 0} (c) I = {w : no(trr)mod3 1} > (d) 3 {w : n* (tu)moct > n6 (w)mod 3} W


Chopter 2

FtmIrn Aurouare

(e) r,:

-nu(tu))mod:} >0} {ur I (n*(ur) (f) , : {u : ln* (w) - nr, ('u)l rnotl:l < 2}



* 7 rna:/ :t"- !-


e run in a string is a silbstrirrg of length at least two, as long as possible anrJ conrristing entirely of the satre symbol, For instance, the string a,bbhaa,lt contains a rul of b's of length three and a nrn of n's of lertgth two. Find dla's for the following languages on {4, h}. (u) t : {tr : ru cxrntairlrr IIo rtllls of length less than four} (b) .L : {'ur : every rtrrr rtf a's }ras lengt,ir either two or three} (c) I : {'ur : t}rere are at tnost two runs of a's of length three} (.1) f : {tl : there are exactlv two ruus of a's of length 3}


Consider the set of striflgs on {0, 1} delined hy thc rcquircments below. For each construct an acccptirrg dfa. (a) Every 00 is fbllowed irnrnediatel.y by a 1, For cxample, the strings 101, 0010, 0010011001 are in thc languagc, but 0001 and 0010t) are rrot. ffi (l-r) all strirrgs containitrg 00 but not 000, (c) The lef'tmost' symbol diffcrs frorn thc rightrnost one. (d) Every substring of four symbols has at most two 0's. Ftrr exarnple, but 10010 is not since one 001110 arrd 011001 are in the latrg;uage, of its substrings, 0010, contains three zeros. {il (e) AII strings of length five or rnore in which the fourth syrrrbol lrom the right erxl is tliffererrt frorrr the leftrnost sytrbol' (f) All strings in which the leftrnost two syrnbols a,rrd the righttnost two syrnbols are iderrtit:al.


Corrstruct a clla that accepts strings on {0, 1} if and only if thc value of the string, intcrprctcd as a binary representation ofan integer, is zero morlulo five' lbr example,0101 and 1L11, representing the integers 5 and 15, respectivel.y, are to be acceptcd' 11. Show that the language 7 : {uwu : 'u,w E {o, b}* , lrl : 2} is rcgular. L2. Show that tr : {.a," : ",> 4} is regular. 13. Show that the language L: It^: rt.) 0'n I Sf]o* that the langua,geL : {a|" : n,: i, I ula,r. ) is rcgular. ffi (1,1,2, "'} is regih,i,,k fixerl, j :

[iQ ."

15. Show that the set of all rcal numbcrs in C is a regular lauguage. L6. Show that if -L is regular, so is Z - {I}. Qf Use (2.1) and (2,2) to show that

d " ( s ,t u u ): d " ( d " ( q , w ), u )
frrr all tr,u € E'

?.2 NorurErEFMrNlsTrc Flrulrl



1 8 . Let -L be thc language ar:cepted lry thc autorrraton in Figure ?.2. Find a dfa that at:cepts L2.

1 9 , I,et .L be the langrrage acceptcd by t,he automaton in "F-igure 2.2. Firxl a dfa
for tlre larrguage Lz - L,

2 0 . Let I, be the language in Example 2,5, Show that .L* is regular, 2L. Let G,r.rhe the transition graph for some dfa M, Prove the following,
(a) If f (M) is infirrite, then G,y must have at least one cycle lbr which there is a path fronl thc initial vertex to some vertcx in the cyclc and a path frorn some vertex in the cyr:le to some final vertex. (b) If , (M) is finite, then no such cycle exists. ffi

Let rrs define an opcration trun,t:a,te, which removes the rightmost symbol lrorrr arry utring. l,br example, trurtt:a,te(aaliba) is aaab. The operation can be extenderl to languages bv :ate(L): {truncate (tu) : trr € I}.

Show how, given a dfa for any regular langrrage 1,, onc can construct a dfa for truntate (tr). Flonr this, prove that if -L is a regular language not containing ,\, then truncaLe (.L) is also regular. L e t r : a o e , r , , ' a m t a: b o b r . . . b n r z : c o c l ' , , c " , b c b i n a r y n u r r l b e r s s a defined in Exarnple 1.17. Show that the set of strings of triplets

(il)(li) (r)
where the di, lt.i, c; are such that ;r * U : z is a regular languagc, 24. Wltile the language accepted by a given dfa is unique, there are normally man.y clfa's tirat accept a language. Find a dfa with exactly six states that accepts the sarne larigrrage as the dfa in Figure 2.4. m

N o n d e t e r m i n i s tFc n i t eA c c e p t e r s ii
Finite act:rlpters rnorecomplicatedif we allow them to act rrondetermina,re istically.Nondeterrninisnr a powerful,but at firs{Er.Sh is unusualidea..$,3 gormally thirrk of computersas contplettilydeterministic,and the elernt:rrt of chijiEedeeihs out of piritie. Nevertheless, rioridctermintsmis a useful notion, as we shall seeas wc prot:eed.


Chopter 2 .l'l.rrtr: Aurounre

Accepter of Definition o Nondeterministic
Norrdeterminism mea,nsa choice of moves for atr automaton. Ratlter than prescribing a,uniqrrt: rnttve itt each situtr.titlrr, we allow a set of possible tnoves. Formally, we a,chievcthis by delining the trarrsition function so thirt its ratrge is a set of possible states.

A nondeterministic finite accepter or nfa is definedbv thc quitrtuple F) M : (e,X, rl,r/0, , but fitrite accepters, as whereQ, E, {0, F are deiinecl for deterministic d : Q x ( E u { A } )- - 2 Q .

Note that there are three major difli:rerrces between this definitiott ancl the definitiorr of a dfa. In a, nondctcrrninistic accepter, the rarrge of d is in the powerset 2Q, so that its vtr,hreis ttot a single element of Q, but a subset of it. This subset defines the set of all possible states that can be reached try the transition. If, firr irrstance, the current stattr is q1, the symbol a is read, atrd ,5(qr, *) : {qo, qz} , therr either 8o or 8z could be the uext state of thtl rrfa' AIso, we allow .\ as the second irrgutnetrt of d. This mcirns that the nfa can mtr.kt: a transition without corrsutningau input symbol. Although we still assurnethat the input rrrechanismcan only travel to the right, it is possible that it is statiorrary on some Inove$. Finrr,lly, in an nfa, the set ,) (ql, *) ttray be empty, mearrirrg that there is no trarrsition defined for this specific situation' Like dfa's, nondeterrnirristic accepters can btl representedby transititlrr graphs. Ihe vertices are rlt:tertnined by Q, while arr edge (q1,qr) with labcl a is in the graph if arrd otily if d(qi,a) contairrs {i. Note that sintxr n rllay be the empty string, there can be some edges labeled .\' A string is accepted lry au ltfa if thcrc is sone sequenceof possitrle rrroves that will put tlte machiue in a,firral state at the end of thc string. A string is rejcctecl (that is, not accepted) only if there is no possible sequenceof moves by which a firral state c:arrbe reached. Nondetcrrnitrism can therefortl be viewed as involving "irrtuitive" insight by which the best move ctrn be chosetr at every sta,te (assurning that the nfa warrts to accept everv strirrg)'

2.2 NoNnnTERMINIS'r'tc -t'rrurlr:AccnrrnRs


Figure 2.E

i.WWMorrsidertItetrarrsititl'g-pr'inFigure2.8.Itdescribesarrtrrrrltttttrministic accepter sirrcc tlx)rc irre two transitionslabeleda out of gg. I

WtrorrcIeterrrrirristit:a,rrtorn-atonisshowninFigure2'9.Itisrrorrd(lterministic not otily bcc:au$c several edgeswith the same label origirratr: from one vertex, but also bct:iurser has a A-transition. Some trarrsitiorr, suc:has,i(q2,0) are it unspecified irr tIrc graph. This is to be interpreted as a transition to the empty set, that is, d-(g2,0) : g. The automaton acceptsstritrgs .\, 1010, and 101010,but not 110 and 10100. Notc tha,t for 1.0there are two alternative walks, one leading to qe, tlte other to q2. Even though q2 is not a final sta,te, the string is acccpted hs:ause one walk leads to a final stattl. Again, the trarsiticlrr firrrr:tion can he extended so its second argurnrlnt is a string. We require of tlrt: t:xtended transition function d* that if d. (q1, : Qr, tu) then Qi is the set of rrll possible sta,testhe automaton may be in, having uttr.rtedin state {1 and }ravittg rearl rir. A recursive definition of d*, analogous to (2.1) a,nd (2.2), is possible, but rrot particularly enlightening. A rnore easily appreciated defirrition t:an be made through transition graphs. I

Figure 2.9









Chopter 2

FINITE Aurolvrrrrn



Fbr an nfa, tlre extendctl tratrsitiou function is dt:firred so that d. (qr,r) contains 4i if rr,nrlorrly if there is a walk in thc: transition graph fiom q, to e qy lir,bclt:d'u. This irolcls for all r7n,t1, Q and ru e E*.

Figure 2. l0 reprtlserrts att nfa. It has severtr,lA-trarrsitiotrs aud some undeflned tra,nsitirlrrssuch as d (q2,a). 'fhere is a walk labelod wc Sr,4rpost: wattt to find d* (qr, a) nrxl d* (,1r,.\). n, involvirrg two .\-transitions from q1 to itself. tsy using some of the A-trdgtls arrd 92. twicur, wr: see that there are also walks irrvolvitrg .\-transitions to q11 Thus , d * ( , 1 t a ) : { g o 'q r i q z } Sincethere is rr .\-edgebetween {2 alrd q0rwe have irnrnediately that d. (q2,A) contains gs. Also, since any state can be reat:hetlfrom itself by making no move, a,ndrxlrrscquentlyusing no input symtrcll,tl* (q2,.\) also contains q2. There,ftlrtl 6* (qr, A) : {So,q'i} . Using as mir,ny .\-tratrsitions as needed, you (rirrl also check that d * ( q r ,a a ) : { q o ,Q t , Q z I

The definition of d* through labeled wirlks is somewhat informal, so it is rrscful to look at it a little more t:loscly. Definition 2.5 is proper, sincc' between any vertices ui and r.r; there is either a walk labeled tu or there is not, inclicating that d* is cottrpletely defined. What is perltaps a little harder to scc is that this definition urn irlways be used to flnd d. (gi,ru). Itr Section 1,1, we descrihcd art algorithm for finding all sirnple paths bctween two vertices. We crr,nrrotuse this algorithm directly flirrce, as Exilrnple 2.9 shows, a labeled walk is not always a simple path. We catt ttrodify the sirnple path algorithrn, rernovirrg the restriction that no vettex or edge

Figurc ?.10

2.2 NoNDHTERMrNrsrrc Frr-rrrnAccEprERs


can be repeated. The new algorithrn will now generate successivelyall walks of length one, length two, length tlrree, arrd so on. There is still a difliculty. Given a ru, how lorrg can a walk labeled trr be? This is not immediately obvious. In Exa"rrrple 2.9, the walk labeled a between {1 antl q2 has length four. Ihe problem i$ caused by the .\transitions, which lengthen the walk but do not corrtritnrte to the label. The situation is saved by this obscrvation: If between two vertices r.',.and ui thercl is rr.ny wa,lk labeled ,ar, t,herr thr:rc mrrst be sorne walk labeled u.' of lerrgth rro more than A + (1+ A) lrl, whcrc A is the number of .\-edgcs in tlte grir.ph. The a,rgurnent for this is: While A-edges may be repeated, there is always a walk in which every repeatexl A-edge is separated by arr edge labeled with rr nonempty symbol. Otherwise, thc walk contains a cycle labeled .\, whir:h can be replaced by a sirnple path without changing the Iabel of the walk. We leave a fbrmal proof of tlfs clairn ir,$an exercise. With this observation, we have a,urethod for computing d* (q,;,ru). We elvalua,teall walks of lerrgth at trxrst A + (1 + A) ltul originating at tr,;. We select fiom them those that are labeled zr. The terminating vertices of the selectedwalks are the elements of the set 6* (qi,,ut). As we have rt:rnarked, it is possible to define d* irr ir, rercursivefashion as was done for tlte tleterministic case. The result is urrfcrrturrtr,telynot very transparent, arrd arguments with the extended transitiorr fun<:tion defined this way are hard to follow. We prefer to use the rnore irrtuitivc and more manageablealternative irr Definition ?.5. As for dfa's, the larrguagc acr:cpted by an nfa is definecl forrnally by the cxtended transition function.

lrR,Ffin',tllllellrii ,ti
The language -L accepted by a,rrnfa, M : (Q,X,d,qo,F) is defined as the set of all strings accepted in the abovtl scrrsel.Fcrrma,lly, L ( M ) : { r i r € X * : d * ( q o , w )n I ' I n } . Irr words, the language consists of all strirrgs ur fbr which there is a walk labeled 'u I'rom the initia,l vertex of the transitiorr graph to some final vertex.





Exumple 2.10

What is the latrguage accepttxl by the a,utomaton in Figure 2.9? It is car,sy to see from the graph that tlrtl orrly way the nfa can stop in a final state is if the input is either a repetition of the string 10 or the empty strirrg. Theref'orethe automaton accepts tlr: larrgrrer,ge = {(10)"'; n > 0}. I


Chopter 2 FINITEAurouil,rt

What happens wherr this automaton is presented with the string tu 110? After reading the prefix 11, the automaton linds itself in state q2, with the transition d (q2,0) undefined. We call such a,situatiotr a dead configuration, and we r:an visualize it as the automaton sirnply stopping without further action. But we must always keep in mind that such visualizations are imprecise and carry with them some darrger of misinterpretation. What wc carr say precisely is that ) d * ( q 0 , 1 1 0= o . Thus, no final state can be rttached by processingu, - 110, arxl hence the string is not accepted.


Why Nondeterminism?
we In reasorritrg about nondeterministic mat:hirres, should be quite cautious in using irrtuitive notions. Intuition c:an tlasily lead us astray, and we mrrst be able to give precise arguments to substarrtiatc our conclusions. Nondeterrrrinisrn is a difficult concept. Digital conrputers are completely deterministic; tlteir state at anv time is uniqucly predictable from the input and the initial state, Thus it is natural to ask why we study nondeterttftristic rnachifies at all. We are trying to rrrodel real systems, so why includc such nonrnechartical features as choice? We ca,n an$wer this question in various wiiys. Many deterministic algorithrns rcrluire that one make a choice at $omc) stage. A typical example is a game-plarying progrant. Flequently, the best move is not known, but ca,nbe f'rrrrndusirrg arr exhaustive search with backtracking. When several tr,ltt:rnatives are possible, we choose one arrd follow it until it becomes clear whcther or not it was best. If not, w€l retreat to the last decision point and explore the othtrr <:hoices.A notrdeterministic algorithur that can tnake the best choice would bc able to solve the problern withoul backtra,cking,brit a,dtltcrrtritristicolre can simulate nondeltcrmirristtt calt serve with some extra wrlrk. F<rrthis reason. noncleterministir:rnat:hirrrls rrr morlt:ls of search-and-backtracka,lgorithms. Nondctcrrnirristn is sometimes helpfirl in solvirrg probletns easily. Look at the rrfir itr F'igure 2.8. It is clear that thertt is a cltoice to be made. The first alternaLive leads to the acceptarr:t:of tlte string ail, while the seconrl accepts all strings with an even mrmhnr of a's. The Ianguage accepted bv the nfa is {a3} u {ot" : n > 1}. While it is possible to find a dfa fcrt this Ianguage, the nondeterminisrn is quite natural. The language is the urriotr of two quite difftrrcrrt sets, and the uondeterminism lets us decide at the olrtset whir:h case we want. The deterministic sohrtion is not as obviouslv

2.2 NounptrrRMtNrsTrc FrnrrrnAccnlrnns


rclated to the definition. As we go orr, we will seeother and more convincing exarnplt:sof the rmefulness nondeterrnirrisrn. of In the sarrrcvein, nondeterminism is an effective rrrcr:hani$mfor describing some cornplicated ltr,ngua,ges concisely. Notice that tlrc definition of a gralrlrrlar invtllves a nondeterministic element. Irr ,9 a,9bl.\

we can at any point chooseeither tlrc first or the second production. This It:tu rrs specify many different strittgs usirrg only two rules. Firrally, therreis a technica,lreason for irrtroducirrg rrondctcrminism. As we will see, tltlrtirirr results a,re more easily established for rrfats thtr,n for dfats. Our rrext maior resrilt indica,testhat there is rro essential diffcrcnce betweetr tlrt:sc two types of automata. Consequently, allowing rron(lcterminism ofterr sirrrplifies f'rrrmrr,l arguments without affecting the gerreralitv of the conc:lusiorr.


Prove in detail the claim made in the previous section that il in a trarrsitiorr graph there is a walk labelerl rl, there must be some walk labeled tu of length rro rrrore tharr A + (1 + A) l,rrrl. Fitrd a dfa that at:r:eptsthe langua,gedefined by thc nfa'in Figure 2.8. I n F i g u r e 2 . 9 , f i n d d * ( q 6 ,1 0 1 1 )a n d d * ( g 1 , 0 1 ) . ffi

, 3.

4 . In Figure 2.10, Iincl d- (qo,a) and d* (r;r,l)

5 . Fbr the nfa,in Figurc ?.9, find d- (qo,1010) and d* (t71,00).


Design at nfa with rto rnore than five states for thc sct {abab" ; rr.} 0} U {a,ha'o:rr,>0}. C.rrr"t.,rct an nfa with three statcs that accepts the language {tr,b,abc}-. W

8 . Do yorr think Exercise 7 can be solvccl with fewer than three states'l ffi 9 . (a) Firrrl an nfa with three states that acccpts the language L : {a" : rz > 1} u {I,*aA' : rrr } fi, fr;> n ) ' t
(b) Do you think the larrgrragein pa,rt (a) can bc a,cccptcd lry an nfa with fcwcr than three states'/ \lpt' @ l,'ind an nfa with lbur states lbr -L : {a" : rr > 0} U {h"u.: n } I}. Wtli"tr of thc strings 00, 01001, 10010, 000, 0000 are arceptetl by the following rrfa?


Chopter ? Flt{trr Aurolrere

/^\f -\?o/


12. What is the complement of the language accepted by the nfa in Figure 2,10? 13. Let .L be the language accepted by the nfa in Figure 2.8. Find an nfa that accepts I u {a5}. 14. Give a simple description of the Ianguage in Exercise 1.2. (rs.)fina an nfa that accepts {a}* and is such that if in its transition graph a single edge is removed (without any other changes), the resulting automaton accepts i") W 16. Can Exercise 15 be solved using a dfa? If so, give the solution; if not, give convincing arguments for your conclusion. 17. Consider the following modification of Definition 2,6. An nfa with multiple initial states is defined by the quintuple M :-(Q,E,d,Qo,F),

where 8o C Q is a set of possible initial states. The language accepted by such an automaton is defined as L (M) : {tr.' : d* (q6,trr) contains gy, for any q0 € Qo,St € F} . Show that for every nfa with multiple initial states there exists an nfa with a single initial state that accepts the same language. Nft 18. Suppose that in Exercise 17 we made the restriction Qo fl F : fi. Would this affect the conclusion? ( rg./Use Definition 2.5 to show that for any nfa

d- (q,uu) :

pE6* (tt,u)

d* (p,r) ,

for all q € Q and all trr,u € D*, 2o. An nfa in which (a) there are no tr-trartsitions, and (b) for all g e I and all a e E, d (q, a) contains at most one element, is sometimes called an incomplete dfa. This is reasorrable sirrce the conditions make it such that there is never arry choice of moves.'

2.3 EQUIvaIENCE DptpRturt'ttsrrcANDNoNnnrnnN.rrNrsrrc oF FInrIre AccnRrnns


For E : {a, b}, convert the incomplete dfa below into a standard dfa.


E q u i v o l e n co f D e t e r m i n i s t ocn d e i N o n d e t e r m i n i s t iF i n i t eA c c e p t e r s c

We now come to a fundamental question. In what sense are dfats and nfa's differerrt? Obviously, there is a difference irr their definition, but this does not imply that there is any essential distinction between them. To explore this question, we introduce the notion of equivalence between automata.

Two finite accepters.Ml and M2 are said to be equivalentif L(Mr): L(M2,, that is, if they both acceptthe samelanguage.

As mentioned, there are generally many accepters for a given language, so eny dfa or nfa has many equivalent accepters.

.11 is equivalent to the nfa in Figure 2.9 since they both accept the language {(10)" : n, } 0}.

Figttre 2.11

Conseqrrently.. therefore.. The argument. This me&ns that we can actually give a way of corrverting any nfa into an equivalent dfa. After reading an a.. Convert the nfa in Figure 2. once the principle is clear it becomes the starting point for a rigorous argument. a ) : {qt. will be constructive. The corrstruction is not hard to understand. A state labeled g represents an impossible move ftrr the nfa and. after reading ru. so the initial state of the dfa will be labeled {qs}.\-transition. the tfa has no specified transitiort wltetr the input is b. but we can say that it must . therefore d ({sn}. in state q2.inly lms to be demonstrated. we may not know exactly what state it will be irr.12 to arr equivalent dfa. the question invariably ariscs whether one class is more powerful than the other.b) : @. the corresponding dfa will have a fi.qa}. this state in the dfa mrrst be a nonfinal trap state. The rationale for the construction is the following. This result is ttot obvious and certa.. After arr rrfa has read a string ?u. Let us look at this cluestiott for finite accepters.Qjr.. qz} and a transition d ( { s n }. . it is clear that any language that is accepted by a dfa is also accepted by some nfa.Ui.t:tly2lQl subsets. like most argurnents in this book. by making a . In state q0. so it is at least corrceivable that there is ir larrguage accepted by some nfa for which we cannot find a dfa. We htrve added nondeterminism. The classesof dfa's and nfa's are equally powerful: For every language accepted by sorrrerfa there is a dfa that accepts the same language. How can we make these two situations correspond? The answer is a nice trick: Iabel the states of the dfa with a set of states in such a way that. Therefore the corresponding dfa must have a state labeled {gr. saY {qz. the nfa c:an be itt state q1 or. Most of the work in this suggested construction Iies in the analysis of the nfa to get the correspondence betrveen possible states and inputs. means nonacceptance of the string. tlre equivalentdfa will be in a sitrglestate labeled {Uo.An equivalent states. Since a dfa is in essencea restricted kind of nfa. But the converse is not so obvious. Iet us illustrate it with a simple example. But it turrrs out that this is not so. Since for a set of lQl states there are exa.QnI be in orre state of a set of possibltr dfa after reading the same string rrrust be in some definite state.56 Chopter 2 Frrurrp Auronara When we colrlpare different classesof automata. Before getting to the formal description of this. By rnore powerful we mean that an automaton of one kind can a<:hicvesomething that cannot be done by any rrutomaton of the other kintl..nite number of states.qz}.. The nfa starts in state gs.

so we need to find the transitions out of this state. The nfa in Figure 2. so we must refer back to the nfa. then there is no specified transition. is a dfa.\-transition to q2.2.ENCE DnrnRrr. For the corresponding dfa to accept every such w7 araystate whose label includes 91 must be made a final state. If the nfa is in state Q1and reads an a. F\rthermore. The result.12 We have now introduced into the dfa the state {h.1S . d ( { q t . q z }.uNrsrrc oF ANDNoFrnErrgRMINrsTIc FrNrrn AccBnrens 57 Figure 2. from 91 the nfa can make a .12 accepts any string for which d* (q0. b ) : {qo} . equivalent to the nfa with which we started.3 Eeurvar. it can go to q1. the nfa is in state q2.13. Remember that this state of the dfa corresponds to two possible states of the nfa. If. a ) : Similarly. qz}. every state has all transitions defined. q z }. { g r . for the same input. I Figure 2. shown in Figure 2. At this point.q r } .ru) contains 91. Therefore d ( { q t .

{t} of Gn that has no outgoing edge fbr sorne a€8.q*. C o m p u t e d * ( g 1 . we argue by induction on the length of the input strirrg' Assume that for every u of length less than or equal to n.. If MN accepts ).4t} to tqt.Consider rrow any u : ua and look at a walk in G1r labeled r.iand an edge (or a sequence of edges) labeled a liom qt to qt. a ). we use the procedurc nf a-to-tlfa below to construct the trarsition graph Gp for X.rrfrom qo to q. E. dry."'. Each pass through the lXl Ioop in Step 2 adds arredge to Gn' But Gp has at mo$ 218'^/l edgcs. By the irrductive assurrrption. During procedUre continues construction.qi.gi. the presence in G1y of a walk labeled u from qo to qi implies that in Gn there is a walk labeled tr from tsol to a state 8l : {.{'} with a.q*.qo.QnI' if it does not already Create a vertex for Gn labeled {qt. Repeat the following steps until IIo lrlore edges are missing' Take any vertex |qo..l... .. and label it 3.Fry). each labeled with a different element of E.. Every state of GIr whose label containu a. Proof: Given M1y. Identify this vertex as the initial vertex.L be the language accepted by a nondeterministic finite accepter M1" : (8r. To show that the constru<:tiorr also gives the correct aIISwer. .. d * ( q j . E. Then there exists a deterministic firrite accepter Mn : (8o. Fp) such that L: L (Mp). ..."'. procedure: nfa-to dfa L. there will be an edge from Qa to sone state whose label contains ql.. {qo} .--'}. 4. some of the edgesmay be missing. . Every vertex must have exactly the lxl outgoing edges. d * ( q * . To understand the constructitln? relllefiIber that Gp has to have certain properties. d. yielding the set IU.Chopter 2 FIruIrn Aurovete Let . so that the loop eventually stops. create a graph Gn with vertex {qs}. Thus the inductive iutsumption . but the until they are all there. Add to Gp alr edge from {qr. Bui by construction.ny qy € Flv is identified as a final vertex. Q*.. in Gp there will be a walk labeleclu from {qo} to Qa. '.{'} exist. the vertex {So} i" Gsr is also made a final vertex' It is clear that this procedure always terminates. 2..'. Therr form the uniotr of aII these d*. There must then be a walk labeled u frorn qo to Q.qj.tIp. a ) .. o .

T l r c r r .l : I.ru). .It.14 . We will fbllow this practice in the rest of the book.re now a number of missing edges.lo.t.mple to make sure we understand all the steps.tt). r Ihe argumerrts irr this proof. so rrrust dh (qo.tu) contains g/.16. ri : 0.qzI .n . (sr.2. i : 0.restill some missing edges.etransition d n ( { q o .rrNrsTr(l !-rNrrp AflfletrnRs oF 59 lnlds fbr all strings of length rz * 1 As it is otrviously trrrt: for z : 1.1)u dfu 1): tsr. The construction in the above proof is tedious but irnportarrt. Since there a. so tl(xruthe label of di(gs. showing only tlte ntajor steps.0) {qo.rr. (so. At t}ris point. dfu 1)u (ti/(q'. To complete the proof. With o : 0.ncltherexa. 2.trr) contains a firml state 91. emphasizing the basic ideas irt ir proof arrd omitting minor deta.r r s i n ga .ils. rnnsidering dlu (qo. Lnt uu rlrl rr. In the $irrnewity. 0 ) : { { 0 . we compute di' (Sn'0) U div (qr' 0) : {. are admittedly somewhat terse. tr. we reverse the argrrtrent to show that if the label of di(qs.qr} in Gt and add an edge labeled 0 between {q6} and {.HN(rE DnrnnMlr'usrrc AND Nolotrpn. There a.s r } . The result then is that whenever dir(So. : I .15.ll EeuIv.g'.sr}. we introduce the state {qo. : Sittcc t)-1.2.we continue until we obtain the complete sohrtion in Figure 2. we have the partially constructed autornaton shown in Figure 2. Q z I. which you may want to fill in yourself. J : 1. t) : {qt} gives us Lhe trew state {q1} arrtl an txlgc labclud 1 between it and {(ro}.(q0. so we continue. e r .q.'{qo.. k = 2. I Figure 2.qz}.14intoatlequivalentdetennitristictrraclrine. it is true for all n.q. This givus us thc new state.ez} and th. using the (:orrutructionof Theorem ?. sr] makes it necessaryto introduce yet another state {q1.lthough correct.qt}.

/ ./ { . Figure 2.Chopter 2 Frutre.15 (--lrl -r./') \. Autottl'rn Figrue ?.16 @ v'l\ lrrJ) @ 't\ V. . ___.

> 1}. L0. prove it.prove it.\-transitions and with a single final state that accepts I. Section 2. if not' give a r:outtterexantple. Can we make a siurilar claim for clfh's'l ffi {a t-l t i" + @) j #) Firrd an nfa without .is regular' so is IE' 13. Give a sirnplc verbal descriptiot of thc la. Show that thcre exists arr rrfa without .F) # s)'l If .o"* ttrat for every nfa with an arbitrary numbcr of final rtfates there is an eqrrivalent nfa with only one final state. f)oes there always exist au equivalerrt clfa with a single initial state'l Provc that all finite languagcs are regrtlar. Convert the nla in Exercise 11.(q6. ffi 9J Let Z be a regular langrrage that does not contain. the complement of I (M) is F) ft? fr it trr.10 to a dfa. '? : (Q.so. ffi Show that if /. \0 4.16. erpivaletrt to thc given one. ffi /ftor'rve.l. II not' give a equal to thc set {u e X* : d* (qo. Use the construction of 1lheorem 2.u) nF: J counterexample..t the ftrllowing nfa into an equivalerrt tlfa.2.tr.2 to convert thc nfa in Figrrre 2. Sectiorr 2. tXZ/\ p.\-trarrsitirrns and with a single firral state that accepts thc sct {a} u {b"' : n.F) the cornplcment of I(M) is 4 (DIs * @}? If Bo.d. then dfr (So. but with fcwer statreB. Carefully cornplete the arguments in the proof of Theorcm 2.2. Show irr tletail that if the label of d-i.nguagear:r:eptedby thc dfa in Figure 2. .2 into an cquivalent rlfa.re that ftrr every nfa M * f eQual to the set {tr e E. Use this to find another dfa. / .. d.3 EeUIVALIINCEoF DErnRuIuIsuc AND Nounu'rpRltlNISTIC FINITE AccEP'I'ERS 61 One important conclusiolr we can draw from Theorem 2'2 is that every Ianguage accepted by an nfa is regular. |.{o.')also corrtairrs q1' it true that ftrr arqynfa M : (Q. 1.ur) contairrs 97. r7o.d" (qo.x..2. Can you see a sinrpler an$wer rrrore directly? 2.. X. Define a dta with multiple irritial states in an analogorrsway to thc corresponding nfa in Flxercise 17.ur)r-r(Q .

The two dfa's depictcrl in Figure 2. ffi by renxrvirtg the two Provc that if Z is regular.but the colrvortteis not true. ffi R e d u c t i o n f t h e N u m b e ro f $ t o t e si n o Finite utomofo* A Any dfa defines a uniqrte la."t . Frtrm a Ianguage . ffiL.arm {i+\ - f. Such a statt: is inaccessible. :TjTl'..L be any langrrage. then e . the first tlrttomatotr has sollre redlrndant parts..re to be applicd in a practical settitrg. tht:re may be reasolls fcrr preferring one over antlther.a2o"3a 4.op2 is also regulat. (tr) Show that if .'Ti . t t e n\ w ) : tt'2a4. The states rcachable subsequent to the Iirst move d(S0. that is.Chopter 2 FIuIrn Aurov.red positions. The secorrdeurtomaton cotnbines these two options. The state q5 plays absolrrtelv no role in the autornrrton since it can nevet tlt: retr.L is regular then cft.17(b) are equivalent' a$ ir fcw test strings will ctuickly reveal.wlanguage chopL (l) 'efrmosr svmbors :. if w : at. but if the results a.nguage.. there are rnany dfa's that accept it.17(a).. In terms of the questions we have corrsideredso far.and it carr btr removed (along with tr..L we create a rre. I . Corresponding to this.chedfrom the initial state q0... so is euen (I).ll transitions relalilg to it) without affec:tingthe lrrrrguageaccepted by the irrrtomaton. ): ' r r re I ) . all solutiorrs arc equally satisfactory.17(a) and 2.0) rrrirror those reachable frotn a first move d (So. But even aftcr the removal of q5.. f)efine euen (trr) as the string obtained by extracting from tu the letters irt even-mrmbe. we can definc a language eaen(L): l e u e r t( u . For a givcrr language. Thcre may be a considerable diflerence in the numlrer of states of such equivalertt inrtomata.1). . We notice sottre obviottsly unnecessary fcatrrres of Figure 2. 15..

If.17(a).u.l point of view.5*(p.r) € F and 6* (q.2. the second alternative is clearly preferable. ro) € f'. then the states p and g are said to be distinguishable slrrng ?{). or vice vcrsa.u) f F.4 RpnucuoN oF rnn NuunnR on Smrus IN FINITEAurouare 63 Figure 2. and d* (p. in terms of simplicitv.tes as far as possible. r) € F implies d.1? @ From a strictly theoretica. there is little reason for preferrirrg the automaton in Figure 2.17(b) over that in Figure ?. Representation of an automatorr for the purpose of computation requires space proportional to the number of states. it is desirable to reduce the number of stir. For storage efficiency. there exists some string u e E* such that d* (p.') f f' irnplies 6* (rt. However. (9.ut)f F. on the othcr harrd. We now describe an algorithm that accomplishes this. rii-0-it iltlLfi i if Two states p and q of a dfa are called indistinguishable . by a . lbr all tu € E*.

This can be done by enumerating all simple paths of the graph of the dfa starting at the initial state' Any state not part of sonte path is inaccessible.lWe usethis first to showthat at the completionof the nth passthrough the loop in step 3.Chopter 2 Fu-rtre Auroneta Clearly. all states distinguishableby strings of length rz or lesshave beenmarked. a ) : eo. It is also easy to see that the states of any pair so marked are distinguishable.5) (2. 3.q) as distinguishable. we mark all pairs indistinguishableby . Indistinguishability has the propertie$ of an equivalence relations: if p and q are indistinguishable and if q and r are also indistirrguishable. Repeat the following step until no previously unma. so we have that the claim is trrte a basiswith rz : 0 for an induction. We first describe a mothod for finding pairs of distinguishable states. two states are either indistinguishable or distinguishable. q) as distinguishable.6) for somea € X. We claim that this procedure constitutes an algorithm for marking all disiinguishable pairs. c o m p u t e5 ( : p . and all three states are indistinguishable. then fro are p and r. 2. Note first that states q.a) qn and 6 (qi. mark the pair (p. F o r a l l p a i r s ( p . The only claim that requires elaboration is that the procedure fiuds all distinguishable pairs. o ) : p o a n d 6 ( q .and qj are distinguishable with a string of length ru. q ) a n d a l l a e X .q). terminates and determines all pairs of distinguishable states. Proof: Obviously. The procedure nlurlr. since there are only a finite number of pairs that can be marked.ssume .rked pairs are marked.q0. the procedure terminates.o) : qu (2. We now a.. In step 2.\.F). Consider all pairs of states (p. E. Remove all inaccessible states. applied to any dfa M : (8. with q6 and g1distinguishableby a string of length n . 6. If p e F and q fr F or vice versa. procedure: mark 1. if and only if there are transitions : 6 (qr. If the pair (po.eo) is nmrked as distinguishable.. mark (p. One method for reducing the states of a dfa is ba^sedon finding and combining indistinguishable states.

at the end of this pass.at the completion of the nth pass. find the sets of all indistinguishable states.Q*. that elements in each subset are indistinguishable..q^. procedurer Given reduce fr: (8. and so on.'.8'}. all distinguishable ptrirs have been marked..F). By this inductive assumption..Qnl1.. all states distinguishable by strings of length up to n will be marked. it then follows that there cannot be any states distirrguishable by a string of Iength rz..q0.. we can claim that.'. .a) : qe. for any TL. say {ft. enl of such irrdistinguishable states. we construct a reduceddfa as follows. all pairs distinguishable by strings of length rz or less have been marked.. Qj.5) and (2.q*r. create a state labeled i.Qr}.qj. op Srarns ru Frlrrrn Aurotrlnrn oF 65 for all i : 0.2.6). 3.. -h for M.. etc'. . Using the results sketched in Exercise 11 at the end of this section. But if there are no states distinguishable onlv by strings of length n.. 1. Flom these subsets we construct the minimal automaton by the next procedure.add to d a rule TU. we use the results to partition the state set Q of the dfa into disjoint subsets {qn.rQn}.. Becauseof (2.d..ei.5) and (2. If q' € {U. Then from this. there cannot be any states distingrrishable orrly by strings of length n * 1. all states distinguishable by strings of length up to rl . To show that this procedure marks all distinguishable states.. {qr. This means that during the nth pass no new states were marked. find the setsto which q' Ttd qo belong..whett the loop terminates. but not distinguishable by any shorter string. By induction then...1 have been marked. at the beginning of the nth pass through the loop.'. Qj. Use procedure mark to lind all pairs of distinguishable states. For each transition rule of M of ihe form 6 (q...a):Im.n.tQn}.'.j . it can be shown that such a partitionirrg can always be found.6) above..i'"k.. and that any two elements from different subsets are distinguishable. {qr.1n .4 RbDUCT'roN rsp Nulrnnn. r After the marking algorithm has been executed...... . For each set {qa. Flom (2. . assume that the loop terminates afber rz pas$e$.qr} and € Qp {qt. 1 .1. As a consequence. as describedabove' 2 . such that any (t E I occurs in exactly one of these subsets.E.

g+). qz. the procedrre mark will iclentify distinguishable pairs (qo.gt). kh.Therefore.18 Figure 2.l (gr.{2.q+).g+) is a distinguishable pair.qs} ancl {ga}.18. In step 2.19.n3 &r€ all indistinguishable. {qr. Continuing this way.qn).qz). Consider the automaton depicted in Figure 2.qr) is also marked.ga)' In some pass tlrrough the step 3 loop.gt). the states q1.(qr. (h.qn) and (qs. Applying steps 2 and 3 of the procedure reduce therr yields the dfa in Figure 2. (qo. (qz. o. and (g3. (qo. 1) : q+ and d ( q u .rnra 4 .q+). the pair (qo. the procedure computes . and all ofthe states have been partitioned into the sets {So}. the marking algorithm eventually marks the pairs (qo.19 . qz).qa)as distinguishable. (qr. (qo.q3) and (qr.66 Chopter 2 Fmrrn Aurorr.q+). F ir th* set of all the states whose label contains i such that ga € F.1 ) : g r . Since (qs. I Figure 2. The initial state f11is that state of M whose label includes the 0.m). leaving the indistinguishable pairs (q1.

Since there are no inaccessible states in M.z) is a final state. This is relatively easy and we can use inductive argument$ similar to those used in establishing the equivalence of dfa's and nfa's. sa. is harder. Assume that there is art equivalent dfa M1. withps the initial state. n) * dl (di (qo.. ) .. The first is to show that the dfa created by reduce is equivalent to the original dfa.pz.pt.4 RpnucuoN oF rHn NuNaenR Smrns IN FINTTE on 67 Given any dfa M. 1 .. But note that di (so. Suppose M has states {ps. This contradiction proves that M' cannot exist. Thus..Auroraarn 2.wn srtch that .. We will leave this as an exercise. . there must be at least two of these strings.w n ) : d i ( s 0 .such that and d i ( q o . m . . All we have to do is to show that d. M is minimal in the sense that there is no other dfa with a smaller number of states which also accepts L(M). The second part.Fm. with transition function d1 and initial state gn. d " ( p o w t ): P t . (po. upr) : 6* (pn. . But since Mr has fewerstates than fr. (g.. trrr) . In other words. but with fewer states..Tr. ..l.. there must be some strin&z such that d. Proof: There are two parts.li (qo.'trz.. equivalelt to M.wk) ..): qi if arrd only if the label of F(ga. I . there must be distinct strings 'trr. .. M1 either acceg$ both tlrz and ur.ut). 2 ..r or rejects both. application of the procedure reduce yields another dfa fr suchthat L(M):"(fr) F\rthermore. ../?116 ur1.r) : . and d* (qn.. Sincepl andp.tupn): di (di (qo.j. to show that M is minimal. i : 1 . are distinguishable. contradicting the assumption that M and M1 are equivalent.r*) : d* (pt'z) is a nonfinal state (or vice versa). rurr is accepted by M and rurr is not.ur) is of the form.

l dfa.such that any r.E.gc then M : (Q. 6. ffi f. arc distinguishable. Write a cornputer pr()grarn xhat produce.Q .f') is a minima..t.> 0 .L.. and if q. therr r71. Prove the fbllowirrg: If the states g* and qa are indistinguishable. In eachcaseprove that the rcsult minimal dfa's for the larrguages rs mlnlmal. F) is a' minimal dfa for a regrrlar la.. namely.4.d. ffi f1/ Strow that if . 10. (*) fI:{o"b"''n}Z. ffi and q.ce deterministic. X. ** g. that M is equivalent to the original dfa. r l l 3 } ffi {a^:nt'2 andn+41.L has length at lea^strz.ngua. arxl g.Chopter 2 FIullp Aulor.l (d ltrrq below.* that inrlistinguishability is an equivaletrcerelation but that rlistinguishabilitv is not. by 3. 8.. is re. .l in . then any dfa accepting -L rnust have at least n. Showthat the automatongenerated prot:etlure fr) Virri*i"e the states in thc dfa depicteclitr the ftrllowing diagram. for Z.tn'r'e of 1. If M : (Q.du. mrtst be distinguishablc. /- .ga rninimal dfh for any given clfa..mlll (b) r : {a"h: n> 0} U {b'a : n } 1} (c) I: (d)r: : { o " o n .L is a norrernPty langrrage .Z) Sl. f 1 states. Show the explicit steps of thc suggested proof of the first part of Theorem 2. q6. .16. Prove or disprove the following conjecture. Minirnizethe rturrrber statesin the dfa in Figure 2.go.

Put all states rrot marked distinguishable from ge irrto an equivalerrce set with qcl.4 RnnucrloN oI. Consider the following process.' rxn Nuvel. Then take another state. {0. not in the precedirrg equivalence set. to be done after the corrrpletion of the procedure rnar. Then formalize this suggestion to make it an algorithm. saY.b.2. .ilable. Start with some state.R oF STATESIN F'INITE AUToMATA 69 11. and do the sarne thing' Repeat until there are no more states aw. and prove that this algorithm does indeed partition the original state set into equivalence sets.


b. This notation involves a combination of strings of symbols from some alphabet E. which will be denotetl by the regular expressiona.. But in many instances. for which. and the operators *. every regular language can be described by some dfa or some rrfa. Expressions ffiM Regulor One way of describing regJular languages is via the notation of regular expressions. we Iook at other ways of representinpJregular languages. Therefore. a rnatter that is touched on in some of the examples and exercises. and +. These representatiorrs have important practical applications. for exarnple. Slightly rrrore complicated is the language {a.R * g u l q rL q n g u q g e s ond Regulqr Grtt m mq rs ccording to our definition. we need more (roncise ways of desr:ribitrg regular languages. The simplest case is the language {a}. Such a description can be very useful.pter for it. c}. parentheses. a Ianguage is regular if there exists a finite acce. . 71 . In this chapter. if wr: vrant to show the logic by which we decide if a given string is in a certairt language.

we eventually generate the whole string. {.g. Then l . hcbc. ri. a.\.abc. For example. The expression (o.. aaa.rt-rz. the language hco. are rt*rz.1-b. that is.nd12 are regular expressions. it$. so 2 . and (r1).ril Let E he a. (a + b+) is not a regular expression. we find that c * fi and (c * o) a"re also regular expressions. c)* sta. Repeating this.. This is similar to the way we con$truct f amiliar arithnteticerxpressions.firril. bc. since there is no way it carr be constructed from the primitive rcgrrlar expressions. a. and n € X are all regular expresrtions.rrdsfrrr the star-closure of {a} U {bc}. ubr:. I .rfltltil .lr.\.. }. On the other hand.(c+o) is a regular expression. #. (a+b'c)'.aa.q.72 nruDReculAR GRAMMARS Chopter 3 Reculan LANGUAcE$ usirrgthe * to denote uttiott. we hirve the regular expressiona+b+o We use ' for cont:eltenation aud + for star-closttre in a. if we take 11 = c and rz : fr.These are called prirnitive regular expression$.. since it is constructed by application of the above rules.given alphabet..ii. A string is a regular expression if and only if it can be derived from the primitive regular expressions by a firritc mrmber of applications of the rules in (2). of Expression Formol Definition o Regulor from prirnitive constituentsby repeatedly We construct regulrrrexpressions applying certain recursiverules. 3 . similar way. If 11 a.

.rexpression.a b .a a br . will let . L L ( o * . fi is a regular expression denoting the empty set.nculan ExpRpssror-ls 7I Longuoges Associoted Regulor with Expressions Regular expressions be used to describesomesimple languages. then 4. for everya € E. L(rfi: (I(rr))-. . e a r . (o)) (r : { t r . To seewhat languagea given expression denotes. Exhibit the language (a* .\ is a regularexpre$sion denoting{A}. a ) . 5 . I . (a + b)) in set notation. ( a + b ) ): L ( a * )L ( a + b ) : (r (a)).3. 6. a .} : { a . } . This languageis definedas follows: The languagetr (r) denoted by any regular expre$sion is defined by the r following rules. a is a regularexpre$$ion denoting{a}. L (r1 r r2) -. . I((r1)): tr(rr).L (r) denote the languageassociated we with r. 3.1 R.L (rr) u L (rr). the first three are the termination conditions for this recursion. l. r r ) : L ( r 1 )L ( r 2 ) .f. 2. b } .{ o . e . The last four rules of this definition are usedto reduceI (r) to simpler componentsrecursively.b . . 7. .we apply theserules repeatedly. a a r o . e . If 11 and rz are regulaxexpressions. $*qmpfq 5. a a . .If r is can a regula. L ( r r .(a)u r.

c\.b}. We establish a set of precedence rules for evaluation in which star-closure precedes concatenation and concatenation precedes union. Wo can consider this as being made up of .ac}. (a * bb) represents either an a or & double b.abb. I The expression y : (aa)" (bb). I . the symbol for concatenation may be omitted. (a * b)*. To overcomethis. the regular expression a ' b + c. but there may be some amhri.I\. we can see quickly what language a particular regular expression denotes.IaRS to*bK\"^"{ There is otte problem with rules (a) to (7) in Definition 3. Goirrg from an informal description or set notation to a regular expression tends to be a little harder.rz. They define a Ianguage precisely if 11 and r? are given.b}. The second part.bbb.. terminated by either an a or a bb. I n t h i s c a $ ew e f i n d t r ( a ' b + c ) : { a b . Exomple3. Consequently. With a little practi<re.3 For E: {a.2 to stop u$ from taking rt : a and 12 : b * c. stands for any strirrg of a's and b's.Chopter 3 Ruculnn Laucuecns aNo RBctiLA. * > 0}.Euity in breaking a complicated expression into parts. We can seethis by consideringthe various parts of r.n. GR. we use a convention familiar from mathematics artd programrning languages. : {ab. Consider. It denotes the Ianguage L (r) : {a. Instead. B r r t t h e r e is nothing in Definition 3. r L : & ' b a n d r z : c .aa. We now get a different result. for example. the expression r=(&+b)-(a+bb) is regular. that is L(r) : {a2"b2*+t : n } o. L(a-b*c) we could require that all expressiolrs be fully parenthesized. Also.ba.} .. but this gives -* crrmbersome results. so we can write r1r2 for rL . -L (r) is the set of all strings on {a.bb.b denotes the set of aII strings with an even number of a's followed by arr odd number of b's. c } .2. . The first part.

rrswcrs are corl'ect.sa.-ly by atr arbitrary nutrrber of 1's.rexlrressionsfor any given langua. 1} carr bc dr:rrotrxl by (0 + 1). lhere are an r. I The last example introduces the notion of equivalence of regula.n derive a variety of rules fbr simplifvirrg rtrgrrlirr .nswer still incornplete.' .re una. If we see . {rrr e X* :ur ha. : rl }rirsrro llair of rxrnser:rrtivo zeros} . 1}. 101 . ' 1..1 Rl. the answer is harder to construct.5. . This suggcsts thrr.ge. but what rxlrnc:sllc:frrrr:irnrl what gorrs after is completely arbitrary.^ uctr subsrrirffi )7 : ( r * 0 1 1 *( 0+ A )+ 1 -( 0+ A ) .t lea.ts.We say the two regular expressionsare equivalent if they denote lhe same la. Even though this looks similar to Exanple 3. I r the la.r expressions. One helpful observation is that whenever a 0 occurs.r expression suchthat r L(r): one pair of consequtive zeros}. If we reasorr sliglrtly differently.'guage 1 : {tu € {0. the language denoted by the regula.cur. One ca. Notc that this la.rnlimited nunrber:of regula.S For X = {0. However.r cxllrcssiorrs ar(l rrot verv sirnililr irrrrl do ttot suggest clearly l. we rniglrt corrrc up with arrother arrswer. givea. After takirrg care of these special caseswe arrivc at thc arrswcr s a H_Li. both ir. as they denote the sarne language.L as the repetition of the strings 1 arrd 01. 0 0( 0+ l ) .regula. it rnust be lg[g*-gg-"gi.t thr: irrrswcr irrvcllvr:stlrrl repetition of strings of the form l .5.ccollntedf'or.st One catt arrive at alt arrswer by relirsclrrirrg sorntlthing likc this: Evcry utring in I (r) must conlaitr 00 somewhere. tlrc rc:gtrlir.: ( 0+ l ) . we arrive at the solution r .he close relationship between the languages. 1}..Although tlrc two erxpressions krok rliflilront.rrguagcis tlrc txlrnlllcmt:nt of thc languagc irr Exarnple 3. An arbitrary strirrg orr {0. the shorter expression ?':(1+01)-(0+I) rniglrt llu rua.t:Ircrl. Putting these observations together.rexpression(I-ilI I+)".ge. Generally.ngua. the a. D r H*Wnrple 3.nn FlxpREsSroN$ . that is. is sirrcrrthe utringu ending in 0 or c:onsistingof all l's a. Howtlvtlr.

t'ind a. ffi (#.lculan.tu e {a.n'rn> (9i Cit* a rcgularcxprcssiotrorL: 3].\}ts expresuirlrrs(st:t: Exclrt:isc 18 irr the following exercisesection).N'lN{.b}+}.m.uacns aNn R. strirrgs . (0 11.'ntg 3}. jl. .00 (0 + 1). (a) Ir : {1t"ll".+ LL \aLL ) Dil. of @ 7. : {u. L". of (rt) The trrrnlrlernetrt -L2. ancla# clcnote? do Wfrot languages thc cxprcssions Give / \ q a I sirlple / \ + \ verbal ).t:n.. L1. ( c l all strings that contain at least ollc occurreltce of each symbol in r.5'l ffi Show that r : (1 +01 ). $ f fanb^:rtlL. < (r:) The corrrplelnent Ir.s ( d ) all str:ings that contain no run of a's of length greater than two. Firxl a regular expressiorrfor f : Iu E {0. \(n.ll strings in I ((a + t?)-b (a * ab)-) of lerrgth less tirart ftrur..76 Chopter 3 F. a. Give a regtrlar expression fo.D Ctu" rcgular expressiotrsfor the following languages on { o .b}-. w e {a. Give regular e.r. but since we this. where .rrl { i1}.rrrt. have littkr rreed for such rnanipulations we will not pursr"te 1 .) tnir'ra regula.'"b"' : Tr.4' {uut : t.4) Firxl a regular expressiotr fbr the set {a'b"' : ('rr*'rn) is even}. I}- : ur has exactly one pair of consecutive zeros} . ffi 12. L s .denotc thc languagein F)xarnple 3.q. b . fO.rexpressiol for I: {o.rirrirrg (l') all strings corrtir. . 4. (.itritrg IIL) Illore than three a's.L is the larrguage in Exert:ise 1.h"t.rr} 4.. Find a regular exptessiorrfrrr the cornpletnent of the language in Example 3.ncuLeR GR. ) 8.} i}.c } .I.lol : ?}.) also denotes the langrragein Exarnple i1'6. ((t) Does the expressiorr + 1) (0 + 1)-). Lauc. Find two other equivalent expressions. Find a regular expressioufor L: 13. * (") all strings in whir:h all nrrrs of n's have lerrgths that are rrrultiples of three. ( " ) all exactly one a.). O (b) . of tlre larrguage L((au)r b(aa).xpressionsfor the following langrrages.

b. (c) all strings txrrttaining an even number of 0's. l}.)-. give a set of necessary arrl sufficient couditions that r rrrust satisfy if L (r) is to be infinite.e. c}. Prove rigorously that the cxpressions in Exarnple 3. Determine whether or not the following clairns are true for all regular expressions rr arxl rz. and (c) of Exercise 16. 23. Chain-code languages are dcfinccl on the alphabet X = {u. * (f) all strings not corrtainitrg the substring 101.6 do indeed rlenote the specified larrguag. what are sullicicnt conditions orr the exprcssion so that the picture is a r:losed cotrtour iu the sensethat the beginning and erxling point are the sarne? Are thcsc conditions also rrecessarv'l S . The svmbol : stands for e<luivalence of regular expressiotts itr the scuse that both expressitlttsdenotc the same language. 1}.1 R. with 5 : {4. Repeat parts (a). rcspectively. S 22. In Exercise ?2. (u) /:: {ru : lurlrrrodS= 0} {ffi (b) f : {tn:no(to)mod3:0} (c) L = {w : no(ur)mod5 > 0} 17.: riri. w}rere t}rese symbols starxl for unit-lcngth straiglrt lines irr the dircctions up. tlrrwn.EGULAR EXPRESSIONS Tf . (urrLdru. (c) (rr *rr). (b) ri (rr * r:)* : (" + r'r)*. Givc a general method by wirich auy regular expressiortr can be changed into frsuch that (I("))" : L(F). L9. 21. wirich stands for the squarc with sides of urrit length. Formal languages carr be used to dcscribe a variety of two-dimcnsional figures.IJ. S (rl) all strings having at least two ot. (a) (ri). S (d) (rrrr). (b) all strings not entlirrg irr 01.\ or @. (b). alrd (ruldr')'.. 20. (e) all strirrgs with at most two occurrences of the substring 00. Arr exarnple o1 this notation is 'urdl. r. 16J Find regular expressions for the following larrguageson {4.6}. right. 18. arrd left. For the case of a regular expression r that doelt rrot itrvolve . = (rirfi).:curreltces of the substring 00 (Note that witlr the rrsrtal interpretation of a substring. 000 corrtains two such oct:trrrences).: ri. Write (a) all strings cnding in 01. d. Draw pictures of the ligurcs denoted try the expressions (r'd)-.'1i) regular expressionsfor the ftrllowing languagcs on {0.

2.nR GRAMMAR$ 6+)i"a an nfa that accepts the larrguage L(o. With Jl. and rf. Srxrtion 2.nddfa's.1 way. for every regular ler.L (r) is a rcgular la. and (c). then show how they can Lrecorrrbirx:dto implement the rnore complir:a. In Exercise 7. but may represerrttlx. interpreted as a binary irrteger. The constructiorr for this relies orr the recursive definition for I (r).{rcepts (r). We rrow show that if we have arry rcgular expressiorrrr 'we (:irn construct an nfa that arx:eptsI (r).(a+b)). L Proof: We begirr with automata thal. the graph as vertex at the lcft representsthe initial state. The constructions are showrr in Figures 3.ngua.3 to 3. We will slmw this in two parts. and (7).lly the satne.rexpression.ted .ge. . a regular expression that rlenotes all bit strings whose value. (5). M(rr) and M (rz) tha. We neecl not explicitly r:onstruct these autornata. As irrrlit:a. wo then construct autorrrata frlr the regular expressiorrs | 12.m sr:henra. In this schema. in -Figurtt3. rcsptx:tively.ngua.cceptedby some nfa.t accept larrguagcsrlenoted by regular expressions 11 and 12. € E.r expressiorrso. C o n n e c t i o n e t w e e n e g u l o r x p r e s s i o nq n d B R E s Regulor onguoges L As the terrninologv srrggests. ffi 26.ssuming that tlxrrr: is only one final statc. Assrrme now that we havc irutomata..lstate. then . thc one on the right the firral state. Let r' bc ir regrtla. and fl. so we lose rrothirrg in rr.ngua. TIte two concepts artl e$sentia.5. and (3) of Definition 3.1(a). (2).nguage there is a regular exprcssion. Our definition says that ir.ler. I'hese are shown in Figure 3. Bet:arrseof the equivalence of nfats ir. We first construct simple automata for parts (1). r1t s. Find a regular expression for all bit strings. with values not bctween 10 and 30. respectivcly.+run Rr:crJr. with leading bit 1.\.2 on page 73.r. Tlten thrlre exists some nondeterrrrirristir: finite accepter tlnt ir.. is greater than or equal to 40. ConsequentlS I (r) is a regular language.'nted this in .a.78 Chopter 3 Rnculen Laucuncns .ges the sirnple for regula. and fbr every regular expression there: is rr regular language. a language is also regulirr if it is rr.ge regular if it is accc:ptrrrl some is hy dfa.tedparts (4).tically. (b). when inCE)l'ina terpreted as a binary integer. Regulor Expressions Denole Regulor [onguoges We first show that if r is a relgular expression. accept thc ltr.the connection betwct:rr r{rgrrlar languages and regular expressions is ir close one.3 we claimed that frrr every nf'a there is an cqrrivalent one with a sirrgkl fina.f(r1) and M (rr) rt:prcst.

.L (r). we can give a formal method for constructing the states and traruitions of the combined machine from the states and transitions of the Jrarts. then prove by induction on the number of operators that the construction yields an automaton that accepts the language dettoted by any particula. t ) in the drawings.\_. (e) 6) (c) Figure 3. . I ./. we call build automata for arbitrary complex regular expres$ions. To argue more rigorously. (b) nfa accepts {tr}.5 that this cotrstruction works.+ A). . M(r) '/-_\. as it is reasonably obvious that the results are always correct. : (o+ bb)- Figure 3.3 Automaton for L(rt * rz). the initial and final states of tlte constituent machines lose their status and are replaced try rrew initial and final states. where (ba.3 to 3. By stringing together several such steps.3.1 (a) rrfa accepts g. (c) nfa accepts {a}.2 CoNNEcttot'r BnrwpnN RncuLaR ExpREssIONs Arrrn RuculaR Latqcuecns 79 Figure 3. It should be clear from the interpretation of the graphs in Figures 3.r regular expression' We will not belabor this point.2 Schematic representation of an nfa accepting Z (r).

0 (a) Mr accepts L (a + bb). (ba.-]\ *\*-/*\--.t ((a * bb).rV (a) 1-J 6) r /--\ Figute 3. we get the solution in Figure 3. ! b M z r-\t fln .rr. Automata for (a * bb) and (ba* * A). (ba. Putting thesetogether using the construction in Theorem 3.n Ler-rcuacns ar'ro Rnculan Gn-q.6. d M . constructeddirectly from first principles.80 Chopter 3 Rncul.rrr.+ I)). .tr).7 Automaton accepts .5 Automaton for 't (ri). (b) Mz accePts f.7 I Figure S. {.4 Automaton for L (rrr2)' Figure 3.raRs Figure 3.l. are given in Figure 3..c.

A getreralizedtransition graph is a transition graph whosc edges aret lub*l 'Ihe label of a. This creates a bookkeeping probklrn that rnusl be hanclled carrcfully. all we neecl to rlcl is to lind a regula'r expresuitlrr captr. is interpreted as an edge labeled with the expr()ssioll a f b * . We could have labelad this edge a* without changing the language accepted by the graph.AR ExpRpssIols arul R....). there exists a Figure 3. I The graph of any nondeterministic finite accepter (:an be considered a generalized transition graph if the edge labels are interpreted properly' An edge labeled with a singlc symbol a is interpreted a^san edge labeled vritlr the expression a. and thrr. it represents L(o. of a's. Flom this observation.8 represerrts a generalized transitiorr graph.rarily. Sitrce any regultr. with the full generated subsets. as should be clear ffom an inspet:tiorr the go) Iabeled n is a cycle that can genertrtc any trumber graph.bels of all thc walks from q0 to anY fintr.8 .hlcof getrerating the la.. tfi---e concatenation of several regular expressions.. The edge (qo.3.rltbrrt it is corttplicated by the existerx:c of cycles thaL catr oftetr be travt:rsc:tlarbil.. one of the tnore intrdtivtt approaches requires a sidel trip into what are called generalized transition graphs. languagc being the union of all srr<:h Figure 3.b. itr any order. it follows that for everv regula'r latrguage. and hent:tl itself a regular I expression.t for clvtlry regular language.L should hokl. that is.r lir. This dot:s not Iook too clifficr. The strings denoted by such regular exprtlssiotrs are a subset of the language accepted by the generalized transition graph.nculeR LANGUAGE$ 81 for Longuoges Regulor Expressions Regulor of It is intrritivt:ly reasonable that the rxtrrvurst: Theoretn 3.nguagc:ltas an a"ssociatednfa a. Since this idtlir is in used h<":re a limited way and plirys rro role in our further dist:trssiorr. The language acrxlpted of by it is L(o* + o* (o+b) c*).rrywrrlk from the initial sta. while atr edge labeled with mrrltiple syrnbols a. thero should exist a corresponding regular exprussiort.2 CollruncrroN Bnrwnnr-r Rncur.l state.te to ir lirral state is sition graph.nd hence a transitiotr graph.we will deal with it informally. Tltere are several ways to do this.

Si) in I .f. want to create an eqiiiVEittint'f6neralized transitiorr graph with one less state by removing g.9.il4 as a generalized transition graph and apply the above construction to it.r language. However.aJ 1 th@ ntosfi'e"eiili" tlie sense' rhe casiae------fiiEteilETfie to all three Yg$!c9! qiL+j. The complete process requires that this be done for all pairs (qi.3 for details.L: L(r).aNo R. . Siriffiiabel Tvery w-at[-In a genetralized transition graph is a regular expressi6n.is...After all the new edgeshave been added. Then there exists a regular expreti$ion r such that .(l: Irr ca^reswhere an edge is missitrg in (a). Let tr be a regula. The corntructiorr that achieves this is illustrated in Figure 3.qi).9 shows which edges have to be introduced so that the language of the generalized transition graph does not change when we remove q and all its incoming and outgoiilg edges.b.. We can d0 this if we do not change the language denoted by the set of labels that can be generated as we go from q0 to q. Although we will not formally prove this.L. We can assullle without any loss of generality that M has only one flnal state and that q0 f F'. there are some subtleties in the argument.9 for all pairs (qt.1.9 e ) a. but refer the reader itrstead to Exercise 16. To remove a vertex labeled q. it can be shown that the construction yields an equivalent generalized transition graph.stand for general expressions' tai a. We interpret the graph of.T " \ J (a) generalized transition graph that accepts it. every language of iccepted by a generalizecltransitiolr graph.1. L j t .. Accepting this. we will not pursue them here. . we are ready to show how any nfa can be associated with a regular expression. wtr ornit the corresponding edge in (b). Section 4. this appears to be an immediate consequenceof Theorenr 3.reqular.. we use the scheme in Figure 3. Proof: Let M be arr nfa that accepts . where the state q is to be removed a"rrd the edge labels &. Conversely.{q} before removing q. The construction in Figure 3.cR GRAMMARS Chopter 3 Rncur.en LANGUAGES Figure 3.pctll.

3.The corresponding graph after removalof state q1 is shown in Figure 3.". we arrive at the a. A regular expression that denotes the language accepted by this graph is r : rlr2 (r+ + rsrirz)- (31) of Since the $equence generalizedtransition gra. lve can prove by an induction on the number of states in in transition graph that the regular expression (3.Ncuncps 83 Figure 3.-------'='-/ \+-J/ n rr rl \ / (rn (t-(rq +r-r r..h A l b+abta a+b -{ r' }----11 r'.TB* H.bb.11(b).af b. T4.f\{ / lt-.c.1-l.xcr )) . n b n b t a. and alwavsworks. Making the : identificatio[ ln1 b + a.2 CoruuncrroNBnrwporu Rnculnn ExpRpsslot-rs AND REGULAnL. regular expression (a r : (b a ab* ab*b * b)* a)* for the original automaton.rr '*) q with all its incident edge$ can be removed. ){'--L-!. The construction involved in Theorem 3.r' *\.b" rz : o.2 is but it is completelyroutine tedious and tends to give very lengthy answers. r HxsmplC fl"S generalized transition the Consider nfa in Figure3.phsa^reall equivalent to the initial one. I Figure 3.11(a).10.1{) r4 -------}.11 a9'--__-J-9-"--\? o . until we reach the situation shown in Figure 3. We continue this process.1 \---'l \+J/ (b) 0 F. removing one vertex after the other. l"t ab*b A .1) denotes the generalized L .

Lto detrote an odd this we eaJilil * ott]With .re.1) with rr: aa+ ab(bb).(ur) is even and n6(ttr) is odd}... we applv (3. finding a nfa for it is easy as long ir. v/e rcmove the state labeled oE.Chopter 3 Rnculan Laucuaces amo R. Next. On the other hand. We label the vertices with EE to dcnotc zrrl e. An attempt ttt construct a regular expression directly from this description lcacls to all kinds of difficulties.13 .14' Finally. 'rz:b+ab(bb)-a. : ru.:.b}. d\Itl) /"\+ \ r+: 0.ven number of rr.'s and b'E with O.rFumberof a's anr-an . we rerllove tlrc vertex labeled O0' This gives Figure 3.{* e {rr.13.ba.. . First.- !---+---' Figure 3.swe rr$evertex labeling effectively. \ rc: I b*a(bb)*ba.n*t*ne"Iffia*nd get the solFion in Figffe.12 Figure S.t n for Firrd a regular expression the language L-. giving the generalized trarrsition graph in Figure 3.3'fZ' We can now apply the conversion to a regular expression in a nre<:hanictrl way.nculaR GRAMMARS trnq'tmplut.

we explored the t:orrncx:ti6rr bt'twenn finite accepters and sorne of the simpler constituents rif prr.rxl r/ stands frlr the dicits 0 to L Ptrsr:al irrtcgers are a sirnple case of what is sometimes clallccl a "pat$crtt.tf a1q5brrg ref'ers ttr gories.such as identiflers. but the way to get it is rela.mple. wherc. with possible values from {t. Section 2.A}. 'Ihis is eas. Tlxl rclation bctweerr finite autorna[a and regular expressions mean$ thilt wc can also use regular expressions irfJrr wiry of describirrg these features. mclst rNlitors c:xtt:rrd this to penrtit searching for patterns. firr examplc.-.14 + ab(bb)rba b + a(blt)..lkNl attout stl far have been found useful in pattern mtr. 2 colnpcjrrohr Bp'rwpEN RnculaR Expn. For extr.ncur. This is a cornplicir.tlrrl eclitor ed in the UNIX opera-ting systenrreqg$ll?fl jlq jgl[t4. ir.ESSroNS R.: i . g{t-T.tccl ancl exterrsive area of computer scienut to which we calt only briefly allude. The example below is a simplifitrd.I* iry" Ll"t'x "t) . thc kcy to successfulpattern nratching Lsfincling an effective way to describe the pattertis. tivcly straiglrtfbrward.yto see.ba EE b + ab(bh)*a e9 The final expression is lorrg irrxl corrrplicated. or integers and real numher$.Alltexteclitors allow files to ht: st:arrrrcdfor the occurrerlce of a given strin65.veta..15 and in Exercise ls. rrir"tt:lrirrgoccurSintextecliting.anLeNcuecns eur 85 Figure 3. thc sct of all acceptable Pascal integers is clefined by the regular expression f')r'q' stl.s staruls fcrr tlte sign.rgratrrrtirrg larrgrrages.tchirrg.d* .J." a terrn that refers to a set of objects having $ome common propcrtics. but rrcvertheless instructive. dernonstration of how tho irluas we ha. Often. I Regulor Expressions Describing for Simple Pofierns Irr Exarnple 1.

ttern-ma.lgorithrn.1. but lterc tlie situ.cur.86 nNu Chopter3 Rr. This clfa.tching a.). the patterrr recxrgrrition prqgram can take tltis description and cotlert it into arr cc|rivalerrt nfh.reoften used. b)-) .) @) r (((aa. If memory space is a problem. (a * aa)) (c) I ((obab)" * (aaa.4 is helpful. is eft'ectively tltr pl. f'ollowed hy a c.1 tettds of finite autorrrataI'rornrtlgrrlar tlxpression$ rntrny states. Searching rr file for occurrellces of a givcn string is a very simple progranming exercise. Tlxl lffit:iunr:y of the progra. but created at mrr tirnc:. I bln. I'heoretn 2'2 rrray therr be ust:tl ttl recluce this to a dfa.tecl. 'l'heorem 3. Tlxr pa. If the pa.1.*a.h). To solvc this probltrm."+-aba. AII the progranllllel has to do is to proviclt) a rlrive.r that gives the general framework for using the table' Irr this way we carl autotnatically }tarrtlltl il lilrge number of patterns that are defined at run titne. the pattertrs are not fixed beforeharrd.to find an nfa that arcepts the language 2. ideils from the recognition pt:ocess arttoma'ta' theorv a.enLANGUA.ttern is specified by a regular expression. lsing the constructiorr in Theorem 3. Fiflcl aII rrfa t]rat accepts the complement of the language in Exercise 1. We have to deal with an unlitnited rrurrrber of rrrbitrtr. (a) r(aa.ti. @ [1) Ciu* an nfa that accepts the larrguageL((a+ b).rch followed by arr arbitritry ntrmbtlr tlf tz'tl.an GRAuneRs the file for the flrst occurrence of the string ab.1 and 3. furthertnore.b(o + bb)")' ffi Find clfa's that accept the Ibllowing larrguages.GI.m must be considered also.rilv rrlmplicatefl pa.s llecuI. We see from this example tltat the UNIX editor can recogtrize regular expressions (although it uses a sornewhat different convention for specifying regular expressions than the one used here)' A challenging task in suclt ar} applicatioil is to writc an effici<rrrtprogram for rec:ognizingstring patterns.) ffi (b) t (ab (a * ab).tterns.ttern description is part of the input.rtion is more complica. as eln instruction to sea. Tlte constructiorr using Theorems 2. in the fomr of a trtrnsition tatrltt. so must be flexible. the state to yielcl autorrrirta with reductiorr rrrethotl rlt:st:ribedin Section 2. / T l llse the conrrtrrrction in L (ub'aa. f b).

!'ind dfa's that accept the following languages. t . f$. 6 .nculnn Expnnssror-rs ehrr R. Give explicit rules for the cons-truction suggested in Figure 3. section 8.3.ncuLaRLANGUAGES 87 5.1. (a) L : L (. (a) Find an "^equivalerrt gerreralized transition states.'--* qLlwhat langr. ..ba). ffi graph with only two (b) What is the languageacceptedby this graph? W ".tu). Find an nfa for Exercise 15(f). ffi -.ab* U I ((ab).rageis accepted by the following generalized transition graph? a+b a+b' aab+t Q0r)Find regular expressions for the languages accepted by the following automata. use this to crerive a regular etpression for that language./Consider the following generalized transition graph.2 cor'rNncrtor'rBnrwnnrq R. a*) (b) L : L (ab*a*) n r ((ab).9(a) are missing.g when various edges in 3.

Write a regrrlar expression for the tret of all Pa^scalreal rrumbers' lrets whose elementsrare integer mrmbers' 1 6 .88 Chopter 3 Rlculan AND RECULAR Gnauuil..T a )r n o d 3 l 0 } (d) 1.: { t n : ( n o ( t n ) n ' 1(.. ('u)) mod it : 1} ) (.nt.) .10. Once the notion arr exact rrratch . In sorrre applicatiOns. Section 2'l' r4.9 generate equivalent generalized transition graphs.this tirrre elinrinating the state OO first' (@ f itta a regular expressiorrfor the following languages on {a'b}' (u) I : {rrt : no (ru) and rr. Find a regular expression for Pa^. Rework Exarrrple i3. only a1 approxirnate otre.rcal ileed t T .6(rrL) are trot}r everr} (b) . : {tt : (n. Prove that the constructions suggested by Figrrre 3. !'ind a regular expression that generates the set of all strings of triplets defining correct binary addition as in Exercise 23. we Inay rtot of t[e pattern.') is even} 1 3 . 1 5 . srrdr a-sPrograrrfr that cheCk spelliilg. : {ur : 2n* (u) + :3n6(u.$u) .Rs LANGLTAGES (") 0 11.

Let 1.mrniirs.L.serl (. ffi **(b) LJiscusshow you might use this to write a pattern-recognitiorr program for i.1 to find nfa's for L(aa) arxl -L(O*).L. Usc the conrrtnt:tion in Theorem 3. E 7'+ A gra.reof tlu: lirlrl A I1t:.S.Gir.y of describing regrrla.rtsert(L) * {uau: a E X.nd l.ions a..rB.ns 89 of an approximate matt:h has been madc precise. Ctustnr<:t arr nla lbr droTt(L). Is the result corrsistent with the defirrition of thesc la.) CJiverran nfa for 1. we err()irrterested in knowing what kind of grammar we can associatewith tlrrl farnily.L).ll prorluctions A -. consider patterrrs derivcd from the original orres by inscrtion of orre syrnbol. lbnnally rlefirre this operation rlrop for lang. utt € L} .reof'tr:il ail alternative way of specifying languages. As arr i llustratiorr of this.rnrnilris said Lo be left-linear prloduct. wlrere A.nguages? RegulorGrqmmqrs tffiffiffi A third wa. bc a. a spuriorrn^syrrrbol anywhere in a lvrrrtl.n.3. Aner'logousto the previous cxcrcise. * (a.P) is said to be right-linear if a.uagcs.ser. First. Right-qnd Left-Lineor Grqmmqrs A grirrrtrnar G : are of the fbnn (y.ges. A --+ it:.q.nrr.3 Rrcur. ln effer:t. ltrrrguage family through irrr automaton or in sorncl other way. given an nIa for .Grarnmars a.T. corrsi<lerall words that can be forrnetl frorn 1. shr:w how orre can construct an nfa frrr rirr. a. Whcrrrlver we define a. by dropping a single syurtrol tif the string. we look at grarrrmarsthat gurxlrate regular Iangua. * 18. if all . autornata theory can be applied to corrstruct approximate patterrr rnatchers. i'rtsg'L(L) contains all the words crcated from tr by inserting.sinprrt a regular expression for . B E V.rlariguagesis hy means clf certain simple grer.alrlr.t (tr). regular larrguage on }J and define 'i. using a.

Pr).Stab.)' I .q.rGr:({s}. Flom this single instance it is ea"sy conjecture that .S.lti$[$ .r. and . S'. (ab).$N$iN$Ns[liEihi. the gra.R or AA regular gramfirar tt.t is either righl-liilear Note that in a regular Hrirrnma.mmaritself is neitlrt:r right-linear nor left-linear.L (G1) is thc lirrrguageclenotedby the regular exprtrssionr =..90 Chopter 3 Reculln GRAMMARS LeNcuacrs ann R.rs.P) with producti''s '5-.withP1givenas SrtbSla is right-linear.i.-"egramma.trl.{o. {o.t} .r.rc. Sr -r Srnbl5zo '52 + ft'' is lcfb-linear. with producticlrrs S .a. or lcft-lirreirr.| . thet Erlablgjlxsl or bir eitherthe rightrnost leftmostsy"tbg]gi$Effi tion. B-Ah. BI .l. {o. .Bl.rgramma.. we tlirn $eethat L (G2) is the regular language I' (aab (ob).h9+ahab9+rtbaba to is a clerivatiorr with Gr.t nlost one varia[le appt)ilfs tl1 the tllsistent'lv right sicle of any productiorr.u. is tine tha. In a sinilar way. A . Although everyproduction is either in right-linearor leftis not rcgrrlrr.5r. Sz} . Iinear fbrrn.4. a.}. f'\rrtheilnore.). The stltpttntle: S+a. fz). Both Gr and G2 are regula. S. The granunar G2 : ({S.. b} .

that s: Vo. . 1 tt'ttJ.t)t. LeL G : (V.tcdliy ri right-linear gra.r. To do so.n GnAr.l ftrrm.fcsular qratunalh_alwavslilear.tc of the autornaton will . ' r d E .. 'Ihe initial stir..tV.rnmar is tr. The corresponding nftr <:rr.rea.il..4tletjo-Lwithout restriction orr tht: llositiorr of *+-"-#.regula.. t]re derivation must have thc fbrrrr V o1 u t V .he rightmost symbol.lreacly pr'cessecl is identical to the terminal prclix of the sentential form..t regulrrr grarrrrnars are associa.bte. Right-Lineor Grommors Generote Regutqr longuoges First..IrueRs 9l tlrt:refore is not regular. Note that ther scntcrrtial forrns of a right-linea. a. * __ (3.r Iirrrgua. cleir..or V.. b ' .ges.ivation a b .consuming" ea.UpV71 4 1 1 1 1 ) 2 '"' u k l J l = ' I t J . whilc the part of the string a.y1. then btx:inrstrof the lorm of the proclu:tiorrs in G. Thus.uzVj.n imitatc this step lry going from statc D to state E w]:err a.rriveri tr.step in tr rlr:r.q. If .rly.?) rhe automaton to he t:orrstnrcted will reprocluce the rklrivation by .rois a stritrg in tr (G).chof thtlstl u's irt turn.riglrtlirrear grammar.nothcr wrry of talking about regu)a.ncl t}ra.S.In this scheme.lJ Fl. -1 . Srrl4xlsenow Lhat we have a.T.r grirnrrrrirr. we coustruct tr.twr: have of Jrrodrrctiorrs the form Vs -i r)tvi1V .lwir.trlof the automaton corre$poncls t|e tcl variable in the sententia. one iEliiElEorrr (xr(:rrr on tne rlght $Idetrtf arry ]1lp. 4I)11)2-.ncul. we show tlmi a language generir. tlrt rrot all linear gl'aloqrar$ arreregqtal_ I (Jrrr rrcxt goal will be to show tha. a.x*r'ple of a linear A litrear granrmar is a grammilr irr which at most. +':**-::-*:* ttris varia. c D+ a .t. This sirnple icleais the basis for the fbllewirrs theorem.nclit rir:<:1rs as l.r grirrnrrriir lul'e the special fonn in whic:h thcrc is exactly one variable a..the stir... Proof: We assumethat V = {V0.ys rtlgulil. Then tr (CJ) is a regular ranguage..nnfir that ruirnics the clerivations of a. The gramnlirr is *rr . P) ht: * riglrt-lirrear grarnmar. svmbol d is r:nrxlurrtered. regular grammars a.t try usirrg a production D + rIE. a.}.tecl with rcgulirr larrguagesand that fbr every regulirr language there is a.

/ V|+ arar. innd $o on. ir*uuume M wa. lr$ing paths labeled u1. Reprcscnts fl=-6 \7 \J V. ( Y o& r a z ' ' ' a * ) : V i ' For each prodrrction V'ataz"'ftrntt the corresponding transition of the automaton will be 6 * ( U .sconstructed.92 Chopter 3 Rncuren Lencuacns . d .... The general scheme from such is shown in Figure 3.r) . d* .. & t & 2" ' g . d will be defined so that . so that clearly V1 € d* (H6. where V7 is a final statc. the automaton will have transitions to connect V... Because of the way in Corrversely. e pa.a2.Vr. to y/.. and tl is ar:cepted by M.lrun RpcuLRR GRAMMARS btl labeled Vs. Supposenow that w e L (G) so that (3. Thc mrmplete automaton is a... by construction. and for each variable Va there will be a nonfinal state labeled Vl.. that trr is accepted by M.'. of rr rmrst have the form 'ut = ltt\1'2"'LlnUt (\-_-/ A n . to accept ur the automaton has to pass through a which seqqence states Vo.+ Represents arar.15. ) F \ -. a path from V6 to V. Therefore.ssetnbled individual parts. labeled u1. For ea<'hproduction f i .th from V. a^V.+a 1 a 2 " ' a * V i . to Vi labeled u2.2) is satisfied' In the nfa there is. The intermediate states that are needed to do this are of no concern and can be given arbitrary labels.'u. r ) : V y .i and Vi that is.

Finally.tcsof the dfa now become the variahles of the grarnmar. 5 . we need to introdurxl att additionnl vertex so tha. .Vt. If .3. We start the transition gra.r ir. I Right-l-ineqr Grqmmqr$for Regulorlonguoges To show that every regular langua. irrrd the theorem is proved. we neecl to rrdd an edge labeled b between Vr and Vy. The first pro* drrctiott rule creates an edge la. The language gerrerated bv the gramma.get:irrr be generattxl by some riglrt-linea. t ) 7 y . Hence ur is in I (G).reled betwecrr tr/sand trzr. t . .rguage L ((aab).3 RnculnR GnauueRs 93 Figure3.16.with vertices Vo.bVolb.lphabetE.ph. The sta. ab)..t there is a path Iabeled ob between lt and Fo. t ) 7 is possible.l. and VJ.u n V n4 u 1 u 2 .16 and the derivation V " + u t V 4 u 1 u 2 V i3 u p 2 . o. Fbr the se<. giving the autornaton shown irr l'igure 3. we start fronr the dfa fbr thc language ilnd reverse thc construction showrr in Theorem 3.ond a rule.3. then there exists rr right-linear g r a m r n a rG : ( V . ' . and the symbols causing the tra.nd accepted hy the automaton is the regular la.L is a regular language on the a. I to'that grammar V(t V - accepts the languagr: generated by the aVt. P ) s r r c ht h a t L : L ( G ) .r grammar.rrsitions hetxlrne the terminals in tlte productionu. .

qo. ' ' ( I k & r . 1 a m } . .'ia k a r q l 1 a i a i .l i n e a r Q grarunar G : (V.+n LANculcEs ANDREGULAn (Q. 4 QoI ut. (3'4) (3.5) with the granrlrlar (J. C o n u t r u c t t h e r i g l r t .t. the gralnllrar will have orttt productiort ftrr each of tlxrse dts. we put irr P the produr:tion qi + a. ai) : e. . 6(qr.d. a."'..a n ) : q t .. E.L' We assumethat Proof: Let M: : { q o . 6 (q.'.ataj ' ' ' apal) : Ql t complcting the proof. . In addition. .ep aiai4r I aiai 1 u . t q n }a n d E : { a t . Therefore we cirn make tltc derivatiotr "'(LnQt.\. q r 1 .r} and . tht:n its derivation must havc the form (3'5).F) bc a.nsition il (qt. if 96 is in F.v can gelrerate every strirtg in . . (3.1f u e L (G).g4 GnaltltaRs Chopter 3 Recul.Qt.5 : gu. Conversely.jqh.aj"'anilt.1. . r .oi) : qn of M. and u E L(G). . d ( q " . For &/ to acccpt this string it must make moves via d(qo'at): qo.Q. wc add to P thc productiott {r --+ .dfa that ac:r:epts. P) with v = {qo. a z 1 .. S.3) We first sltow that G dcfined in this wir. Consider u € L of the fortrt w : (]. But this implies tliat ii* (qo.i.at):qy e F' By txrnstructiorr. For each tra.

One can make a simila. is given in Figure 3. H$nuwtplo I $ $'. Proof.rEF 9t -L For therpurpose of constructing a grammar. A lanSirragetr is regrrlar if and only if there exists a left-linear ggammar G s u c ht h a t L : L ( G ) . . b) = {qt} E(qyo) = {s} lo -d(t {t-Ett 9z-bg.) = {92} E(q. bc clerivedwith the constructed grarrrrnar can bv Qo 1 (rh ) 0.. the sarneconstnr(:tion can be used if M is an nf'a.). I Equivolence Between Regulor Longuoges ond Regulor Grommors The previous two thrlclrerns esta. The result was obtained by sirnply following the construction irr Thtxrrern 3. togcther with thc corresponding gramrrrar prodrrctions.r connection between regular langrragesand left-linttar gramma.[]cuLAnGRnnrnraRs 95 I'igure 3.4.4. A+rj. We orrly outline the main idea. it is usefhl to note that the restrictiorr that M be a.aQ2I aahq2 =+ aabaqs + aaba.rs.17. tlfa is uot essential to the proof of Theorem 3. Given any left-Iirrear grammar with productiorrs of the fbrm A --+Bu. lt --aQJ ?. thereby showirrg the complete txlrivalence of rcgular gr&mmirrs and regular languages. With minor modification. The string aabcr.17 o) s(so. The transition firnction for att ttfa'. Construct a right-linear gr&mmirr fot L (aab*a.bliuli the connectit:rr between regular languagcs and right-lincirr grammars.3. ={sJ E(qt.3 It.

(")) 2.18. A Ianguage -L is regular if and orrly if there exists a regrrlar grammar G suc:h that L: L(G).2 Theorem dfa or nfa Theorem3 .t the equivalence of regular larrguages and regular grarnmars. r Prrtting Theorems 3. regular expressions. wc a.uive a.'nd.4 Theorem Regular Brammars .L (G) : (.mmars' While in sorne irrstarxle one or the other of these may be most suitable. they are all cqually powerful' They all give a complete and unambiguous definition of a regular langnage' TIle connection between all tlrcse concepts is established by the four theorems il this c:hapter. 1 3.L((d))R. respectively. and regqlar gra./.cilg every suc:h prodrrction of G with A uRB. / /^\\R A few examples will ma. 1{ _+ t. we useExercise Section verseof any regular larrgrageis also regular.whichtells us that the rc12. (d) But thenso are.cus ANtl RH(IuLAR GnRunlns we r:onstruct from it a right-linear grammar G try repla. We now have several ways of describirrg regular languages: dfa's. .18 Regular er<pressions Theorem3 .3. nfa's.t&.5 together. as sltowrr in Figure 3.L(G)' is regular.Chopter 3 Recrrlan Lnuclul. Next.4 and 3. 3 3. Figure 3.ke it clear quicklv that . Sinced is right-lirrcrrt.

Construct a dfa that accepts the language generated by the grammar S abA.rn > B}. 1 @ l-inrl a regular grailrrrrar for the larrguagc L .GRerr. In Theorem3. : {w : (no (w) .ru6(. L = {w € {rr.} 2. b}.\ there exists a rightv linear grammar whose productions are restricted to the forms A-aB A+tL.r. (e) : (f (G))8.).{w : (n* (ur) .{a"lt"' tn I mis even}.: \\ {tn: lno (ur) . consisting b} of all strings wittr no more than three a's. F'ind a regular gramrnar that generates the language (ur) is even} . eVardaeT B . ' (O f itta a regular grarrrrnar that generates the language on E : {r.rrr)) mod:l I 0} (d) .ab).antl left-lincar grarrlrrrars for the larrguagc L : {atth"' :rr. Find a left-lirrear grammar f<rr thc language in Exereise 5. fi + u.. Find a regular Blarnmar that generates the language L(aa* (ab + a)-). A + baB.n.provethat l.3 Rncure. W 1 G Suggesta construction by whicJra left-linear grammar can be obtairreclfrom an nfa directly.. 7.6 (t G Find regular grammals for the following languages on {a.rrunRs 97 1. S.Albb.1 1 6t r ) l i s o d d } . '. (e) L : {ut : no (u) and nr.5. (zrr) arc both even} ffi ' (b) .b}. 9. S 11. Constnrct a left-linear grarnrnar for the larrguage in Exercise 1.t n" (w) * 3rr. whereA. ( (l$y'Show that for every regular language not containing. Construct a riglrt-linear grailrrrrar lbr the langrrage L((aah.ll.) Corrstruct right. 2.no (to)) mod 3 : 1} (c) I . t Lj. ffi 5.

$l# {5 .cn. where.q'r'tM. whereA e V and r ET*. grammar.9. S r . 15. atttl a^ssume ({S}UyrUyz. 5 r .s {r+} \--'l Sho* that any regular grarrrlrrar G for which L (G) + g must havc at least one procluctiorr of the form 4+r. X .t'. P ? ) b e a l e f t V that Vr and l/z are disjoint. Consitler the linear lin*u. Show that -L (G) is regular. E . ( r c . Find a regrrlar gramlnar that generates the set of all Pa"scal real numbers. P r ) b e r i g h t . ) f e t G 1 = ( V r .9is not in VtJVz and P: g r a r n m a rG : * SrlSz} u Pr u Pz.l i n e a ra n d G r = ( V r .98 Chopter 3 Rnculen Leucuac+psaFIIrRnculan Gn.P).

Closure propertios. A srx:orrtlset of questions a.heatrswerto this r:rirrjrx:ture cleliniLely is no.tion. ireltrlus in discrirnila.sfrrr inslance in ltlxercise22.The opera. Could it be thnt cvt:rv forrral language is regula.ch string of a language is chrrngrxl.s. we rnust inqrrirc rnrlrtl cleeplyinto the nature of rcgulirr larrguagesancl see what propt:rtics the whole fa. Frrr r:xirnple. tlltlxrtrgh tnostly of theoretica. As we wiII scc shortl.l.closure question.sir.ns wc:ll as operations in whir:h etr.stion we raise is wha. Is the resulting langua.r la.Jrpr:ns when we perforrn operations on regular latrguages.r'/ Rrrhaps irrry set we can specify c:a.tiriris corrsiderare sirnple set opuratiorrs.milieswe will crrt:ounter.t ha.a. and have $eenir f'cw rlxirrnples of their usefillness.y. Thc fir'st que.gc: frrrnilies deals with our ability to tltx:ideon certain properties. We now raise the qrrcstirlrr of how general regula. cari we lell whether a lirrguage 99 . albeit velry txlrnplex.bout ltrngrra. St:c:tiorr 2.1. finile llct by automaton.rnilv ha.studied $om(]wirys irr which they c:arrtre represented.Properties f o R e g u l q r[ q n g u q g e s e have defincd rergularlanguages.ting hetween thc various language fa.ngrralgcs arc. But to understarxl why this is so.ge still rtrgular? We refer to this a.n acx:eptecl some.lintorcst. wcr such as concatena.

We say that Lhe farrfly of regular languagesis closed ullder ulri()rl. L1L2. L1 -tL2.We will see ittstancesof this (Theorem 4.Il regular languages.fi arrd. Closure under$impleSetOperotions We begin by looking at ther closrtre of regular lattguages urrder thc (:ommon set operations. or regular grammar for it. Irr this chapter.f3) lir. If L1 and. the auswer may be obviotrs.Li.ter in this chapter. attd if we can show that the candidate languagc rloes not have it.Lz are regular liulgtrages.gefamilies. more prac:tic. sotne of Lhenr havr: vt:ry little.Lr and L2 tne regular. cornplcmentation. their union also regular? In spec:ifit: want to addrcss thc problem in general. Admittedly. when we look at the same quttstions fbr other language families. brrt are not as easily Finallv we consider the important question: How can we tell whether given language is regular or not? If the language is irr fa<:t regrrlar. it may rrot bc clear what practica.r is to study the geueral properties of rcgrrlirr Ianguagtrs. Ont: waY to show a language is not rcgultr.ill and Example 4. we a can always show it by giving some dfa. a fact we express by saying that the family of regular langtragcs is closed under union.Chopter 4 Pnorenrrns on RrculAR LANGUAGES fbr is finitc or not? As we will see.r expressions rr a. then we catr tell that thc larlguage is not regular. irrtersection. By giving us insight iulo the general nirture of language families. concatenatioll. but many results are useful. is instances. But if it is not.nuwered regular answered for other larrgua.nd ?'2' 12 such that.7 other. At first sight.Ll : L(rt) and tr2 : L(rz)'By defilititlrl.such qttestions are easily tr.lsignificancethese propcrtieu have. Closure properties of variorrs language farnilies utrder differcrrt operations are of considerable theoretical interrest. Ianguages. characteristics that are sharcd bv a. Is it true for all rt:gular but here we trr and Lz7 It turns out that the arlswur is yes. o P C l o s u r e r o p e r t i e s f R e g u l o rL o n g u o g e s Oonsider the following questiorr: Given two regular languages -L1 atrd -L2. such as ttnion and irrtersection. then so are -L1U L2. Proof: If . We catr ask similar qucstion$ about other types of operationu on languages. a'ttd star-closure. we need another line of attack. we look at a variety of propt:rtieu of regular languages' These properticu tell rrs a. rl *?'2' ?"1 fl'Ild .great deal about what regular languages can and canrrot do. this leads us to tlte study of the closure properties of larrguagesin general. If we know of some such property. that is. clostrrc properties help us answer qrrestions. Later. then tltere exist regula. regular expressittn. similarities and differences in these propcrtics will allow us to contrast the various language families.

tcPi' Ihis is:rt:hieved by taking f ((qo.rrguages.. M : (8.rrr) is deliled fbr all tr € E*.2if a. d* we a.nd orrly if it is arx:cptecl it is-r. 4rr: sfiorter but nttrrt:ottstructive(tlr al least nrtt so obviously r:onstructive) argurrtettts.nrlM2 a combintltlau(P. ' u )e Q .i-) be a 'Ihen ther tlfrr.{o.1. F ) . d . so that d* (qu.ssrrrnecl to be n total function. ta.Jo. fnis is rather strariglrtforwa.Ir.finile acceptt.r la.tiotr.i.rrdpi € Fz.1.pti). Erpation (1. in which caseru € tr. Not orrly does it t:stablish the dcsircd result. Thcrr as F is clefitrecl the set of nll (.u) is a final state.rrdM2 is iu stir. irnpurtirtrt because tlx:y give 1rs iniriglrt into the rrlsrtlts and ofttlrt serve a"s there thtr strrrting point fix practica.dz. . f . sitrple matter to show Lha.Then L1) L2: Ltl Lz . : {r a) a. To show c:losureunder rcrnlllerneutation. Thrrs. X . L1L2. inmedia. w h t r r cM t : Lt from Ml a.'rfor lhe intelrscctiouof two rc:gConsJtructivepr:oofstlt:cur throughout this book.1 CLosuRE orr PRoPIIR:I'IEs Rncur. X.atrd whosc trarrsiliou functiorr d is such thnt M is itr state (q.. Dcrnonstrating cllosureunder intel$t:ction takes a little tnore work.w.1.nd dz (Pl n. Cotrsequentlyeitlrt:r d* (qo.) whenever M1 is in sta'te qi a.t ur E tr1 f'l . I The proof of t:losureunder irrtt:rsectionis a gtlod exampltl tlf a coustruc:tive proof. as itr tnany c:ases.. Noter tltat in the defirritiotr of a' dftr. Consequently.3). tomatrrrr e x P consists / \ of lrairs (q.tena. k:t. but it also shows explicitly how to construct ir.we start witlr DeMorgan's la.r'd. o r t ) * ( q 6 .F a n d w e L ..king the complcrnetrt of hoth sicles. and -Lit and star-clttsrrreis respectively. ( g o .'f .te.nRLANGUAcEs 101 ri are regular exprtlssiottsdenoting thc latrguagestr1 U L2.pr) ) : (tr*.:lrj).( 0 . Hert:.trr fl 12 is regrrlirr.. Q .a whencvcr d1(q1. thcy are ula. Let (Q.F2)are clfir. conca. closure unrler trrrioll.f..lgtlrithms.pt). by M.Po.X.'s' We constnrr:t w h o s es t a t e s e t 8 : L 4 : ( A .Fr) and Mz : : L ( M 1 ) a n c lL z : L ( M 2 ) .l tr.. M : ( Q .x'.): P' such thirt Qr e Ft a. dfa that accepts tr1.F ) accepls Zr. For tlosure uuder irrtrlrsection.)' 96. p n ) .we havc already slggcstcrd Lhe result irr Exercise 4 in Scrtion 2.

the final vertex the initial vertex. Suppose that .Bc(rr-AR Larqcuecps for any languages -Lr and tr2. We then construct arr nfa with a single final state for it. Using closure under union. Now. if tr1 and L2 are regular. Lttt Lz The following example is a variation orr the same idea. I A variety of other closure properties can be derived directly by elemen* tary arguments. WowthatthefamilyofreguIarlanguagesisclosedunderdifference'In other words.Chopier 4 Pnopnnrrns ol R. In the transition graph for this nfa we make tha initial vertex a final vertex.L is a regular language. and reverse the directiorr on all the edges.Lz is regular implies that Ia is also regular. t . we next get that Z1 UL is regular. we know that tr1 rf 12 is regular. proving closure under reversaJ. The family of regula. Proof: The proof of this theorem wa^ssuggested as an exercise in Section 2.L2 is necessarily regular also. Using closure under complementation once more. By Exercise 7. this is always possible. we wallt to show that if . The needed set.Lft. the modified nfa accepts . Then. It is a fairly straightforward matter to show that the modified nfa accepts ruE if and only if the original nfa accepts u.r languages is closed under reversal. then L1 .3. then by closure under complementation. Therefore. Section 2. we see that ffit: is regular. and the argument is complete. namely Lr-Lz:Lrf\Lz.identity is immediately obvious from the definition of a set difference.3.L2 are regular. The fact that . because of the closure of ragular languages rrnder intersection. so are Z1 and -L2. Here are the details.L1 and .

4. h. h (a.abbbcabl.tcuu.b}and f : {a. ..osuRr..h. substitution is tr.R LAN(+rrAGEs 103 or underOtherOperotions Closure In acldition to the standarcl operations on la.realphabels.nguageon X. Then a.ced with a string. Others are explored in the exrlrc:ises at tht: crrd of this sectitxr.nobvious fashion.lt'l't:|"ll#1 1i'. I If we have a regular exprt:ssion r for n latrguage I. Tlrc dourain of tlrtr function h is c:xtendedto strings in a.llcld a homomorphism. Tlren h. lf tr is a. if 1r. Thr:re are ma.c} arrddefine by h'(u):1. Tlte homomorpltic ima. ' lil$finirt..):uLaz'.opnnrrns R. thcrr h.(a1) h (az) .Qro.u Sxsnple 4. in which a single lettcr is repla. image is defined a.(ut) : h. we select only two typical ones.ny srrr:h results. tlrcrt its homomorphic h(L): {h(r) twEL}. regular exprrrssion lctr h(L) carr Lreobtnirrcd by simply applyirrg the homomrirphism to each E syrrrbol of r. aba} is the Ianguagc h (L) : laha. la.nguarges..(n.geof L : {aa.1 Let X: h {o.b. Supposc X and I a.1 CI.) . orre ca.ba): abbhca.function h:}j-f* In words.(h): bbc. .6.: Pn.rl dclirre other operatiorrs atrd investigatc closure propertics for thenr. a hornomrlrphism is a. tlten a.b.

I The generrrl result on the closure of regular languirges under any horrxlmorphistn folklws f'rom this exarnplc in an obvious lrlanner.trt: a.b} '. trelongsto L1f L2.uctrt.1.) is also rtlgular. Proof: Let . We fincl h(r) by substituting h.llthe strings in.y }tottrorrurrphisms.bcctlbt:c)* rlenotes the regulirr lernguageh (L).L1 tha. we claitn that h (tr) is regular..gcft. The fanrily ttf regula.ARLAN6SAGES W.w€ t.1. such that u : h(w).t have a sufrx bclonging to tr2.r latrguagesis therefore closed rrrrrlt:ratbitrar.(a) for each sytnbol o € E of r. then 11: (tlbtc + (bdc)-) (d.(I. It is equally ea^syto see that thc rcsrilting expressiorrdenotes h (r).bet.d}. It r:an be shown directly by an appeal to the definition of a regula'r expressiorr that the result is rr regular expression. . af'tcr removal of this sullix.holrorrtorllhism. All we need to do is to show that for ()very ru e L (r'). I l|iM on Let L1a.nguage.) (aa). regular languag<l denoted by sonre regular expressioll r. thc txrrresponding h (rr') is in L (h (r)) arrd converselythat for every u in L (h.rrrt h {b.akcir.'.Evcry such striug.c. denotedby If f. .Leaving the details as an exert:ise. is the regular latrgttage r = (e * b. Defirxr hv h(a):46""' h (b) : 6.nd L2 be latrgurrgcs the sarnealphir. If -L is a regular la. tlten its homomorphic Let /r.(")) tlterc is it ttt itt . . irna. Then the right quotient of tr1 with trz is cleflled rrs € .L. L 1f o r s o m e y € t r 2 } Lt/Lz- {r:ry (41) To form thr: right quotietrl of -L1with /12.ke t: E: {a.be a.104 Chopter 4 Pnornnuns op R.

Note tha. lllhrlrt:fore.opnRTrns ol' R.1 .4. Lp.ular. The strings itt .t tr:rrrrinate with at least one l) as a sulfix.t hcre L1. Figure 4.pplies to this exer.1). dfir for l1i sa/ thr: automaton Mt : (Q. constnr:tion that takes the dfa's for L1 ir.nguagesis also regular.L1. We will prove this itr lhe next theorem by a. } 1} . This suggestothat the right quotient of any two regular la. rz } I. tlfa for Lt/Lz. ?n.L1 tha.t I L2 : {att6"' . we arrive at the answer try rcrnoving one or rtrore b's from those strings in .qu.1 Clostlns Fn. Since iln inrtotttaton for L1f L2 urust accept ir.L2 consist of orre or rrlore b's.l:cjulan LANCUAGFT$ 105 - s'4 Hr+r{rtrplo If L1 : {a"b"' : n } 1. We start with a.> 0} .rnple.Bt:ftrre we describc the r:onstnrction in firll.nyprefix of strings in.rrrl.X. we will trv to rrtodify M1 s{r that it accepts z if there is any y satisfying (4. lct us see how it a. and Lt/Lz ale all reE.F) irr l'igure 4. {l|!I : m.6.} and L2 : thcln l.rn > 0} U {ba.1.L2 altd constnrcts from them a.

To solvc it. determine if ir.qo.ccordingly to make t7 a finnl sta. The idea is gerreralized in the next theorem I If L1 and L2 axeregular lauguages. 92.q0. : q1 € F..Lnrlcue.F). we determille.94 do not. The resulting automaton for L1f L2 is shown in Figure 4. We see that otrly 91 and 92 qualify. eo. For each et.cps Chopter 4 PRopeRuEsor. Qt. The arrtomaton Mi is M with the irritial state q0 replaced by gi. = (Q. whether there is a. tfu:n L1f L2 is also regular. qb to see whether there is a walk labeled bh* to any of the q1.t u € Lz.Bcur.te labeled u sur:h tha. L2 such that y) d* (qo. Qz. If this is $o. We say that the family of regular languages is closed under right quotient with a regular la. we To apply this to our present ca^se.X.d.{i. check each state qq. Q3.nguage. Q+.li. Check it to see that the construction works.2 The dififrculty comes in finding whettrur there is some g such that ty E L1 nnd 37€ L2. Proof: Let Lt : L (M).nn.8.F) there exists a y €. We constnrct u. f<rrca<:hq € 8. follows. anY r such that d(q6. Q1-.2. We now determirre whether .F.nother rfia fr : (A. Figure 4. € Q. or 94.X.) is a dfa. walk to a final sta.te.2) : q will be in L1f L2.where M : (8. This carr be done bv looking at dfa's M. We modify the automaton a.'R.5.

we can use the construction fbr the interscctiorr of tvro regular languages given in Theorem 4.'a)e F.1 s. finding the transition graph for L2-tL (M1). and a solution is given in Figure 4. L (Mz).1. for Converscly. and fraccepts u because (go. Frrr this.c. In tlrat case. LttLz. acld qi to F. then .I-tcuacns 4.3. . any r acceptedby M.this implies that there exists a 37€ tr2 suclt tha. q d* Therefore. Flom the graph in Figure 4. Then '' ' \ / there must be a.t trt:cepts tr1.(fr): and fiom this that LtlLz is regular. L ( M : t )i L 2 : s .r) : q alnd d* (g. we have d * ( q u .irtoin -L2. (gu. € F. and r is in LtlLz.td* (q. so that there rnust be some q € Q such that d.ting this f'or every et. The examplc is sitttple etrough so that we can skip the formalities of the r:onstrrrction. To lrrove that "L (fr) : Lr/Lz. L ( a b * )' We first find a dfa tha. we determitre F and thereby construct M. lf there is any path between its initirrl verttrx arrrl any final vertex. construction. Therefore is in L1.z) is in by F. t Lt: Lz: L(a*batt*). This is easy.L2 a L (M) is not empty. g € L2 such that rg e trr.R 107 there exists a E in L(Mt) that is a. L (Mt).g) e F'. This implies that d* (qo. Repea. Rut again by rxrnstructiotr. r:)q e F . let r be arry elemetrt of L1f L2. Lz: {o} # a.1 Closunr: Pnopr:t{rlr:sor RucuLA.3 it is quite evident that L(Mo))L2:s. Lz : {o}.L. E Q.ry) e F.We therefore r3t corrclude that .

Thh:us L2: L(a"ba"). The result iq shown in Figure 4. the automaton acceptin1 LtlLz is determined.3 Figule 4.Chopter 4 PRopnnrrEs oF REGuLAR LANGUAGES Figure 4.which can be simplified to a*ba*.4 \s' Therefore. It accepts the language denoted by the regular exL1f pression of a*b* a. I .*batt*.4.

"::.3. ffi 8' Denne rhe compremenrffjTl.. but the proof was nonconstructive.Sz = {n t n E .) n. if L1. In Example 4.4. . : 1 1 . .5r and Se}. Provide a constructive argument for this result. F. -b /-l r Show that the family of regular languages is closed under the cor operation.-". The nor of two languages is nor (L7.eR LANcuAcEs or l0g 1.1 we showed closure under difference for regular languages.rexpression.-.. a. show that the family of regular languagesis closed under finite union and intersection..osuRE PRopERTTEs Recur. J".r ls..r is not in both . ffi (b) L (ab.r languages is closed under the nor operation.. show that h (r) is a regula. 2. .$ Sr $. Use the construction in Theorem 4. show that the family of regula.a.1.L (baa. show that the family of regular languages is closed under symmetdc difference.e . . that is. . ) n U Lt Lt: are a"lsoregular.then Lu= and .L2)= h(Lr) nh(Lz) (c) h(LrL2): h(L1)h(L) .T':.reregular.1 to find nfa's that accept (a) r ((a + b) a. of the following are true for all regular languages and all homomor- (g/Whi"tt phisms? (a) h(hu r . ..L2. lr. Fill in the details of the constructive proof of closure under intersection in Theorem4. s.) n L (a*b"a). Then show that h (r) denotesh (I).L. L1 and w f L2} . but.1 Cr..1..n} Le of two sets Sr and Sz is defined a..Sr or fr € 52..). 4. The symmetric difference n d:{ 1. 7.-*r (b) h (21 r-r . In the proof of Theorem 4. L2) : {w : w {.. z: h ( L r ) u h ( L { ) *fv-* .2. 6.

operation.r'l ffi 13.tl(L) .eo.'7.lul : Z} is also regular' 14. a t a z ' " t t n . If I/ is a regrrlar language' prove that L1 : {ua : u.--'R \ : FirxlL1fL2. h. E L.'r11F.L is regula. for some 'u e -L} ' exch.r.Lz is regula.(rq. Wffi L 7 .:TT:Trutr. Flom this.u E Zr} is also regular.rand that -L' is finite.. we can define the operation on a language as shift(L): {u:'u : sli.110 Lat'Icuacns oF 4 Chopter PRopuwrttrs REouLaR ..ft(ru) for some trr € .^.' 4il lI L. so is toal (l). ffi 1 9 .) Sbow that the family of regrrlar langualges is clcisetl under euchan'ge. 1 8 .*haa*) : LrLzlL:r is not true ftrr all languages and Lz' I't \O " fojjEhow that L t . Show that if ..nt. Theto. closure of the family of regular larrguages under this operatiotr.rd on strings arrtl languages as t h i r d ( a n z o " t a 4 a ! . Forasrring(rrtrz.il of a laflgrrage is defined as thc set of all suftixes of its strings. "If 'Lt is regular n'rrtl trr utrz is also regular' then 1 6 .Lr and ./.:::.ange(f) : {u : a : exchange (Tr...r t r 7 . thefl all languages would be regular.{n : ny e L for solne gr€ t. Can wc " {fi\ \\ U "on"tuae frotn this that Lz is regula. If tr is a regular language..Lr U .1 I t ' ' ) : Q r y t . ' ')' : a s a a ' ' ' with the appropriate extension of this defiflitiorr to languages.2 is definecl as LzlLl : {g : r € L7. that is.n//. Defirte an operation thi.i.. Show that regrrlaritv is preserved utder the.} . gc ( .L2. t t ( . Show that. F) show that the farnily of regular languages is closed under the left quoticut with a regular language. Lt: L(o.:.:. prove that the langrtage {uu : u E L. t 7 z ' 'f'tn .L}. of a language is the set of all prcfixes of its strirrgs. Show that the family of regula.. The head. and.r languages is closed under this operation.} ..Lr L(abo'-). if the statement ^Lr must be regular" were true for all .\ fto)t"t . we SuDDose krrow that .shilt \\ c90-0"' aI]cI e n c h a n .t(L) : {y : ny € -L for some r € t. that is tu. W t\ The left quotient of a language -Lr with respect to 1. . Prove the 20.L1].

it is an easy lrratter.t cxactly we rnearl when wet say . we have used several ways of describing regular languages: informal verhal .given a language.1)m Lz. u E E + . the farnily of regular languages is crosed under the sh'ffie * 28' Define an operati.L as the set of all strings of tr with the fifth symbol frorn the left rcmoved (strings of length less than fivc are left unr:harrged).L bv on Ieftsi.r an(l Zz is rlefinetl as . can we determine whethel ol not . . thorrgh..de(t): . Definc the operation le-ftsi.. Is the family of regular languages closed under this oper. the family of regular larrguages is closed under the rnriruoperation..n(L): {tu e l: t } r e r ei s n o u E L . The question of the existence arrclnature of rrrerrtbership algorithms will be of great concern in later cliscussiorrsl it is a.L is defined as rni.L? This is the membership q*estiorr and a method fbr anuwrlrirrg it is called a membership algoritlmr.lr}6?}n1 wtwz. i u1u2. Show that the family of regrrlar larrguages is closed under the rruinusSoperation. r o } .v QrrusrroNsABour Rncur.2 Elnvnrvrnrr.ul € X*) € show that operation.L and a string to.4. We first consider whir.it is irnportant that this be unambiguous. The m.in of a language . s u c h t h a t u r : .l (b) ((rJ L (Gr)- E l e m e n t o rQ u e s t i o n o b o u t y s Reguloronguoges [ wc rrow conre to a very fundamenttrl issue: Given a language . shuffie(L1. L2) : {w1u1utzuz. trtrr regular languages. very little can be done with languages for which we carrrrot {ind elficierrt niembers}rip algorithms..ation? 25.ar is an elemcnt of . {?.eR LANLITJA#E$ 111 * ??." In ma.tte . Show how()rre derive can regular L(Gt)r L(Gz) ffi L @r) r...rlmi'nuss on a language . The shulfle of tu'o languages .ny a./_-\ 2.wm € Lt.rg1r'rr:rrts.. (Ga) ffi ffi glrammars tbr the larrguages \a.l... . Gr and Gz be two regular grammars.ruruq e L} .+\39Let - Show thal. for all ru.r issue that is often difficult. * 24.

t a regnlar language is given in a standard representation if arrd orrly if it iiJ dQst:rihedby a flnite automaton.L1 irntl -L2werdefine the la'ngua. or infirritc. Often several defilitions of a programming language exist. given there exists an a. Given standard representatiols of two regular lalgtrirgils -L1 iilIrl tr2. Otherwise. it is finite. representa'tion.set nqtation. there t:xists an algoritlrm to determiue whether or not Lt: Lz.y tha. and we need to know whether. An elegant solution uses tfue alreatly estahlished closure propr)rties. attd whether orrc langrttr.he latrguage as ir trirrr. Given a. It is 1ot possible to arguc on il serrtence-by-sentencecomparison. these questions are easily arrswcrtld.ge a subset of at another.ge t4: ( L 1 .qititln graph of a dta. lf there is ir.ngr-tage ttot enrpty. irr sta.sinrpkr pnth fiom the initial vertex to any firLal is velrtex. regular expressions.qc+l:s rlcsr:riJltiqns. Orrly tlxr la. they specity the same language.st three are sufficiently well defined for use in theorerns. sirtce thiS works only fbr finite languages.ndir. . wc thcrcfcrrc sa. the lattgtttr.tt2 Chopter 4 Pnoprnuns or Rpcul. there exists an algorithm for detertninittg wltether or nttt ru is in I. then the la. in spite of their dificrent irppearances. a regular expressiotr. If irlY of these a.lgorithm fbr determining whether a reglrlar lirrrgrtir.Ianguage is infinitc. 'ur ttl stNrif it is Proof: We represent the lalguage by some dfa.ge.rentif we represent t.gc irrfirritc. To dt:termine whethel or ttol a.r larrguages thel ilrgument is not obvious.rcl Proof: Tlrg itrrswt:r is a. Iitrite. Proof: Usirig .nR Lerucu. tltetr test accepted by this automtr. cvtrn ftrr regula. to Nor iu it ea-$y see the answer by looking at tlte regular expression$' gramma. standard representation of any regular larrguagc tr on x and any u € X*.rs.. or clfa's.ppa. This is gelerally a difficult problertt. or a rt:gtrlir'r grilrnfiIirr. finite altomata. 1 6 ) | ( t ' t n t ' z ).ton.and regrrlar grarrrrrrars.l issue. I Other irnportant questiorrsarc whether a language is finite or itrfinite' is whether two languages are the same. For rtlgular langrtagJes least.reon a path from au initial is to a final vertex.is etnpty. I -The question of the eqrrality of two languages is also an importailt practica. find aII the vertices that are the base of sorlrecyclg.

l.nrlrlvcrrtually irnpossible to fincJ. such that w : ufru..t.nfincl a dfa M tha.. de- g.t accepts 1.we will encounter cluestionslikc these on several occasions la.1.2 Er. Let tr be a regular langrrage orr X arrd. with u..L*.L1.o dctermine wlrcther or not ^L1: LtlL. tletenninerr whcther or rrot .L2. Once wtl httvc: M we can theu use the a"lgoritlrrn in I'heorern 4. The operation tnil(L) is defincd as tail (L) : {u : uu E L.r{s 118 Bv closure. but this is not ir.tion. 5._____----- 4.6 to cletsrrnirre if tr. Exhibit an algorithrn that. ffi 3. arlcl we ca.r. ? € X*.2 r:an bc zrnswcredeasily.rrguages are given 1.rlar larrguages. Anticipating a. 11. Show that there exists irn algoritlrnr to determine whether or not ur E Lr .b to 4.lr)Ns AIlotTTR. Exhibit an algorithm for tleterrnining whcther or not a regular language . Iittle.Ac. 7.for any regular language tr.irtd'.Lz.'r.L. given any three regular languages. g.' X" } . termines whether or not L : I.tLz. for arry given to and any regr.L) fbr any regtilar L.fi bc any string irr E*. For tlgular languages. in spite of br:irg obvious ancl unsurprising. ffi 2.yif Lt : Lz. Exhibit an algorithtr that.1. 4. Fi'd an algorithm for dctermining if a given regular language is a Jralildrorne language.rne languagc if .1 we see thirt Lt : fr if ancl trnl. wc will see that the answers ber:gr1(: irrcrcasirrgly rnore difficult. Find an algorithrn to dctermine if "L coltains any tu such that fi is a substring of it.Lr arlcl /.the questions raisrld by Theorems 4.J. Show that there is an algorithrn to detemrine if 1. But from Ex(n:is(: 8.2. -t3 is rcgular.3. show that there exists an algorithm for cleteruining if A e .ter orr.j is ettrpty.lwir. L.u. € . I These resrrlts are fundamental. tlrat is.L contains any string rrr such that tuF € 1. Show that there exists ari algorithm for determirrirrg if tr1 C 1. for any regular languages Z1 arrd Zl. itr stanclard representtr.1. giverr any regular language . For all the exercises in this strtion. there is an algoritlrru {.t regrrlar lir.. : shuf f Ie (tr. ffi 6.ys case when we deal with larger the farnilies of langrrages. Section 1..BculeR LnFrL+r.Ls.nttnltrnnv Qun$.: tr/i. A larrguageis said lobe apa. Show that for any regular Zr and.L : . 10. assume tha. tr.

N r ldentifying onreguloLonguoges R.fi'lll Efufffnfm 4rS Is the larrgrragetr . ancl if TL> rrll then at least one box must have rnore tltarr one item in it. {0. but has to be shown prtx:iuely to be used in arty rntlarringful way. when given a regular gramrnar G.L is regular. say q.cgular languages ca. This is such an obvious far:t that it is surprising how many deep result$ can be obtained frotn it.t have finite nrcm()ryl however. the pigeonhole rrurrrber urinciple tells us thtlt there must be sonre state. t'ind an algorithm for cletennining whether a regular language tr contains an infinite numher of even-length strings' Describe arr algorithrn which. d. There are several ways in which this precision can be achieved. Intuition tells ns that a language is rtlgulir"r only if. such that d* (qu. If we put z objects into rn boxes (pigeonholes).{anbn : rz > 0} regular? T}rc answer is no. PrinciPle the Using Pigeonhole The term "pigeonhole principk:" is rrsed by mathematicians to refer to the fbllowing simple observation. impOses Some limits on the structure of a regular language. e*') : q. as rlost of our examples have demonstrated.L be any regrrlar larrguage on E : {a. This is true. in processirrg arry strirrg. .cns : show that there is an algorithm for determining whcther or uot . Therr some dla M : (8' {a. .n'n tn'ucu. b}. can tell us whelher or not L (G) : E-. . F) exists for I . show that an algorithm exists for determining if tr contairrs arry strings of even length. 2 . o t )f o r i : of zj's.L. .L tail (L) . for any regular 12. Thc ftrct that regular languages are associated with autt>mata tha. the infbrmation that has to be rernernbertxl zrt a. 3 . as we show using a Proof bY contradiction' Suppose. brrt only a finite nunber of states in M.o") : q d* (q.p be inflnite.114 Chopter 4 PRoenRrrus on Rncttl. N o w l o o k a t d * ( Q o . ffiS 1 S . S i n c et h e r e a r e a n u n l i m i t e d i t . .. Let .q. . b} . Sotne ttarrow rcstritltions mlst be obeyed if regularity is to holcl.ny stage is strictly lirnitecl.i.

F. an'). and l s l> 1 . A Pumping Lemmo The following result. In order to use this type of argurnent in a variety of situations. 1 . that is.a*bn): d. This contradicts the original assumption that M accepts a^bn only if n : rn. the one we give here is perhaps the most famous one. L f o r a l l i : 0 . (qo. (d. There are several ways to do ttris. wi * . Then there exists some positive integer ?rl such that any w e L with l. an automaton would have to differentiate between all prefixes a* and a-. I In this argument.ul ) m. Let L be an infinite regular language.b") : 6 * ( q . ' ' z* &g e) (4 2) i s a l s oi n . and leads us to conclude that tr cannot be regular. the pigeonhole principle is just a way of stating precisely what we mean when we say that a finite automaton has a limited merlory. . can be decomposed as w egk.b') : q7 €. there are some rr and m fbr which the distinction cannot be made. with l r u l1 m . . b "t : Af. such that ^'. it is convenient to codify it as a general theorem. . But since there are only a finite number of internal states with which to do this. contain a cycle. But sirrce M accepts a"bn we must have d. (g. .4.3 InururrnyrNcNor-rRpcuLAR Lnlrcuacos lfE with n t' m. . 2 . The proof is based on the observation that in a transition graph with n vertices. any walk of length z or longer must repeat some vertex. . To accept all anbn. Flom this we can conclude that d* (go. uses the pigeonhole principle in another form. known as the pumping lemma for regular larrguages.

gr:s. and so on. completittg the proof of the theorem. Tlrl tlrtlorerrtdoes hold for finitt: lirrrguages. .. tfuis can always he dqne. F}om this it immediately follows that . Since this sequencehas exactly l.Q2t.r:annot bc putnped since prrmpirrg automrrtir:a.Q f. . .. .tt ltence the term puntping lemma for this result. ..vc'stated here.nllc rrsc:tl proving that a languir. so thir.llv " creales an infinite set.r. at lea"stone state must be repeated. with lrgl { rz * 1 : m. d* (80' rY3z) : q7.116 Chopter 4 PttopsRrlns or RnculAR LANC. . is a.ngrtages is ir. y isg_!L:e.r.rT iT an--arbitrary .: iy"h thc rnidd-le part yields another string in .ry ^.L such that dfn. u) : q.L is regular. Finitt: Ianglra.. . rz) : Qf.'.rttnot regular. Wtr say that the rtriddlc strittg is "pumped. z) : q.roliE/ nl irr the putnpitrg lemtna is to bt: taken larger tha'tr tlte lotrgest string. there exists a dfa that rexrogrfzes Now take a.r. d* (q'.r. (eo. Since tr is assumed to be infinittr. l". We have givcrr the pumping lemma only f'or irrfinite languages. The pumpirrg lcrnrna. arrd such a repetition rrnrst start tro later than the nth move' Thus rnust Iook like the sequcrrce Q o tQ l . it. i .'l+ I entries.UAGES long strjqg in 4 cql_b9*b[aken this.5*(qn. Ai z of ru such that r d* (qu. Let such a Proof: If .aitrla. .havtl states labeled qo. Q r t ' .. as well as rf.tqrr.as we is ftrr hir. r d* (g".L. ) : e.{^Jl. 'l'he but it is Vac. Iike the pigeonhole argtrtnettt in Example 4.qr.t rro strilg can be purnptld. although always tegttla.lwaysby cotrtracliction.6. everyuuFiciently To paraplrrase of numberof repetitions p. . nyzz) : qr. and lyl } t. .ffiEl] Cornider the set of statcs tlte automaton gocs through as it proces$cs ?rJ' say Q oQ t 'Q i' . ' t Q f t Q indicatirrg there tnust be substrings 'x.string trr in.gc regular' it r . Therc is rrothirtg in lhe pumping lerrrrnit. The demonstration rrsed to strow that cert. ( l r ' . . which ca.

8 that irllows us to r:orrrhrrleflrrm this that the language is rcgrrlar.. the substring y musl consist c'rrtircly of a's. if the purnpirrg lernrnil e is violatecl everrfor orro 'iu or i. l'he opponent c:Iroosrls rkx:ornllositionrgra.l..ssumptionthat . it hardest for us to wirr tlrt: girrntl. Wc try to pick i in such a way that Lhe purrrpeclstrirrg ti. Asstrrnttthrr. t | /1 vdr'e .urin . 'Ihe opporrerrt pir:ks rru. We do trot krrow the va.rrd clearly not in . The correct argument can be visualiztxl ir.r. Givt:n ?7?.nalways .L ancl lrul > rn. I'his conLradicts thc pumping lemma and thereby is indit:irtcs thnt the a.L : {unltn : n > 0} is not regula. We have to assurnctirir"tthr: opllonent ma. We cannot clain that we have reached a contradiction just bcr:auscthe pumping lenrrna.2).4.' Usirrgt]rc prrmping lemma to show LhaL.tion (4. the punrping leurua holds fbr cvery ?1.of length cquerlor grcatcr tharn rn. proof thal the language is rxrt rcgulir. srrb.mountto a. 4. we We are free to r:]rorisr: irrry ?1r. pick a.2) is 'Ll)O : t1711-lrhm' rr.we must keep in mind what the theorem says. rk:firrcrl irr Erlrir. Orr tlxr otherr hand.t'. wc ctr. tr and every 2.L.gc. A strategy that rrllows us to wirr whiltcver the opponent's r:hoicesis ta. rsr 'ryz.r... it+ 1.L is regular rmrst bc fir. Step 2 is t:rtrt:ia.jer:t 'ro€ . we win the garrrt:. but wc:rkl rrot krrow what they are.hodeE t&] Tlxtreftrre. Suppose lyl : ft.blishing contradiction of ir. string .t L is regular. so thaL Lhe puutping lt:rnrnil must hold. Our goal is to witr Lhe gatrte by tlsttr. is not in. 2. to 3. There are fbur moves in the garne. We are guirrarrttxxl tlrc existence of an nz as well as the decomposition rEz.> t Thcrr thel string obtained by using i : 0 irr Erpatiorr (a. subject to lzyl f nz. the pr.nta. Irr this..llyqrrite difficult) that any pumped stringJrnust be in lhe original larrguir.L.r.inst an opponent. If we can do so. [ltulnflg [.1.'I'herefore. but whatever it i$.lue of nt.3 Irnrurrr'ylNc NoNREcLTLAR LaNcuacns IIT Even if we coulcl show (ancl this is rmrrnir. l3rl> tlrr: 1.kes choicethat will make the. wc mrly be rr. thcn the language cannot be regular. Wrile we canuot forc:eLhe opponerrt to pick il pa.rmping lernma. I In applying the purnlrirrg l{:rnrna.iirr game we play aga. there is nothing in the stalernent of Theoretn 4..bleto choose tl so that the opponent is very of ) \ T IL\tvdlF** . wirile the opponcrrt triels to foil us.lse.is violated firr some speciflc values of rn.rticulilr tltx:ompr-rsitirlrr 'rl.

It.o I I which is in L. y. then t 0 is a strirrg of odd length ilrrd therefore trot iu . flo. But any trrgument that assurne$thtr. If. then the opponent could have t:hosen a g with a. 'til : ttrZ'o.r.l j r'.ha. Tho. Because of this choice. . forcing a choice of r.+rtL.t{ t {o k o{ ' l ffisrestric *cirtirblflbf t e d i n S t e p l J t o c } r o o s i r r g a y t h a t c o n s i s t s a's.tus.rl even numher of bts. we use i : 0.tcvclr rrl the opponent picks on Step 1.(e : /t (qna6t . hor{ lro'/.118 Lewcua.. we could not have reached a violation of the pumpirrg lerrrrna orr the last step. F Note that if we irad chosen ru too short.. tts I/g &#tra/ t / a 5 ' or h.:(+lrl.e restricttxl irr Step 3.tuft.. The strirrg obtained in this fashion has fewer o.L..ah.5. In Step 4. the opponent wc:rt:to pick a:Q. We would also of fir"ilif we were to choosea string cxrnsistirrg all ats.. ir.n Figure 4..bh. say. is irr -L for all i. 6 1 * t.Jsurnc rnirk(r ir wrong rnove.f u. Nnl( th at' ili sltorla/ 7. I .. s )< . Wha. for.. atrd we lose. . g. /ht .virilrrtion of puttrping letnma on our next rrrtlve. In that case. trrrtl the requirement that he o f lle Jr) L0 t / t .. the opponetrt need only pick U:ua' Now zr.:{wwfl:rueX*} is trot regrtltr.rt:ftrrc-L is trot regular.5 u.a F-+-t--------------E J Z . that the opponent will Ttr ir.in the ca-se where wu pit:k 'u) : [tr?"'..'$ on thc lcft tharr orr the right and so cannot be tlf the fbrrrt u.n.cns Chopter 4 Pnornnlus or.. wc carr always choose a34 aE-shetynirr Figure 4.q.rrcl that allows us to produce a.1l1lly the purnping letntna we cannot ir.t thc oJrportettt is so accommodating is autornatically incorrer:t. / a r n .'Rr. To dcfca.'n u).n.

string not in . Now we get the string (ab)tt'a"'.b. in which case we can use aill as trr).use lrgtl cannot be greater than rrz.rrchoost)rl :0 rr.we pick as 'u the string a"'l (unless the opponent picks nt 13..'ylNcNoNREcuLARLexcuacns 119 Let X : {a. r!h<rn' We now pump up. wc can dea.ru)< n6 (u)] is not regular. iqLl.rt pick a g with all a's. Because of tire constraint l*El < ?r-4.sour string ut: (ab\"'t7 a"' - which is in -L.nbe done with y. Civen rrz."g!lS. : {(ab)" ak : r-t> h. using zi: 2.L ((a. Ciiven Lhe opponentts choice for rrz. = lut E X* : rzo(. The languirge 1. beca. I Slmw tha.nything br.g-g+cl*U-rlrJ:! bqilltp_pg1! g{_tbg p_tlirrg nade up qf.g.lJ Itrurq'r'rr. bo!l.r7:.ny possiblu r:hoir:el rvltich is _Illl_t by the opponent. If our opponent picks so (gL-:-6. Since we have cornplete freedorn irr choosirrg rrr. that is y:ak. fo > 0} is not regular.t ca.t L: { e r r l. Supposewe are given rn. ancl -L is not regular is The larrguagtl 7.irr. The choice of r does rrot affect tht. Now. violated. b}..ywcca.'+r.. wc pick ut * tt"'b""t r. Irr thc sarrrc wily.4. let us seewha.we pick a.b)-a-). thereby proving our clairn. t:hoosei = 0 and get ir.l with ir.nnotdo a. the opllonent r:a. r z > 0 } is not regular. The resultirrg strirrg ?il2: 5*trl-|E6m+L is not in .rgument. the purnping lerrrrna. If the opponent wc piclks. Therefbre. The various .L.gir.

h > 0 } is not reeular. I Show that the ltr. the lartguage is not regular.h(c): c lhF" h ( L ) : { a n t h t " n + k: n * k > 0 } :{aici. since for rn ) 2 and k { m we havc h!:.. Suppose the opponent picks y srrth thtrt lul: h < m' We then Iook at z. but it is even easier to use closure uuder homomorphisrn.-1)! Therefore.(b) a. Chopter 4 Pnorenuns or RnGtJt Latqcuecns decotnpositions of tu obviously differ orrly in the lengths of the uubstrirtgs.:nlI} is not regula.ngrrtr. It is rrot clifficult to apply the pumping lemma.k. but we krrow this language is not regular.closure propcrties can be used to relate a given problem to one we have a. . Ttrktl : h(a) : u.bt. thereforc .r.h. This ntay be much simpler than a direct application of the putrpirrg lemma.gt: l:{ar. Show that the language L : { a " b k c ' ' t k I h > 0 .L carrrrot be regular either.i>U}.lready cla"rsified.l . I In some cases. only if there exists a j su<:hthat mt-h:jl But this is impossible.rnl-tu>(m.120 an. This string is irr l.s which has length m. clircctly.

.il Innur'lpyrNGNoNRECLILARLar'rcuA. We can geL a corrtradir:tion the pumping lemma if we of r:ilrr pick i suclt that nz!f(i-1)A:(rrr. Then.y rr. tr I The purnlling lcmmil is difficult fbr several reasons. Knowledge of the ntlqrs is esseutial. Yon good strategy to win. The riglrt side is tirerefore an inttlgur.rd to seeexactly how to use it. tsut -L1 : fntlhtt: rz > 0].stray in applying it.t . by I']rxrrclrn 4. We rnrrst be rnore inverrtive. it rniry still be ha. whicli we have already classifit:das rurrrrcgrrlilr.good game. firrllpose L were regular.tingthe conditions of the purnping lernrnir.n be decomposed as 'tD: frUE1 . irrrtl it is oir. there is a. irnd we have srrcceedecl irr violtr.kl n.* I)k a's.. ca.l and I : (rn * 1)!. ntl . tirrx:s to generate a string with rn! + (i. : l+ 1 or n.ys nt. prrurp it so that it has an equal trutrrlrerof a's arrd b's). Howcvrlr. vou a. (t Prove the fbllowing version of the pumping lemnra. br. Z ancl tire lilnguage I't:T ( r . Lo * b * ) woulcl also be regular'.tecl.: m. The pumping lernrna is likc a girrnc with r:omplica. If yoLr can apply the purnpirrg lcmma. thcn there is an nz such that. rnuch more elegarrt way of solvirrg this prriblem.re to be congratula. Its staterrrerrt is complicated. since our opporrerrrt citrr illwalys choose a decomposition that will ma.ke it impossible to pump thc string orrt of the language (that is. If thc ollporrcnt now choosesa g (by necessitycorrsistirrg of all a's) of length h < rt. we purrrp r.1.rt ihat alotre is rrot errough to plir.tly.rT fr arxl A { rn.u € Z of lerrgth greater than nr.syto go a.cns 121 I'Iere we need a trit of irrgemrity to apply the purnping lernrrra dirrx.C)onsequently. Clhoosinga string with ri.tedrules. But everr if wr: tnaster the teclurique. Let us ta. E l*2 will not do. correctly to sotne of the rnorc rliffir:ult r:asesin this book.. r.*1)! possible since I'his is rrlwir. cannot be regular. [f L is rcgular. every . alsrl rrrxrd er.4.

. n' is a prirne nutrber} . " . S l r o w t h a t t h e l a r r g u a g et ' : (i)e.lul ) tn.. l_ D n"t*. ffi (u) f : {a.g x n L:{*n'-h = n t5*'i \ n> lL- \ (\( * \ .'grl rn."L. therr there exists an m.as special cases. r . r t _ .12. such that the following holds for j every sufficiently lurg ur € .b}-} (g) L: t { t u r r l ..{""r. \_/ Lrro. = {an : n is either prirne or the product of two or more prirne numbcrs} 6. il'he middle string u can bc written asu:'rAz1 < with le.r22 Chopler 4 Pnopnnrtns oH ltr:ctrlan.t f . lgl ) 1.. j L-'\fl nr la t- (r] n>i1-r\.*inc ou if the following larrguages X : {a} are regular ffi 1 -oqK .hat.L is regular.. ' i . t . .gt*.L. is the product of two pritre numbers} (f) .L and cvery one of its decornpositions llr : I.-.L:{o. o l " .L fbr all i. therr -Lr U -Lr is also nonregulat. regular? 3 .t.o"* that the following larrguages are not regular. 1...""If trr arrtl . m {w:n*(ut):"r6(tr')} is trot rcgular. "nl_ _ .l I ) : I or l f k} {anblak : n (.. (c) l..{a" : n.t) . r}_1. ": .tltl. t ' .r. Is. ffi .t)l \J''.1).: -(. Apply the pumping lcmrna directly to show the result in Example 4. If .{unn :'ur€ {4. : {ut : n.Lz are nonregular lan@erov* guages. rno* that thefollowinglarrguageis trot regular.."Itlttk I+} n+l} : (b) . b } " } (fl *-.^ l ' (a) I : {an : n } 2. :2k (e) I : {an : n.(tu) t' nt (.ru)} ffi -(f) t . 2."bt:n{11 (c) Z.L ./t} ..ulrgi z'uz E L tor all i : 0.z€ X*..such t. l s l> 1 ' such that nyoz it irr . Fa. 2 Prove the frrlkrwirrg generalization of the prttrrping lemma. Rl: e { a . which irrclutles Thcorcm 4..(i) ordisprove the ftrllowing statement.tz with u.} " . .I: (c) I : { o " 6 l o a: A t I n .*+-(b) tv tL : {a"' : tt' is rxrt a prime number} {ttn : n: A 2 f o r s o m eh > 0 } for some h > 0} 4 .8 as well us Exercisc 1. r r . Lar'{cuacns with lyzl < rn..r. '-..rr.

I'he r+et walks is genera. ffi * 16.n infinite Lrut courrtable set. i ' r ' i * (b) /. it will be c.ulcrs 123 Consider the latrguagesbelow. the followirrg larrguagesregular? {u'*rr.p.utr e LJ} {tu : Apply the pigeonhole arguuent rliret:tly to the larrguage irr Exarnple 4. Show by cxample that tirc fatrily of regular larrguages is rrot r:losed urrtler irrfirrite urriorr.h languagc .4.ure of P. Consider the argurnerrt irr Set:tiorr iJ.rrz {o. * t ' r . of . : { a n b t u . b}+ } e W ( 1_t:)A* (a) I : | ".L2bc regular lt-r.L.wE la. .geregular? [. lrl } lul} dS G4 tr thc following languagcrcgular'l ' "'l l . | r-.An.t in light of of Exerr:ise 15.nguages.}5. 15 1 0 0 } ( e )r : { a " b r : l n .ncuI.)fet f be a.1r) e {a..lenotedby Upep1.. arrd associate with ear.your conjecture.bl L J / $ rl+(L5. : | J L \/. ( a ) Z . so tha.3 IuuvrrnyrxcNoun. Show that in this case.b}. u. : jrLt {tnuwRui u.''r i'rtrj'Lt1'u) {a. I : { a " b t : n > 1 0 0 . : {w1c'u2 i Lurl.L -.llyinfinite. r . the languags 7 : Is necessarily rcgular'l ffi . : {a*b{ : rz "1.is a prirne nurnber} I (e) {a.tNc.r the following langua. For each.r.ut€ L1. Thc smallcst sct containing every . r ' p ) . beca. < t. : { r ."ht : n.8.Lo. The larrguage assor:iaterlwith sut:h a graph is '/5-. it tloes rrot irnrnetliately follow that -L is regular.l l : 2 } \fOJ l. l"t -Lr and .use the special nat.1>3. P E P A '--. ju)r + :iuz} € ( il{ -J |2. 6}+ .h Sl} (c) . Then prove . kr r l l l k : > 5} m ffi (b) fI : {a"blak:rr.2 that the langrrage associated with any getreralized trartsition graph is regular. make a corriecture whether or not it is rcgular.{a"bl : nf I is an integer} (d) /. < 2nI ( I ) . L DeP whcrc P is thc sct of all walks through the graph and ru is the expression associated with a walk p. the infinite uniorr is regula.Lu is the union over the infinite set P.

Chopter 4 Pnopnnrlrs or RncutAR LANGUAGES tr /+ the family of regular languagesclosedunder infinite intersection? ffi \L /\ dE-J Supposethat we know that Lt I Lz and trr are regular' Can we conclude . In the chain code language in Exercise 22.1.Lz is regular? 19..L is not a regular all u e {u.# *fipfr trom thls that .r.L be the set of that describe rectangles. Section 3.ld}* language. show that . .r. Iet .

'rrtbtlrship problem. ancl grammars.renthesis a atrd a right parellthesis for b. programmirrg lattguages reqrtirer sorncthitrg beyond regrrlar lirrrguages.nt nx. flxpla.1lter clefining context-f'rtxr gralrunars a'nclltr. The rclcvartceof these limitations to programming larrguages becomes evirlt:rrt if we reinterpret somt: of tlte exatnples.ges are ttfft:r:tive in describing t:elrtilirr sitttple llatterns.st chapter.irririg ir setrletrce through its grilrnrnirtical deriva.Context-Free Longuoges n thc la. If in I'or L: {q*\rn : rz > 0} we sutrstitute a left pa. The la.rnplesof nonregular languages.y 125 . Whilc rcgular langua.tion is fir. by We begin this r:ha. then parenthesesstrings such as (0) and ((0)) are in -L' tlrt a (0 is not.o tnost of rrs f'roru ir stucl. Next. In rlrrlclr to cover this and otlrt:r rnore complicated fuaturcs we tttLtst enla.rriiliar l.nguirgcs.nguage therefore clescribes sirnple kincl of nestud stmcindicating that somtr llrollerties of ture fbund irr programmittg la.ngutr. wc txrrrsider the importa.r fbr exir.gos. in prrrticular we ask how wt: t:irn tell if a given strirrg is clerivablefiorrr a givtlti graurtnar.rge This leads us to considt:r context-free langrrir'gcs the farnily of langJrragt:s. we rliscoverud that rrot all latrgrta'gesartl rcgular. one does not rrt:c:tlto look very fa. illustrating the dqfirritions with some simplc: cxarnples.

t the subCouLext-1'rtx) stituliotr of thc: variable on the left ol a produr:tiori t:ilrr be rnade any time sttch a.ges. Every regular grarrurrar is rxrntext-free. In computer science. where.nguage theory tells us irtrout rxrntext-fiee languages has irnportant applic:rrtiorrsin the design of prograrnrnirrg ltr. The topic of context-free languagesis perhirys the most important aspect of firrmal la. we rnust relax sorne of tlrrlsc:rostrictions.as we do frrr irrstirrrce tra.ngua.ngua.126 Chopter 5 Cournxr-FRno Lancuacns of natural languagesand is callexl parsing.ges well as in the constructiclrr clf clfficient as conpilers.this is rclt:vilnt in interpreters. compilers.t tr : L(G).ve already shown in Exarnple 1.. Wt: ha. A lir.crrntext-free grarnmar. thclrc are nonregular languages.mffim The procluctions itr it rtlgrrlir. a. But. Actual progritrnrning la. we get corrtc:xt-fi'rxr M A gramrnar G: (V.nother. does rrot dt:pcnd on the . but perrnittirrg arrythirrg grarnma. To crea.S. Parsing is a way of describing sentence structure.rrgutr.P) is said to be context-free if rrll prodrrctions itr P have Lher ltrrrn A+il.rs. lt.. so wc s(xl tltat tlxr fer. We touch rrpon this briefly in Section 5.rgrarnmar are restricled in two ways: the left side musl be a sirrglr:variilblr-'.ct tha.T.nguagetheory as it applies to llrugrilmming la. on the right. Context-Free Grommors ffi.11 thrrt this language can be generirtctl tly tr. so a regular langrragr: is trlso a. variable appears irr il st:rrtential form.nclother translating prograrrrs. rxrntext-free one.milyof regular Iatrguirgesis a proper subset of the fanily of cotrtext-frrt larrgrrir ges grirmmar$ derive their trattrt:frorn thcl fa.while tlre right sicle has a spcc:ialforrn.nguageshave many fealures that c:arrbo clescribed elegarrtly try means of context-free languages.nslating from one language in to a.3. By retaitritrg the rcstrit:tion on the left side. What frrrrnal lar. as we krrtlw f'rom simple examples such as {u"h"l. It is irrrportirrrt wh(lnevrtr we need to understand the meaning of a sentence.geis said to be corrtext-frtxr and only if there is a contextI.tegl'alrtlrrars thirt irlu rntire powerfirl.4 e V and 'r t (V u ?). if freegrarnmarG srrchtha.

r is S + a. > 0} T Both of the above erxanples involvcl gral]lmars thirt ate not only c:ontextfree.bbu'tt'.(ba)'' n.8. {o.l ori of the consecluenr.}.ruB.1 Conrnxr-li*Rps GnalrunRs 127 'I'his f'eaturtr is frrrtrr (lhe contcxt).xample 4.uaBb.b}.5 --+ b$b.ria.9-4. R + bbAa. r).tlt:r showthnt L(G): : {ab(bbaa)" hha.abR.Su + aaSe.a + aubSbaa ) 'I'his makes it clear that L(G): 'rrr {u. with productions S --+a5a'. blt lirrc:ar. but as shown in F. A-4. A typical clerivation irr this gramma.5. it is uot regrrlar S .a. n. A .l (J The gramma. a context-free gramnlirr is uot neces$rrrily . letr. but linear.bk: Lhe left sidc of the procluctiorr. is t:orrtext-free. Regular ir. sirtgle va.vt: to the rea. b} .rrdliuear grafiIIIIarS are clearly croutext-fitlt:. e {a. .r : ({S} . symtrols in the rest of the senterrtiir. S. . Longuoges Exomples Context-Free of H $*sqtrsf $. to it We is context-free.:et rrllowittg only a. The languagc is context-f'rrlc.

ASrl. we need to protlurxl a. lb show this.\. . but not lirrrrar.respectively. and ababab.SrB. context-free grarnmar fbr the language. The castl rtf n : ?7? was solved in Exa. (b.5r -i a$rbltr. Thc language .cns WWWW rhe language 7. e {4. and b with lc:ft a. The resulting gra.mple 1. 5r .} is context-frrxl. (ut) and no (u) } 26 (u) . B --+bBlb.. However.nd right parerrthtrses. This is done with ^9ASr.a5rblA.gesclearly if we rt:place c. the grammar is rrot lirrerirr. In fact. 71-+ aAla. then add extril fl.remany other eqrrivalent context-frrle grammars.l) We can sec the connection with programming larrguir. where u is any prefix of 'u..q.b} : no (w) : nr. In Exert:ise ZE at thc end of this section yolr are asked to firrtl one of them.').t (G) are abaabh. We first generate a string with an equal number of ats and b's. and we get thu answer 5 .128 Chopter 5 Conrnxr-Fn.L .nbuild on that solutiorr.'s on the left.5'bl55l.r that is context-fiee. It is not difficult to urnjecture and prove that * L : {.11 and we c:a. This is another gramma. there a.9 a. Some strings in . hence tr is a context-free languager. We can use sirnilar rea^soningfor the casc n { m. The particular ftrrm of the grarrmrar given here was choserrfrlr the purpose of illustration. aababb.pp Lerrc+u.mmar is contuxt-free.:{a"b*:nlm. there are some sirrrple linear ones for this language. Take the case ??) rn. I :IIWiliWWilWWMW Consider the grammar with productions . A --+aAla.

+n 1 e'at'4 aaABb4 q. Conuidcrrrow the two dt:rivatiotrs s 1. it is rrot so easy to sexlif there are any lint:irr oiles' We will have to wait r. In srrch(iases.4-4.lcrrtgralrtlrla. A --+aaA.irely in tlrc order in which t}r: productiols arc aplllied.rs.rntil ()hrrpter I befbre w(t (tirrr auswer this qrltlstiott' T Derivqtions Leftmost Rightmost ond In context-freegrirfirilrarsthat a. rn. we be often reqrrirc that the va.1 Cournxr-FRnn GR.ble in the sententia.5. F\'om this we see that the two deriva.c'Rs 129 incluclessur:hstrirrgs as (0) ancl 0 () () and is itr fa'ct tht: sct of all lrroperlv nestr:rlparetrthesisstructtrrt:s lbr the colrllnorl prtlgratntnitrg la.8*+A' getrerates language (G) = {aznb*' .tiorrs rtot only yield tht: sarne sentent:tl but use exactly thc sarne procluctiqrrs. M A rlcrivatiou is sa. 4.ntl 4 aaAB4 naB S uaBb4 aab \ s 4 . 2. If in each step the rigltttnost va. In order to show which producliott is a.we have numbcred the productiotrs and written the appropriate mrrnber on the + syrttbol. L the that this gra.ced.id to be leftmost if irr each step tlrt: leftmost varitr. a derivation rnay involve senwe tentia.bleis replaced.l forrn is replaced.ppliexl.5.5} .ttutrr. 3. have a chtlice Trlkc for exampletht: grarrrrnar are repla.Ab aab. 5.8 . P) with produr:tions t. Brrt.tsb.ta i.renot lirrcar.sy sr:c to rz ) 0.bl .ngtrilgt:s' Here again therel ilnr rriany other eqrriva. irr contrast to Example 5.. The clifferentre is etrl. in the order in which variiltrles G : ({A.8.> 0]. S -+ AB. . To rem6vt: suclt ilrelevant factors.a.riabltrs replaced in a specific order.r. Io.lfrrrmswith more thaln tlrrc variable.ritr.3. we call thc derivatiou rightmost.mrnar trt is ea.

The followirtg tlcfinition makes this trotiorr precise. A --+hRh. 1 shows part of a tlcrivation tree representirrgthc prodnction A o.tf. Figrrre 5. Let G = (V.A rightmost derivationof thc samestriug is + + S + aAB + aA + u.bABc.1 ). Bcginning with the root.P) be a c:orrtr:xt-fieegramlnar.5. [Rlnfii..Chopter 5 Cor'rlrjx'l'F nr:r: Lnrvc:uaclns Figure 5.\. is by a derivation tree.hBhB abAbB+ abbBhhB abbbbB o'bbbb is a lefturostrlcrivationof the string abltbb.7.fii 'f. a uode labeled with a variable occurring on the left side of a production ha. A derivation tree is irrr orclered tree itr which rxrdes are la.re tertninals.ffi.n..iotrs.beled with the lcft sides of productiotrs arrcl irr which the children of a node rcpresent its corresporrdirrg right sides.lil. a derivatiorr tree shows how each variable is replaced in thc durivation. . Al.Nil$. For example.indt:pcndent of the order irr whitlh prodttctiotrs arc usud.hBh abAh+ abbtsbb ahhhh I Trees Derivotion A second way of showing derivat. In a derivation tree. fi __+ Then + + + S + uAB + a.-\ \_1/ WnsicJer the gI'aIIuIIar wit} prochrctions S --+a.schildren consistiug of the symbols ou the right side with the start syntbol of that productiorr. latrtrlerd and ending in leaves that ir.AB. Atr ordcred tree is a derivation tree for G if ancl orrly if it has the following propcrtics.

is sirid to be the yield of thc tree.king thc lefttrtost. If a vertex has labc:l A € V.trcl frotn V. Consicler grarnmar with procluctions S --+aAB. a n . is a sentential form of G.t is. thrr.tiotttree. Every lcaf has a label fiom V U 7'U {I} is said to be a. The string abBbB. partial derivation tree. .9.. and 5. but itr which I docs rrot rrecessarily holrl and in which property 2 is replacecl by: 2a. 5. and its chiklrt:rr are Iabeled (from ltrft to right) o. Tht: yield is the string of trlrrninals in the ordt:r they are ettcoutrteredwhetr the tree is traversed in a depth.2 is a partial derivatiqn trce for G.nhave no other children.intr. A -+ bBb.. a vertex with a t:hiltl labeled A ca.W$I thc G.la. 3 .n.first rnarlrrer.Gnnurutnns 5. then P must conta. rtlA.. fr --+ Thc tree in Figure 5. while tlrc tree in Figure 5. abbbbis a sentenceof I (G).3 is a deriva. I . The string of syrnbols obtained by reading the leaves of the trct: frotn Ieft to right.1. A leaf lahtllcxl\ Itas no siblings.belfrom T U {I}.sir.e. always ta. omittirrg itrry ..llrclductiou of the ftrrrn A . tttrexplorttd branch. 4 A tree that has properties 13. 2. 4.$u\WWNNWW. a2.+ u 1 u 2 . Tlre descriptive term Icft to right ca.' . The root is latreled . Every inttrrior vettex (a vertex whic:h is not a leaf) ha. which is tlrc vield of the first tree.nhe givt:rrit precisemeaning. Every leaf has a la. The yielld of the second trtrtr.1 CoNTEXT-Fnnn 131 l .\'s encoutrtered.

We do this by indrrction on the number of stcps in the derivation.'. Like transition graphs ftrr {inite automata.lNc. Proof: First we show that frrr every sentential fortn of I (G) there is a corrcsponding partial derivatittn tree.Let G -. thongh. we must establish the connection between derivntions and derivatiott trctts.S. Then fbr every € I (G). steps.tt)AcHs Figure 5. Also. Assr. Since S + u implics that there is a production .the yield of trny derivation tree is irr -L (G).3.9 -r u. 't . this follows imrntldiately from Definition 5.(V.Chopter 5 Cournxr-Fnnn L. P) be a context-free gralilnar. if t6.' of G whose yield is ir. Conversely. tlNrre exists a derivation tr*.l derrivationtree for CJwlxrsc root is labelecl5. As a basis. Now any ?rrderivable in n * I steps .T. First. this explicitness is a great helJr in making argurnetrts. we note that the clainred result is true for every scntential form clerivable irr one step.rmethat for every sentential form derivablc in n. there is a corrcsponding partial derivation tree. is atry partirr. tlrcrr thc yield of fs is a senterrtial fbrm of G.3 qnd Derivotion Forms Trees Between Sententiol Relqtion Derivation treeu give a very explicit and easily comprehended tlescription of a derivation.2 Figrre 5.

for Give a derivation tree for w : ahhltu. A € V . showirrg that the latrguage giverr is gerrerated bv the gra. flowevcr' once wo have a. Conrplete tire argumerrts in F)xample 5..arrcl W E rAy +'IA1A2"'Q.1 by showirrg that the yieltl of cvery partial c{erivation tree with root . partial derivatiorr tree whosc leaves a. in rr.but do rrot give tlx: order of tlx:ir applica. By itrdrrctiorr.bbaba the grammar in Example 5. derivatiotl tree.1 CoNTEXT-I"IIIJE 133 must be such that S I r A y . aud sint:t:the gramrnar tnust have llroduction A + a1a2' ' '(trrrLr we see that bv expanding thc leaf labeklrl A.mtnar.9 is a serrtetrtial form of G' s.partial derivation tree we therefilre claim that the rlr. We will loave this a"stlrr exercise.2. wc catt alwaYs get a leftmost clerivatiott by thinkirrg of the trce as having been brrilt irr such a. it follows that c:verysententlcin I (G) is the yield of some derivation trcrr:of G anrl tha. Use the derivation ttee to find a leftmost derivation' show that the grarnrnar irr Example 5. wc are Iecl to thr: rrot surprisirrg result that any ?tr€ I (G) has a.l derivation trr:c represents some scrrtential fbrrn. gap itl the preceding discussion.!J: rAL V tlT. aE ( v u 7 ' ) * . Since try tlte iucluctivc assunrption there is a partial derivatiorr tree with yield :rA'g. firw details. I Dcrivation tretls show whic*r productitlrrs are userl irr o[ta'ining a sentclrx:e.leftmost and a sce rightmost rlerivation (fbr cleta.1. we r:arrshow that t:very partia.u.ils. way that thc leftmost variable in thc tree was rr. we get rr. }rut we havet rrot cla. .lsohad a leftmost or rightrrrost derivtr.t the yielcl ()f every derivatio[ tr:eeis irr l/ (G). reflet:ting the fa. . rcsult is true In a similar vein.imefl that it a.1.re terninals.tiotr. Is the language irr Exatnple 5.lwaysexpantled lirst. 1 .steps.ct that this tlrtler is irrelqvattL. 3. l)erivtltitlrr trees are ablt: to represent atry derivation. Since a clcrivatiorr trr:c is also a.titlrr. n .2. Exercise 24 at the erl(l of this scction).4 does in far:t generate the language describerl irr Equation 5.. Fillirrg iu a.Gnaultarr$ 5. W 5. By cle{initionr any u E L(G) has a dcrivation. l)raw the dcrivatiorr Lrcc corresponcling to the dcrivatiorr ilr Example 5..2 regular? Cornplete the proof in Theorcrn 5. au gfus(:rvation whidr allows lltJ to close tr. with yield r&rfl|"'amA: for all sententirrl forms.

. rn ) 0) (u) I : {anb* : rt.ul: z} show that the complement of the language in Example 8. Find a context-free grammar for head(tr).. b}* : n^ (w) t' nu (w)} ( f ) . W .1.n > 1]. Let .nInz-L} (c)I: {a^b*:nl2rn} ( d ) . where . Show that the following. .Lh is corrtext-frec for a"trygiven A ) 1.h} ffi {a"6*o"n: n : rn or m I k} {anb*tk:h:nIm) {a'ob*ck:n+2m:t+) {a"b*ck : k : ln *l} W (f) f : {w e [a."h"'. (ut) I n. L . L : *r5' {uorur.rr. rn ) {a^b ch :rl:m or rn !. Tndacontext-freegrammarforE: \-/ x-.I : (c) tr: ( d ) .h I n + rn) {a"b*ch:k > 3}.134 Chopter 5 Conrrnxr-FRnr L. w h e r e . *--.b} forthelanguage p= {anu^u?b : w E *11. Given a context-free grammax G for a language tr.{ T r e { a .. Section 4. \ . 1 r r " < 3 n } S (*) I : {tl e {a. For the definition of head. b}+ .mcuacps b f! Find corrtext-freegrammars for the following languages(with rr. o. language is context-free.\L).h>o). (a) I: (b).I:{u.Lr be the language in Exercise B(a) antt . {o.> 0.. (w)] (e) I : (h) f: 9' /\\ (10) {a"h"'ck.(ur)+ t}. 'i s a n y p r e f i x o fu r } r r (e) I : {w e {o. show how one can create from G a grammar I so that l.Lz the Ianguage in Exercise g(d).c]* : nn (tu) + nr.b. ) 0. 13.l Lz is context-free. i rt.q. L: { a ' b * : 2 n . L: (e) L: Find context-free grarnrrrarsfor the following languages(with n.1 is context-free.. 14./ \J (a) Show tl::.tt) {a. (c) Show that Z and -L* are context-free..bl" : no (.see Exercise 18.w): 2nt. b \ * : n o ( u ) > n 6 ( u ) . Show that Lt J Lz is a context-free language.17n + 3} ffi (b). fe) = head. lul : e l. S (b) Show that . /'\ | ( L2/ Lct L: {a"hn: n > 0}.L is the language in Exercise 7(a) above.

\. with }j : {n. tr} . A + ttB.B. Consider the tlerivation tree below." the grarnrnar with prorluctions S * aaB. then every u E L (G) has a lcftmost V ancl riglrtnost clerivation. say 0 and []... fi+Aa. (tt ll) t()l' but not (Dl o' ((ll Using vour clefirrition. properly nestetl strings irr this situatiort are ([ ]). Show that the cor[plefirent of t]re latrguage in Exercisc 8(h) is cotrtcxt*free.'inrl a simplc graurrrrar G for which this is the clerivation tree of the strirtg of Thcn find two ntore serrtettccs I (G)' -rr:1ntrb' what one rrright mean bv properly rrested parenthesis stnrctures in/ Uu)n"ntte volving two kincls of pareflthescs. Civc a verhal tlescription of the language gerreratcd by this gralrlmar' 4ti9)orrsicter \_--.* that thc langrrage : {wicll)z i'u. Show a tlerivation tree for the string aabbbbwit'h the grarntnar g +. ' L fu-f) Sir. Fincl a rrrrrtext-free glalnlnar alphabet {a.5.ivcly. give a t:ontcxt-free gramrnar for ge[erating all properly nested parelrtnescs. .C}' ( 24. show that thc striilg gau. Give arr algorithm for finding sudr derivations from . c}. w aabbabba is rrot in the larrguaE. ffi for the set of all regular exJrressions on the Find a context-frec grallllllar that carr generate all thc production rules for context-freegrammars with T: {a.481.rrrrrt"a. is context-free. I.. hrouo that if G is a context-fi'ee grammar.fa*"""^" a derivatiotr tree. A * bBiiltr.b. 18'.1 Cot'lrnxr-FRer:ClR.nnrlunRs 135 16.c generatecl by this 2O. B-Sb. t urit}.tt)2e {o.*.b} and V: {A. Irrtuil.l.b}..

we will ca. the nrethod will eventually give a leftmost derivation of tt.S.vethe sentential fbrms that carr be derivecl in two steps.ll this the exhaustive search parsing rnethod.c.t can be derived ftom .ls. After the first rt)und.sthe c:tirrstnrctionof a dcriva.tchru. in which we irpply all applicable prod'ctions to the leftmost variable of every r.5. If norrc of these rcuult in a rrratch with tu.string tu of tclrmina.ns 2.4 * u. we studied the sc:t of strings that c:anbe derived usirrg (J. An algorithm that can tell rrs whether'ru is irr z (G) is a nrernbershipalgoritlrrrr.1J. This gives us a sct of sentential forms. In t:itscsof practical applications. Let G : (V.q. It rnay be that sorrrc of these senterrtial fbrms can be rtrjected orr the grounds that ur c&n never bc derived from them. Specifically.ec: f'rom the root down. It is a ffrrnr of top-down parsing. which we can viuw ir. with lul : h ) 1. arrd so on.tion of ru. we have serrtential forms tlmt can be derivcd by applying a single production.tion Lr.5 irr one step. If u € L (G). Thus. wtr svstematically construct all possible (say. but in general. leftrnost) derivations arrd see wlrcther any of thern rntr.136 Chopter 5 Cournxr-FRr:n Larqcu. we will have on each round a set of prnsibJe sentential fbrms. we may want to find a deriva. On each subsequent rr)und. some of thcm possibly leading to ru. we again take all leftmost variables and apply all possible productions. after the second round wtl ha. If so. we warrt to know whether ot rrot ru is in L(G). For referrlnr:e below. then it rrlrst have a leftmost derivation of flnite lerrgth. we are also concerned with the analytical sidrl of thtr grammar: giverr rr. Show that the dcrivation tree for anv ?rl € I (G) has a height h such that I o g i ' l5 h'< q + u. 2 6 . we can parse it in a rather obvious fashion.T. Ihe tcrm parsing describes finding a se(luenceof productions by which a w ( L (G) is derived. we go to the next round. F'ind a lirtear grammar for the larrgua. l ond Ambiguity {ffiffi Porsing We have scl firr c:oncentrated orr the generative aspects of grammars. Given ergr&rnmar G.P) bc a context-free gramrnar such that every one of its productiotrs is of the form . Porsing ond Membership civen a string ru in r (G). we start at round one by looking at all productions of the fbrm S+JDr finding all r tha.gein Example 5. .

ahl)is in the larrguage gencrated by the gratnlnar rrnder consideration. 4.lrrIctlIrv tBT S .\.Round tlrregivesus 1.2 Pnnstllc . l I tlre tr.On the rrcxt round. This is certaitrly the cilse in .9+bSaS. S + aSb. I Exhaustive search ltarsiug ha.9+ bSa.9.R.95.oundtwo thetr yields selntctttial S =+ 55'+ S.9.+5 5 l o 5 b l b 5 a . it is rrot to be rrsed where erflicietrtpa. we fiud thc actual targct string from the sequenc(] S+aSb+aaSbb+aabb. S+ab'b+u'bSab.ws. 2.9 + .nd string ru : aabb. Sirnilarly.rr removetlfrorn contentirlrr.ysparses au € L (G). it is possible tirat it never tt:rmiuates for strings not in L(G).q. S+55+aSb.no Atr.5. be Again. S+$. . ftrr from further <:orrsideration obvious The last two of theser:irnbe removexl forms reason$.5. fiom senteltial fqrm 2 we get the additional sentcrrtialfonns S+aSb+tr. The most obvioqs orrtl is its terliolsless.95b.rsirrgis requirtxl. 5+aSb+ab. 5 +. Tlrerefore e. S+aSb+aaSbb. But even wfien effic:itlrrclvis a setxxrdary issue. While the rrrelhod alwir. severalof thesectr. .. 3.s serious fla. there is tl rrtore pertirrrlrrt objection. the which are otrtainedbv re:placing lcftrnost S in serrtentialform l with all applicable substitutes. 5+55+5.

4 --+ B. Given any Iil € latb]+.this restriction does not irffcr:t the power of the resulting gratnrna.pter. If we do Irot producbiorrs. those of tlxl firrnt .Itorlr:ilrr be made into an algorithm whiclt. After lruj rounds we have either produ<:eda.l. Irr fh.9. so that we canttot tell *r.l fornts.rsingmet. ftll any u € X*. consider both its length and the number of terminal synrbols. Thetr tltt+ cxhillrstive seart:hlla. we see that the difticulty cotn(ts tiom the productions thc length of successivc ^5'-* tr.o nrlc out..ct.h tirere are two types of produc.tion$we wanl l.rildinto it $ome way of sloppirrg. For each sentential forrrt.witlt m : a. It generates the language irt Exnmple 5. this prorhrction can be ustld to clecrease scntcntia.7 without thc empty string.rsingis relativrly TIte probklrn of nonterminatiorr of exhaustive $c:irrt:h elasvto overcornc if we restrict thc: forrn that the grirrrlma. Supptrse that G : (V. the exhaustive seitrch parsing metlxrd will althc ways termirrate in no more than lrul rounds. As wc will see in the rrcxt chtr. B e V. The gr*tnrnirr S' *+ 55 laSbl bSu la.Chopter 5 CoFrrnxr-Fnnn LRwcjunoes the method will go on producing tria. P) is a context-free gralrrilrirr which does not have any rules of the fornr O-*tr.l the previous cxir.mple.sily when to stop. The idea in this cxample catr be gtrneralized and rnade into a theoreur for context-fiee languages in genera.\ as well as those of the ftrrrn . have arry srrr'.4 --+ . ot a tells us that nrr parsing is possible.7. A* R. Since neither the lerreth of a senterrtial form nor the rrtrmber of . pa.hb. This is clear becauser length of the senteutial ftrrm grows by at lea*stone syrnbol in each rountl. cither pr:oduces parsing of trt.hl ba satisfies thc given requirerncnts. where A.ny signilicarrt wav. Proof.r ca.then we have rttarry f'ewer dilTiculties.parsing or we know that w { L(G). sentential forrns indefinitely urrlcsswe br.n have. If wtl examine Exarnple 5. Each step in the derivation increases at lcir"st one of these.rs irr ir.7.

Exactly how many sr:ntentia.2) is only a. If we restrict our$clves to leftmost derivations. Neverthcless.l forms cirnnot exceed M = l F l +l P l ' + .n be fountl in Harrison 1gz8 and Hopcrofb rrnd Ullrnan 1979.lwaysbe done.u cannot bc genertr.blished. therefore. brrt all of them are sufficiently conrplicatcd that we ca. + e f t * t l (5.8 we will take this question up agnirr briefly.nrrot everr descritre thern wittrout developing some additiorral results.t. antl a corlpiler hased on it woultl neetl an excessive amount of time to parse (lverr & nroderately long program. The const.nrJso on.5. is still cluitc iuefficient.Equation (5. .n 2 lul rorruds. There arer severtrl known rnethods to at:hieve tliis. a. the total numtrer of sententiir.n be ester. of rxrurse. A rnethod in which the work rises with the third power of the length tlf the string. at wlfch ti're we either havera succerssful parsing or . .n algorithrn that parses any ut E L(G) itr a nrrurber of steps liroportiorral to lrirls.tedby the gra. a derivation carrnot involve rlore thtr. wc refer to such a. and often the rrumber of sentcntial forms is rnuch snraller. More detnils ca. no precise gt:rreral resrrlt ca.l:tion . bound.rnethod as a linear time parsing . onc reasorr for not pursuirrg this in cletail is that evcn these algolithms are unsatisfactory. we obsclved that parsing ca.rrstivesearch parsing is vcry inefficient irr most cases.practical observation shrws that exha.pirrsing rnethod which takes tiue proportional to the krrgth of the strirrg.rrrrrar. We wjll rrot pursue it here excr:pt for sorne isolated results.2 Pansruc ANDAMBrcurrtr 1SS tertninal symbols can excced lrul. Frlr every context-free grarrrrnar there exists a. What we would Jikc to havc is a. no ilrore than lpl? sentential fclrurs aJter the set:ond round.f more eflicient parsing methods for context-free gralllmar$ is a crlnrplicated mattcr that belongs to a course rlrr compilers. while better than an e.nnot involve more than ? lul rounds. In the proof of Theortlrn s.ngth of the string.tes that the work for t:xhaustive seatrh parsirrg may grow cxponentially with the le. In section 6.r. tlut we can put nome rough upper bounrls cln it.2 ) This indic. I While the tlxhaustive search tnethod gives a theoretical guarantee that parsing can rr.xponential algorithttt.lforms are generated diffcrs from case to case. makirrg the cost of the mcthocl prohibitive. w() can have no more thnn jPl scntential forms after one rountl. its practical usefirlness is limitecl becausc the number of sentential ftrrrns generated try it may be excr:ssivelvlarge.

rchmethod ancl the string rl : tt1Q.9lbS$lc is an s-grammar.omplettld in tro more that ltul steps' . a E T.. Next. (G) can be parsed with an effort proportional to ltul To see this. since there can be at rnost one rule with S on the left.7. The gra. then any string trr in f.tifii ltiffiffifiF|il.. but sinr:e again there is at most one r:hoice. and any pair (A.fi. specialcases' ]. where A € V. As we will see in the next section..140 5 Chopter Corurpxr-FRuu Lerucu. Illaily featlres of comtnon programming languages can be described by s-grarrrmars' If G is arr $-grammar. We see frorn this that eaclt step produces one tertrinal syrnbol arrtl hence the whole process must be t.-'A*. look at the exharrstive sea... we do not know any lindar time parsing methods for contextin free languages general. aS I While s-gramma.cns algorithm.mmar G : (V. P) is said to be a simple grammar or $-grammar if all its productions are of the form A --+ar. we rnust have S 1a4azBtBz"'Az"'A^. * € V*.H ir A context-free gra. and starting with a1 olr the right. they are of some interest. the derivntion rmrst begin with S + a1AyA2.mmar g + a.r$ are quite restrictive.a) occurs in thc two prodnctiolE .but such algorithms can be found for restricted. a) occurs at most once in P' The grammar g -+ c. but important.9lb55la$Slc is not an $-grarnmar becarrse the pair (^9.S __+ ald S + a.a".2.SS. we substitlte for the variable 41 . S .q.

_ l Ambiguity is a common feature of natrual languages. where it is tolerated and dealt with in a varicty of ways. The setttence aaltb has the trvo derivation lrees shown in Figure 5.4 .is ambigrrous. Often w(] can achieve this by rewriting the grammar in an equiva.2 PARSINcaNn AMeIcuIrY 141 Ambiguity in Grommqrsond Longuoges On the basis of our argument we can claim that given any w e L(G).4.9 . we s&y "the" derivation tree because of the possibility derivtrtion tree rather than that a number of different derivation trees may exist. rvhere there should be urly one interpretation of each statement.aSblS^91I. A context-free grammar G is said to be ambiguous if there exists some ut E L(G) that has at least two distinct derivatiou trees' Alternatively. In programming languages. with productions . "4" exhaustive search parsing will produce a clerivation tree for tu. ambiguitv rnust be rerrtoved when possible. Figure 5.5.lent. This situation is referretl to as ambiguity. ambiglitv implies the t:xistence of two or more leftmost or rightmost derivations. unamlliguous fbrtn. The graurmar in Example 5.4.

5(a) a-sthe Correct parsing as it indicates thal b*r: is a subexpressionto be evaluated before perfortring the addition. It is better to rewrite the gramntar so that only one parsing is ptlsuihle. to associatelrrecedencenrles with tlttl operalors * and *' Since + norrrra'llY has higher preccdencetharr *. E-ElE.n Chopter 5 Conrnxr-FRnu l.ke Figrrre 5. we would ta. ) }.I}. (b) Wonsider the grammarG: (y.rb + c arc in L(G)' It is easy to see that this grarnmar getrera. 1-alblc. ( . * T : { a . * . the string u. as is done in programmilg marlrells.5. I One way tq resolve ttre ambiguity is. restri(:ted subset of arithrnetic expressions lbr C and. E . ff + E. I b*c ha^stwo differtlrrt derivatitln tree5.this resohrtion is cornpletely outside the grammar.P) with v = {E.nncuacns Figure 5. b .5 Two dcrivatirxr tre..rE. c . and productions E-1. The EI&IluIIar is arnbigutnts. Puq(rill-like programming Innguages.esfora*b*c.tesa.. For instatrce.T'-E. Tlre strings (a+b)+c and q. as shown itr Figure 5.(E). . I-Iowevr:r.

taking V as {ll. In fact.+F ..grarnmtlrin Exa.blcs.mhiguity is in tire langrrage.5.6 $. E*:E*7.1} and repla.tr'.n equivalent unambiguous grammiir.lrercrrr'rv 143 Figure 5.lblr:. r In the foregoing examplrr the amhiguity came from ttre grammar in the sensc that it crruld be removed by finding a.ce productiorrs thc with E-+7.11we introducencw varia.but.7. No other tt derivation tree is possible for this string: the grammar is urrarnbiguous.11. we will latr:r show thrrt there are no general algorithms by which thesc questions r:iut always be resolverd. .mple 5. T-+T*F. irr general. t t Ttr rewritethc. F --+I. I o.6. It also is cquivalent to the grrrrnrnar in Exarnple 5. T ' . It is not too hard to justify these claims in this speciflc instarrce. howevcr. In somr: instanrxls. the questiorrsof whether a. A derivation tree of tlrc sentenrxr lb + c is shown in Figure S.*qfiplo S. this is rrot possitrle becausc the a. F -+ (E).? Pansruc Arun $. givt:rr context-free granrrrrar is amtliguous or whether two given txrntext-free grarnrnars are equivalent are very difficult to answer.

logous Sz. somewha.:]"".trcl while the scctlnr.L44 Chopter 5 Corirnxr-FR.I'"I.lfl. A ftlw tries will bts.'sa.l .na. thcn the languirgtl is called inherently ambigrrous' rwi sor'': th il.t clifficult tniltter even to exhi|it art inherently aIII- with rz alcl rn non-negativr. course not follow I'rom this that -L is inhererrtlv a'mbiguous a's there rnight w.t L: by -L1 is generir.ff . Brrt in sotne: exist Sorrr() hi1vecolflicting requirerncnts.5'+ sz.ri ifliif 11Spfi g1v611 If L is a cont(rxt-hee langurrgtl tbr which thcre exists an rtn1mbiguotls Brtt'mrnilr. tltett -L is snicl to be una'rnbiguotts' If t:verY Branlrlrtrr that geueratt:s L is aurlriguous. Notice tha. Thcrr -L is gette*r.:tious grermtnar with start syrrrbtll .terJ LtU Lz....Abl. clA.. is to That tr is cxrntext-free ea.{a"bn t:"'} u {atbtttctt'} t It is a.l 1lrff i..L2 is rlur.r ambiguorrrisiltce tlrtl stritrg A''bncn has two distinct o1e strrrting with s =* 5r.rS it. B .nd c's.\.ty ['1attd -L2 for other tronarnbiguoltsgraIIIIna.. tire first puttitrg a rtlstrictiotr orr the uutnber for does thcl seune bts il."'If.:flll::lil llH.sy sitow.a.52 atrd llrtr by arr a. and ...r:r:Luqcuaclli ni. is an irrhcrently anrbigrrous conttlxt-fiee la[gllage. tht: pther with.llffl'''r' Exomplo 5.aSzlB. 51.51 A -.bEr:ll.13 The languagt: L -. lt tloes of clerivirtions. of a.teclby the t:ombiuatiorr of these two grarntnars with the additional produt:tiorr ii 51lS2. is The grarrrrrra.

Show that a regular languagc cannot be inherentlv arnbiguorrs. Firrl arr s-gramrnarfor L : {. alAa. using the grammar irr Exarnple 5. how rnarry routcls wil. D } . 11. ABlaaB.il. but that the lnnguage denoted bv it is not.5 is unambiguous."lt" : ru > 1}. ffi B. S + aSbS lbSasl A L4. L 10. that the larrguage7: {uu. l'ind an s-grarrilnar l.rtt b).t arr s-Eirarnmar tr (a. W 15.rse ttre string abbbbhbwith the gramrnar in Exanrple l-r.rr :.b}-} is not inhererrtty ambigu- 16. } fbr L9 show thal sno. E. for a 2.u. Let G: (y.n.12. Is it possihle for a regular grammar to be arnbiguous? tt |li:* 13. 6.S.a. one proof cilrr be founcl in Flarrison 1978. Sltow that thc following grarrrrnar is ambiguous./l ancl l7l.l'r. ffi Give an unambiguous grammar that gerrerates thc sct of all regular express i o n s6 1 X : { a .? Fansrrvc+ Auucurly arun l4E tluickly cortvince you of the imllossibility of combining thgsc requiremelts itr a single set of rules that c:over the case n : m. Clive an expression for the maximurrr sizc of P in term. Give the dcrivation tree for (((a "f b) + c)) * o f t). Use the exhaustive scarch parsirrg rrrethod to pa. In general. r Gl "u..t s-Br'alnlnaris unambigrrous.ru 2}. : {a''. is quitrl teclurical.a.5. {ffi 7. S * A E*b.l be neederl to parse any string ru in this language? . Show that the grammar in Exarnple 5. 17.r of ll. Show that thc grammar in Example 5.4 is ambiguous. uniq'ely. A rigorous argulnent) though. Slxrw that the grammar in Exarrrple 1.13 is ambiguous. Construct an urrarrrbiguous grammar equivalent to the grammar in Exercise 6.P) be an s-srarnmar.Il:tu E {a. Show that the folkrwing gramlnar is amhiguous.

me.f -state. It is traditiorral irr writing orr pr()gfammirrg languages lo usc ir rxrnvention f'or specifyilg gIaIIIInars calltlrl the Bachus-Naur form or the sarne as the notatiorr wc havtt used here.r Example 5. The basic problclll }rtlrttis to define a. Itascirl if-then-else statetnent can be tlclfirrtld as (i. wc (:irrl define a programming latrguage by a grir. Let CJ: (y.riable clescriptious of programrning icleltifiers to make the irrtcrrt of the production explicit.programming precisely arld to use this clefinitiorr a. (terrn) ..ngrr:lge writirrg of c:ffir:icrit a. BNF.ria. Thus. As wtl hirvtt $een.T.mmar.sthe starting point for the lar. For exarrrplel. in tlre grirrnmtr.'str. Thg symbols * ancl + are tetgrirtals.rrplc 1.re important in achieving this.14 is unambiguous' 1 9 . aqcl so o1.ges tend to use more t:xltlit.ngrrages.on) ::: (terrnl | (t:nytre. tiorts of Theorem 5. In BNF.bles enclosedin triarrgular withorrt any special marking. ir.ny parts of a Pascal-likc progra.) + (/octor) .5 2 0 . to this chapter.mtning language are susceptiblc to rlclinition lry restricted forrns of txlntext'fiee glammars. Slxrw that the grammar in llxar.rxlin the construction of itlterpreters nrrclcompilers fbr them.2. Both rcgrtltr. The symbol I is used -+' RNF as an alterrrator iL'Jin our rrotatiotl. regular lalguagels irrc u$ed in the recognition of certain sitrtpltl ltirttl'rns but as we arguecl in the irrttoduction which occur in prograrrrrning la.ncl reliable trauslation prograllls. BNF brir. Fistl a grammar equivalent to that in Exa'rurple which satisfies the trrrxli- Grummorsdnd Context-Free $ffiffilii P r o g r o m m i nL o n g u o g e s g One of tlte rnost imporlant uses of tlrtl tlxxrry of fbrmal languagesis in the clefinition of programmitrg languagcsa. But othcrwise there irre no significant dilleretrccs bctwtlen the two notations' Ma. but ::: is usetl instearl clf lir.on) (the'n-clause)\else-clause) ' ::: .ngua.12 miglrt appear irr BNF a*s I (erpressi. P) lre a cofltext-free Btailrrrlar iil wlrich every A e V oc(ilrft on the left side of at urost o1c productiol' (J is urrarrrloiguorrs. we rrefil c:ontext-fiee langnages to model more cortrplic:attxl turpects.f (erprt:ssi.r and cotrtext-frec lirrrguirgt. As with most other latrguages.nt) i.146 Chopter 5 Corurnxr-Fnnn LeNcuacns 1 8 .: (/ar:tnr) |(term.it va.Termilal symbols are writtr:rr illso uses subsicliary syrnbols such as l. s.r:kets.ssion) (term) . This forrn is in essence are bqt the appearallceis diffcrr:rrt. rmrt:h in the way we have done. va. Prove the following tesult. m Thert 5.

sitrcethey "etrnirrtcger variable cannrlt bc assigtred violate otherr constraints. If we check this against Definititlrr 5. What we need for this purpose are algorithms for cletecting attd removing ambiguities in cx)rrtr:xt-freegrammars and ftrr tleciding whether or not a context-free ltr. tnrt for our purposes it sufficers realize that $rrch granulrars exist and have brlcrr widely studied. A related question is whether a language is or is not irrherently ambiguous. yet allow us to pilrst: irr linear time. the usual BNF definition alkrws constructs such a. such iLT . a naive approach catr easily introdrrce a.3 Cox'r'Erxr-Fn. AII other term$ are variables which still have to be defirrcd.r. z '.nguirgeis inherently ambiguous. The variable (if -statement) on the left is always associated with the terrnirral z/ on the right. As Exarrtple 5'11 shows.mbiguity in the gra.2. but also make the work of a. ancl much of it is beyond thc scope of our discussion.4. it context-fieel grallrlrlar are usuallv rerfilrrcd to as its syntax' is rrortnally the case that rrot all programs which tr.mmar. irnptlssible in tlte most general scIISe'as we will see later' of Those a^spert:ts a programming language which can be modelul by a However. whic:h have the ability to express tlrt: less obvious features of a prograrrrrning larrguage. For this reason such a statement is easily arrd efficietrtly parsed. to Neither of tht:stltwo cotrstructs is a. Wtr will briefly touch to on this topic in Chapter 6.[egel".s 1)Qrr111 : r'eal. otherwise a program mav yield very different results wherr processed by different compilers or rurr on different systems.rt:syrrtactically correct in this serlseare in fact acceptatrleprogram$. We see here a reasorr why we use keywords in prograrrntitrg languages. so that parsing hecomeslt:ssobvious.r typical programtning language (jaII be trxpressedby an s-gramrna.Laucuacns GnalrlreRs ANDPROGRAMMING 5. The specification of tr prograrrlllring language must bcl urrarnbiguous. Irr connection with this. Unftrrtunatelg not all features of. n ::3. Unfortunately. Thtl rules for (erpress'ion) abovtl irre rtot of this type.sitrple rnatter. we see that this looku liktr arr s-granunar production. rxlrnpiler rnuch easier. extensive use has bcen tnade of what are called LL arrd LR grammars. r. the issue of ambiguity take$ orr added signifl(jarr(ic. This is not a. Ttl avoid such rrristakes we must he a.crccptirble a Pascal compilt:r.blc to recogtrize attd remove arnbiguities.'irtteqer u&T 'r i ?n. theutr irre very difficult tasks.nn L47 Here the keyword z/ is a terminal syrrrbol. For Itascal. The question then arisesfor what grammatical rrrles we calr pertnit and still parse cflicierrtly' In compilers. Keywrlrd$ not otily provide some visual structurqr that can guide the reader of a prclgrarn.

r[rrrar t]rat shows the rela.ge sema. Nothing Prograrrrrrrirrg cxlrrt:ise context-free gramilrars exists ftlr tIrc Specifica'tion a. Several methods have beel proposedr btrt none of them have }ee1 as uttivcrsnlly accepted and as successfulfor sernarrtir:tlcfinition as context-free languages havtl bt''en for svntax. 5 . Give a BNF definititxr of a FOttiI'RAN do statement.s as elegattt atrtl of prograrnrrring lnngrrage semantics. Find examples of fcatures of c that <:annot be described by context-free gtamIIIATS.ningof a.nent) undefinccl)' 3. It is atr orrgging concern both i1 prograrnrrrirrg larrgrtages and itr formal latrguage thexrry to find effective rnetho.ngrra. sinrxr it ha. aud Colsequerrtly some semantic features ilrav bc poqrly definecl or ambiguotts.te. lrr. Give a definition of ttre correct form of the if-else stateuent in C' 6 . . Give a BNF Hril." this kind of rule is part of progra. 1' Give a complete dcfinition of \ex'pressiora)for Pascal. 4 .rticular corNtruct. Give a BNH dcfinition fbr the Pa^rcalwhile statement (leavilg t]re general concept (sfc.mming latrgua'gesenlalltics. pa.Is ftrr tle{ining prograrnming lauguage scrtrilntics.Chopter 5 Corurnxr-FRnn Lamcuecns a real value.ltics are a courplicatctl rnatter.s to clo with how we ittterpret thtr rntrir. 2.tion between a Pascal program ancl its srrbprogrants.

a. An immecliate applicalion of the chonrsky normal forrn to parsing is giveri in ser:tion 6.tiorrs arrtl sutlstituticxrs thu. corrrplelter ancl in facl.r into a. elirninating rules of the forn A . In rnany irrsta.3. the chomsky norrnal form ancl the Greibach normal form. [n this r:ha. is it desirablettl place evell lror(l stringcnt rustrir:tirlnsrlrr tht: grirrrrrrriir.A and . However.Bccausc of this.t will bc rrscfirl irr srrbsr:tlrrclrrt rlisrlrssiorrs. A 1orrnal filrrn is orrt: tlurt.S i m p l i f i c q t i oo f n Context-Free G r q m m q r sq n d Normo Forms I efore we carr study context*free languagesin grealer clepth.t sa.tisfies certa.mrna. .Both ltave tnatr.4 . we Ileed to look irt rnethoclsfor' lrirnsfonning an arlritrary corltextfree gta. 149 .l.is llroad errough so that. we must atttlrxl to st)rnt:te:t:lrrrit:irl rnirtters.lthough rr:strir:tc:rl.nces.neqrriva. In'I'heoretr b. we saw the corlelielce of certairt restrictiorts olr gralrulratical forlm. is a detrirnent il sorne argumenls. \A/tl alstl irrvt:stigtrtc rrorrnal forrrrs f'rlr cxirrtr:xt-fr(xi grarrrrrrars.nsftlrmir.inrestrir:titins on its f'rrrrn. llrorlur:tirirr. We itrtroduce two of the tnost useful of these.pterwel strrdy $evrtrill trrr.B made the arguments easier. a1y glirrrlrrlirr hlm itrr tx|rivirlertt trorrrtirl-forrrr versiorr. Thc tlcfirrition of ir contexl-free grirrrrrrriil irnpOses ntr rtlstrit:tiorr whatstltlver oil the right side of a freedonris rrot necessary.ypractical and theoretical uses.lent one tha.

{I}. Assume that . and it is often necessary to give it special attention. 6 Chopter Slr.\ and those that do not. ttre process does not necessarily result in an actual reduction of the number of rules. Let G : (V.S.L.91tr.this technical aspect is relatively unimportant are and can be read casually. for all practical purposes.\. we will restrict our discussion to . and adding to P the productions 5o. P) be a context-free grammar. tnaking Ss the start variable. AIso. there is no difference between context-free languages that include .. (G) (see Exercise 13 at the end of this section). We prefer to remove it from consideration altogether.tes. Therefore any nontrivial conclusion we c8. looking only at languages that do not contain .The various conclusions significant.\-free languages.4 and B are different variables and that B+atlazl"'la" .96. given any context-free {I} G. there is a method for obtaining d such that -L (e) : ..they will be usedmany times in later discussions' Grommors ffiffi Methodsfor Tronsforming we first raise an issue that is somewhat of a nuisance with grammars and Ianguages in general: the presence of the empty string.n make for .I 150 FoRMS AND GRAMMARS NORMAL or.P) be a contextfree grammar for -L .L be any context-free language. Suppose that P contains a production of the forrn A --+r1Br2. For our purposes.T.t.In doing so.rpl. For the rest of this chapter. we will not define the tenn s'implffication preci$ely.L. but we will use it nevertheless. Consequently. as we see from the following considerations' Let . What we meaJl by it is tlte removal of certain types of undesirable procluctions. Here we give one that is very useful for simplifying grammars in various way. we do not lose generality. The empty string plays a rather singular role iil many theorems and proofs.S.lcnrroN Cournxr-FnEE The somewhattedious na. and let G: (V. unless otherwise stated.T. Then the grarnmar we obtain by adding to V the new variable .tureof the material in this chapter lies in the fact that many of the argumentsare manipulative and give little intuitive insiglrt. Rule A Useful Substitution Many rules goverrr gerrerating equivalent grarnmars by means of substitutions. {I} grammar will almost cortainly transi'er to . * genera.

rr L(G). The B so introduced eventually has to be replaced. so that e S 4c tr.S. . A production A . then obviously S36-.1) is used. .rsFoRMrNGGnelrlreRs LEt is the set of all productions irr P vrhich have B as the left side. then look at the derivation the first time (6.1). € L ( G )'.ntBrz can be eliminated from a f{rammar if we put in its prace tnu" w e L(G). If this derivation does not involve the production (6.6. Thus S 4c urAuz 46 u1r1Bfrzuz +G utixr. Let E : (V.T. The subscript on the derivation sign * is used here to distinguish between derivations with different grammars. (E\.1 is a simpleand quite intuitive substitutionrrrle. can strowthat if w € L(t)./ By similar reasoning. But with granrmar d we can get S 16 u1Au246 u1fr1!in2u2. I T h e r e f o r1 f.1 MerHoos poR TRnl. tr 1e.t h e n .(*) =r(G) Proof: Supposethat .r.r. is *secJ again later. e t ew Theorem6. we completing the proof. lry induction on the number of times the production is applied. we lose nothing by assuming that this is done immediately (seeExercise 17 at the end of this section).ntBrz from P.F) U. that S46u.. . \ . we can repeat the argument.gjr:zuz. lrtanrz. Thus we can reach the sarne sentential form with G and d. and adding to it A --+ rarrz lnraznzl. It follows thrrrr. n Then (6. If it does.1) .the grammarin which F is constructed deleting by A .

In this result. irr this ca*se. o. are still in the grammar even though they can no longer play a part in any clerivatiorr. While A can occur itr a string derived frorn 5.5 -+ A Clearly playS nrl l()ler a$ -4 CAnnqt be trarrsfortrtxl into a terminal string. c} . in the grAmrn1l wlxlse errtire production set is Sa ^ 9 b.IRL the set of procluctions in which B is replaced by all strings it derives in one step. A.luu. l A-aA. the variable B and its a^lsociatedprgductions Notice that.R.152 Fc.lrlgN OFCOlrrnxr-li'Run GrtRMH.b. The string aaabbchasthe derivation A+ aa. B} .ver take llart itt any derivatitln.bltAclabbc. B -+ ahbAlb. the protluctiorr . Retnoving this production leaves the Ialguage unaffected and is a simplification by any definition' . it is necessary that A and B be different variables. using the suggestedsubstitution for the variable B. The new gramma. I _ Productions Removing Useless Onn invariably wants to remove prodlctions from a glammar that can ne. The case when A : B is partially addressed in Exercises 22 and 23 at the end of this section.A4 aaabBc4 aaabbc in G. we get the grammar d with productions A --+alaaAl aba. P) with productions A --+a.Io.t nxqmpli Consider G : ({A. atrd the correspotrdittg derivatiotr A+aaA+aaabbc ln (j. this can never lead to a sentence.r d is equivalent to G. We will see shortly how ulch unnece$$ary productions can be removed from a gralrfnar. FOr example. B + ahbAlb.AlabBc. \ l A .laRfi NOR\.MS aNl Chopter6 SrurunrC.

. E iinx.cI and ? : {4. the variable B is uselessand so is the production B . A+at B -+ aa. I This example illustrates the two reason$why a variable is useless: either because it cannot be reached fiom the start symbol or because it canrrot derive a terminal string. where v : {s. Although B ciln derive a terminal string. A variablethat is rrot usefulis called useless.rr. Before we present the general case and the corresponding theorem.a. [xornpl+ s'2 A variable may be u$e]essbecause there is no way of getting a terminal string from it.aCb. The casejust mentioned is of this kind. b}. A procedure for removing useless va-riables and productions is based on reco6pizing these two situations.6. y in (V U7'). A.ble may be rneless is shown in the next grarnmar. Another reason a varia.7.rlrnns lEB Let G : (V.aSlAlC. In a grammar with start symbol S arrd productions 5-iA.T. A production is useless it involvesarrv useless if variable.'l Mr:'r'sor$ FoR TRANSFoRMTNG Gn.S. B-bA. A variable A e V is saicl to be useful if and only if thenr is at least one ?rr€ tr (G) such that S# rAy 1w.5. thert: is rro way we can achieve S # nBy.bA. (62) witlr r. with P consisting of C . B.d$$frqirbisli* liminate uselesssvmbols atrd productions from G S . In words. . * (V. let us look at another example.a variableis rrseful and only if it occurs if irt at least one derivation.P) be a ctlntext-free granrmar. A aAlA.P).

7 .aSlA. In our case.. B}. andproductions S --+ aSlA. 5 . \ / variatrlesor productions. we identiff the set of variablesthat carrlead to a terminal r*ring. Removing it and the affected productions and terminals.. becarrse + A for c.1 @ First. A Becarrse -+ a and B -r oG. Fonnrs Figure 6. I L e t G : ( V .this argument cannot be made S So cloes. i : {o}.nal answer e : (t . Dependency graphs are a way of visualizing complex relationships arrd are found in many applications. For this. Then there exists an not contain any useless equivalent gra*rrrur I : (t.A} . A+0. are led to the grammar Gr with variablesyl : {.fr.S'A. A dependerrcy graph for Vr is shown in Figure 6. F) with fr : {S.1. *# a. The formalization of this process Ieads to a general construction and the corresponding theorem.1 shows that B is useless. we can draw a dependency graph for the variables.i. . with an edge between vertices C and D if and ontv if there is a productiotr form C * rDy. we are led to the fi. Figgre 6.F)tnutdo*. P ) b e a context*free grammar. Next we want to eliminate the variables that cannot be reached from the start variable. the variables A and B belong to this set. B-aa.9.s. Removing c and its corresponding we productions. and productions S . For context-free grammars' a dependency graph ha-sits vertices labeled with variables. s.. terminalsT : {a}.Chopter 6 SlvtplIllcRtloN oF Courpxr-FREE GRAMMARSAND NoRMAr. A variable is useful only if there is a path from the vertex labeled s to the vertex labeled with that variable. However. A --+a. thus iderrtifying it as useless.

tht:rr A 4 ru e 7* is a possitrle cleriva.{ ) ..2 \. Set V1 t. has a. 2. v1 on the sex. 3.rldecl. Hence A will cventually bt: adcledto [.l answer t fro* G1. The steps irr the algoritirm irre. Any va.T.terwhile therc a. Thc remaining issue is wlrether every.'rr.4 $ rr : ah.4 ancl look at the partia. prorluction of thc forrn A adrl A to I/1.evelr -2 .k* 1l as all thr: productions irr -P whose syrrrbols are rr.riable dept:rrdencygraph ftrr Gl and from it find all rtnz' . The third Lirne throrrgh step 2.l rltlrivation trcc correspondirrg to that dcrivation (Figure 6..l .riablcsA for whir*r A4. of twtr parts.I1 (yl u "). Ta. It is eq'ally crear that if A e vr.r'ly this prornclure termi'*tes. and so on. 1. tht:re are only terrnina.1 will bel aclded to l/1 on the first pa"ss through step 2 of the algorithtr.so cvery variablc Aa at level A . At level A.5. The algorithnr cilnnot terminir. all variables at level a * :l will be rr.epeatthe ftrllowing step rrrrtil no rnore variables are arlclecl v.riirlile at level a .ondpass through step 2. in ck*r. wc draw the va.4 f'or which. is acrcled v1 before the to procedure tclrrninates. Ttr see this.ncratecl from G by an algoritirur consisting.1 Merlrops non TRarqsFonMrNc GBenruens lEF Proof: The grammar I can be gr.2 will thr+n be added t.tionv/ith Gr.6.reva.P1) such that V1 contirirrsonly va.2).mediate graqrrrrar G : 1 (V.4 . .Level.s the in trcc that are nrt yet in I.riable.&-1 -------Levcl. cxlrrsiderany such .t with all irr in yr U ?.'. we get the lina. R. Figure 6.tt fr. In the fir'st paxt we rxrnstlLrct an intt:r.T2. to lbr every A e V for which l.Lo€T* is possible. Irr thn seconclpart of the constructiorr.ls.

Auy variir.' proclrrc:tions'we all $iuce the c:orrstructionof d retains A a. Also.cr.f. These artl removed frorn the va.ny ternrirurl that d. graurrrrilr tnay gt:ritlrate a lauguir'ge not clonta.tiotr A+A is uossible is crallerinullable. frorn G by tlrt: rtlrnoval of prcitlucltious.1 t . Irr suclt cascls. G does not contain any trseless procluctions..t F ^ \ / G togethr:r. (ri3) A.ched va. [rt: renroved.ble for which the deriver.. r ? + $ i. for each tu E L (fJ) wtr have a clerivation SltAylu..riables tha. . The grirrnma. we see that G a.cotrtext-fi'ee A-A A is calledir.l assoclia.s. / ^ syilrtlols or Becaust:6f the colstnrr:tion. Putting the two rt:srtlts sr tha.\-llrocJucliorisor: nullir.F\.\-prorluctious cillr .te tr.(e) e LG).rctions involving them.lral Fottr.seq*cntly 1.ltlt:vtlriables. yut have sornt: the ..nr"1f.t$ frotrt 5'.ic. it$ a.fi . The rcsult is ttleglarlrrllirru:\v ' A t*. d ir t:orrstructecl g p..\-production. * : i { .Jf7 o u r s f iF fll'ili!.r56 nnn Chopter6 SrN{1'1.i?. gralrllrlirr of the folrn Arry producliorr of a.rrr.t c:aunot be rtler.irringtr.roN OFCoNttlx'trli'ttEECIRAMMARS Non.()o.riable Set.t is sornetirrrcsunclesiratrleis otre in which the right sicleis the ernpty string.oesrrot cicclnrin sotrrc rrseful prothrction.ted Irave everyttring ueeded ttl rnake the tlt:rivation S t--6 nAy 11u ru.rrcl arc ttqrtivaleut' I 1. Wtl t:an also elirnina.rethe proth.riilr"rr i1t.tr. { t f" Removing A'Productions One kincl of llroc]uction tha.lprc.

rerepltr. Fbr all productions A --+.s th<lsegenerated lry repla.'oR TRAN$FoRMTNC. For t:ach srrr:hprocluction of P. we look at all productions in P of the fbrrrr ul+aDrfft".rrrl1. and onc iu which both ri and 11 a.if z1 and r:. 2.t:ed with . Irr rnore grlrreralsituations. 1. A. 51 -+ a51blA..\-productions..). This grammir. wc put into F tn.rnmar with .frm.$rblab' We can ea"silyshow that this n()w gramrrrar genr:rates the samr: language:as the clrigina.cing nullable variables witlr all Fcrl exa.s 167 S n aStb. I Let G be arrv r:otttext-fiee gr:r. T}re .\.tions.. Doing this we get the gr:Irrimar $'5r aS'rblab.Az. 2 wherc each nt.'. where . ca.ted. we are ready to construct F. usirrg the fbllowing strllls. tirere will he oue prorluction in F with fir r()placecl v/ith .rgenera. substitutions for . 'Io .\-protluctions can ht: rnade in a. . one irr which ri is rcplaced with A.1 Ml:'rHops l.. a. put B irtto l/r. > i}. Gnelrrverr.1.4142.he following step until no firrther vrrriables ir.readrlt:d to V. similar.swell ar.i are botlt nullable. A in all pousible corrrbina..41.\.6.n be removed after addirrg new prtiductions obtairred by srrbstitutittg A for 51 wherc it occurs on thc right.\ not in L ((. althougtr rnore r:ornplicil.. ro.( V U ?. . Proof: We lirst find tire set U1s of all mrlla. .An art:in !fu.\-production the 51 --+ ).ble variables of G.nb" : n. prrt A into H1y.mple.tes A-freelanguage {a.n. Onc:c the stlt V7lr hirs been founcl..nt procluction ir.l otre.\.. Repeat l. marrner. For all nroductions fi _. 'fhen there cxists an eqrlivalcnt gramrnar I having uo .

fii Arry production of a.vs Chopter6 Slrrrurlc. where.tion. D*d.158 AND GRAMMAR.Cl A ..qltoN r)r' CoNrnx'r'-lr-nEE Tht:re is onc exception: if all r. ('--+ D. C --+Dl. The argumerrt that this grauunnr d is cqrrivalerrt to G is straiglrtftrrwarcl atrtl will bo left to the rearlor.3. E.ABaClEn.\-productionsequiverleut the gratrruilr defitrtldby S .ABaC.the prodrrctiotr A -+ .sI NoRMAI Fon. D -+ tl.blA. S . . folkrwing the secondstep of the construt.r to Firrcla t:rintext-f'ree without . I RemovingUnit-Produciions '-fheorent As we see from 6. we lirrrl that the rurllablevariablesare A.. contcxt-free grilmrrrirr of lhc fbrm A -->B. . prrlrfilctious in which both sides irre a single variable art: at tiurcs uncltlsirahle. Therr. B .wc get AaClAB alaC lAalE o. in Fhomthc first stcp of tlrc coustntctiorr Theorern6. '1 .BC.la.BlClBC.i$fi'f. fl.rernrlla.2.4. I grarrrma. B-b.\.initien-'0. B €V is called a unit-production.i iiil.\ is rrot Put into P. C.ble..

all variablesB such that t4 n.y t l a t l" ' 1 a " .and we needonly considerA -+ B. whereB + A1lyzl.this carrbe in done if we proceedwith somecare. in the specialcase A.4). is takerrfrom F.At first sight. we use the substitution rule discussed in Theorem6.S.B witlt A-utlazl"'1a". I .rr\l.4) holds whenever there is a walk between . then (6.To get arourrdthis.\-productions. P) be any context-foee grammarwithout .1 Msrnons noR TRehrsFoRMrNGGRerr.rRs 159 To removeunit-productions.f .\ to replace A. .4 and B are different variablerl. for each. for all A and B satisfying (6. The new grammur d ir generated by first putting into F ail non-unit productions of P. D) whenever the gtammar has a unit-prodtrction C -+ D.1.i.A can be removed from the grammar without effect.1 directly with rr : nz = . the utrit-productionsa-re not rentoved.5. we first Iind. U-4. rroneof the yacan be a single variable.. (6 4) We can do this by dtawing a dependency graph with an edge (C.any unit-productiorrof the form A -. To showthat the resulting grarnmaris equivalentto the original one we can follow the sameline of reasoningas in Theorenr6. Let G : (V. where . Then there existsa context-free grammar e: (t. it may seemthat we carr use Theorem6.4. so that no unit-productions are createdby the last step.U* is the set of all rulesin F with B on the left.F\ that doesnot / have any unit-procluctionsa"urlthat is equivalent\toG.!r. Next. But this will not alwayswork.1. ly.. Note that sinceB .lgl| ' .4 anci B. Proofr Obviously.7.B. we add to P A . As the construction the next theoremshows.

3 o @[] ---------. n --+bb. Then there exists a context-free grarnmal that generates . ---+ B albblbc. The dependencygraph lbr the utrit-productionr is given in Figure 6. B _ Albb. A --+ alhclB.\-productiorr$.L and that does not have any useless productions.9 + . Note that the removalof the unit-productions has made . . A --+ albblbc.theequivalent 5 .4.S + E.160 Chopter 6 SturlrnceuoN oF Conrexr-FnnE GRAMMARSAND NoRMAL Fonns Figure 6. Hence. the new rules S . B * albc' grammar to obtain.we add to the origJinal non-unit prodrrctiotrs ff+Aa. and . 6 i Remrtveall unit-productions from S * AtrlB. A + albc.B and the a*ssociated productions useless I We carr put all tltese results together to show that gramma"rsfor contextfree language$ can be made free of useless productions. A-procluctions. -:J G x o m p l6 . .3. or unit-productions.4 4 B.L be a context-free language that does not coutain A. Let . we seefrom it that .alhclbh. fi+bb. B + A. and unit-productions.a lbclbblAa.

1.q.l B . Mnrsons roR TR.an. ffi .the procedurefor removing . A + aaa.4 requiresthat the grammar ha.5 :+d rlj implies D .using both the original and the modified grammar. A + o. RemoveA-productions 2. Theorem6. Show that the two Erammars S + abABlba.we can renroveall undesirableproductions using the following sequence steps: of 1.4 removethese kinds of productiorts trrrn. . In Example 6. show a derivation tree for the string ababbbac. for example.But note that the removalof unit-productions no (Exercise at the end of this section). and the theorem is uroved.3. and 6. The only point that needsconsideration in is that the removal of one tyJre of production may introduce productions of another type.\-productions can createnew unit-productions.1 by showing that .aNsFoRMING given in Theorems Proof: The proceclures 6. are equivalent.2. 6.. Removeunit-productions productions 3. Therefore.+G: tlr.and doesrrot createtr-produc:tions 15 productions does not create A-productionsor urritthc removal of useless productions(Exercise16 at thr: end of this sectiorr).rtlra.1.ns 161 6. Removeu$eless The result will therr have none of these productions. Also.aAlbh and S + ahAaAlabAbblba.Gn. I - 1" Cornplete the proof of Theorem 6.ve A-productions.

from the grammar all uselessproductions. B + bBlbbC.rtoNoF CoNroxr-FREE GRAMMARS NoRMALFoRMS AND 4. B + b6. A*I. Completethe proof of Theorem6. A + aaAl.4. A * aBlA. Eliminate all . What languagedoes this grammar generate? ffi L Eliminate all unit productions from the grammar in Exercise7. What languagedoes the resulting grammar generate'l . Eliminate useless g + alaAlBlC. L Remove all unit-productions.L62 Chopter6 Slupr.aAlaBB.rnce. 4-bA. 12. Completethe proof of Theorem6. 11. D . B-AAWhat Ianguagedoes this grammar generate? productions from 6.ddd.\.\. Use the construction in Theorem 6. B*Aa.\-productions from SAaBlaaB. Eliminate all useless S . In Theorem 6. and all A-productions S .41.4. 10. 7. C .cCD.aSlAB. why is it necessaryto assumethat A and B are difierent variables? productions from the grammax 5.3 to removeA*productionsfrom the grammar in Example 5.1. C-B.3.

where "B { V.\-productions and unit-productions? . one length of all the strings giving the productiorr rules. then the removal of useless prodrrctions by the construction of Theorem 6. Show that if we apply the construction in Theorem 6. we obtain a new grammar d such r h a r z ( S ): r ( G ) . Oonsider the procedure suggested irr Theorerrr 6. wc might the tcnn sirnplificatiorr precisely bv introducing the 'Ihis of a grarlimar. Does the new procedure still work correctly? If so. If not. Prove that if this rule is replaccd by A+ By. therr renroving those that do not yield a terminal string. What can you say about the removal of . 1 6 .e. It is possible to define concept of complexity of them is through the For example. prt.3.S. ffi 1 7 .\-productions irrtroduces previously norrexistent rrrrit-productions.2 for the removal of useless 'l:pfoductions. Show that the rernoval of useless productions always reduces the complexity in this sense. use (G) compt.T. ffi 1 5 .1 that the variable B can be replaced as soon as it appears.6.1 Mer'sons FoR TRANSFoRMTNG GnertlreRs 163 1 3 .\-productions and no urrit-produrtions. then the resulting grammar is equivalent to the original one. 19. Suppose that a context-free gramrnar G: the form A+fry.\-productions.nity : A+uEP E {1 + lul} . (V. Reverse the order of the two parts.4 does not then introduce any. l/ € (y U ")t.ve it. r8.2 does not introduce any such productions. Justify the claim made in the proof of Theorem 6. but possibly with somc unitproductions. Let G be a grammar without A-productions. 1 4 .{ . B+x. 2O. Show that if a grammar has no .L (G). Show that the construction of Theorem 6. U . Suppose that G is a context-free grammar for which I € . can be done in rrrany ways. Give an example of a situation in which the removal of . give a counterexample.P) has a production of where tr.. first eliminating va"riables that cannot be reached frorrt S.

'i : 7.n1. fr '* rr. gramrnar d: but A is not a prefix of any gl.T.2. t . /).2...". .S.tAl . Lct the set of productiorrs involving the variahle...2.laEcl.rrs that have A on the left by A + yil11iZ.F). A *.l.lZnt. to exercise rewrite the grarrrma...n't... where fii.r7i are in (Ill?')-.!!rlazl .t11 plc that the rernoval of useless produt:tions does not necessarily protluce a minimal gramnrar.l*"A and A + ..T.P) be a context-free grammar.'lA*". Show that the grarnmar ohtained try rcplating these productions with A * 'ltrlZth. i :1. Use the result of the preceding A + Au.s. 1. ? . Prove the following courrterpartof Exercise22.. placing all productit. . : 1 . Let G : (y..riablc (say..r 23.. rhentr(c):r(d). ..tlp: ?1. .Bblbc so that it no longerhas prtxluctionsof the forur A + A:r or IJ * Br. s t l a z. Divide the set ef productions whose left sides arc soilre given va. ffi *ZZ. Show by examr:omplexi. z: Z + xil'tiZ..r t .. wher'e ..n' is etlrivalent to the original grarnmar..!t. *24.L FoRMs Chopter 6 SrllplrrrcATroN oF C<. .1.\ B .L if (G) < torn7tlerZtlv (d) for arry d gcnetating f. A contcxt-free gramrrrar G is said to be rrrinimal for a giveu languagc . . into two disjoint subsets A+ and AnllA*rl'. . Consider the where Z f V and F is obtained try rc(u J{Z}.4 on the lcft l-re divided into two disjoint subsets A+ntAlr.rr'r'ruxr-FR.164 GRenuaRs nNo Noti[4A. . 1 u .4 is not a suIfix of any :qi.. Prove the following resrrlt.

ltr particular. Ci are irr I.45'la.BC. B.ASIAAS. br:c:ause their widt: usefhlrress.r() can ask that thc: string rln the right of a productiotr consisLof no trtore tirirtr two symbols.4. uR' funllfi .tponrenl Nonna. One instance of tltis is the Chomsky normal form. Chomsky NormqlForm One kirxl of nrlrrrral forrn we cir.l Fonus 165 iffiffi Two lmportont Normol Forms Tht:re are nrany kinrls of norrnal forrns we cir.rrlook ftlr is one in whit:ir Lhe mrrnber of strictly liurited.a violate the conrlitions of I AA5 attcl zt - . havtl beetr studof itxl extentively. OI A _-+ ill whcre l.$AIb iu iti Chornsky norrnal forrn.l form if all production$ at'e of thc form A _. A is rrotl both proclucticxts5 Dt:{irtition 6. A .ftftilrltiuw A contcxt-free grarrlrrlirr is in Chornskv rtorrna. S Ala'tt' u.. g'. fhe gralrllnilr fi .Sone of these.rrestahlish fot coutext-fi'ee grirrrrrrlars.6. Wc considr:r'lwo of thettr bricfly.ptl a is in 7'.2 Two lr. a. wc s.ymholson thc riglrt of a prodrrc:tiotla.'.

(6 ri) A . where Ct = rt if z. The construction of G will be done in two steps. S.. we can assume of without loss of generality tlrat G has no A-productionsand no unit-productions. . If n > 2.' is in V. It is an ea. where Cl € Vr.syconsequenceof Theorem 6.+C t C z ' ' ' C * . For every Bo we also put into Pr the production Ba --+ a' This part of the algorithm removesall terminaJsfrom productions whose right sidehaslength grea.Cn. s. At the end of this step we have a grammer G1 all of whose productionshave the form A--a.replacingthem with newly introduced variables. For each production of P in the form (6.. If z : I then trr must be a.CrCz.P) with ^ # L(G) has an equivaAny context-free lent grammare : (?. (6 I'r) where each 14 is a symbol either in V or T. ancl Ca: Bo if ra : s. Proof: Because Theorem 6. productionsin P in the form A+ntrz"'frn. terminal since we have no unit-productions.5. put the production into P1. F) in Chomskynormal form.T. Step1: Construct a grammar G1 : (V1 T.1 that (6 7) l'(Gr) : L(G).166 Chopfer 6 Stlrpllrtc. introduce new variables Bo for each a € ?. In this case. .i.rroN oF CourFxt-FnEE GRAMMARSAND NoRMAL FoRMS grammar G : (V.ter than one.S.q.Pl) from G by consideringall ..5) we put into Pt the production A .

the gralnmar doesnot have any tr-production$or any unit-productions. we introduce D1 new variables .ABB. Ba+c' .fi Convertthe gramrnarwith productions S -+ ABa.B../' .BoBoBa. Dt * CzDz.. (. r ffi Sxnttpl+.d). We will leave this to the reader. we introduce new variablesBo.6.1 will showthat f (Gr) : .Ms 167 Step 2: In the secondstep. so that \ . Bn*b. Obviously.7)with n:-2 into F.. fi.6) as well as all the productiorrsof the form (6. all productions of the form (6. we introduce additional variablesto reducethe First we put length of the right sidesof the productions where necessary. put into P the productions and A --> CtDr. This somewhat informal argument can easily be made more precise.D2. Bo '' a. B -+ Ac. and usethe algorithm to get S . to Chomsky normal form. As requiredby the constructionof Theorem 6.. : Dn-z + Cn-tCn.. Fon.? Two lMponrerur'Nonrranr. A -+ aab. A .the resulting grammar d ir itr Chomskynormal form. B -+ AB. In Step 1. Ba..6. For n j 2. (") : LG). Repeatecl of applicatiorrs Theorem6.

--+b.tnns NoRMALF onlrs In thc secondstep.nd variirhles carl appear. Greibach trormal forrn has marry theorctical arrd practica.nition 5. we irrtroduce additional variables to get tlrc first twrr producLiorrsinto normal forrn and we get the fina. we $ee that the form A * o.)occur at most once. wherca€.4.nspartrnt. . a. .l result SADr.ii1qniy. We therefore deal with this rnatter very briefly. Arguments justifying Greibach normal ftrrm are a.Iittle complicated and not very tra.ffi tN A txrntext-free grarnmar is sa. constructirrg a grarnmar irr Clreibach normal form ertlrivalerrt to a given context-free grammar is tedious.ch normal forrn and s-grammar$i but Greibach normal form does not carry the restriction that the pair (-4. A --+ BoD2.Iandr€V* If we compare this with D<:fi.r is comrtron to both Greiba..nelf'l. This additional fret:dom gives Greitra.fr.ri.BRo.ch norrnal forur a generality not by lrossessed s-grammars.id to be in Grcibach rrormal forrn if all productions havtl the forrrr ! --+ o. Bu.q'rroN Colrnxr-Irrtnn GR.+vrr. Sirrrilarly. Here we put restrictions not on the lcngth of the riglrt sides of a prodrrction. p --+ABt. Bt.!.AND oF' Chopter6 Srrururlc. but on the positions in which tcrminals ir. Bc+c' I Greibqch NormolForm Another usefrrl grammatical fotrn is the Greibach normal form. Nevertheless. D2 -+ RnB6. Dt .l conseqrrences. a Q.

we immediately get the equivalent grammar S + aABlbBBlbB. into Greibach normal form. We introduce Greibach normal form here because it will simplify the technical discussion of an important result in the next chapter. rreither the conversion of a given grammar to Greibach normal form nor the proof that this can always be done are $imple matter$. Here are two simple examples.2 Two IturoRrnrqr NoRl"ler.10 Convert the grammar SabSblaa. However.6. using the substitution given by Theorenr 6. from .1. Here we carl use a device similar to the one introduced in the construction of Chomsky normal form. B-b. We introduce new variables . However. respectively.FoRrrs 169 If a gramrnar is not in Greibach normal form. A+d. which is in Greibach normal form. I In general. I Exoilrple 6. B+b. A --+ aAlbBlb. S+AB.aAlbBlb.4 and B that are essentially synonyms for a and b. A -. which is in Greibach normal form. tltough. we may be able to rewrite it in this form with sorne of the techniques encrluntered above. B ---'b is not in Greibach rrormal form. Substituting for the terminals with their a"ssociatedvariables leads to the equivalent grammar S --+aBSBlaA.

B . Ttansform the gramma.ABIaB A . there existsan equivalent Bramrrar d in Greibachnormal form. Transform the grammar with productions S * abAB. ffi 6 . L A linear languageis one for which there exists a linear grammar (for a definition. I .ns NoRMAl 6 FoRtvrs AND oF a conceptualviewpoirrt.aabll B-bbA into Chomsky normal form.T. Let Z be any linea^rlanguage not containing .1) lPl * lTl production rules. S. !+Ba.uve.BAalAlA. 5. . Show that there is an equivalent grammar in Chomsky normal form with no more than (k . Let G : (y. 3. A + bABl. A-e.Chopter SrupllnrcauoN Colrruxr-FnpnGn-q.. . Convert the grammar S + ttSblab into Chomsky normal form. P) be any context-freegrammar without any tr-productions or unit*productions. Convert the gramma. Provide the details of the proof of Theorem 6.tr. ?. s u c ht h a t . Let k be the maximum number of symbolson the right of any production in P.P) all of whoseproductions have one of the forrns A+aB..r S form. A abAlb into Chomsky normal 4. A . Greil-rach normal form plays no further role in our so discussion. aSaAlA.S. m .13). For everycontext-freegrammar G with ^ # L (G). Show that there exists a grammax6 : (V.\. .r S .= r ( G ) . w h e r eo E T . into Chomsky normal form.6. see Example 3. Draw the dependencygraph for the grammar in Exercise4. B E V . 1 . we only quote the following generalresult without proof.

1 tr.C}.Sblab.P)thereisaneqrrivalent one irr which all pr'64u"tr1on. or * . A.CeUandae?.l forrn.L(G). Convert the grarnrnar g + ab loSlaa5 into Greibach norrnal form.U? 15. Convert the following grarrrrilar into Greibach norrnal form.BC.C €V.9-4. Corrvert tl:regraurnar S irrto Greibar:h rrorma.. A+ B into Greilrach norrrral form.B.B.A.b}. A context-free grammar is said to be in two-standard forur if a. B -b.rrdard form.P) S + a. S + rt.s uaAlB.rNoRMAL Fon. there exists an ecluivalenf grammar irr t.lb ffi f<rmr is generall for any context-lrce grarrrmar G with A { . Prove this.{n. wlrcre € X U {I}. Convert the 6grammar S + ABblo.2 lI'Wo Ir.ttern .ll production mles satisfy the following pa. Showthatfrrrevervcontext-frcegrarrrtrarG:(V. Can every lincar grar[mar be convcrted to a fbrm irr which all productiorrs Iook like A . a 10. *16.T.rhave the forrn A + a. ItAb W aSblbSola.rpoRrAru. 12. A-aB.. A + bABC. e + aBCt irrto two-starrdard forrn. Convertthe grammar G: ({S. A_+e1 wher'eA.'I'wo-$tandard ffi witli P givena. 14.S.B.prs 171 9.l + o. 13.. .an.wo-sta.S.BC. 11.6.where a e ? and tr € yU {.

r . arxl T.ai. a.rt l..tuaRS At'ro Non.8) shows that it catr bc used to t:ompute o[ rr. We definesubstrings wij:(ri"'ajt and sulnetsoI U vi : {e Ev :A5 u. . F ORl{s iiffiffiffitil A MembershipAlgorithm for ContextF r e eG r o m m q r s * In Ssrtion 5. .. A derives ruii if a. Kasarni.2. II. .Chopter 6 Stltrurlc.2 . V2r. V z z .T'. that mcmbership and parsing erlqorithttm f'or cotrtext-fiee grarnmars exist that retpire approximately ltol'' steps to parse a string trr. h < r .j .tr...llthe F1r.r. . Ttl r:ontiurrc.i C {A: ABC.rbserve drrction A -. attd srt ott. Clearly. The algorithrn works only if the grarnrrrar is itr Chomsky rrormal forrn aucl sutr:eeds by breaking one pr()hlem into a $equell(xl of stnallt:r ones irl the following wir. The algorithm wt: will describe here is t.i I U An instrrrx:tion t]rc indices in (ti. (6.y. Vlrnger. with-B 4r1* v.S) LE {r. Cocke.if we prot:ced in tirtl secluetrctl l . ancl only if G contains a. Therefortt.lled CYK algorithm. wt: cla. Vil call llt+ computed for all I < z < rz by inspection of u arxl the prtlrluctions tlf the grarrrmar. u E lt (G) if and orrly if S e Vr".. We are lr()w in a position to justify the this clrrirn. C t r m p u t eV r . tr/r*1. . .r}. C t r m p u L e t : t .ij: a ..t A €Viif To cornpute Vl.'en. S. .lvlel. D..fter its originators J.itned.wit}rBEHtr. .. r r d 1 ' . V n . V 2 t .. produt:tion A--+ BC.c€Vr+r. . 1 " a 1f o r s o m e f u w i t h i < h .i. . notice tlrnt for j > i. P) in Chornsky llorlllill forrn and a string 'til:eraz. r. .}.nclonly if tltere is a.without any elaboration.. Assume that we trave a grarnmar G : (V. I n t r t h e r w o r d s ... prothtr.. .W . 3 .a'r'loN oF CON:rEXT-FRljtl GRnr. Ctimpute Vrz. 2. . V .

O t V22} . First rxltrl Lhi'rl'ur1 . that i*.e1-rs.inrt-xl rt:srrlLli-rilows. Thc CYI( algorithru.wc irlso havt: yz:: {. alc r1r:r. E r : { . e strtlre reclrtircrl righl sirleis AB. Since tlrt:re ale nrltre. r:arr l_re it qrnvcrted iulo ir.dctt:rrrlinesrncrnbershipfor ir.J frirrn.l/+o: {A} . : { B } .$).1) /2 sets of V.' t/rr} ..ll variables t}iiit occnr of ott Lhe lefl . (_.Next.t the CYK rnemlrershill algoritlrrrr lecluiles ()n.l.. ntitic:c Lhat exa.Vt+: {l} . V4 : {4. 8 } . similarly.xr-FnnnClHe.riablesthat irurric:rliI ately rk:rive a.3 A MnlrnnnsHlr Ar. Y r r : { B } Now we use (6.4} at:ci. V:s : {S. V r . B E Vtt.riy gr:leraterl lr.lrrntiorr a. a grarnrnal iu flhonr. { 5 . Sirice ur22: e. I fr. of so tlrc cla.8}. l'lr: {.r Fon {loFl'r. A _ BBla. V12 is crrtllt.4.ridcof a prorluction whosu righl sicle is A. 4 } . terms irr (ti. the sct c:onsists a.rn. Eir. R -+ ABlb.rrrtl.r..V r r : { 5 . have to lltl cotrtl:rtted.irarsirtg rnetlxltl.:r sl. V 4 a= ={ E } . { 5 .y.! --+B(. With sorlc langrra.'Il'. Since V11= {A} ancl V22: {A}. l / z r : { ..0. RI .sky lolrng. B € Vn. .iso V11 is thc scl of all vir.colr.i.chinvtilvr:s 1. } _-tB(i.gt: generatecJ lw 5-AR.heeva. {A} .8) to get I4z : {A : .ctly rr. B l . A straiglitft-rrward argrrrnt:ltalongtlu:srlIinesthrlu gives Vn: : Yr3 Vr+: V ' n: s ot h a t u t e L ( G ) . a } ..4}.gc a.(rz -1.: . V t t : { A } . as clrrst:r'illccl her:e.t rriost ll. Iir see thrr.r.dditiorrsto keep trac:k of how tlrc elertrentsof l/1. tinrlwc haveH23: {S'.R } . R} .I/rr : {5.ivecl.tns L7B Ultqftn$f$ld. Dcttlrmine whetlrrrr the string In: tlul H-ratnutar (rabbbis irr the langua.

using tlre gratnrrrar 3 . IIse the rcsult in Exercise ll to write a cotnputer prograln firr Jrarsing with any contcxt-free gra. * * 4 .Ilri GRAMMARSAND NoRMAI Fonlrs 1 Use tlre CYK algorithm to dcterrrrirte whethcr ihe strings nabb. Usc the CYK algorithm to find a parsing of the strirrg aab.r. . ffi 2 .rprcetloNol Corvrrxr-Fn.and obtibb a.774 Chopter 6 Srrvtt.mfllat in Chomsky norural form. Usc the approach crnployed in llxcrcise 2 to show how the CYK mcmbership algorithm can be trrade itrto a parsing metlxrrl..11. aabba.re in the language generated by the gtarnmar irr Examplc 6'11' of Examplc 6.

Sinc:en is unboundrld. we mu. we rnust also corrrrt the nrrmber of a's.. Intuitively.ge.whereas the recognitiorr of a context-free language may reqrrire storing an unbounded amount of information. For example. we understand that this is because finite autotrara have strictly finite rrrernories. su()has {wwR}. This suggests that we might try a stack as a storage mechanism.Pu hdown s Automoto he description of cnntext-free languages by means of context-free gramrrrars is convenicrrt. But as we see from other examples. allowing unbounded storage that is restricted tci operating likr a stack. I n } 0}. when scanning a string from the language 7 : {a"b.s. this counting cannot be done with a finite memory.finite automata cannot recognize all context-fiee As languages. we have seern. The next question is whether there is a class of automata that can be associated with context-free langua. We want a machinc that can count without limit. as illustrated by the use of BNF in programming language dcfinition.st not only check that all a's precede the first b. we need more tharr unlimited cxrurrting ability: wc rreed the ahility to store and match a sequence of symbols in reverse ortler. This gives us a (lass of machines called pushdown automata (pda)' 175 .

We first show that if we allow pushdown arttomata to act nondeterministically. we conclude ttre chapter with a brief introduction to the grammars associated with deterntittistit: context-free languages.tre Figure 7. we explore the connection between pushdown autornata and context-free languages.1 Input file In this chapter. But we will also see that here there is no longer an equivalence between the deterministic and nondeter* ministic versions. Each move of the control unit reads a symbol from the input file.1. The class of deterministic pushdowrr automata defines a new family of languages. .move is a new state of the control unit and a change in the top of the stack.r. forming a proper subset of the context-free languages. Since this is an important family for the treatment of programming language$. we get a class of automata that accepts exactly the family of context-free languages. The result of the. Each move of the control unit is determitred by the current input symbol as well as by the symbol currently on top of the stack.seccepters. Aufomoton Definitionof q Pushdown Formalizing this intuitive notion gives us a precise definition of a pushdown automaton. the deterministic context-free languages. we will restrict ourselves to pushdown automata acting a. while at the same time changing the contents of the stack through the usual stack operations. In our discussion. M A N o n d e l e r m i n i s tP u s h d o w n u t o m o t o ic A schematic representation of a pu$hdown automaton is given in Figure 7.Chopter 7 PusHooww Aurorr.

Q x (E U {A}) x l' .\.$lr|"q.MrNrsrIcPrJsurrowruAuroltnrA 177 . The irrgutnentsof d a. t .7.tesof the control unit. irrdicating that a move that does not collsurne an inprrt symbol is possible. the input svurbol read is a. The re:sult is a set of ptlirs (q..lll lr A nondeterministic pushdown accepter (npda) is defineld the septry trrple M : ( Q .z). q 3 . r € | is the stack start symbol.rrsitionrules of arr npda crrntains d ( . Note also thnt d is defined so that it needs a.. ________-____'Emunrpl+ f . In our rrltation we &ssurnethat the instrrtion of a string into a.use I x f* is atr infinite set ir.\-tra.llerlthe stack alphabet. Qo€ Q is the initirrl state of the corrtrol unit.. irnd the symbol on top of the stack is b.irRR.stack is done symbol by symbol.rrdtherefore ha"r irrfirrite subsets. F e Q is the sct of linal states.1 NowlnrER.a . I .nsitirirr. sta. b ) : { ( . X is the input alphrrbet. . We will call such ir rrlove a . no m()v()is possible if thrr stack is emptv. tiris choice must be restricted to a firrite set of possibilitics. where q is the next state tlf the control unit trnd r is a string whiclt is pnt on tcill of the stack irr place of the singlc syttrbol there before. . the requirernetrt that the rarrge of d be a finitrl subset is ner:essarybeca. \ } .iirqnri. I' is a flnite set of syrnbols ca.k.l. q o 1 z 1 F|) whcre Q is a finite set of internal sta. 8 . 1 . stir.fil.finite suhscts of Q x f* is thc transilion functirin. While an npda may have several choit:cs for its move$. c If at any time the control unit is in statc q1. The ulrnplicated formnl appearance of the drlrrrairr and range of d rnerits a closer o'xamitration. d ) . Note that thc second argument of d may be .rethc current state of the control unit. or (2) the u)ntlol unit goes irito state qs with the symbol b rcuroved frotn the top of tht' stack. tht: current input syrnl-rol. 1 r .ppen: (1) the t:otrtrol unit gJoes irrto state {2 ancl t}re string cd replirces b on top of t}re stack. then one of two thirrgs can ha. Finallv.and the currrlrrt symbol on toJ) of the stac.| Suppose the set of trir. d .rting at the riglrt end of the string.t:ksymbol. t r .

we see that the npda will end in the final state q3 if and only if the input string is itt the language 7.$. 1 . The control unit is in state (1 until the first b is encountered at which time it goes into state q2.:0 ) ( q 3 .-di : d (sor.} z:0. The crucial transitions are {I--=. notice that transitions are not specified for all possible combinations of input and stack symbols. .b } ... which adds a 1 to the stack when an o.0):{(q3. before making $uch a. \ ) }. 1 ): { ( q 1 . 1 :) { ( q 2 . For instance. we rnight say that the rrpda accepts the above language. { /-. After analyzing the remaining transitions. ] d ( q t .1) = {(q1.10).I)}.'_ 6(h.A ) } . t r . I . and 6 ( q r .(q3. 1 . .a. \ ) } .1 ) : { ( q 2 . d ( q 0 . 6 ( q r . What can we say about the action of this automaton? First. we must define what we mean by an npda accepting a language.A. r : { 0 .\)}. b . a . which removes a 1 when a b is encourrtered.t } p : { a . Of course. is read.-{anb":n>0}u{"} As an analogy with finite automata. l ) : { ( q 2 ' . there is no entry given for d(qo.b.. d(qz. 1 1 ) . The interpretation of this is the same that we used for nondeterministic finite autoutata: an unspecified transition is to the null set arrd represents qdead configuratiqllfor the npda. 6 ( q r .o){(q'. . l r . q. Q z . I ) } . This assures that no b precedes the last a. and ..11)}..0). These two steps count the mrmber of a's and match that count against the number of b's. Q t .2 an Consider npda with Q : { q o . claim.178 Chopter 7 PusHnowN AuroMlt'rA ffi $xomple 7. r': {qr}.

. The final stack content z is irrelevant to this definition of acceutance. . i ( q o .e F . the language accepted by M is the set of all strings that can put M into a Iinal state at the end of the string.. and u is the stack contents (with the leftmost symbol indicating the top of the stack) is called an instantaneoua description of a pushdown automaton.n/ld Jfn"J (5 {r..F) be a nondeterministicpushdown automaton.2.l i Construct an npda for the language L : {* e {a. The triplet .b}* : no(w) : nu(ru)}. a ) d ( q 1 . b ) . and the current contents of the stack. . i 1 .t.a. the solution to this problem involves counting the number of a's and b's. (8. r ) i u @ . we introduce a convenient notation for de* scribing the successiveconfigurations of an npda during the processing of a string.y.t . 4 . thus (qt. i ' Ht(dftfpls y.tm$ilnlirtri ril.70/ JTdeF a where q is the state oJ_tlp_gqntrol uaiL tr. The relevant factors at any time are the current state of the control unit. .U!) is possible if and only if i t ( q z . € Moves involving an arbitrary number of steps will be denoted Uy [.7.S ii. As in Example 7. In words. i . TheLonguoge Accepted o Pushdown by Automolon lllfim$.f. which is easily done with a stack. p e r .a .aw. t i . lc ) t. u e f . } .. . the unread part of the input string. Here we need not even . On occasions rvhere several automata are under consideration we will use Far to emphasize that the move is made by the particular automaton M. u ) .d.80.r'ol u^-rt L t' '+- j. Together these completely determine all the possible ways in which the npda can proceed. A move from one instantaneous description to another will be denoted by ihe symbol F. w .' is the unread part of the input string.E.I NoNDETERMrNrsrrc PusHoowruAurouarn l7S To simplify the discussion. Let M: The language accepted by M is the set L(M): {.b!) | (qz.

rt. . a d (qo. #. A s o l u t i o nt o t h e p r o b l e m i s g i v e u b y M : ( 8 . 0 4 )' } 1z)} .4z} y:{a. ftlr t:ourrting thel b's that a.i:i1. Brrt the lolcleterrlinistic rrature of the irutornatr)rr hclllls us with this.1(t. into lhe stack whenever an a is read. . w h e r e .r. f ) : {(qn.z}. {r1l}).nnpdrr. We can ll$e iI negiltivtl cotrntttr symbol. tirirt is. we r:ompi. : ( p \unun : ur € {ct. | : {a.1 1 ) } .b. ' I z )F ( q 0 .r:k. the nprla.A'[: ({rlo. i ' ) .\. 2 . t) alrd hcrrt:t:thcl strirrg is ircx:epted.0 to usr:. q o .0z) F (qg. we push the rxlrrsgr:rrtivr:svrnbols rirr tiru sta. z} . Tlte cotttplete solution is r1o. I ' .b}. .r \ we use the fact that the syrnbols are retrieved frorn a stack itr the reverse or<lerof their insertion.t wtt rltt rrttt know tfie rniddle of the strirrg.tr. 1 ) : { ( q 6 . will not find a.:lT:'"'*ni[::r:]:. Q : {. j n . thtr npda correctly guesseswhere the rrriddle is artd switt:hes stirtes at that ptlirrt. r . 6 .(qo. a d ( q o .re current input symbol with Lhe top of lhe stack. b}-} . {0. mtr. ir.r.'\)} ' b d ( .. w]rttrc tl cnds ilrrd rirE sttlrts.cx:tlptirrgt}rtllirrrgrri'r. tr. 15. The only difficulty with this is that if wtl there is a prefix of ur with more b's thiln r?.I (qo.q.y 1.b} . 1. But this is easy to fix.Qt. b is fbund. .s a ( q s .rc to trc rnirtchtxl irgairrst ats later. For thc stx:tlnd pa. a d ( q o . z. . F : {arl . .b. with ri given as I .. 8 . tr. r) : {(q11.i. . z) : {(qt.lT*il:. a b ' a ) f l. 0 0 ) }' b d ( q o . {&. We ca1 iusert a counter symbol' say 0.'it. z) F (qr. cotrtinuitrg as lotrg as the ?iil: ix.ge I.f .ke$the move. z ) . In processing the string baotr. 'W WMWWiN*t:tlrrstrtrtltilnrrpclilfilrtr. z)} . sa. z ) : { ( q 1 1 .. Wlren reacling the first part of the string.( g o .I ) } .Chopter 7 PusHnowu Auronere w6rry a|out tlte order of the a's atrd b's. a b . . b a a b . then pop olle counter symbol from the stack when a. 0 ) : { ( r t 6 . d (rJs.O ) : { ( q n .1]:f ''An apparent clifliculty with this suggcstiorr is th.

7.1 NolrnnrERMrNrsrrc PusHlowl Arlrouara


The transition firnction can be visualized as having several parts: a set to push zr orr the stir.r:k,

\ / ) I

d (so, a) : o, : 6 (qo,b,a) d (so, b) : o, (t(,10,1r,: b)
d (Qo,a, ): z d ( q o , b ,z ) :


aa)} , {(so, b(z)} {((t(r, , ab)} , {(qs, {(qo,bb)},
{ ( q g ,a e ) } . { ( q o ,b r ) } ,

{ )

a set to guesstlre middle of the string, where the npcla switches frorn state ttl 9r1 qt '*-d ( q o , ,a ) : { ( q t , a ) } , A d ( q o ,t r ,b ) : { ( q t , b ) } , a set to uatch zrrRagainst the contents of the stack, 6(h,a,0) : {(q1,I)}, : { d (,1r,1r,1,) (qt,I)} , and fina,lly d ( q t , A , , ) : { ( q z ,r ) } , to recognizea successfulmatch. The sequence of moves in a,ccepting abba is (qo,abba,z) t (qo,Itaa, az) | (Qo, baz) ba, F ( q 1 , b a , b u z ) l( t 1 1 , o . , n F )( g r , A , . z ) ( q 2 , . e ) . 2 F The nondeterministic alternative fbr loca,ting the miclclle of the string is taken at the third move. At that stage, the pcla ha-r the instir,ntaneo's descriptions (qo,ba,baa) a'rl ha,stwo choices fbr its next rnt)ve. c)rre is to uue d(gs,b,b) = {(qo,bb) ancl make the move i (q6,bu,rt, baz) | (qo,a, bbaz), the secondis the one used above, namely d(q0,,\,b) : {(qr,b)}. Only the latter leads to acceptanceof the irrput.


Aurol\lArA Chopter 7 PusHnowt-r

1 . Find a pda with fcwer than four states that accepts the same language as the pda in Exarnple 7.2.

Prove that the Jrda in Exarnple ?.4 rloes not accept any string not in {tutrtF}'

fD C.,nrt.lrct nptla's that atxrept the followirig regular languages' (a) Lr : L(aaa.b) (b) IIr : L(aa,h"aba-) ({ the urrion of trr an<l Lz

(d) Lt - Lz

on languages X : {4, b,c}' fD Construct npda's that acceptthe following (a)L={**bt*,rz>0} W (b) .L : {tnctu'r : ru e {n,,b}* } : (") f.L {.a^h*c'+*: n, } 0,- > 0} (.1f : {o"^h+*c*:n}Q,m}1} ) (*) t: r: { a 3 b ' o c ' oz > 0 } W


( (S)r : {u : no (*) : rru t,) + t} ( I t ) . L : { u t : n o ( u t ): t r , 1 - 1 y
(i) {* '. n,, (to) + nr, (u) : Tr.. (tu)}

(j) .L : {'u:2no (trr) { n;, 1tr,)< 3n" (ta)} (k) .L = fw : no (tr,) < na (tr')] )<rrrstrtrt:tarr nlrda that ac:t:eptstfue language 7 : {anb* : n' } Q,r1 f v,ltt. Firrrl arr nprla on 5 : {n,,b, c} t}rat accepts t}re language L : i t t r tl ' t r t € { r r ,b } * , r u , I * { } z {zntcutz '

?, Filtt arr rrpda for'the cotrcatetratiotrof tr(a*) and the language irr Exercise 6. @ fitta an trpda lbr the language 7 = la'b(ab)" b (ba)" : n > 0].

9. Is it possiblc to lincl a dfa that acccpts the sarne language as the pda M : (Iqo, qr]1 {a,,1,}, {t} , qo,{qr }) , ,

7-1 NoNnerERMrNIsTrc PusHpownr AurouArR


d ( q o ,a , z ) = { ( q r , e ) } , d (qo,tr,e) : {(So,e)} , 6(qr,tt,z): {(q',r)}, d ( q r , b , " ) = { ( q o ,z ) } ? f f i 1O. What language is accepted by the pda M : UrIr, qt i qil qli 4+, qsj , {a, b} , {0, r, a} , qo,{qsl) ,

with , d ( q o b ,z ) : { ( q 1 ,1 z ) } , : 6 ( q t , b , 1 ) { ( q ' ,r r ) } , d ( q r ,o , 1 ) : { ( s . , } ) } , d ( q . ra , 1 ) : { ( q n , ) } , , f 6 (q+, z) : {(g+,z) , (gs, u, z)l'l


is try S1,/Wtrat language accepted the npda M = ({qo,$,gzl ,{a,b},{o,b,a}, 6,Qo, {qz}) with transitions z,
6 ( q o , a ,z ) : { ( S r , n ) , ( g 2 , . \ ) } , d ( q r , b ,a ) : { ( q t , b ) } , d(qr,b,b): {(q,,0)}, 6(qr,a,b): {(Sr,A)} ffi 12. What language is accepted by the npda in Example 7.8 if we use F : {qo,q'}? 13. What language is accepted by the npda in Exercise 11 above if we use F : {qrt,$,qz)? 14. Find an npda-lvith no n:rorethan two internal states that accepts tlre language


15. suppose that in Example 7.2 we replace the given vaJue of d (g:1, A,0) with

,5 tr,0): {(qn, . (sz, A)}
_ (rel What is the languageacceptedby this new pda? we can define a restricted npda as one that can increa,$e length of the the stack by at most one symbol in each move, changing Definition ?.1 so that
d : Q x (Xu {I}) x | + ?Qx(rrufu{r}). The irrterpretation of t,his is that the range of d consists of sets of pairs of the ftrrnr (q6,ab) , (qt, a,), or (Q,,A). Show that for every npda M there exists such

a restrictetlnpda fr suchthat L (M) : L (fr)



AuroMArA 7 Chopter PusunowN

dD -

is O" alternative to Definition ?.2 for languageacceptance to require the stack the end of the input string is reached. Formally, an npda to be empty when
M is said to accept the language N (M) by empty stack if

N ( M ) : { t u . E . , ( q o*,, r ) i * 1 p ; , r ) } ,


where p is any element in Q, Show that this notion is effectively equivalent to Definition ?,?, in the sense that for any npda M there exists an npda fr such that L(M) = t (fr), atrd vice versa'


A P u s h d o w n u t o m o t oo n d C o n t e x t F r e eL o n g u o g e s

In the examples of the previous section, we saw that pushdown automata exist for $ome of the familiar context-free languages. This is no accident. There iS a general relation between context-free languages and nondeterministic pushdown accepters that is established in the next two major results. We will show that for every context-free language there is an npda that accepts it, and conversely, that the language accepted by any npda is context-free.

-1 "1d "* 7{t-tilrtl+-jh

[onguoges Automolqfor Conlext'Free Pushdown
We first show that for every context-free language there is an npda that accepts it. The underlying idea is to corEtruct arr npda that can, in some way, cerry out a leftmost derivation of any string in the language' To simplifii the argument a little, we assume that the language is generated by a grammar in Greibach normal form. The pda we are about to construct will represent the derivation by keeping the va.riables in the right part of the sentential form on its stack, while the left part, consisting entirely of terminals, is identical with the input read. We begin by putting the start symbol on the stack. Aft'er that, to simulate the application of a production A + afrt we must have the variable A orr top of the stack and the terminal o as the input symbol. The variable on the stack is rerrroved and replaced by the variable string r. What d should be to arhieve this is easy to see. Before we present the general argument, let us look at a simple example.


Construct a pda that accepts the languagegeneratedby grammar with productions 5 - oSbblo,.

7.2 PusHoowr-r Aurordaraeuu CoNrnxr-FRnrLANGuAcEs IBE

We first transform the frammar into Greibach normal fbrm, changingthe productions to 5'-+ a1Ala, n ---+ bB, B --+b. The corresponding automatonwill havethreestates{go,qr, gz}, with initial state gs and final state q2. First, the start symbol ,9 is put on the stack by d (qu, ") : {(qr,S")} . A, The production 5 --+ aSA will be si*rlate4 tr lbq]p4gby_fg*oViqg g
to* th* tto,"k o.tt.l "o['lo."it g it o.ith-

Similarly, the rule ,5 - a should cause the pda to read an a w.h:ilesimply removing S. Thus, the two prodrrctions are represented in the pda by d ( q t , a , S ) : { ( s r , 5 , 4 ) , ( q 1 ,A ) } . In an analogous manner, the other productions give

6 ( q r , b , A:) { ( s r , ) } , B
6 (h,b,B) : {(qt,f)} . The appearance of the stack start symbol on top of the stack signals the completion of the derivation and the pda is put into its final state by d ( g r , A , z ) : { ( q 2 ,I ) } . The construttion of this example can be adapted to other cases,leading to a general result. I

For any context-free language Z, there exists an npda II such that L: L(M).

Proof: If tr is a A-free context-free language, there exists a context-free grammar in Greibach normal forrn for it. Let G : (V,T,S,P) be such a grarnmar. We then construct an npda which simulates leftmost derivations in this grarnmar. As suggested, the simulation will be done so that the unprocessed part of the sentential fornr is in the stack, while the terminal prefi.x of any sentential form matches the corresponding prefix of the input string.

Chopter 7 PussoowN Au'roMA'I'A

Spar:ificallg the npda, will be M : ({qo,{r,{l},T,V l{z},d,rlo, z,{gl}), whetrez I V. Note that the input alphabet oI M is identical with the set of terrninals of G and that the stack alphabet <ffitains the set of variables of the grammar. The transition functiotr will include A d ( q o , , a ) : { ( q 1 , 5 a ) },

( 7l )

so that after the first move of M, the stack contains the start symbol ,9 of the derivation. (The stack start symbol a is a marker to allow us to detect the end of the derivation,) In addition, the set of transition rules is such that

, ( r n , u )€ 6 ( q 1 , aA ) ,




is in P. This reads itrput a and removesthe variable A from the stack, tlte transitiorrsthat allow the replacingit with u. In this way it generates pda to simulate all derivations. Finally, we trave
d ( q r ,A , z ) : { q t , z ) } ,

( 73 )

to get M into a final state. To show that M accepts any ru € L(G), consider the partial leftmost derivation




If M is to simulate this derivation, then a"fterreadirtg ar&z" '4r,, the stack must contain AtAz' ' ' A^. To take the next step in the derivation, G rnust have a production A1 + ttBl , .' Bp. But the construction is such that then M has a transition rrile in which ( q r , B r . ' . E r ) € d ( q r ,b , A r ) , so that the stack now contains Br ' ' ' BnAz'" An after having r€fld a1a2' ' 'anbA simple induction a,rgument on the number of steps in the derivatiorr then shows that if


7.2 Pusnnown Auroua.rA ANDCor-{rnxr-FnEE LANCUAoE$


then ( c 1 1 , u t , s a ) ( q 1 , . \z ) . t , Using (7.1) and (7.3) we have ( q p , ' r r . , , F ) g r ,w , s z ) i ( n r , ^ , a ) F ( q 1 ,. \ , . e ), .z( so that L(C) e L(M). To prove that .t (IUI) e LG),let u € L(M).Thern by delinition

( q o , r uz ) f k t y , A , u 1 . , Brrt thcrc is orrly orle wfly to get fiom qg to tlt and orily one way from q1 to {1. I'herefore, we must }rirvt: ( q r , * , 5 e ) i ( q 1 , ,z ) . A Now let us write ut = eje,zil;i. " a* Then the first step irr ( q 7 , a L a z a 1 " . . . a n , S z1 q 1 , , 1 , 2 ; )i must be a, rule of thc forrn (7.2) to get .' ( q y ,q a 2 a 3 . , . e n , S a ) F ( q t , a , z f l : r . a , , 7 . 4 2 ). r, Rr.rtthen thc grarnrnar has a rule of the fllrrn S + aturt so tha,t 5 + a 1 t i ,.1 Repeating this, writirrg 'LtL: Auz, we have a ( e 1 , u , 2 t't,;,1 , r , A u 2 z ) | ( q 1 u , 1 . . e , , , , , u 3 I t 2,l ) , .1 irnplying that, A + u2rn is in the grailrnrar rrncl tha,t, S # a,1u2'u3'u2. This mir.kr:sit quite clear a,t any poirrt the slack contents (cxcludirrg z) are idt:nticirl with the unnra,tcheNl liirrt of the sentential forn, so that (7.4) implics 5 # a142.',ar,. In conseqrren<:c, L(M) g , (G), completirrg the proof if the larrgrra,gcr cloes not txrntairr .\. If A e -L, we add to the constmcted npda the transitiorr d ( q o ,) , z ) : { k t , z ) } so thnt the empty string is a,lsoac:r:cpted. r (2.4)

.ssumed that the grirmmar was iu Greibach rrorma. q. . 1 1 .bc. we can use the constrrrr:tion in the previous theorerll irnrneditltely. aao. g " ) } and d ( q .6 Consider the grammar S-aA.a. F (qr . t Irr order to simplify the argurnents. C-c.c . b ."il Exonple 7.A) {(qt.aabc.Chopter 7 Pussoowt'r AuroN4ArA -Jfi rr. 6 (qr. t r . C z ) F (91''\' z) F ( q 7 . . A ) \ . z) S z) | (q1.l form. Mg). a b cA B C z ) .b. d(qr. Since the grammar is already irr Greibach normal form. F ( q 1 .1 ir. A Tltis rxlrresponds to the derivrrtion S + aA+ aaABC + aaaBC + aaabC 4 ua'ahc.g)}' d ( q t .Az) F ( g 1 .(qt.s ) : { ( q r .A) : {(q'.tr)}. we carr rrrake a similar and only slightly more complicated constru<:tion .a ) .d) : {(q1A)}.a ) : { ( q r . o . t r ) } . the proof in Theorerr 7.bc. It is not necessary to do this. b cB C z ) .afl.". A+ aABClbBla.S ) : { ( s ) . B ): { ( . | ( q 1 . : d(qr.t r . In addition to rules d ( q o . . B-b.rfu. z ) } ' the pda will also have tratrsition rules d ( s r . The sequence of moves mrrde by M in processing aaabc is (go.

and of .EE LaNcuacns 189 from a genelral t:ontext*free gramma.1frrlrl the stack ancl replace it with Br. Takirrg tltis as given.o.r:k anrl then replace A with Cz. while the processed input is the trrrrnirrirl prefix of the sentential ftrrrn. eaq:h trove either iucrease$ decrcascs the stack content by n sirrgle or symtxrl.2 PusrrDowm Aurolrnrn nulr Clt)rurr:xtFn.cr}.A): {c1.t is.te rly that is entereclif and only if the stack is cmpty. It has a single fina.he ab in the inprrt agrrinst a similar string in the sta. we remove :.c?. .(76) Tha. Quite a few details are needed to mir.. All transitions rmrst have the form 6(qt.1.gairrwe will Jeavetht: a. tr) . As statecl. we will assurnethat the npda in question meets the following recluirements: 1.l sta. 2.. whelrc : r:a (qi. For example. To keep the discussiorras sirnple as possible. We leirvc thc details of the constnx:tiorr arrd tlrc associatedproof as an exerr:isr:. The rrlrrstruction involved readily suggests itsulf: reverse the proce$$ in Theorctr 7. but coruume no input symbol. This rrreans that the contq:nt of the stack should be reflected in thc variable part of the sententirr. for productions of the f'orm A --+Bx.ving properties I rr.. BC) . But the configura..7. (7 lit) r:i : (qi.kt:this work. Ilere we need to cxplore it further.rexpkrrtxl partiallv in fixercises 16 and 17 in Srxrtiorr7. we must first rnatch t. Tltese restrictions rniry appeer to tre very sev()rc)but they are not.nd2.1 so that the grammar sirnulates the moves of the prJa. Frrr procluctions of the form A + abCr. This equivalence wiL..1 is also true. we now rxrrrstruct a context-free grarnrrrar ftrr tlx: lirrrguage accepted by tlxl rrllda.titirr the npda also involves arr irrternal state.. hrrt a.rgurirents as an exercisel(st:c Exelcise 1tj at the elnd of this section). we want the senturrtial forur to represent thc content of the sttrck.l ftrrrrr. It can be shown thrrt for any npda there elxii*s an equivalent one ha.r. Coniext-Free Grqmmors Pushdown for Aufomolo The converseof Thtlclretn 7.

e it with BC.tling u a.ti) generate thc set of rules ( (qiAqn)--+ (q1Bq1) qCqn).mmar.i Bqi are uselessvariirbkrs ilnd do not affect Lhe larrgtra. we go frorn t1i t'tt q1. .ge accepted by thc npda. This is due to thc fir.1) u.'.nd going here mealls that A atttl its cffttts (i.lly.cceptedby the h. Since (7.Il o' elrd going frorrr state 91 to qr.z (creating an empty stat'Jr) while reading rr and going from (lo to qf. to the top.. If we calr firrtl srrt:h a.ccepts'tu' Therefore. It is hard to see how this can be done.ge grarrrrrrar. w}rile reading a.ct that to erase A wc {irst replac. In the last step. and if we choose (qorqt) a"rits start symbol. fiom state qi to state 4i.e. tlrtr grtrmmar will have a cotresporrtling production (t1iA(t.ll the trringing the symbol originally bckrw .i) --+ (t'.Chopter 7 PusnrrowN AUToMATA this has to be remembered in tht: sontential form as well. erasing B.ted by the gralrrlrlar will be identica'l to the Iangua. w€lexamine the cliffererrt types of transitions that c*n trc ma. 'oErelsing" successivestritrgs bv which it is replaced) are retnoverdfrom the stack. Productionu of type (7. a. Fina.a. and thc rxrn$trrrctirtn we give here is a little tricky. where qy is thc single final state of the nuda. ttr gA.5) involves arr immediate erasure of A. 3 if and only if the npda nlmoves .ke on all possiblc va. Srrbsequently. Tlxr resulting variables tr. as start variable wt: tirke (qozqt). but this docs not aff'ect the gratnmar. gra. then frorrr q. Suppose for the motnent that wc ctr.lrres Q. To construct a grarrllrrirr thnt satisfies these conditiorrs. 1 A if and only if the nptla ora-ses from the stack whilc rtltr. erasing C. This is true.4. Brrt this is exactly how the rrpdtr.nfind a grammar whose variablt:s irrrr of the form (qiAqi) and whose prothtt:tions are such that (qiAq1) u. a in whcrr: 96 nnd g1ta. it may seem that we havc arlrltrd too much' as there may be solrle stat()it ql that cannot be reaclttxl lrom qi while erasing B. the lattguagt: genera. therr (qozq.clehy the npda.

tr. o . . (qo"qz) a {qs Aq1) (qozqz)lo (qrAr1 (qr rqz)l r) a (qoAqz)(qzzqz)lo (qoAS. b . a) : { (q6. 5 (qo.5) so that they yield the correspotrding productions (goAqs)a. From the first two transitiorrs we get the set of productions (sgaqo)a (qsAqg)(spaSo) (qo. a ) = { ( q 2 . To satisfy the latter. A ) } .A)} Usirrgqo 4qinitial stato aryl ga qqth* fi the npda satisfies txrndition 1 above. 6 ( q o . qt) (Sr". (qn Aq=) (qt. z)} . z ) : { ( q s .2 PussoowN AureMArA aruo Cortrux'r'-FRpeLANcLTAcEs 191 Consider the lpda with transitions d (qo. then replaceit in thc . 4 ): { ( q u . a . . . 4 ) } . (qszqz) + (s0Aq0)(qorqr) l(qoAqt) (qnzsz)l(qrArrr) (. --+ a (q0.I ) } . 1 ): { ( q r . (ss"sr) -. (gu. we irrtroducea..A ) } .A e ) } .r) lo (qnAqr)(sszsc).4) : {(ql. Az)} . d ( q r .nrqr) | (soAs.new state 43 and an intermediate step in which we first rcmove the A from thc stack.a. A .])(qtrqr) . (qzzqz) -' (qoAqo)(qorqt) l(qnAst) (sfiqz)l(srArt ) ktzzqs)| (qoAqs)(qrrq*) . .r) (qtrqz) .toArtr)(qraqr)l a(qoAsz) (qrrs. z) : {(qs. (q. * A 6 ( . A . ))} . k . A:) { ( S t .4gr)* lr.t) krzrlrtt)lo (qoAqr) (qtzqt) .Zo)l lo il (qoAsz) (qr. ilhe new set of traffi d (qo. a 6 ( q .") : {(sz.4qr))(qo"qr) (So"qr) lo(. (s0Aso)(so"qr) l(qoAsr) (qraqr)l (qoAqz)(qr". .) (q:aqo).tt) I (soAqr) (qtzqt) . qo).\.but not 2. . b . Thc last three transitions are of the form (7.7.r**t *ou*. qo) k. d (gr.raqt) (SoAqt)(Sraqr)l (sozqr) l" (-r (qoAs. --+ a (qsAqs)(s. -i (g1ag2) . d ( q r .tzqo) (s0Aq0)(so"qo)l(qnAqt) (qrzqo)l(qoAqz)(qrrsu) I (qoAq.

We use the suggested construction to get the grammar G : (V.2. I Iffi language' u h: r(:r. u . h .f.d. with successiveconfigurations (qq. (7.aah.The string aab is accepted by the pda.z) F (q2"\' '\) ' The corresponding derivation with G is (qorqr) =+ a (q6Aq3)(qtrsz) + aa(62q2) + aa(qsAq1)(qzq21 + &eb(qrzqz) * aab. We will show that the grammar so obtained is such that for all e t .S. while the sequenceof middle symbols is the same as the stack content.z) | ( q o . ? € E * . u .Chopter 7 PusHpown Autouara The start variable will be (qorqz).7) and vice versa.r)lforsoqenpdaM. The first part is to show that.Lis a context-free Proof: Asslrme that M : (8. Although the construction yields a rather complicated grammar. This forms the basis for the proof of the general result. wheneverthe npda is such that the symbol A and its effectscan be removedfrom the stack while readirrgz and . X € l * .P). Az) F (q3. A Xi) ( t i . The first gi in the leftmost variable of every sentential form is the current state of the: pda. Q i . r r .1. then. to The steps in the proof of the following theorem will be ea^sier understand if you notice the corresponderrce between the successiveinstantaneous descriptions of the pda and the sentential forms in the derivation.ab.X.z) | (qo.{gl}) satisfies conditions 1 and 2 above. it can be applied to any pda whose transition rules satisfu the given conditions.i).A z ) F (g1. x ) implies that (qiAq1) 3 u.S.A e f . Q ( q o . with ? : E and V consisting of elements of the form (acq.b. E .A.T.

If we now apply the conclusion to ( q o .1r. in that case. 1 ConsequentlyL(M) : I(G).A)} (7. .7. 1A . Notice that (qiAq1) + a(qiBq1)(qfiqn) might be possible for some (qiBq) (qfiqn) for which there is no corresponding transition of the form (7. For all sentential forms Ieading to a terminal string. then there must be a corresponding transition ( 7. We only need an induction on the number of moves to make this precise. the argument given holds.Cq*) . We see from this that the sentential forms derived from (qAqi) define a sequenceof possible configurations of the npda by which (7. Prove that the pda in Example 7. Using the corresponding transition for the npda (78) we see that the A ca. Bd put on.10). This is not hard to see since the grammar was explicitly constructed to do this.cr-rcuacps f03 going from state qa to qi. I 1 . then the variable (StASi) can derive z. Show that the pda constructed in Example 7. For the converse.)) whereby the A can be popped off the stack. But. if (qiAqi) + a.q.2 PusHoowu Autorrrara .5 accepts the language 7 : {an+7b2^ : rz Z 0}.e) 6(qr.7) can be achieved.5 accepts the string aaabbbb that is in the language generated by the given grammar.8) or (7.ul Conrnxr-F'Rpp L. . we $ee that this can be so if and onlv if (qszqy) w. at least one of the variables on the right will be useless.A): {(si. reading a. consider a $ingle step in the derivation such as (rlrAqx)+ a(q1Bq1)(q. a ) i ( q . with the control unit going from state qt\o qj.n be removed from the stack. r .o. A ) .Similarly.

c.2. t o . Find an npda with two states that arcepts 7: {a*bz* I n I 1}. ffi 12. > 0}' ffi L0.Chopter 7 PusHoown AuroNrar.Bqn. L 7 .1.5.1 to find an npda for the language Example 7. Aa)} . ffi 5.2 imply the following. )C"tt*ttrrct an npda that accepts the language generated by the gramma^t S + asbblaab. w i t h p r o d u c t i o nS + A A l a . ffi 8 . Construct art rrpda that will accept the languagegeneratedby the gramrnar s G : ( { S . A } . 1 6 . 11.z}. B + bBBlA. A . P ) . 13. A ) } . In Example 7. with transitions 6 (qo. Showthat Theorems7. Give a construction by which an arbitra"ry context-free grammax can be used in the proof of Theorem 7. Show how the number of states of ff in the aboveexercisecan be reducedto two.and (q6agr) are useless' 14. \ \ / ff/ nina a context-free grammar that generates the language accepted by the npda M : ({qo. AA)} . (i|f) . Construct an npda corresponding to the Brammar S + aABBlaAA.s.9Slab.o .suchthat. For every npda M. b. e) = {(ge. d (so. . 6 .Qr}.b}.{a.7. (E L: tr"O an npda with two statesfor the language {a'ob*+r: n.7 accepts I (aa-b). A) = {(qo.1 and 7. Section 7. Show that the npda in Example 7.o.1. there existsan npda frwith at urost three states. S . d ( s o .S A I b .{0.2. W Construct an rrpda that accepts the language generated by the grammar ^9 aS.L (M): l. Give full details of the proof of Theorem 7. Use the construction in Theorem 7.7 generates the language L (oa-b). A ) = { ( q ' . A u aBBla.d. Show that for every npda there exists an equivalent one satisfying conditions L and ? in the preamble to Theorem 7. l r } . show that the variables (+o. {A. / e. 1 8 . 7 . {qr}). Show that the grammar in Example ?.

chieved a modification rr.ffif. at most one mov{} carr be made.vINISTrc Pusunowrq Aurclnrnra. d.fl \1 # It is interestirrg to note the differt:rrce between this definitiorr and the corresponding definition of a detcrrrrinistic finite automaton" The do4ra. # .Nc. The second condition is that when a A-move is possible for some configuration.q. bv of Definition 7.(MJ.\-transitions does not automatically imply nondeterminism.1rathertharyb E. b) conttrins at rnost one eleinent. Ar'lrr DptpRL{rNISTrc Corurnx:r-FRHH L. no inJlut-consuming alternative is available. This does not affect the definition. $ornc: transitions of a.il.e E X U {I} and b € f.1.11 Dcfinition 7-4 language if and A language . the orrly criterion for determirrism is that at all times at rnost one possible rnove exists.3 Dprnn. qo)zt F) is said to be deterministic if it is a-nautomaton as defineldirr Definition 7.finrnqffi"!.L is said to be a deterministic context-free only if there exists a dpda "lb"sn'ch"tlrat t.caPrtc"we want to {gtain )-transitions. d (g. The lirst of these conditions simplv requires that for any givcn input symbol and any stack top. subject to the restrictions ttrat. 1.I. l.F r eL o n g u o g e s r e A deterministic pushdown accepter (dpda) is a pushdown automatorr that neverha^r choicein its move.1.uacns Pushdown Automqtoond ffiffi Deterministic D e t e r m i n i s t iC o n t e x t . This can be ir. tust be empty for evcry c € E. lffim. . so there may be dead rxrrrfigurations. undefined. =l. gf_!h!_Eg+q4g4 funqlio" iu still asin Definition7.ig be.7. a.il i A puslrdown automaton M : (Q. that is. AIso. Since the top of the stack playsE role in determining the ncxt move.clptla may be to the empty set. for every g E Q. the presenCbof .

since there is tlte possi}rility of irrr cqrriva. : . does rxrt irnplv tha't the language {trlurR} itsolf is nondetertrtinistir:. .t-1. of course.Qz}.1 i) .1 obvirlrs moclification of tlrc argrrment that -L1is a r:ontext-freelatrguirgrr shows that .o) {(sr.b. A. nol ecluivak:nt. 6(qr. a)} violate cotrditiorr 2 of Definition 7. o) : {(q.t.t .Qt.r. d (qz. :) { ( q r .3./}. It accept. 9 Lt:t L1=la"b".>0].1) {(qr.40.i ) } .rastttr deterministir: irrrrlnondeterrninistit:llrulidowtr automata art: finite irutorrra.nguages E x o m p l e7 .b.4 anrl is therefore deterrninistic. ) : { ( q 1t.I)}.r.ngrta. .ngrtage is indeed rxrt detrlrrninistic. This. The language L = I'tl) Lz .L2 is also context-frtx:. Brrt it is knowu that tlxr lir.|l f l f r{ ./o.. o .l(4r. } .{o. in cout.u 1 (J"\ r''. .l 't : 10} 6 kto.. : ({qo..) : { (s' . satisfics the conditions of llelirritiorr 7.zl0) and :n.ro.tr.196 Chopter 7 PusIItowt't Aurtltrlare Exomple 7. There are context-f'reela..8 The language 7:{a"b":n>0} M t:orrtcxt-fiee language. and d (qo.4.L + ' l ' t . I there is ttot dett:rrninistic because Look now elt lixample 7.{qs}) b . aa)} a. that ilrc not deterministic. d (116.I'he ptlir. a) = {(qe.n"l 1" i . o. is a deteruritristic with {0. Thc npder.d.rrtlthe next exalrrplc wo see tha.ge. r ) } .stlrtl givcn lir. l ( q ' . \ .u. Fl'otn tiris a..ta.lent dpda.\ \ . lr}.\..

. the llrocess thir.8..c.la nnp DprnnulNls'l'I(i Coltrnxr-FRnn 7.T.L is indeed nondetertninistic' see tlis we first establisli the folkrwing claim. grarrlnlar G : (Yr U VzU {5} ..t S # U U V2. rr formal argurnent ftrllows.Lz.rgumentwill be clefirrrecl 1. w}rtrre P:h U P z U { .tlPda M for L.F) . is context. Lct G1 : (Vr. There is rro informirtion availa. . Q n-I Then consider f r : with (d.tlus to the r:tlrrect conitlcture. But .E.r..ybe enterecl rrfter M has reacl atb"'.gtl' This seerns rea$onable.eginning of the strirrg or in . b* marle deterministically. it rnay letr.. brrt the details of generatesLl)L2.P1) attd G2 : (Vz. then. cxrrrrlrinitrg two.frce. This new part of the control utrit mar.} F':Pu{fi'eeEFlt.PuSHDowN Aurorvt. we sec that' the a. .lRMrNrsTrc LAI'rGu. given a. .ncral is context-freetu'J !+g{:-t}H be prcin sentecl the ncxt chaptcr. This will follow from a ge. . . ar.ncasily be -"d* plultffiHt-THis point.2: L(Gil' If we assume the V2 are disioirrt and tha..ontcxt-free language.q.S1.'"bZ" Figure 7. F) with Q : t q o 'Q t .:l DETr. behincl the constnrction is to add to the control rtttit of M a The icfura part irr whiclt tratrsitions c:ausedbv the iuput symbol b are replacxxl sirrrilar with similrrr ones fttr input c. d.then L: t ' l ) { a " h " c n : r z> 0 } wouldle a c.T.tthis point. this sort of argumelt is basecl9rr a partit:ular algorithrn we havtl in mind. shorrld be fairly clear a. .rrd.7. ir.z.t recognizes Q.P2) be context-freegrarn- that I and rrrarssuchthat -Lr : L(G) and 1. Accepting this. hrrt cloesnot prove anythirrg. If .. but ca. sint:c the pdn" has either to match tlrre b or two aga'inst each a.rrinitial choice whetlter the irrput is in -L1 }.52. .L wcre a dett:rrninistic rxlrrtext-freelangua'ge.n npda M fbr tr..P).ns 197 well.bleat the t_ry which the choice. 40. .'Ihis until Chapter 8.ut. now also accepts (trrbrrcn. sint:tl the second part rtlsponds to c'iu the stlrrre wav as the flrst part cloesto b"'. Therrl is always tlte possibility of a completely clifferent a. 8 : a u {ao.duF. frornd by irrcluding attdfr constructed F ( 0 r A . f . X. rrnd so has to rnake ir.z. s:) { ( f l ys )} . 5.q.L is not a derterministic context-frce langua.pproach that avoids nrr itritial crhoice' 'lb but it turns orrt that therc is not.2 describesthe construction graphically. Let M : (8. 9 1 1 5 2 } . ' .9. we show the la'tter try constructitrg a. Of courstl.

L are acceptedby M. .lsobe true that ( e o . i-: . for some qj E F.. " b r n .our assumptionthat L is a deterministic ctlntext-freelanguagemust be false. I It :r{ q f r+'' sl for all ql e 4 s_ Il. .1) that i is not context-free.s € 1.s) {(fr.Chopter 7 PusHnowruAtJ'rouara Figure 7.r) (fr.mains be shownthat no strings ottrer to than those irr . a so that for it to accept unbz" we nrust further have ( q r . im zr.s: {(qi. A . by r:onstruction (ti.r. l'or M to accept anb" wE must have ( q o a n b n r ) i * ( g oA . u)} for all ) 6 (qo.ccept a. . . . But then. . so that M wille. . ( q j . this is considerecl severalexercises in the end of this secbion.nb"cn. and € : 8(ir.u)} . et EQ.! C) Addition Control unit of. BecauseM is cleterministic. r ) . it must a. ) t * ( q i . with q4 € F.But we will show in the next chapter (Uxirnpte 8.ru € l*. . Therefore. b n r l f * .c". h " .t l r ) . I .tr.b.It re.u ) .o that i at is context-free.The conclusionis that L : t (fr).44 i.

. 7 : {a"bn : n } 1} U {b} deterministic? 4 .} deterministic? ffi 1O.b}.*g" irz/ 'J ri) V show that under the conditionsof Exercise16.b}.q. For the Ianguage -L irr Exercise 1.Lr is deterministic context-free and lz is regular.3 is not deterministic. Show that fr} in E***ple 7.B statement plausible' fi-J.Lr f-l-Ls is a deterministic context-freelanguage.Z. * the language{ucwT : w E {a. b}. ci-r" an example of a deterministic context-free language whose reverse is not deterministic. Is the languag. Give a.. I s t h e l a n g u a g e7 : {anh* in:n7 G. show that every regular language is a deterministic context-free language. Give reasons why one might conjecture that the following language is not deterministic. while the language in Exercise I is deterrninistic. 7.then the * context-free.. Ianguage. Show that fif i. the closely related language 7 : {wwR : w E {a.9 does not accept a"bznch with fr > 0. ffi LtJ Lz is deterministic Iu. 7 : {a*b D. Is the languag. show that.. Show that 7 : {6nlt't" : rn I n* 3 .} is known to be nondeterministic.L : {u e {a.l stto* that if .cr. . Show that 7: language. but that the language in the example is nevertheless determirristic. Show alstr that it does not accept a"b*ca unless 'trl: n or m:2n.. .L* is a deterministic context-free language. E***ple 1 4 .2 deterministic? ffi Show that the pushdown automaton in Example 7.rguments that rnake thi. ffi "-\ (ro. Srro*that . L: in=rno. : n > 1} U {a} in Exarnple 7.9 does not accept any string not in . or n: m}2} deterministic? 8 . ffi n.NoDnTTRMINISTICCoNr:sxt-FnEE LANGUAGES 199 I .9 does not accept anb ck fot k I n.*:k) {a.L (a"b-c-)' 7.B DnrnnvrNrsTlc Pussoowu AurouRrn .rg..(u) f n6 (ur)) is a deterrninistic context-free 1 2 . Show that -[4-itt E*u*ple 1 3 .b. 6 . . {a*bz": rz > 0} is a deterministic context-free 2} is deterrninistic.

we can achieve our goal. Supposr:we are parsing top-down. The question then is wltether there are grilmmars that allow us to do this. so thcre is no choice to be made. ref'erring the reader to books ort r:ompilers for a more thorouglt trea. Herc we enter a topic important irr the study of contpilers. this is not the case.4 * ay in the grammar.dsynrbol. As first casc. but sorrrtlwhat peripheral to our interests.4. We will provide only a brief introduction to some irrrportant results. we cannot immediatcly claim that this will yield a linear-time parser. To purrrue this. But in this case there is only one surh rule.ue Context-Free ffiffi Grommorsfor Deterministic Longuoges* The importarr(:c of deterministit: <:ontext-free larrgrrageslies in thc fir. Sittce there may be .\-transitions involved. If there is no rule . We can see this intuitively by viewing the pushdown autorrraton as a parsing device.r sentence. we use the approach illustrated in Figure 7. we simply look at the leftmost symbol of ur2. and we rrray expect thtrt it will work t:fliciently.$a. This would avoid backtracking and give us an efficient parser. Suppose that to : rul?{rzand tlnt we have developed the sentential form tulAu. attempting to find the leftmost derivation of a particrrla. the string ru does not belong to the language.Chopter PusrruowN 7 Aurou.ct that they catt be parsed efficiently. . Figure 7. To proceed in matehing consecutive uymboftl. the parsing can proceed. it is clear that at every stage in the parsing we know exactly which protlrrction has to be applied. we can easily write a computer progralrr frrr it. but it J.3. r. . we would likc to know exactly which produr:tion rule is to be applied at each step.tment. I " *+* a 1 | | oo. We scan the input ru from lefl to right. ta. If there is such a rule. Input ro 4 .rutsus on lhe right track nevertheless. Since there is no backtracking involved. . let us see what grarnmars might be srritable for the description of deterministic contextfrec languages. For a general r:ontext-free grammar. Sentential form Matched part Yet to be matched .ke the s-grammars introdtrtnd in Definition 5. To get the next symbol of thc sentential forrn matched against the next symhol in u.3 tt1 d2 d3 d1 t t . sfl. From the discussion thcre.lfbrm whose terminal prefix matches the prefix of ur up to thc currently scanne. but if the form of the grammar is restricted. while developing a senterrtiir. For the sake of discussion. .

but the concept is rrrore general. given the currently scanned symbol and a "lookahe.S. As remarked in Example 5. The scanned symbol does rrot tell us which is the right one. To see why this is so. Exornpla 7.9laSbl ab generates the positive closure of the language in Example 7.4 Gnelru. One type of grammar is called a.rucuacns 7. Does this allow us to make the right decision? The &nrJweris still no. thcy are too restrictive to capture all aspects of the syntax of programming languages. I We say that a grirmlrra.n LL grammar.9. including both aabb or ao. but it is an LL gralnmar.In the fi. by looking at a limited part of the input (consisting of the scannod symbol plus a finite number of symbols following it).FoR DorenMlr.bbab.ab.q. Every s-grarnmar is an LL grammar. Iook at the derivation of strings of length greater than two. In order to determine which production is to be applied. we can see that .I syrnbols.:ad"of the next ft . we Iook at two consecutivesymbols of the input string. In a similar fashion.rst ca*se. Wc tieed to generalize the idea so that it becomes ntore powerfirl without losirrg its essential property for parsing. the rule S --+aSb must be used. The term LL is standard usage irr books on c:ompilers. must start with S . To start. The above is an oxarnple of an LL (2) grarrrmar. we have available two possible productions 5 -r .1 The grammar S -. the second tr indicates that leftmost derivations are constructed. the first .u$uc Cor-rrext-Fnnn L.ris an -L.10.L(ft) grammar if we can uniquely identifu the r:orrect production. In an . .q. since what we have seen could be a prefix of a number of we strings. Suppose we n()w use a look-ahead and consider the first two symbols.9 and S --+ aSb.t stands for the fact that the input is scarrned from left to right.r is rrot an LL (k) grammar for any h. Otherwise.4.aSb. Exomplo 7. we mu$t apply the production S .r. while in the secorrd it is necessary to use 5 * 55.1. The grarnma. If the first is an a. predict exactly which production rule must be used.Rs 201 Although s-grammar$ are useful. finding tlnt they arc aa. The grammar is thereforc not an LL(z) grammar.L-L gramrrlax we still have the property that we carr. and the second a b.10 The grammar S --+ oSblab is not an s-grarnma. this is the language of properly nested parerrthesis structures.

Chopter 7 Pusnnowr*lAuroN. We corrclude our discrrssion with such a definition. Then + S + aSbS + ab. If for every pair of left-most derivations Sl Sl 1 wtArt 41il1!!1n1 w1'*2.9 + q. We see that we never have any choice.we must u$e S . The grammar g a$b^91. consider the leftmost derivation of ru : aba6. we need a more precise definition if any rigorous results are to be develoned.baSbS+ a.) . P) be a context-free grammar. we must use S --+ A. But the problern is not yet completely nolved because the new grammar can generate the empty strirrg. (If lu. the equality of the ft leftmost symbols of ur2 and tl3 € or implies lJt -* Uzt then G is said to be an LL(h) grammar. when the syrnbol is b or if we are at the end of the string.L-L grammar for it exists. This observation about the grammar does not imply that thc language is not deterministic or that no . witlr zr1. To see this. S.bS abab.L. We fix this by introducirrg a new start variable 50 and a production to en$urethat some nonempty string is generated.7.'21 lur3l is less than fr.L^L-grammar equivaletrt to the original grammar. wtAxz 1trt1y2r2 l wrtts. We can find an .\ is an L. is then arr . IliMl Let G : (V. The difliculty lies in the fact that we c&nnot predict how many repetitions of the trasic pattern a"b' there are until we get to tltc end of the string. The final result 5'6 SaSbS aSb.?t)z.'u)B 7*.aSbS.hq. yet the gralnmar requires an immediate decision.SlA f While this itrfcrrmal description of trtr grammars is adequate for understanding simple exarnples.L-grarrtrnar nearly equivtrlent to the original grammar. there are always some situations that cannot be resolved. then h is replaced by the smaller of these.L grammar for the language if we alalyze the reason for the failure of the original grammar. Rewriting the gramrrrar avoids this dilliculty.IalA no matter how many look-ahead symbols we have. When the input uymbol examined is r:r.

. = {u : n..13 is not arr tr. Sirow that the gramrnar frrr -L : lw : n* (r) : rrr. the next step in the cleriva. and rnany rrrrrrpilers have het:n written using . 1 . therc is itrterest in other.7. Describc an algorithm whi<.L-L gramma.0 } W } tzlt*7'n+"n: n.L gramrnar$ for the lbllowing larrguages. m . tletermines whether or not G is an Z." {a'"tn+2rn':rr.trz>1} (") fI : (d) Z.L (abbb"). ffi L Let fJ be a corrtext-frcc granrrrar in Grcibach rrorrual form. ffi 3 . Show that a dctcrnrinistic contcxt-free larrguageis never irrherently amhiguous. There is a great dur.assumirrg 5: (a) Z: (b). Ibr any given ft.l deal with &ll dertt:rrninisticcontcxt-free lanto guages. Hunter 1981) or trooks specifically devott:d to parsing mt:thods for fbrmal larrguages (such ns Aho and Ullman 1972).L gramnrar f<rr the language in Exercise 2.}fl. Ctlrrsequently. But -L-Lgranrmars are not srrflicietrtly genertr.rether so-called -LR grir. rrroregenera. 9 ." (w) + nu (ut) t' n.h. Show that arry LL gramrrrar is unambiguorrs.(trr)) given in Examplc 1.rlv irnportant a.L.L. {a.[ (A) Srammar.'oR Dnrnnlrrt-rrsuc CON. Show ttrat the strcond gramffrar in Exa.(u) < n6 (ur)} (e) f. li'irrd ari 1.rs is an importa.mple 7.c}. Show that if G is an /.L gramrnar. { a t t l 1 r r t . A nrrrntrcr of progranuning larrguagescan bu defined by LL griilrrfilars.11 is an Z/: grailrnr&r and that it is equivalent to the original gramrrra"r.g. 4 . .4 GRAMMARSl. The topir: of .L (k) grarnuar. then I (G) is a dcterministic contcxtfree languagc. } 0} {a. Particula. (u)l . : 70. wlrich also allow efficierntparsing.tion is uniquely determined (as expressed bY gr : l/z). Construct an -L-Lgraurnar for the languagc L (a*hu.rrr.)t-t..nt t)rre it the study of courpilers.l of material orr this subject that can bt: found in books orr compilers (e. Give.ldrltcrrninistic grammirrs. : {tu : n.b. 2 ..I'EXT-FREE LANcuAcES 203 The definitiorr rnakes precise what has already becn irrdicated. 6 . trut can bc viewed as constructing the derivn"tiorr tree fiorn the bottom up. If at nrry sta'gein thr: leftrnost derivatiorr (w1Ar) we know the next h symbols of the input.Lpilrsers.rnrnars. ffi i).. r t t " r : n Z 0 .


o Properties f Context-Free Longuqges he family of cotrtext-free Ianguages occupies a central position in a hierarclty of formal languages. and structrrral results such as pumpirrg letnmas' These all provide us with a mear$ of understanding relations between the different families as well ir. As in Chapter 4. algorithms for determirring properties of members of the family. On the one hand.sfor classifying specifit: languages iu an appropriate category. On the other hand. 205 . To study the relationship between langtage families and to exhibit their similarities arrd differences. we inveutigate characteristic properties of the various fatrilies. there are broader language families of which context-frce larrguages are a special case. wc look at closure urrder a variety of operatiorts. (:orrtext-free languages include important but restricted language farnilies such as regular and deterntinistic context-fiee languages.

1.. thcrtr rmrst be some va.1) with (8.\}. there exist artritrarily long derivatiorrs and corresponding derivation trecs of arbitrary height..tedby the graInIIIaI arrcl tlre therefore in -L. a l l t h e s t r i n g s u ' u i . 1 .oEm ?37is an effectivctool for showing Similar pumpitrg lrtmmas are knowu art: not n that we will discuss twtl strt:tr resrrlts. sur:h that (83) . [onguoges A PumpingLemmofor Conlexl'Free Let . since -L is infinite.Chopter 8 Pnopnnuns oF CoNTlJxr-Fnnn LeNcuacns ffiM Two PumpingLemmos The pumping lemtna girrcnitt lThe.. Consider ttow sut'h a high derivation tre. .{. .2) er. lJince the Iength of thc string on the right sidc of any production is bounded.L be an infinite r:ontext-fiee language.Therefbre. Since the lrumber of va. ard assurne that we h*ve for it rr grarnmar G without utrit-productions. . 'uu"x:'!'z E L1 /R d. 2 . say by k. Then there exists some positive integer rn such that atty ut F L with lrul > ?n calr be clccotrposed as (8.hles G is finite. or A-productions. 1. Correspotrdittgto the derivation tree irr Figrrre 8.1.c a. I t I AI and z art: tr.ria. This is known a-sthe pumpirrg lemma for context-free languages.I fbr all i : 0. c a t t be gerrera. ' g i z . . .ll strings of termitrals. the Iengtlr of tltc clerivation of any'ur € tr must be at least lwl lh.Iiable that repeats on this pathr as shtrwn sclrematically irr Figure 8. .f.l. Frrrthermorer in the . . " .' r. Proof: Consider the language L . one for irr gonertr. * .nd lr vl> 1.r.2. we have the derivation Sl uAz l uuAgz A'u..the other for a restrir:ted type of contextcgntext-free languages free language. Flom the above we see u t l r a t A 3 u A y a n d A 5 .r l : 0 .unrtrz. wlrert.ndsome sufficienbly long path in from the root to a lca.

8,1 Two PuuPtllc LetuN,Ias


Figure 8.1 Derivation trec lbr a long string.





derivations A I uAy and A 4 ,r, *e calr assuilte that no variable repeats (otherwise, we just use the repeatitrg variable as A). Therefcrre,the lengths of the strings u, fi, and y depend orrly on the productions of the granrmar anrl carr be bounded independently of ur so that (8.2) holds. Finally, since there are no unit productions and no.\ productions, u and y cannot both be ernpty strings, giving (8.3). This completes the argument that (8,1) to (8.a) hold. I

This pumping lemtra is useful in showing that ar,larrguage does not belong to the family of context-free languages. Its application is typical of pumping lemrpas irr generall they are used negatively to show that a given Ianguage does not belong to some family. As irr Theorem 4.8, the correct But argument can be visualized as a game against an]Urlellig€@!. now the rulos make it a little more difficrrlt for us. For regular languages, the substring-zE-\hose length is bounded by rn starts at the left end of zu. Therefore*tliEiubstring gr that can be pumped is within rn syrrrbols of the beginning of ur. For context-free language$, we otily have a bound on luryl. The substring u that precedesura carrbe arbitrarily long. This gives additional freedom to the adver$ary, rnaking arguments involving Theorem 8.1 a little more complicated.

Show that the language [,:{u,nbncn:rz>0} is not context-free.

Chopter I

PRopnnrtns on Conrnxr-FREE Lancuacns


-lT*.,/ ),g,,,,' ,".'dir' i''--r*lir'lr+r

lHi-*: . -Dc-.t- :,!rt,r;rH,H"l-;-:T""Tl:'i:-,ffiT:i lffiJlil,':,,';-*#il1-iJ *Y-, L umpinglemma.Thattris ,
suchasthe con$truction from someother argurnent, must conre context-free
of a context-free granrrrra,r. The argument also justifies a claim made in Example 7'9 and allows us

, | .r r q rr Lr{ a E I

which is Once the adversary lrqs choserrrn,,we.pick jl: ttli"efq3id! in tr. The adversary now has-EETffil-cltoices. If he choosel urq to contain only a,'s,then the pumped string will obviously not be in ,L. If he chooses 4 slriqg rnntaining a,n equal number of a's and b's, then the pumped string k #;.,ot L* generated, and again we have [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[fJ*ith generaterl a string not i1 .L. In fact, the only way the adversary could stop us froilI wirrrring is to pick ufra so that u3rhas the same number of a's, bts, and cts. But this is not possible becauseof rrstriction (8.?). Therefore, -L is not conte4!-free. If we try the sarne argutrrerrt on the language 7 - {a"b"}, we fail, as we must, since the language is context-free. If we pick any string in tr, such as u i : e , " ' b " ' , t h e a d v e r s a r y c a t r p i c k r , : t t r ka n d g r - b A , N o w , n o m a t t e r w } r a t

to closea gap in that examPle.Tlq
L: {a"b'}| {a^bz^} | {anb"cn}

is not ctrntext-free. The string artlbrnctrl is irr Z, but the pumped result is not.



Consider the language

L - {ww : tu


Although this language appears to be very similar to the context-free language of Example 5.1, it is not context-free. Consider the string a*lfna*b*, There are many ways in which the adversary can now pick uzg, but for all of them we have a winning countermove. For example, for the choice in Figure 8.2, we call use i : 0 to get a string of the form akYa*b*,h < rn or i < m, which is rrot in .L. For other choices by the adversary similar arguments can be made. We conclude that ,L is not context-free'

8,1 Two Pulrerrvc Lnrvruns


Figure 8.2

ilv fr! E



r'eilr rt-r




is not context-free, In Example 4.11 we showed that this language rg11g!_tgg!g:. However, for a language over an alphahet with a single symbol, t[eEEfrttle clifference between Theorem S.l a,nd the prrmping lemtra for regular languages. In either case, the strings to bt' 1'r'rped corrsist entirely of ots, and whatever new string can he gcncrated by Theorern 8.1 can also be generated by Theorem 4.8. Therafore, we can use essentially the same arguments as in Example 4.11 to show that .L is not context-free.



.\ r ,,,1* ---*.lo*
fr,#sthplqS,,4 Showthat the language


I '1

r^ bb 'trra lrrrcroao6a


, 5lf.

is not context-free. Given rn in Theorem 8.1, wc pick as our strirrgitr,r"'6n,)The adversary now has several choices. The only one that requiris mffh thought is tho

q4ovrn_illFigur* S.?-..$q!11g11!jry will yield glr€w-striue;v/ith ,on=q lg + \i - I)kr a'sJand + U -I)hz b's.lIf the adversary takesk, I 0, W E;FO@TIffi i:0-.-Sinre--r (* - hr)'g (* - 1)' = n 7 ,- 2 m * r 2 < rrf - lq, the result is not in tr. If the opponerrt picks A1 : 0, frz I 0 or h * 0, hz : 0, then again with i : 0, the purnpedstring is not in ,L. We can concludefiom this that .L is rrot a context-freelanguage. I


Chopter I

PRopeRrtrs or- Corutnxl'-FREELRwcu.+cps

Figure 8.3

# n -- j_----E#
u i l l r l


d . . .d . . . a . . . a h ' . ' h ." b . . ,b z

A PumpingLemmofor Lineqrlonguoges
We previouslymade a distinction betweenlinear and qog}llrj3rg4text-freg grammaxs.We now maftea similar distinction betweenlanguages.

A context-free language .L is said to be linear if there exists a linear contextfree grammar G such that .L : L(G).

. ,--1, C'l

Clearly, every linear languageis context-free,but we have not yet esis tablished whethet or not the converse true-

tl ,.''ri.',r -)),,.*


r?> 0) is a linear language' A linear Brarnmar for it is given in Example 1.10. The grarnmargiven in Example 1.12for the languageL : {.; no (tu) : nr,(tr)} is not linear, so the secondlanguageis linear' not necessarilY I _
Of course, just because a specific gramma,r is not linear does not imply that the language generated by it is not linea,r. If we want to prove that a language is not linear, we mu$t show that there exists no equivalent linear grammar. We approach this in the usual wa5 establishing structural properties for linear languages, then showing that some context-free languages do not have a required propertY. Let -L be an infinite linear language. Then there exists some positive integer ?n, such that any w €. L) with lurl ! m can be decomposedas w:uafrllz with

luuyzl 1n-t',


(8 5) (8 6)

8.1 Two Purr.rprr{c Lnlrues


suchthat uu'ny'z € L, (,$7)

f o r a l l z : 0 , 1 ,2 . . . . . Note that the conclusions of this theorem differ from those of rheorem 8.1, since (8.2) is replaced by (8 s). This implies that trie strings u and 3r to be pumped must now be located within rn synrbols of the leff and right ends of ru, respectively. The middlc strirrg z can be of arbitrary length. Proof: our reasoning follows the proof of rheorem g.l. since the language is linear, there exists soure linear grammar G for it. To use the argumerrt in Theorern 8.1, we also need to claim that G contains nei urrit-productions arld no .\-productions. An examination of the proofs of Theorem 6.8 and Theorern 6.4 wilt show that removing A-productions ancl unit-productions does not destroy the linearity of the gra,mmar. we ca.rrtherefore assurre that G has the required property. consider now the derivation tree as shown in Figure g.l. Because the grammar is linear, variables can appear only on the path from s to the first A, on the path from the first A to the second one, and on the path from the second .4 to sorre leaf of the tree. Since there are only a finite number of va"riablers the path frorn 5 to the first -4, and sirrce each of on these gerreratesa finite number of terminals, u a'd z rnust be bounded. By a similar argurnent, u attd y are bounded, so (g.b) follows. The rest of the argument is as in Thcorern 8.1. I

llH$$$si$.Niiil rhuru,.g,,us*

is not linear. To show this, assume that the langua6Jeis linrrar and apply Theorem 8.2 to the string
,uJ : arnbzrrl(f,ttr,

Inequality (8.7) shows that in this cas. thc strirrgs ut Dt a must a,llconsist at entirely of a's. If we pump this str'ing, we get orrt,lk62nzont+t, with either h > 1 or I ) t, a result that is not in tr. This contracliction of Theorern g.Z proves that the languageris rrot linear. I

Chopter I Pnt)pr:nrrr:soR Cot't:rsxr-FRenLeNcuecns This exatnple answers the general qtrestion ra,iscd on thtt relation beof tweel the f"tr,milies context-free and linear languages' Tlttl farnilv of lineflr is Iangrrelges a, proper subset of the family of context-free larrguagtrs.

1. Use reasonirrg similar to that in Exarnple 4.1I to give a complete proof that the language in Examplc 8.3 is not cotrtext-Iree. that the langrrage L : {a* I rz is a prime trutnber} is not t:orrtext-free. @Stto* 3, Show that -L : {trr.utrttt : w E {a,b}- } is not a cotrtext-fi'ee larrguage W 4. Show that .L : {ur e {tl, b, c}" r n2"(*) + "3 (*) : til| (to)} is not context-frcc, 5. Is the language 7 - {anb* : n * 2*} corrtext-free? 6. Show that the language I: q (3) {o"' , " } 0} is not coutext-free'

Show that the following languagcs on X = {a, b, c} u,re not context-free.

( a )I :

{a*ll ; " <.i'} m

(b) I : {alhi :" > (j - 1)'} (c)L:{anbich:k:jn\ (d) /,: {a'obich:k>n,f >i}

( e ) L : { a ; " h ic k : n I j , n ! h S . i } (f) .L : {ru : n,,,(tu) < nu (tu) < n.. (tr)} (S) (e) f - {w: no(w) lrr't,('u,): n" ('u')} (h) Z, : : {u € {4, b,t:}* : n,,.(zr)* nr (tu) 2n" (tu)}'




Deternrinc whether ol not the following languagcs are cotrtext-free (a) Z,: {a'f"uuHa't: n,} 0,u e {a'b}-} ' (h) L:,{a"'bia^H (c) 1, : : ? r} 0 , i l 0 i W

> 0} {a'ltirtib" : tr.> 0,j

(tt) L : {a"H akht I n + i t- k + I} (c,) L = {anbi ahlsl,r, a- k, J < I} j} (f) r: {u"'b"t! ,," < g. I1 Theprenr 8.I, find a bound for rrz in tertns of the properties of the grammar G. m nctcrmi*e whetrrer .r *.t t6e followi.g language is cnntext-free. ffi

^ '{ ,

i e [, = {.,w1ttn2 ,u]r1,ttz {4, b}- ,wt I wzl

8.2 CLosuRePRoreRrlns ewo Dncrsrol Ar,conrrHMs FoR Cournxr-Fnpn LaNcuecns


b q

drb V QH

Sfro* that the language7 : {a'ob"a"ob'* n 2 0,m > 0} is context-free ; but not linear. Show that the following languageis not lirrear.
t,: {w : n*(w) > nu(.)} W is

1 3 . S h o w t h a t t h e l a n g u a g eL : context-free, but not linear. \' D (4.

{w € {a,b,c}*:rro (w)+na(*):rr"(u,)}

- 1} islirrear. {a"bt :i 1,n<2i 15. Determine whether or not the language in Example 5.1? is linear. ffi O e t e r m i n e w h e t h e r o r n o t t h e l a n g u a g e7 = 16. In Theorem 8.2, find a bound on m in tenns of the properties of the glarrrrrrar G.

17, Justify the claim made in Theorem 8.2 that for any linear langrrage (not rxrntaining A) there exists a lirtear grarrtrna,r without }-productiorrs arrd unitproductions. L8, Consider the set of all strings af h, wherc a and b are positive decimal integers such that u I b. The set of strings therr represerrts all possible tlecirnal fractions. Determine whctircr or not this is a context-frec languagc.

* 19, Show that the complement of the language in Exercise 6 is not context-free.



t* the followinglanguage context-free?
L: {o""' : rz and m, are prime numbers} W

C l o s u r eP r o p e r t i e s n d D e c i s i o nA l g o r i t h m s o L for Confext-Free onguoger
In Chapter 4 we looked at closure under certain operations and algorithms to decide on the properties of the farnily of regular languages. On the whole, the questions raised there had easy arrswers. When we a,sk the same questions about context-free la,nguages, encounter more difficulties. First, we closure properties that hold fbr regular,rlalnguilges do not always hold fbr context-free larrguages.When they do, the arguments needed to prove them are often quite cornplicated. Seconcl, many intuitively simple and important questions about srntext-f'ree larrgrragesc:artrrottrtt arrswered. This statement rrray seerrrat first surprising and will need to be elaborated as we proceed. In this section, we provide only a sample of some of the most important results.

longuoges Closure Context-Free of
is The familv of context-fierela,ngrrages closed und(lr union, ctltrcaterratiorr, arrd star-closure,

ancl V1 and V2 are disjoint.Sa. AIso.93.9) is used. T z .L2 and it follows that . Then S:t+Sr4ru is ir possiblc derivation in grammar G:J. 63 is a context-free grammar. More precisely. But it is ca^sy see that to L(Gs) : Lru Lz.9) (8. Nr:xt. S r . Cotsider rrow the larrguage I (Gr).f ' r eg r a m m r r n J r : ( V r . the derivation SrSru can involve productions in P1 only.L (G3) is the rrnion of L1 a. The productions of Gs are all the prochrt:ti{rrrrt G1 a. Alternatively. Obviously. S 2 . so that L (Gz) is a context-free languagc.I0) (8. rxlnsider . Since sentential ftrrrns durivcd fiom Sr have variables in I{. Suppose for instance that u € trr.} .gain Sa is a new variable and P+: Pt U Pz U {5i SrSz} . e e G r spcctivcly.2L4 Chopter 8 Pnornnuss ol' CoNrexr-FRen LANcuAGEs Proof: Let trr and -L2 be two context-free languages generated by the c o n t e x t . r.nd 1. then tu must be in. A similar argument can be made for u € -L2.Pa). Here a.. G+ : (Yr u Yzu {50}. gerrerated by the grammar Gr : (H u V2 U {S. Wt: c:arra^$$urne without loss of generality that the sets V1 and Vz <rredisioint.rrcl of G2. if (8. P3) . P:t: Pr U Pz U {5r * 5rl5z}. wlrere S'3 is ir variirtrlc rrot in V U Vz. u 72.. P 2 ) .8) rmrst trr: the first step of the derivation. 1f w e L (G3) therr either 53 + ^91 or 5h + 5'z (8. u'I'2. Hence trr must be in -L1. P 1 ) a n d G z : ( V z .il'. Suppose(8.2. ? r . together with an alternative starting production that allows us tr) us(l orro or the other grammars. .10) is used first.

8.5'2.2 ClosunE PRopERTTES DucrsroN AlcoRrrHlrs AND FoR CoNTEXT-Fn.U. Then I ( G s ): L ( G t ) * . r and The family of context-free languages is not closed under intersection and complementation. 5r --i a5rblA.7.1. 5z --+ c5zlA' Alternativelg we note that tr1 is the concatenation of two context-free languages. which we have already shown not to be context-free. concatenation. consider (Gs) with I Gs : (VLU {Ss} . . Finally. . P5) where 5'5is a new variable and Ps: Pt U {S5. Proof: Corrsider the two languages L 1 : { a n b " c n ': n Z 0 . the family of context-free languages is not closcd under intcrsection. } 0 . a grammar for -L1 is S * Sr. For instance. r n } 0 } and L2: { a " b * c * i n . But L1 -t L2: : {unbncn n } 0}.3.L1 and "t2 are context-fiee.r n > 0 } . star-closure.en LANGUAcES 215 Then L ( G + ) : L ( G r )L ( G r ) follows easily. so it is context-free by Theorem 8. Thus.55. There are several ways one can show that . Thus we have shownthat the farnily of context-freelanguages closed is under union.5155l.

Let . q0.s.3 and the set identitv ::=. then the right side of the above expression would be a context-free language for any context-free L1 and L2. if and only if (gn. representing the respectivestates .dz. e d1(qa. action of M1 and M2: whenever a symbol is read from the input string.f. a ) : p t . 0o: (go'po). ( ( q r . In other words. then py : p7. . But this contradicts what we have just Shown. If the family of context-free languages were closed under complementation.t. r While the intersection of two context-ftee Ianguages may produce a Ianguage that is not context-free.Fz) be a dfa that accepts. X. p r ) .L1 be a context-fiee language and tr2 be a regula. we also require that if a : tr. r) and 6 z( p i . Consequently.F"l) be an npda which accepts -Lr and (P. .8. b ) .po. that the intersection of two context-free languages is not necessarily context-free.Chopler I PnornRuns on Cowrnxr-l'Ree LnxcuRcps The second part of the theorem follows from Theorem 8.pi). p r )r ) e 6 ( ( q a . In this.L2' We construct a pushMz: qo.I.F) which simulates the parallel down automaton ffi : (0.. the states of M are labeled with pairs (U. . * t': I\x Ezt and define d such that o . To this end we let e:exp.: L 1 t 1 L 2 :L t U L z . dr.r language' Then L1 tt L2 is context-free.r. a. Proof: Let M1 = (8. -ffi simultaneously executes the moves of M1 and M2. b) . the family of context-free languages is not closed under complementation.E. the closure property holds if one of the languages is regular.

This closure property is sometime$ useful for simplifying arguments in connection with speciflc languages. it is regular. Exomplo 8.nlI00j is (:tlntr:xt-free. We can get a muclr treater argument with Theorem 8.FID Dnctslol AlconrrHMs FoR Colrrexr-FREE LANGUAGES 217 in which Mr and M2 (:arr be after reading a certain inprrt string. Therefore. It is possible to prove this claim by constructing a pda or a context-free granrrnar for the langrrage. it is easy to see that. we say that the family of context-free languages is closed under regular intersection.2 Closunn PRoRnnrlns .Fr and p" € Fz if and only if ( q o . ( ( q ' . ( g . € . Then. Let L t : { a l o o b l o o }. Becauseof the result of the theorem. a string is acceptecl Uv fr if and only if it is accepted by Mr and M2. by the closure of regular languages under complementation and the closure of context-free languages under regular intersection. I The property addressed by this theorem is called closure under regular intersection. because trr is finite.7 Show that the language tr-la"h"rz)0.. I . " ) i . 7. that is.q. nd d* (Po. but the process is tedious. Therefore. dtlr q. if it is in tr (Mr) ) L (Mr) : Ltfi Lz. n ) . It is a straiglrtforward inductiorr argument to show that ( ( q t " P t ' )' u ' "z ) i .:{a"bn:n>0}1121. the desired result fbllows.w) : p". P " ) .8.5.z ) . w . Also. .

For the purpose of this discussion.r.^9. no unit-productions.q. We use the algorithm for removing uselesssymbols and productions. b. I Closure properties of langrtages play an important rolc in the theory of fclrrnal languages and many more clt$ure properties for context-ftas languages can be established.Chopter I PnornRTrnson Corcrnxr-FREEL.)} is not context-free. but agaiu we can get a tnuclt shorter argurnerrt urting closure under regular intorsection. c}* : no (*) : n. The pumping lemma can be used for this.P). If '5 is found to be useless.6.\N showrhat the languase L : {* € {cl.L (G) is empty. we have already established the existence of a rnembership algorithm for context-free languages' This is of course an essertial feature of any language family useful in practice' Other simple properties of corrtext-fiee languages can also be determined. I (VrTr^S.L were context-free.rucuRcns W[*Blii$l'tt. (V. we a$$ume that the Ianguage is described by its grarllma. a^$sume that.Slight made in the argumerrt if this is not so.. Then L f t L ( a . (tr) : t. and no useless symbols. there exists an algotithm Giverr a context-free grammar 6: for deciding whether or not . then . Longuoges Properties Context-Free of SomeDecidqble By putting together Theorems 5.then tr (G) is ernpty. Proof: We assume that G contains no . (".L (G) contains at least one element. changes have to be Proofr For simplicity. if not. Some additional results are explored in the exercisesat the end of this section. Suppose that . * b * c * ): f a ' b " c ' : r z l 0 ] would also be context-free.T. But we alreadv know that this is not so.P)' there exists an algorithm Given a context-free grammar 5: fbr determining whether or not I (G) is infinite.\ # L(G).\-productions. . Suppose the grammar has a repeating variable in the sense that there exists some A e V fbr which there is a derivation Al rAs.2 and 6. Wtr conclude that tr is not context-free.

Show that this larrguage is litrcar Consider the language .ritlblos in sucir a way tha. cycle is ir repeatiug one.rrtltto unit-productions.t I (fJ) is infinite. For the mtlrntllrt.lgoritlrtn. 'I'hen any varia. 1 .iAlcoRt't'HMSFoR. where. As itr I'heoretn 4.L (G) is finite. R) whenever thcre is a corresponding production A --+:nB. I'hus. r and .'oltrnxr-FnnE La. In that t:itsc.sa. then the length of irny derivation is boundecl by lI/1.(. Corr$e quently.4.te rr. we have SluAuSut and A1z.osuRril zfg to Since (J i$ ir. Iarrguage. Is thc conrplernent of the languagc in llxamJrle 8. grilrnrnitr ltas wc a repeating va. the gra. rrscless svmbol.riahlcrif rrrrd otily if the clepelncltlncy graph ha.!t. wtl rniglrt look for a.NcuncF:s 8.rierblc.t therc is att edge (A. so thrr. But then S 1u.other sirllllt: properLiesof context-f'rtlclanguages rrr(: rrot so easily clealt with.n tllgoritltrn for detertniuing wlxrtlrcr I (G) is finite. Since we now have an algorithm fbr rlu:itlittg whether tt.trle that is rrt' Lhe base of a. ?. to get a. have an algorithm frrr rlctertnining whetlrcr tlr ttot I (C) is infirritc. and a trre irt 7'*.Pnopnnrrns eNr Dncrsror. we rreed only to cletermine whclthcr the gramtnar ha. u.$srrrned have no A-productions ir.Au 3 ux'''Ay"u 3 ur"zynu is possible for all n.7. If no varia.2 Cl.t7cilnnot bc sitttultatreously emptv. Sintxl A is tteither nullable nor ir.sstturt: rcpeating varia'bles.mmir. I Sornewhatsurprisitrgly.cyt:lo.rhirs ir repeating va..r.n a." but its intrritivt: rncatrilg is clea. This is an important poirrt to which we will retrrrn lirter..8 contcxt-free? ffi 'lheorem B.L1 in .lgorilhn. I'his can be done sirnply by drawing a depenrlerrc:vgraph for the va.lgtr the sarntl ritlrrrr to deternrine whcthcr two context-free grilrrrrnarsgenera.But it trrrrrs rnrt that there is rrclsuc:tr we do uot have the ttx:hrrical tnachiner:vfirr ltrtlllerly clefinirrgtiro rnt:arritrg of "there is no ir.ble r:an cver repeat.

: n. Show that the family of deterministic contcxt-free larrguages is not closed utrtler union and intersectiorr.2 is regular. Which of the languagc faurilies we have discussed are not closed under reversal? 7. .n is not a multiple of 5}iscontext- 18.. {S 22.L2is a linear language.L1.t (ur). {anb": n } 0. r:arr determine whethcr or not A e I (G). Slxrw that the family of context-free languages is closecl under reversal. b}. Show that the family of urtarrrbiguons context-free languages is not closed under urrion. 7 : {tu e {4." ('u) : rr. I' Show that the farnily of deterministic context-free languages is closed urrder regular difference. show that therc cxists an algorithm to dctermine whether the language gerrcrated by some context-free gratrrrnar contains any words of length less than some givetr llumtrer rr. brrt is closed under regular difference.L be a rleterministic context-free language and defirre a new languagc L1 : lw : aw € L. 5. ffi 16. 12. Let .Lr is lirrear and 1.Lr arrd .Ls is regula. 4' Show that the family of linear lartgrrages is closed under hornorrrorphism. 21.Lz is context-free. Show that the family of t:ontext-free languages is closerl under homornorphism. 15' Cive an example of a cotrtext-free language whose complernent is not contextfree. Showthatthelarrguage : 7 free. *13. Show that there exists an algorithm to deterrrrine whether or not . but not closcd unrler concatenation. ffi 1O.a € X]. then /.: be regular. that is. Show that the family of litreat languages is closed under union. Let -Lr be a context-free language arrd .r .tu does not cttrrtain a substring aab} Is the farnily of deterministic corrtext-free languages closed under homornorphisrn'i 19' 20. 9.Lr is a tleterrninistic corrtext-free language'l 17.L. .Lr is rrrntext*free and . Givc the cletails of the inductive argurnent in Thcorcm 8. Show that the family of linear larrgrragesis not closed under intersection.220 Chopter I Pnopnnuns ol CoultLxr-!'nnn LANGUAGES 3.[s it necessarily true that . Give an algorithrn which.5. Slxrw that the following language is context-free. therr . W Show that if . 11. ffi 6. if .r. Show that the farnily of unarnbigrrous contcxt-free languages is not closecl under interset:tion. for any giverr trntext-free grarnmar G. 23.L2 have a cornrnon elerrrent. 14. Show that the farnily of context-free languages is not closed untler difierence in gerteral.

Turing M o chi n e s n the foregoing discussion. This prompts rrs to look beyond context-free {**}.ta are more powerful than finite automata. 22r . For example. where our results showeclthat some simple languages. We also saw that context-free languages. such as {atb"c"t} and are not context-frtle. we automaton. we have the more powerful pushdown automaton. we have a finite automaton. while firndamental to tha study of progranmfng languages. To do so. we have encountered $Qmefundamental ideas. a. that pushdown automa. in particular the concepts of regular and context-free languages and their association with finite autontata and pushdown accepters. if the storage is a stack. Ianguages and investigJatehow one might define new Ianguage families that picture of an inclrrde these examples. we return to the SEeneral arrtornata with pushdown automata.re limited in scope' This was made clear in the last chapter. Extrapolating from thiS observation. If there is no storage. If we compare flnite see that the nature of the temporary storage creates the difference between them. Our study has revealed that the regular languages form a proper subset of the context-free languages. and therefore. we can expect to disCover even m()re powerful language families if we give the automaton more flexible storage.

Definition q Turing of Mochine A Trrring rnat:hineis an automaton whose ternporary storage is a tape. Associated with the tape is a read-write head tha.222 ChopterI Tunlnc MACHTNES what wtlrrld happen if. but we leave them out becausrl the resulting automaton is a little easier to rkrscribe. rhis telpe is divided into cells.Thring rrrachine'sstorage is actually quite sirnple. thel develop sorne feeling fbr whtrt is involved by tloing some simplc programs. we rrsrxl two stacks. T h eS t o n d o r d u r i n g o c h i n e T M Although we carr cnvision a variety of automata with complex and sophisticaterdstorage deviuru. irr the general scheme of Figure 1. which maintains thirt any urmputational llrocess. the concept is tlroad enough to c:oververy complcx processes. to a precise elefinition of the idca of a nrechanicalor algoritlunir: computation.lJsuch a storage device a tape hecauseit is analogousto the magnetic t.t can traval right or left orr the tape and tha. It is more instructivcl to ask a more arntritirlrs question and rxlnsider how far the concept of arr arrtomaton can btr ptrshed. a. Ttrring rnar:hine. T[ring rrrachine is quite rudimentary. thrne stacks. the autornatorr that we use as a Thring machirre will have neither an input filc nor any special output rnu:hanism. We bcgin our study with a fbrmal defirrition of a. t:elnbe done on a T\rring machine. This array extcnds indefinitely in both directiorrs and is thercfirre capable tlf holdirrg an urilirnited amourrt of infbrmation.t ca. Next wc errgrrethat. one-dirrrensional array of cells.such as those carried out by present-day cornprrters. whilu the mechanisrn of a. we will see later that this modificatiorr of our general model in sec. eac:hof which is caJrrrbleof holding one symbol.rrtomaton and tlrrough it a new langrrirge family? Tlis approach raises a lirrge number of qucstions. To deviate slightly frorn the general schcme of chaptcr 1. The infornmtion can be read irrrd changed irr any order. what carr we $ay about the rnost powerful of arrtomata and thc limits of computatiorr? This leads to the funda. a queu(].apesused in actual curnputers. rnost of whit:h turn out to bc rrrrinteresting. we will ca. we could retairr the input file arrd a specilic outprrt mechanisrn without affecting any of the corx:lusions we artl rrbout to draw. each of which can holtl tr single syrnbol.n read arxl write a single symhol on each nrove.or $orneother storage device? Does each utorage device tlcfine a new kind of ir.3. in turn.mentalcorrr:r:pt a Turing of machine and. . It cart lle visualized as a single. whatcver input and output is necessarvwill be done orr the machirrc's tape.tion 1.2 is of little r:onsequence. The discussibnr:rrlminatesin the Turing thesis.

( t .srrtrsetof the tape alphabct. F ) . that is. f .i l .r irrput for reasons that will become apparelrt shortly. Blanks are ruled out a. d is a pa.1 rnakes the notion prccise' A T\rrirrg tnachine M is derfirrtxlby M : ( Q .1. where Q is the set of intcrnal states. The arguments of d are the current state of the control unit and the current tape symbol being read.Inc MacntNn 223 Figurc 9. X is the input nlphabet. . I is a finite set of symbols called the tape alphabet. F C Q is the set of firral states. The transition funt:tiorr d is defined as d:Qxf-Qxfx{. The result is a new stattl of tlte control unit. a new tape sYmbol.rtiirl futrction on Q x f. ! e f is a special symbol called the blank.9. I)efinition 9.q 6 . Thritrg tnachine opcrates. X . d is the transition function.R}.we assumethat E ! f * {n}. Irr gerreral. not including the blank.1 THn Srer"rrrARlrTuR. that the input alphabet is a.1.. In the rlclinition of a Ttuing rnachine. qo E I is the initial stntc.1 Reed-write head A dirrgrani giving an intrritivtr visualization of a T\rrirrg rnachiue is shown irr Figure 9. its interpretatiott gives the prirrciple by which a.

it has a secondary storage of rrnlimited capacity. T\rring machines are quite powerfirl in principle. Tlte transition function d defines ttprogram" of the machine.rry ttut ttre very limited: it can $ensea symbol on its tape and use the result to decide what to do next' Tltc onlv actions the machine can perform are to rewrite the current symbol. It then gotls through a sequenceof steps controlled by the transition lunction d. the whole process may ternrinate. which we achieve in a Tlrring machine bv putting it into a halt state.Chopter 9 TunIuc MACIIINE$ Figure [|. how this computer acts.It) . flnternal state7o f Internal smte fI I l'l'l'l l'l'Fl(b) which replaces the old orre. and a move symbol. which has a finite llrelnorYr and in its tape. so the T[rring machine will ha]t whenever it enters a final state. During this process. I .n cs.lll. This small instflrction set may seem inadequate lor doing complicated things. A Ttrring maclfne is said to halt whenever it reaches a configuration for which d is not defiiled.i (qo. In fact. and wc often call it the As always. to ctrange the state of thc control.The move syurbttl indicates whether the read-write head moves left or right one cell after the new symbol ha^sbeen written ott the tape. but this is not so. It has a processing unit.2 showsthe situation before and after ttNr move causedby tlx: .o) : (91. Tlrg instructiols that sut:h a c6mputer ca. the automaton stai-t$ in the given initial state with sorne informatiorr on the tape.1 Figrrre 9.f ii*tnele +. r. We can think tlf a Thring machilre as a rather simple computer. the conterrts of any cell on the tape may be cxamined and changeclmany times' Eventuallv. L rtr R. we will assume that rro transitions are defined for any final state.2 The situation (a) before the move and (b) after the In()ve.rJ. this is possible because d is a partiat function. and to move the read-writc head.

Any subsequent a will also be replaced with a b.Hi$" Take Q. 5 : { a . b .b. b d (qo.n) : (qr.r. I .o) : (go. H.(q1. it v/ill move left one cell.a. a) : (qo. f:{a. ) .b.3 of A seqrrerrr:e moves.b) : (go.(q1. L ) . d (qo. E. ft) .1 Tnn Sr.L) . tr) . s4ren the machine errcounters the first blank. d ( s r .!) : (sr.qr} . tr.b. d (sr.il) : (qs.fi. . The machine will remairr in state q0. If this T\rring machine is started in state qs with the symbol a under the read-write head.b.but let F be empty. r : { q r }' ano d (qo.o) : (qo. R d (qo.TunInG MlcnIbrn 9. ) .L).q. I as definedin the previousexample.a. ft) . E.b' ft)' Therefore the rea.b } . tI (qr. b ): ( q o . d (qo. but b's will not be modified. then halt in final state q1' Figure 9.tr}. E) . tlte applicable transition rule is d (go. then move right on the tape.ffin#Ili. Define dbv d (qo.rl-write head will replace the a with a b..tnRRo 226 Figurc 9. n.3 shows several stages of the process for a simple initial configuration.o) .. ft) . h't(HtilFld f -t Consider the Tirring machine delined by Q : {qo.

We will use the notation in which frtIrz or ' ' a t f l z ' ' ' a k . and the positiorr tlf the read-writc head. the conterrts of the tape. we will look at other version$ of Ttrring nratrhines and discrtss their relation to our standard model. This symbol is also rt:ird and Ieft unchanged.Chopter TuRrNc Macmbru$ Chopter 9 Tunrr. Suppose that the tape initially corrtainsc. 2. The T\rring machine hils a tape that is unbounded in both directions. we uso the idea of an irutantaneous description. vrhich we will call a standard T\ring machine. so thai it is now over the b. This convention is chosen so that . The machine goes back ittto state {q and t}tc readwrite head moves left. 1.. To exhibit the configurations of a Ttrring machine.tiln show the tape contents. I Since omr can make several different definitions of a Tirring machine.chconfiguration. trnd the sequence of moves stilrts again. We a"ssumethat at the initial time the tape has some specified r:ontent.. The symbols et. allowing any number of left and rigltt moves... As an analogy with programmirtg tcrminology. whatever the initial information on its tape. The machine then reads the a. Whenever tlte machine halts. it is worthwhile to nummarize thc main features of our model. but making no rnodifications to the tape. while q defines the state of the corrtrol rrnit. 'fhere These corrvnntions were cltosen primarily for the convenience of subse* qrrent discussiorr.te q with the tape depictetl in Figure 9. The Turirrg machine is detcrministic in thc sensethat d defines at most one move for ea. This is an instalxjc of a Ttrring rlachine that does rrot halt. k + 1 o' " ' is the instantaneou$ description of a machine in sta. some or all of the contents of the tape may be viewed as output. h o . Some of this rnay be considered input.b. we cirn trace a typical ctr. brrt does not change it.r q 0 . Its next strrte is {1 and the readwrite head moves right.t the T\rring machine is in an infinite loop. Any configuration is completely determined by the crrrrent state of the control unit. We are rlow track exactly in tlre original state. In Chapter 10.with the read-write head on the a. with the readwrite head moving alternately right then left. 3. will run forevet. is no special irrput file. there is no special output device. Similarly.4.se. It is dear from this that the maclfrrc. we say tha...rc To see what happens here.

The unspet:ified part of the ttrpe is assumed to <xlrrtain all bla.ru indicates that the read-write herr. then the rnove abqlctl I abe. the blank syrubol rtray appear in the irrstantaneous desr:ription. bqlbneous dt:scriptions Qoaa. are used iu argumtlnts to rlistinguish betweort several ma'crhirrcs.nks. $fisrhpls S.i-r. r:) : (q2.qRn 227 Figure 9. the insta'nttrrrtxrus description qE. the tape t:orrtainsabcd.3 can be representul by rl11{r0.nrllt:ft of the read-writc head.e.q2d.ntl are not shown elxplicitly in the irrstiurtaneousdescriptiotr. I'he instantnrrcous description givcs otily a finite amtlrrrt of informatitlrr to the right a. I ll. For examplc. llfi i:i-l- 9.Tuntuc MncHttqn 9.4 the positiorr of the read-write head is over the ccll contaitting thc sytnbol immediately followirrg q. T .ttl is q1.1bq(raI bbqptrF bqlb qoilill bqLb.1 TnE Srlrun.tlis on the cell to the immecliate krft of the first symbol of to and that this cell coutains a blank. arxl the reacl-write head is on tlte c.3 correspond to the sequenceof instantaThe picrtrrrc:s hqya.$ .nothtrr will tre denoted by F. bbqsl.il drawn in Figrrrc:9. is ntacle whenever the internal sta.however. Thus. such as F14. R) . The symtrol F has the u$rral neaning of irrr arbitrary nutnber of mtles. If the positiorr of blanks is rc:levant to the clisr:rrssion.Et{frr$Sl'$ The action of thc T[ring machinc irr Figure 9. II d (s1. rxlrrrrally such blanks are irreleva'nt ir. Subscripts.. A move fron one conliguration to a.

H d ( .t. carrnot escape.k-Lqruht&t1+t.lway. . . Iixample 0.rious observations jrrst macle in a forrna.rting rfli:x2 iI to it:. proceeclingin an crrdlessloop I'rrlm which it. This situatior plays a. .rQiIzI tltqiauz firr any qj arr(l a.Chopter 9 Chopter Tunltrtc MncHrlrns For frrrther discussion. n.lconfigura.r q I A h A h + l is possible if and only if . is ir. g I and h € Q.tion M is sa. A move { 1 1' " f i . starting frorn the initial l:ontiguratior :t. 1 ro r ) : ( q z .ninstantalttx)lliJ descriptiott a. fronr sorncinitia. b . Qo.ici halt sta.o) is unrlcfined.l. A . F) be a . 'fhrirrg ma. 1 r . d (qr. t t +'t t L n | & 1 " ' qTuk-rbAp'r1 "' A.?F co.nystring a1 . A move A1 "' " t l . The sequc:rrr:e couligtrof ra.ant with n. fbr which 6(qi. firnclarnentirl role in the dist:ussion of 'I\rrirrg ma. Let M : (8.1qfr!1the tnar:hirrel never halts. indicating tha.tionsleaclirrgto a halt statc will be called a computation. We will reprt:sent it by x:lqf.of M. tr) .it is convenient to summarize the: vir.Tlrtlrr ir.1 Q 1 0 r i 0 .lJshorvsthe possibility that a T\rring machitrewill never hall.. is possible if ancl only if b. ) . d. .*n) = (qz.chine.so w() u$e a.chines. special notation for it. X.

t|e strirrg is not irr r (0i].i E : {0.tthr: irrput .tcr M does it can errter an infinite loop a. we terrrrinate anrl irccept the string.3 tells us what tnust happetr when ut e l'(M)' rrotfiing abqrrt t]re outcorrrefor any ot]rer iuput. bra.nkirrput somcwltere else orr the ta. (M)' accepts tht: lauguage dcnoted Fbr .) n. t t 1 .wo interrral sta'tesQ: {qo'{t} and one: final stirtc F: {qr} are sufticicrrt.n easy exercise irr Turing rnin:hiile prograrrttrting. 0:) ( q 0 . qptt. If.ruis written on the tape with blalks on eitfier sicle.pe.$.llthe itrput is rcstricted to a well-defined by region of thc tape.ntl trever ha. 1}. or 6rre of two thirrgs ca. If it is. Let M : (Q..tl-writehead positioned on the leftrrtost symhol of tu. The reason for excluding trlanks fiom t]re iuput rrow becomesqlcar: it assurosus that a. ) ' F for I i z.fitg6l. Without this could not lirnit the relgionin which it must look for convention.chirre the inprrt.nhappen: the rrrachinecan halt in a norilitral sta. t.ffit*. A string tu is written orr the ta. I t.w.. r .1 THn Sr^r'lDnn. starting nt the lcft end of theri1put. If the input txrntains a i any*fiere.Sr.. arrclwe halt itt a tronlirral state.d.9. Io keep track of thc r:omputtrtiotr.cketecl hlanks ou the riglrt and left. rro matter how ma.it coukl ntlver be strrtr that thelrt:was not sorntl nonbla.lt.qo.nksfillirrg out ther urtused portions. the rna.ny hlrrtrks it str.).f. the Ttrring rnachine entets a fina. .t.F) accepted by M is b e a . tr) d (qo. we contimre by movirrg right.l state atld halts.E.pt:.l TuRIlrcj MacnIt-ln 229 AccePters Mochines Lqnguoge os Turing Ttrring machintls can be vitrwed as accepttlrs in the fblklwing sense.#. : (q1. wr: reacl each syrnbol anrl tlheck thnt it is a 0. As trir. r . tlten tl is ctlrrsideredto be accepted. after a sequellce of tnovels. design a T[ring rnar:hinethat by the regular exPressiou00*.'€ X+ . T \ r r i t r g m a c h i r r c Then thc language L (M): ( * qy t r . n1qyn2 sorne t 'l'his rlerfirritionirrdir:atesLha. This is a.nsition functiotr we carr take d ( s . If we reaclr a blank withtlut eucorrntering alything hrrt 0. The rnachine is started in the irritial state q1 with the retr.f. Any string for whit:h not halt is by clefinitiott trot in 1. It savs Defirrition 9.:. W]reu u is trol itr L(M). 0 . fi) . with bla. € f .

replrr. R ) . we solve the problern in the following fashion.r7. If at a. since illrring mac]rincs have a primitive instrrrr:tion set. a d (gr. y .230 Chopterg Trrurrc MACHTNEs As long irs a 0 appears rrnder the re^d-write head. then move to thc leftmost b and replrrce it with .ceit with a. We then Iet tlxl read-writc head travcl right to fincl the leftrnost b. q. r :y .cingit with a y.t a conrplete solution for which Q: {qo.i Exorrrple For E : {a.rh":n}1}. | : { a . . indicating that an a has beru successfullypaired with a b. it is possible. and so on.1) is undefined. After that. .b}. . b ): ( q z . the rnachine will halt in the nonlinal state q()' $ince d (. rerplacestlrc leftmost a with an lr.n r. thc:n the strirrg must be irr . f t ) .nr.qt1qz1qBtq4} 1 F:{qn}' g:{a. b . the head will move to ther right. working orrt the cletails.r . The set . still. E } . then causcrJ the read-write heatl to travel right to the first b.nytime a 1 is read.l state if utarted in stnte {s on a blank.tch each .with a correspondirrg b. to as the next exarnrrlcu illustratc. say y. t]re computations that we carl pregram ea"sily in a higlxlr level lailguaEle ere oftr.chineaJsohalts in a Iina.nsitions can be broken into several parts. we could interllret this as ilt:ceptance of A. b}.:n cumbersome on a Turing machine. a) : (qr.u . but for ttxlhnical rea^s()ns ernpty string is not includecl the irr Definition 9. Intuitively.'s or b's remain. i ( q o . we arrive a.r. d ( q r . we check it off by replacing it with some symhol.y ) : ( g r . L ) .L. when the y is written. starting at the Ieftmost a.3. we go left again to the leftrnost a. ) : ( q t . If after some time no c. Note thirt the 'rhrirrg ma. a. which in turrr is checked off by replacing it with another symbol. design a Thring machine that ar:r:eprs 7:{a. The tra. the machine crrtersa state qz. rcpla. say z. and thc concept is ea^sy understand. d ( q r . T}avelirrg back and forth this way' we ma. ft) . T The recogrrition of morr: complicated languagesis more difficult.

The particular input aabb gives the following successiveinstantaneous descriptions araabbl rqlabbl raqlbbl nq2ayb tsq2raybl rqsaybl nrtnyb I myqlbl rnq2yyl ng2ryy I r:xqoyu rryqy I myqsJ I I rnyyZqal. and b."a .tclted with a single b.L ) . b i * q o o . After two passes' we will have completed the partial cornputation " q o a o . q o e . the machine will have carried out the paftial computation . ' and so on.!) : (ga. stop ping only when there are no more o'$ to be erased. 6 (qz. the computation will halt in a nonfinal state. d ( s z .To termirrate. if we give the machine a string a'b*. Other input not in the language will also lead to a nonfinal halting state (see Exercise 3 at the end of this section). d ( q z . indicating that the matching processis being carried out properly.a. wittr rL 7 ffit the machine will trventually encounter a blank in state q1. r t b b.9.. ' . a . When the ir4lut is a string a'b'.L ) .a): (q:t.. d (Sr. ) . ready to dcal with the next c. a )= ( Q : t .) R 5 (qz. repositions thqt read-write head over the leftmost a. ' ' a h h ' ' 't i * * q s " ' ( r A Y' b . r We are now back in thc initial state q0. If we input a string not in the language. E) . R. It will halt becauseno transition is specified for this case.tr. instead of finding an a it will find a 37. E) . so that a single a has been mtr. the read-write head travels left with the machine in state q2. This can be done by 6 ( q o . . When looking for the Ieftmost a. a final check is made to see if all a's and b's have been replaced (to detect input where an a follows a b). . a . a y h" ' b .g ) : ( q z .n) : (qo.a ) : ( q z .1 Tsn SrannaRn TuntNc MaoHItIp 231 The next set of transitions reverses the direction utttil an z is encountered. and returns control to the initial state. When an r is encountered. the rewriting continues this way. For examplc. the direction is reversed to get the a' But now. After one pass tlrrough this part of the computation. a .

Turing Mochines Tronsducers os We have had little reason so far to study transducersl in language theory.r. writing the actual progralrr is tedious. they provide us with a simple abstract model for digital comlrrrter$in general. Although conceptua. I i Design a T\rrirrg macrhinethat accepts 7. Thus. b.282 Chopter 9 Tunrr'rc. At the conclusiorrof the computation.:{a"bncn:n > 1}. t}tey r:an he acceptedby I'uring rnachineswith very similar structufesI Otte ctlnt:hrsion we can draw frorn this example is that a Thring machine can recognize sorrre langrrages that are not contrlxt-free. for sornc finrr. a first indication th. T\rring machines ate tttlt only interesting as language accepters. we check that all original symbols have been rewritten. . Since the primary purpose of a conrputrlr is to transform input into orrtput. we can view a Tirring machine tra. a"s well a-rwith some not in -L. and c by replacing them in order lty r. a.we have to look at this er. The inprrt for a computation will be all the nonblank symbols on the tape at the initial time. We matclr each a. it acts as a transducer. respectively. u.t Tirring machines are rn()re powerful than pushdown arrtomata.spect more closely. Notice that even though {o"bn} is a txrntext-free language and {anb"c"} is not. If we want to rnodel comprrters using T\rring rrrachines.7 are easily carrierl over to this case. The ideas userl to Exrr.r Qyfr.lly a simple exterrsion of the previous exarnple.nsducer M as arr implementation of a furrction / defined by fr=f(w).mple9. uo the string aabb is accepted.MACHrFres At this point the T\rring machine halts in a final state. We leave it as a somewhat lengthy.lstate q7. provided that QywI L. But its we will shortly see. accepters are quite adequate. At the end. but straightforward exercise. the output will be whatever is then on the tape. You are urged to traccl this program with several more strirrgs in -L.

t.1. d.rt tty looking at some simple operations. such as addition and arithmetic comparison. with the read-write head on the leftmost symbol of trr(z). all the common mitthematical firrrctions.' € l). F) computable if there exists some Ttuirrg such that QowluWf@t). we first have to choclsesome convention for represerrting positive integers.!|l THn SrAFtrrA. srrch that (r)l lur : z. eyeF. qo.tcsItqn 233 inn. design a T\rri1g machirre that computes nlA. or just A function / with clomain D is uaid to be Turing-computable tnachine M : (Q. . Given two pgsitive integers l' and y. we rnust also cler:idehow r and 3r are placed orr the tape irritially and how their sum is to appear at thc end of the ctlrttputation. no matter how cofirplicated. For sirnplicity. we thcrefore want to design a Ttrring machine for performing the compltation qour 0t. constructing a prograrn for this is relatively simple' All we need to rlo is to move the separating 0 to the right end of u (g). for all z. ru (r + U) will be on the tape followed by a single 0. we will use unary notatiol in which aly positivc irrteger " i. As we will shortly claim. and the reacl-write head will be positioned at the left end of the result.ItoTunINc M.Plil. We will assume that ro (r) and tu (37)are on the tape in unary rrotation' separated by a single 0. are T\rring-computable. We sta. (r) where q1 is a final state.E. so that the addition amounts to nothing more than the coalescing of the two ."nted by ur (r) e {1}+. f . After the computation. (s)i qp (*+ s) 0.i'l$fimmir.$.*pr".

l.1. . F utrary notation. L ) . 1) : (r1:1. R d ( q r . we construct M : (Q. .. although cumbersorrrc for practical computations. F : {qn}. Z . The transition d (qr.0. These can also be done easily on a Ttuing machirre. 1. : (s+. a fact that is remembered by putting the machirrc into state q1. 1 ): ( r 1 r . d ( q r . one that plays a part in the synthesis of more complicatecl instructions. Exotnple fl"10 Designa Thring ma. qs.chine that copiesstrings of I's. More preciselv.1 ) : ( r l l . F). E. . L ) . d (Ss. for any ?u e {1}* .. with Q : { s o .234 Chopter 9 'Itltrrruc MncnrNns strirrgs.l (s.X. is very convenierrt f'or programrrring T\rring machines.t r ) : ( q z . ) .i (. tr) tr. ft) . ft) .1) : (qo.1ww. 1 . other basic operations are copying strings and simple comparisons.q t . 5( q z . To achieve this. 0 . q n 1 1 1 0 1F 1 q 6 1 1 0 1l 1 1 l q s 1 0 1 1 1 1 1 q 0 0 1 1 1 * F F 1 1 1 l q 1 1 F 1 1 1 1 1 q 1F 1 1 1 1 1 1 9 1 t r 1 F 1 1 1 1 1 q 2F 1 1 1 1 q 3 1 0 1 i qstrl11110 qa111110. The resulting programs ara much shorter and simpler than if we had used another representation. q z t q J ) q 4 } .?0. Note that in moving the 0 right we temporrrrily create an extra l.1 . iB) This can be seen from the sequen(re instantaneous descriptions for adding of 1 1 1t o 1 1 . d.i q.1 ) : k t t . : (qr. find a machinethat perftlrms the corrrputation tyl'. 0) d ( S r .j. I Adding numbers is one of the furrdamental operations of any comprrter. ft) . [ ) . such as binary or decinnl. is needed to remove this at the end of the r:omputation.

teqn iI n { g. Rtllllaclc:c:very 1 lry an r.l (.t Iirst.ce the pr()grailr with Lhe sirnple string 11. ft) . : (gr. f t ) . . r) . so let us tra.et r a.w (r) Oru . the mat:hint: is to perform the computation (z)0. if r j 91. ryprr' + if n Z A. .1 THn Sremoann TuRrr'rcMrc.D.1r. Clorrir.t will halt irr ar. n d ( S z . (E) t qrut(r) 0. if r < A. )( S r . A T\rrirrg rnachine version of this is d (qu. ft) . .rr (r)0.l) = ({0. n) !. a.5(Sr.he right end of the current nonblank regiorr arrd create a 1 there.1 ) : ( q 1 . R.1 . t r ) . .Hrrue 235 To solve the problem. Tlrc c:orrrputation perfbrmeldin this case is qsll F rrgeI F tru(en I rqln F r l g 2 n F r q 1 1 1F q 1 r l 1 F 1q211 11s21 111q2tr F F F 119i11 lq111F F s 11 1 1 1F q 1 t r 1 1 1 1 q 3 1 1 1 1 . 1) : (gz.r7 two positive integers represented in utrary notation. (E) | q. (y) qn. Tlavel to l. 3 . ) : ( r 1 r1 .rrd. will lralt in a nonlinal sta. t r ) . More specilically. Tlfs trtay be a little hard to see a. : (q3.nd that Iirral state q. r. Firrd thc riglrtrnost r and replace it with l. d (so. z ): ( q 2 . we irnplemcnt thc fblklwirtg itttuitive process: 1 . 4. struct a TLrring urachine tha.9. . l ( q r .c1lr:irt Stcps 2 and 3 until there are no more tr's. tr) where q3 is the only final stirtc.1. (y) . Il) . F T trt: I. 1 . .

e minor rnodiflcations.b}.in Example 9.rn rn. we still find a 1 on the right when all I's on the lcft her.. together with a.lllnn. In thc first case.. Irr the second case.ke decisions based otr aritlrrntltic cnmparisons. Write a Turirrg rnachine simulator in sorne higher-level programming larrgrrage. we match each l on the left of the dividing 0 with tlrr: I on the right. ?. This kind of simplc dtxlision is comrnon in the rrrachirxrlangua. Assrrrnethat X: {a.m]m.t the 'I.te machine'l ffi 3..sa. we carr rrrrethe idea. Is it possi}rleto do this with a twogta. we encounter the blank rrt the right of the working space. t ** L. Tltis t:xample ma.t a.msare entered.7 does when preserrted witlr the inputs aba arcJ.Thring machine carr bt: progralrurlc(l to tra. and shoulcl produce ar^s outprrt thc result of the rrrmprrtatiorr.\rring rnachitre irt Example 9. deperrdirrg whether u > E or A > fi.l corrfiguration.ge computers.7 with sott...m fbr this is straightforward and is left as arr exercisc. Ihis can bc usecla.. signal to enter the state gr. 4. Such a simulator should accept as input the tlescription of any Ttrring tttacltirre. whrlrc of alternate instrut:tion strea.. Insteacl of rnatr:hing tz's and b's.21ltr. Deterrrrirre wha.when we attempt to orr match arrotlrr:r 1.n inititl.. At the end of the matching. Is thcrc any input for which the Thring trachirre irr Example 9. aaabbbb. Design a Turing rnachirre with no morc than three states that acccpts the language L(a(a-l b)-)..7 goes into at irrfinite loop? . we will have on the taoe eithc:r tyr-.236 Chopter 9 Tunrug MecnrNns To solve this problem.kesthe irnporlant point tha. The cornplt:tc progra.ve heerr replaced. We use this to get into thc other state qr. deprlrrdirrgon the outcome of an arithmetic operation.

l .qz. ..qt. : (q2. .e.t:rue{a.rc MaclHrlrn 237 5.E. L Construct a Ttrring uachinc to compute t. Il) .b}+}. b}.oz'.a^cetu+t.*e.} : { u : n ..{. d (r70.. r r r> 1 } {flitbzn:n> I} ( g ) f l l:l { a ' " I ) " t L " b " : r r .i (qz. (*) r:Z(oba-b) ffi (b) f : {rr. (a) /(u):3a (u) / (t.b . --0.1) I : {e|'6"''n}I. "( u ) : n 6 ( t a ) } (e) /.: n .ire lunction . { a n b " " s " + ' n . Construct l\ring machines that will accept the fbllowing lirnguagestxr {o. : ({qo. 6.._t.R) . whereur € {0. What language is accepted by the rnachine M { a ./ (r) : . b.cceptsthe ianguage L:{tntr. 1}+. 'r > a x<A .. a.b) : (sr..i (qz. (f) I.nf f m.I0 if the string u conta.. { q s } ) w i t h ii (qo. Design a T\rring machine that a..I THr.a) : (qr .i S'rAlrr^. .e.r.a) : (qs.tez. b) : (sr.*. then check your answers by tracing several test exarrrples.u.qt}.t) .i (Sr. . d . 'a): x -'!. } 0 } l (h) I: For each problerrr.b.: lrrrl is even} ffi (") .9.C) .zrrj where c g f 'I\rring machines 11. write out d irr rrrrnplete detail. 10.nrr TuRrr. 'I\lring machine Sper:ifically.ir and 3r positive irrtegers represerrtetl irr rrnary.b}. b) d (qr .. ft) .if ru : e.a. b. } 0 . {ffi slrorrltlprorlrrc:e fi:ataz. R) . n) : (gs. ! } . Q o . .ins any syrnbol other than 1? 7.. with o1 € X. 8.: o {tu: ltol is a rnultiple f 3} (. the -X. Design to corlpute the following functions for . Design a Turing machinc that fincls thc micldle of a string of cvcn lcngth. What happens it Exarnple !1..

What happens when this machine is started with 1lt) on its tape? 1 5 . u ): 2 n .? excludes the empty string frorn arry language ac:ceptedby a Ttrrirrg rrrachine. 1. Cornplete all the details irt Exatnple 9. we follow a practice common in prograrnming.10 goes through when presented with the it4lut 111. To demonstrate how'I\rring machines can be cornbined.9 trlrltl be solvetl if r and g/ were represented iu decirnal. We start with a high-level description. block diagrams or pserrdocodeare the two alpproir. high level.l with which we are working. Modify the definition so that languages that contain A mav be accepted. strt:h primitive operations rrre the brriklirrg blot:ks for mclrc c:orrtplcxirrstructions.tiorrs carr also bc put together on a Thring machine. ffi 1 3 .chines$evcral wirys irt ir. Give convincing arguments that the T\rring mar:hirreirr Exarnple 9. Design a T\rring machine with f : {0.r:hr. 1"6. Sketch how Exarnple 9.ctua. la.rntil the program is in the a. in digital complrters. n} that. Give the sequence of instantaneous descriptions that the Turirrg rnachirre in Example 9. L4. Write out a complcte solution for Example 9.l 3 a ( d ) f (") : E. Write a T[rrirrg rnac]rirre progranr fbr doing the indicated cornputation in this representation.chine. whereLfJ denotes cqualto f. largest integer less tltarr or t2. when startcd on any cell containing a blank or a 1. You may have noticed that all the examples in this scction had only one final state. Definition 9. 1 8 . lct us seer how thcstr basit: opura.Since. there exists another one with only one final state that acccpts the same language? ffi 2 0 .8. Is it generally true that for any Thring rnachine.10 does in fact carry out the indicated t:ornptrtatiorr. Supposc that in Exarnple 9.'s will ust: wc .9 we had decided to represent r and gr in birrary. if t is evert r r f f 1 $ (rdd r-tl (e) / ( t ) : r m o d 5 (f) the /(*) : Lfrl. 1 7 . ffi C o m b i n i n g u r i n gM o c h i n e s o r T f Complicoted osks T We have shown explicitly how some important operations fbund in all computers can be done otr a Thrirrg rna.238 Chopier 9 Tunnrc MAcHTNEs (") f ( n . We can describe T\rring ma. will halt if and only if its tape has a (l sornewhere on it.11.nguage then refine it successivelyr. 1 9 .

4. Design a Turing machirre that cxrmputes the function f(n. For the sake of discussiorr.like that in Exa. for example is meant by saying that the conrparer serrds a start signal to the adder? There is nothing irr Definition 9.5 J'@. What.9.9 and 9. (E) | qa. . Corr. which therr computes n*y.2 Cor"lerNlr'rc TURIr'tcMacnrNns pon. but whose interior details are not slrown. If so. The program for the cornparer d is written a. the cornparer serndsrr. If not. The value zero will be representedby 0.1 that offers that possibilitv. In subsequent discussion$. if x < s.11. it can be done in a straightforward way. As a first example.11.ow(r) 0. using a Ttrring machine having states irrdexed with C. If n < 31.. (r) arrd + if r'P a. Nevertheless. In a block diagram.rpl. we will often use such high-level.start signal to the adder. ifnlt!.9. asrrume that r and E are positive integers in unary representation.5. we encapsule computatiorrs in boxes whose function is described.11.gur \ut (y) I nop*(r)Oru(s) .t) rnost frt:qrrently in subsequent discussions.IcArnn Tasxs 239 Figure 9.mple 9. The cornputation of / (r. Before we accept this high-level view. we implicitly claim thai they r:nn actually be constructed. we corr$tnrct a Ttrring machine having states indexed with E. The diagram shows that we first use a comparing nrachirre. By using such boxes.ow(z)Oto (g). gr)can be visualized at a high level hy means of the diagrarn in Figure 9.0. For the adder. we cornbine the machines in Exarnples9. qc. wc muut justiSr it. It is certainly quicker and clearer than the corresponding extensive set of d's.y):nt'!t . suggested in Example 9. arr erasirrgprogram is started that changesevery 1 to a blank. For the eraser E. to determine whether or not r> y. we u$e the idea in Example 9. blackdiagram representations of Thrirrg rnachines. with the rest of the tapc blank. with states indexed with . The computations to be done by C are qa.

Thc states qio and {6s are new states. which is a single-statement shorthand for a sequence ()f lov/er level statements. for all c e l.c) : (qi.rwcMAcHrNos If we take {. introduced to take care of complications arising from the fact that in a standard T\rrirrg machine the read-write . One simple kind of pseudocode is exentplified by the idea of a macroinstmction. f o r a l l q a€ Q a n d a l l b € f .pc content or moving the read-write head.240 Chopter 9 Tun. we see that C starts eithcr A or .t o ] .t the relevant low-level code is substituted for each occurrence of the ma.1.R) d(qr. If tha syrnbol read is not an a.L) c) for all qt€Q. pseudocodeis a way of outlining a computation usirrg descriptive phrases whose meaning we claim to understand. it is to go into state {i without changing the tir.b) : (qftg. In computer programming.4 and E.5.g as t}re initial states of . We then use the macroinstructiorr irr a progralrr with the assumption tha. ( y ) i q r .c. If the Thring maclflle reads an a. and indicated in Figure 9. This idea is very useful irr Ttrring machine programming. y r u ( r * u ) 0 . n ) O r ( g ) i q o .a. While this description is not usahle on the cornputer.c. respectively. highJevel view of T[uing1rnac]rirresis otre itrvolving pseudocode. u The result is a single Thring machine that comhines tlrc actiotr of C. A. : (qp. The computations performed by the adder will bt: ( u t t . Consider the macroinstruction i/ a then Qi else q*. then regardless of its current state.b. We first defi. s . r o . E a^s T Anotlrt:r useful.ne the macroinstnrction in terms of tlte lower level language. we assume that we can translate it into the appropriate language when needed. with the following interpretation.t:rtr instruction. 6(qt.0 and qp.8. o t( n ) 0 . the machine is to go into state 96 without changitrg anything. and that of the erraserE will be q n .a): (qio. L) d (qon.ft) d (gio. To implement this macroinstnrction requires several relatively obviorrs 'f\ring steps of a machine. for all c € f. o w.

ch occurrerrcci wlrcreiul er. Subprograms are fundarnerrtal to high-lcvcl prograrrrrning languages. w(l rmurt be ir. it will return relevant results to region T. aucl to assurethat the tetttporarily suspcndcd comprrtations of . Tliis indicates that a lefb move mrrst be ma.6. but put the machine into a stirtc (Ii1 $ qko. so no irrterfertlrr(lc (jirn or:<:rrr. but leave Lheread-writt: hcir.4's current state.sxs 241 head changesposition in each rnove.9..rl then pa*sses control to B by making a trarrsitiorr to the start sta.When B is linished. Ihis requires a new featurc: ther ability to store infornration on tlrt: r:a.bleto pa. The workspaur ftrr B is seprlrtlte fiom 7 and from the workspace for A.with the read-write head (which may have rnoved during B's operatiorr) irr its originrrl pla. where A will expect to firxl it. Before A calls B..B .teof B. I Goirrg a stt:p firrthtlr.g.6 Region separator Workspace forr{ Workspace for. rr.ssinforrnation from A to B ancl vice versa. it writes the irrfrrrurirtionneededby B (e. To solve the control transfer problern.4 are not afftx:ted by the execrrtion of /1. B will use 7 to find its input. .' subprogram. For example. the argunents for B) on thc: tirpr: in some regicln 7.croinstruction.ce. Figtue 9. Norrnillly. A may call B frorn state {i. we would like to resurne prograrrr A itt sta"teq. We let the head mrrve right. After transfer.rldreturn to this state. To rnake this plrrusible. .q. Wlxlrr B is finished. milcroinstruction is replaced by actual corlc a. Irr the ma. invokes rrtircltint: B. To solve this.. we want to change the state. the two programs can interact in the required fashion.llingprogram's configuration so the configuratiorr carr br: rtx:rcilterd on returtr frorn tlN..2 Corr.lrxrmputer when a subprograrn is called. we carr divide the tape irrto several regiotrsas showrr in Figurtl 9.dwhere it is. subprogram is a single piece of code that is irrvokcxl repcirtt:dly whenever needed.rerlrrNc Tunrr'rc MAcrrrNrils FoR CoMpllcAruu T. say rnachine A in state q. we (:an replace macroinstructions with sutrprtr grarIIS.ppensin a rerr.debefore entering the drlsirul statc qi or q4.t ctr. Irr this wiry. At other times. let rrs outline briefly how a 'I\rring rnachine carr bc used as a subprograrrr that t:arr bt: irrvoked repeatedly by another Thring machine. Note that this is very sittrilar to what actuirlly htr. but they can also be used with TLrring rnac:hirr:s. be able to r(l(:r{lntcA'u cxrnfigurationwhen it recovers control from B. itt w}tich c:itscc:rlrrtrol shor.

Let u.7 v We can now program T\uirtg rrrachinesin pseudocode.Chopter 9 Tururuc MnctIIwns Figure 9. 1. the idea is simple enough that there should be no doubt that it can be dorre. Write out the complete solution to Example 9.'Ja$rJlrrne The process of mulfinal tape contetrts are to be a. Find a 1 in r and replace it with atrother symbol a. Our examples were not general a"rrd detailed enough for us to claim that we have proved anything.7. I In spite of the descriptive nature of these examples' it is not too farfetched to conjecture that Ttrring machines. can be combined irr rnatry ways to make them quite powerful. brrt it should be plausible at this point that Tirring machines can do some quite complicated things. 2.14. 1. Although this pseudocode is sketchy. Repeat the following steps until r corrtairrs tto more I's.n actual Ttrring machine program. whereby the string g is added the appropriate number of times to the partiallv cornputed product' The following pseudocodeshows the main steps of tlte process. A rrrultiplication machine can be constructed bv combining the ideas that the initial and we encourttered in adding and copying. provided that we know (in theory at least) how to translate this pseudocode into 8. tiplicatiorr can then be visualized as a repeated copying of the rnultiplicand g for each 1 in the multiplier r. Replace the leftmost 0 by 0y. Design a Ttrring machine that multiplies two positive integers irr unary notation. .sindicated in Figure 9. while rather primitive in principle. Replace all a's with I's.

with your conventiorr. Therr use them for the solution. is a prime nurrrtrer} 6. the machine is to go into state ql.9.. fu3l) if no two tu have the sarrre length. which indicates that the rnachine is to sea. > 1} (e) f : {an : n. (d) /(n):. (e) /(n) : p'!. ffi for all positive integers z. and i:0 othcrwise.luzf .qt. use a bhck diagram to sketch the implementation of a function / defined for all w1. Establish a conventiorr for representing positive an{ legative integers in unary rrotation. copiers.rc Tunnrc Mecumns FoR.rrrrl} ffi (c) The complement of the language in part (a) (d). suggest a method frrr represerrting rational numbers on a Ttrring rrrar:hine.: n?. W . Sketch the constructiotr of a Ttrrirrg machine that can pcrform the adclition antl multiplication of positivc intcgers x ard g given in the rrsual decimal notation. : {u"bn' : m.11. w2.r'r:hits tape to the right of the current position for the first occurrcnce of the syrnbol a.Provide a for T\rring rnachines that ar:cept the forIowing la'guages orr {a. where i is such that ltui j = ma-x(lturl. or rnultipliers.2 CortelNlr.ws) : i. "high-level" rlescription 5. For each problern.wz..r. (u) Z: (b) t: {. (b) /(n):r'1'. CoMplrcArun Tasxs Z4B 2 . using adders. sutrtracters. 8. otherwise it is to go into state 4i. 4 . define a set . 7.. Give an implementation of the macroinstruction searchright (. then sketch a method for addirrg and subtracting such numbcrs.f appropriate macroinstructions that you feel are reasonably easy to implement. comparers.a. t wr # wz: lurl : l. (c)/(rz):2".qi). b}. If an a is encountered befrrre a blank. rrketch the construction of a subtracter for t:omputing a .ruturt} {*r. 3 .3y. 'rrrrg {1}+ by f (wr. draw block diagrams for Thring nrarhines that computc the functions ( a )/ ( " ) : n ( n a 1 ) .

Tht: irrstnrction set of a Turing mais <:hirrt: so restrictecl that any argument.:omment on this. ailtl 9. Since very complex operations ca.o[tata (for a.'rliglrt srrspect that a'ftrring tnachine begins to ilpproar:h ir. in some $enst:. subprogrrur filr arxlthrlr.wo ttrrrr to a sornewhat.rrcl a way of carrying out a reasonably rigorous rlist:rrssiorrof T\rring machines without havilg to write lengthv. In Example 9.sa. on E: {a. perform string rrarripulations. wt: skctchcd the construction of a T[ring mat:hinc fbr a larrguage which is uot coutext-free ancl for which. low-ltlvtll code. attcl tnake sotne simple compilrisons.rity to tra. repla.b} that a.nding.n he brrilt this wiry.244 Chopter 9 TunIwc MAcHINEs Use the macroinstruction in the previous exercise to design a Ttrring rnacltirte the language L(ab-ab"a).tyllir:irl t:ornputcr itr power.im thilt T\rrirrg lllacllines can pcrfrrrrrt ttot only the sirnple operations fbr which wc have provided explicit ils prograllls.bkl philosophical issue' such a comprclmise.whikr it takes very little imaginatiott or ingenr. but nrore complex procert$es wtlll.10. irntl irdds Iittle to our understa.nslattt a block diagram or pseudocode into the t:orrtlsporrdirtg error prone' Tlrring rnachirreprogram. Thc discussionalso ilIustrates how prirnitive opera. There is unfortutlately no completely satisfactqry way qf gt:ttirrg out of the predicament. Suppose we were to make tlle conjecture that.tionu cilrr lxl t:orntrirredto solve nore cotnplex T\rring rnttclltirtescan lre cotnposed.sona. the best we t:tlrnprotnise' To see how we might achieve can do is to rea. If the irrput cofltains no a. soltrtitlrr.ccepts 1 0 . Thesis Turing's 1flffiffitr Thc prtrcxling discussiou not only shown how a T\rrirrg rnachitre can be constnrr:ted from sinrpler parts. Ttrc first is that .11 s]row thal Tlrrilg machines can do sorne simple arithmetic optlrittittns. Wt: rrow face a dilenuna: we want to t:la. r:orrscqucrrtly.r.nt progrirm$ cxplicitly. how sever. we sholld show the releva. Somehow.we would like to firrd tracting. clescribableby block diagrarrls or pseudocode. w(.n section. stlt: Ext:rcise 2 at the etrd of pusltclowtr aul. lro puslrdown au[omaton exists. or proof for a nontrivial probletn is quite tedious. Use the rlacroinstructiotr searchright irr Exercise I to creatc a Thring machine program that replaces the syrrrbol immediatcly to the left of the leftrrxrst a by a blank.8. Iixamples 9.ta. But doing so is unpleasant and diuouglrt to be avoided if possible.cha rea. T\rriug nachitres are equal itr power lo a typical digital computer? Htlw trtiulrl wtl .9. from the examplesin the prtrvitlrs We can rlrirw sornrlsirnple conclusions 'furing macllines appeal to be more pqwtrrful thir.I gram can act a. actually doing it is timtl t:orr$urnirrg.utoma. this section). a. 9. and how olle proproblems.ce the rightmost rrorrblaflk syrnbol byab. but also illrrutrat()sa llegative aspect of working with such low-level a. To defilrrd such claitns against challenge.

Ttrring and others in the mid-lg30's to the celebrated coqiecture called the T\rring thesis. it would not lead to a proof.lcornputer" arrrl tha.so it is important to keep in rnirrrl what Thrirrg's thesis is.n be done or any existing digital computer can also be dotre by a T\rrirrg rnir:hirrt:. fact that all such tries have beerr the utrsuccessfulrnust btt takttn as circunrstantial evidence that it cannot be done.sis rejecting the hypothesis. we could take a seqrrerrcc 6f increasingly more diflicult problerns arrd show how they are solvecl by sorne Tlrring machine.tement. we crrrr *lsr approa.ndleave us no further aheacl than before.r:hthe problem from the other side. Some rlrgrrments for a. Tb do so.l cligita.n$ can be perfbrtrr:d by some T\rring rna. wrr r'ight try lo find sorne proctxhrru filr which we ca. Arrythi'g tha. Every irrrlic. 2 . we raise the clucstiorras tti whcther this definition is sufficiently broad. solvable by what we yet irrtuitivtly rxrnsidcr ar. we would hirvt: ir lrtr. but it ought to be possible in prirrciplc if orrr hypothesis is correct. Tlxr dirlir:rrlty lies in the fact thir.sweeping sta. If ue take this attitude attd regat'd the TLrlirrg thcsis sitrply as ir dclinitiotr. Still. we would hrrvr:trl define precisely the term .t any cornputation lhat can be carricrl orrt lly nxrchanical mea.t Ttrring machines are in principle as powerfirl a"sarry corrrputer.5llcce.definition of what constitutes a mechanical cornputirtiorr: ir cxlrnllrtatiorr is mechanical if and only if it can be performed by sorne Turirrg rnachirrc. Arguments of this type lul A.t we don't know exactly what is meant by "a typictr. but frrr which we can show thirt rxr T\rrirrg ma. .ation is tha.lgorithm. This hypoihetis states tha. Ihe 'rhring thesis is rrrorc protrxrrlyvierweda. lrToorre.chinelanguage instruction set of a specrific computer and design a Tlrring machine that can perform all the instrr.bleto suggest a problern.'ha. 3 .nwrite a computer prograrrr. But rlo one has yct lrr:crr f'rrr able to prorluctl a r:ountt. Alternative moclels have beel proposed for nrechanical ctrrnputation..$s this directiott would strcngthcn our convictjon of the truth irr rlf tIrc hypothesis. we might also takc the ma..nd r:onceivablv tright do in the future) with computers? An unequivocill "yes" is rxrt possible.mechanicalrneans." This would requirc sorne clthcr irllstrar:t model tr.t r:. brrt none of them are tnor:epowerful than the T\rring rnachine model.sa. M. but the evidence in its firvor is vr:ry strrlrrg.mple.cceptingthe Ttrring thesis as the definition of a rnechanical comDrrtation' 1 . for which a Thring rnachine program canrtot llr: writtcrr.chine.chinecan exist.rrexa.9.rctionsin the set. If this were possible.t we have no lneans for rnaking a precistl rlcfirrition. while (lvery .3 TuRrNc'sTnnsrs 245 def'endor refute such a hypotlxrsis'l To def'end it.s been a. Thin iu rr. This would rrndoubtedly tax our patience. It is not something that can be proved.n a. Is it far-reaching enough to covcr' cvcrythirrg wr) now do (a.

and Thring's thesis ctr. Specifically. In solne sense) Tbring's thesis pla.nnot trtl provrxl by therrr.246 Chopter 9 Turrtuc. although they can possihly bc invalidatecl.tirlrrfclr TUringts thesis.rse irrguulellts ale circurnstantial.l result contradicts a conclusion based qn thc laws. To avoid prosprx:t$. . lM" Arr algorithm for a function f : D --+ ft is a Ttrring machine M.tions.n algorithm ." However. There is always the possibility that sorrreorrewill come up with a. Ttring ma. Thring's thesis is still itn alJsrrmptiorr.w fiom thcrn irgrtlc with our experietrce ancl our ohserva. for example.ihrrc to invalidate a Iaw stretrgthens our confideuce in it. MacHhrns Tlx. .str. Howc)vcr. wr: t:ilrr irllltc:itl to T\rrirrgts tltesis atrd clain that Sur:hunpletr. we are in a positiou to give a precise rlcfirritirlrr of art algorithttt. But viewitrg Turing's tllesis simply a"san arbitrary defirritigrr rlis$gs an impottant poinL. This is the sitrra. Irr spite of its plausibility.R on its tape. thc likelihood of this happenilg seems to be very srnnll.q1 frlr all rJ € D.ys tlrt) rltrrrrt: role in cxrrnputer science as do the basis laws of physics and chemistry. On the other hancl.vekrgical n(xx)ltitity.t explain mur:h of tlrt: physical world.rrt ir. Hirvirrg accepted T\rritrg's thesis.chine progrirm tilIows us ttr "there exists a. Slch litws t:arrrrot be proved to be true. which given as input any d € D on its tape. Iclentifying an algorithm with a. they do not htr.nother definition that will accotrnt fttr somtl subtle situations not covered by T\rrirrg rnitchirrcs lnt wlfch still fall within the range of our intuitivc notiorr of rrrechatrical cotrtputatiotr. tltey are plausible models tha.terl fir. We accept them because the conclusions we dra.UI (. t0 trurrstrtt<:t is rro algorithm.t qodl nr .1) e F.rrythirrgw() (titrr tl6 orr arrv txrrnputer carr also be dotre otr a Tirring tnachine.ntly. The conclusions we draw f'rom it agrt:e with wltat we ktrow alrout real computers' and so far' all irtttrrrrpts to invalidate it have failed.we can require thtr. repea." or "thcr(l prove rigOrously sUch claims as tlxlllit:itly irrr algoritlrrrr for . so we have some reason for it c:onsiclcrirrg a basic law of computer scieuce. If an exPerimellta. Itr such atr eventu6lity. is based largelv on Newton'$ Ia'wsof r'cttitttr' Although we call them laws. even relativelV simple problems is el vcrv lcrrgthy urrdertakirtg. we might begin to question their validity. sorne Of t)trr $trllsctlttent discussions would have to be modifled signifir:ir. eventually halts with the correct answer / (d) e . Classical physics. rather.

it woukl seem that we can actually clairn that a Tlrring machine is more powetful. Get a copy of this article and read it. by J. While "Pir.3 Turumc's Tunsrs Conseqrrtlntly.Ttring machinet' in Defirdti'rr 9. and that we can give cclnclise descriptions for sorne rather c{lmplex processes.lgorithmsconsiderably. we will go t)ne step further and accept vurtrirl descriptions or block diagrams as algorithms on the a"ssumptiotr that we could write a T[rring machine prograrrr frrr them if we were challenged to do so. but it obviously Ieaves us open to criticisrn. There are a number of enjoyable articles on Turirrg machines in the popular Iiterature. This greatly simplifies the discussion. This would easethe burderr of cxhibiting a. we could substitute ttPascal program" for .swe have already done. A good orre is a paper in Sci.entifi. The reader who hirs any doubt of the validity of these clirims can dispel them by writing a suihrble program in some prograurrrrirrg l.nger is more than offset by the fact that we can keep the discussion simple arrd intuitively clear.st." This paper talks about the irleas we have introduced here and also gives some of the historical context in which the work of Turing and others was done. Actrrtr.lly.5.cAmerican.rr. and we are irr rlrrnger of claiming the existence of nonexistent algorithms. What irnlrortant factor is not takerr into account in this argurnent'l W ** 3.rnguage. Skctch how the various instnrr:tions in this set could be carried out by a Ttrring machine. ** 1' Conrrider the set of trar:hine langua. 2. Since the tape of a Turing rnachine can always be made to bchave like a stack. Hopcroft. titled t(T\rring Machines.clear verbal description" is not.9. a..l program" is well defined. . May 1984.ge instructions for a corrrputer of your choice. In the above discussion. . then write a brief review of it. E. we stated at one point that Tlrring machines appear to he more powerful than pushdown autorrrata.. But this da.


249 .er we look at several variations. If wc a<:r:erpt Tirring's thcsis.O t h e rM o d e l s f o Turing oc ines M h ur definition of a standard Tirring rnachine is not the only possible one. showing that the standarcl T\_rring machine is equivalent. The conclusions we can draw about the power of a lfuring machine are largJely independent of the specific structure chosen for it. For example?we can consider T\rring machines with more than one tape or with tapes that extend in several dimensions. In this chapl. we exper:t that cromplicating the standard Thring rnachine try giving it a morc txrrnplex storage device will not have any cffrtct on the power of the automaton. can be dorre by a sta. in a sensewe will define. to other. there are alternative definitions that could serve equally well. therefore.ndard model. It is nevertheless instructive to study rrrore complex models. Any computation that can be perforrrred orr surlh a n{}w arrarrgement vrill still fhll under the category of a rnechanical corrrputation arrd. Many variations on the basic model of Derfinition 9. more complicated models. if for no other teasoll than that arr explicit derrrorrstratiorr of the expected result will demonstrate the power of the T\rring machine and thereby increase our confidence in Ttrring's thesis.1 are possible.

ndalntlthcr'l Although there rnay be clear differences in their clefinitions. we also look at nondeterministir: T\rrirrg rnachiiles and show that they are no more powerf|rl tharr dclterrlilistic ones. These have quite different definitions.mily of rcgular larrguages. since T\rring's thesis covrlruonly rnechanical computatious and does not tlddress the clever guessing implicit in nondctt:rrnirrisrrt.tr.tir. these difference$ maY rrot have atry interestiug consequences.ssof automa. ilM Two automata are equivalent if they accept the same language.ny Whenever we chlnge a dtrfinition. we carr defirre equivaletrce or nonecluivalence for cla-sses automata in gerleral. Atrother issue that is not irnmediately resolved by Ttrring's thesis is tha. These ar..re can rnake use of the tape only in a restricted way. Wlrat do we mean bY ir. This is unexpected. orl ffiffiffi Minor Voriotions the Turing T Mochine heme irr We first consider some relatively mirxlr {:}rilnge:s Defitrition 9.sses automata C1 and dr. For the rest of this chapter. If f'trr everv tttrtomatort M1 in C1 there .t different tirnt:u. tigate whether these changes make a. we follovr the precedence established for nfa's ald dfa's and define equivalence with respect to the ability to at)rxrpt larrguages. grarnmable" or Finally. We have seen tln exampltl Of this in tlte case of cleterministic and nondeterministic Iinite arrtornata. but they a. This leads to the idea of a "universal" T\tring rnilt:hirre.l clifferencebetweetr one cla. but that automata. we must carefully state what is to be nnderstood by this equivalcrrcc.ntlsst:ntiir. Extrapolating from of this. of Equivolence Closses Aulomoto of Whenever we define equivalence for two automata or classes Of atrtornata. we irrtroduce a trew type of autotnata and raise the qgestion wlrt:thcr thcse new automata are in any real sensedifferent frorn those we have alreacly encounterecl. Corrsider of two cla.ration firr lrrtcr cltapters' we look at linear bounded T\rring rnacltirres tltat have atr infinite tape.t of one uachine executing "reprodifferent programs a.re elqrrivakrrrtirr the serrsethat they both are iclentifled exactly with the fa.250 Chopter l0 O'rsun Molnls or Turut'rc MecHtlrns We will c:ortsidervariants that will be useful in srtbscqucrrt discussious. in prepa.1 and invesdilltrrttntr: irr the general cotrcept.

.t oN TuRrrqc Mec.he simulatiorr of ir singk: move rJi ts*d. Sometimesit is convenient to provide .t is d.Hr.rriedorrt hy M.ton.2 does this for clfir.ch them is associa.there is a machine iu the second class capirtllc:of sirnrilatirrg it. To demonstrate the equivalenceof two classesof autotnata. the intermecliateconfigurations in 4 io 4*r rnay not corresporrd to arry r:onfigurrrtiong! M.}r qf t}rir. Let M btl an rrrrtomir.the simulation is proper. Turing Mqchines with q Stoy-Option Itr our clcfirrition of er. Lt+t ico. srr{. For demonstrating ttrc equivalence in connection with T[rrirrgts machines. of M may involve scvt:ral move$or ff.. we say that Cz is at lur^st as powerful as C1. ancl the two automata arc rx|rivirlent.d1.o. the read-write heacl rnust tnove either to thc riglrt or to tlrc lefl. Irr otht:r wrlrd$.+.. We sav that another autorrraton -ffi r:a.4 woukl havr: done. given the corresponding starting configura. The r:onstruction of Theorem 2..tion. wlxrre d0.ccept the same language. sta. if we know the cotnputatiorr r.ted with a unique configuratiorr of M.rereleva.Hluo THnlrn 25L is an autorrraton M2 in C2 srrch that L (Mt) : L (Artz) . we ca..r. sirnrrlate M. VARTATTONS . we show that frlr t:vury mar:hine in one cla.t M ancl fr a.nt. tha. wc siry that C1 and Cz are equivalerrt. If the converse also holds anrl fbr every M2 in C2 tlrcre is arr M in (J1 such that -L (Mr) : L (Mr).ny ways to establish thc equivalence of automata.0 | Aadl Therr M siurula.d1.a.nsimulate a corrrputrr.ndard rhring machine. &r€ instantaneous descriptiorrs. we often use the inrportant ter:hnique of simulation.tion M.n determine from it exactly wlnt computations . but tltis does not affect anything if we can tell whitlh configuraltions of M a.10.. we say that rt .testhis computation if it carries out a r:orrrllutir. As long as we can deterrnine from the cornputatiorr of . thcrr uratters catr bc arrarrgrld so tha. If M can sirnulate erely cotrprrtation of M.'s alndnfa's. There are ma.un mimic the computatiorr of M irr the of following rrrarutlr.1 Mn'ron.t err. . rt should be clear that if M can sirnulate M.\.rrlrl have clo-ne.ss. he the sequence of instantarreous dr:scriptions of the comprrtatiorr of M.fr what M w.tion analogous to that of M. Note that t. if ffi .

t}te simulating machine M does \ / the following. tve put into E the corresponding transitions : E ( a . it is obvious that arry starrtlard T\rring machine can be simulated hy one with a stay-ttptiorr. fio.d.) .n.{o. a.cJ : . . be a TlrringtnaTo show thtt txtrrverse. R ) .) . The sinulating machine can be constructed by M hv defining li as fcrllows: For ea. fbrallcel. u ): ( q i . It is reastlnallly otrvious that every cornputaLion of M llas a corresponding computation of M. b .rlateM. If the move of M does not irrvolve the stay-option. lhen M will make two moves: the first rewrites the syrnbol artd rttoves the read-write heacl right. If S is involvtxl in tlter rnove of M.S ) .F) chine with stay-option to be sirmrlated by a statrdard Ttrring machine M : ^\ //^ {O. F ).X.5} with the interpretation that 5 signilies rlo IIIoveIIIeItt of the read-write head. ' f .. d. Prooft Since a Ttrring machine with stay-option is <:ltlarly iln extension of the standard model. the secorrtl rrrovesthe read-write head left.n define a Tiuing rrtachitre with stay-option by replacing ci in De{initittn 9.b.R. \Qi . b . . the simllating machine perfbrms one nl()vc.1..^ 0 ((. we ca. Frrr cach S-transitiotr 6 ( q r .a ) : ( qj . Thus. h .tr or rt) . L o r R ) . o )( i j .1 by d:Qxf-8xfx{tr. c. For each rntlvc of M. Ieaving the tape t:ontents urtaltc:rcd.chtransition d ( q r .f. t . esscntially identical to the move to be simulated.262 Chopier l0 O'r'ntn Montls oF TURINGMacntnns a third option.tr. The class of Ttrring rnachines with stay-option is eqqivtrlent to the class of standard I\rring rnachines.. to have the read-write hcad stay itr place after rewriting the cell content.nd F. This option does not extcrxl the power of the automaton. we put into f E(A.IetM : (8.. so that M can simr. r. f . = (Qj .

y. we keep ortr disc:rrssiou rather thtr. such an arrtomaton is sometirrrescalled a T[rring tttachine with multiple tracks. Ba.as shown in thc above theorem.3. calltxl tracks.hrrt use o11e with a ta. we make one rcrrtark otr the sttr. To sirrtulale a sta.we use tht: rrotiotr of sitnuabout equivalenr:c. Tope with Semi'lnfinite Turing Mqchines Many authors do not consider thtr rnodel in Figlre 9. each crontainitrgotte memhtlr of the triplet.sedon this visualization.ndir. The reader will find it instnrctive to sketch each simulatiotr in some highcr or level la.1 as standa. so Howcvcr. bgt it should not he trirld to see how they r.1 that each ta.rd. we hervt:divided each cell of the tape into threN:parts.pc that is unbortndtxl only in one dircctiotr. I{ere wc look at two srrctr trtodels. To avttid this. . hrrt such a view in no way extends Definitiott 9.lismwt: have described makes it possible.rd Ttrrirrg rnachine. sitrce all we need to do is rnake I au a'lphir'bt:tirr which ea. It is implicit in Definition 9.rtlThring machirrc M by a machirre fr with a semi-infinitr: tape.rre9.1). in whic:h tlte tape symbols are triplets fiom sorrtesimpler alphabet. The simulations artt givetr only iu brotld be outlirre.reklss cottttnotr are explclred in the exerc:iscs at the end of this section. otlter ther txluivaletrce with thc stattdard machint: ltas to be demonstrated.1 MrNoR V. This can be macle morc explicit by drnwirrg att expanded versiort of Figr.In our suhsequernt latiol freqrrently. Befrtrcl irrtroducing other rntldels. t'his Tlrring nrachine is otherwist: identical to otrr statrdarcl nrodel. Some variarrts tha. It is ttot difficult to see that this restriction does not affect the power of tlrtl tnachine.10. are often crrrlllersonte.sa tapc that has a left bourtdary (Figure 10. In thc picture. artd the forma.tRIarIoNs oN THt TLIRINGMncHIun THntrae 25S Figure 10. 'fhring machitre urodels involvc a change of defirrition.t a.1.a.1 (Figure 10.2). to talk about the proces$precisely and prove thetlrtlrns discussiolr.1 Track1 Track2 Tiack 3 b t Simulation is a standard techniqut: for showing the equivalence of automata.Complete simulatiorrs with'I\ring ma.chirttls descriptive.but we generally mahe tto attempt to dcscribe everything in a rigortltrs and detailed wtr. We can visualize this a. which are sometime$ us(xl as the standartl defilition.rr made rigorous. cxcept that no left move is permitttxl when the retrrl-write head is at thc boundary.rrguage in pserrdtltrrtle.nirr theorem-proof forrn.pe svrlbol can be a compositc of characters rtltlrt:r than just a singlt: tlrre. we ufJetht: arrangenrent showtr in Figrrre 10.chuynrbol is composed of several parts.nda.

3 - Treck 1 for risht nart of standard tepe" Tiack 2 for left part of stefldfid tape The simulating rnrr.c. On the upper M onli.: the flrst to be usr. special end markers ff a. The refererrccpoint could be. b ) . for example. the simulatirrg at machinesees(ff. Ref'ercncepoint /'' h a )q. r .ins the left part of M's tape in r{}verseorder. we keep the irrfrlrmation to the right of some refercrrr:e point on M's ttrpe. and work on the lower track as M moves into thc left part of its tape. L ) .a):(qj. a$$rrmethat the rntr.4 and that the move lo be simrrla.254 Chopter l0 Otnnn Monnls or TuHrNc Macnrruns Figure 10. the position of the readwrite head at the start of the computation.4 (a) Mat:hine to bc simulaterJ. Bcca. c .use belongs Qu. only irrfrrrmation the rrpper to in fi trat:k is considered this point. Now. The distinction carr be made by partitioning the state set of M into two parts.chineto be simulated arrd the simulatirrg machine are irr the respective corrfigrrrations shown in Figure 10.? Figure 10. wlrerefi € Qu.re put on thc left boundary of the tape to facilitrrte switching frorn one trrlck to the othcr.ted gerrcratedbv is d(qr. The lower trirc:k conta. ff) Figure 10. tlrc second to be usud on the lower one. For example.slong as M's read-write head is to the right of the refererrr:epoint. M is programrned so that it will use information on the upper track only a. @ . { (a) .L).chine has a tape with two tracks. The simulatirrg machine will first rnove via tlic transition : (f ( E ( A . ) )r . (b) Simulating machinc. say Qu and Q.'d when workirrg otr tho upper track.

A schematic representation of a'n off-line machirrg is shown in Figrrre 10. . If wc put the iuput fitc back into the picture.l )0t in state ti e Qu.1 we disr:arded the input file fbr reasons of simplicity. L ) . ( # . we get what is known as an off-line Thring machine.10. Itt such a machine. . # ) .6.Now with fr € Q1. and what is seen try the read-writc head.qn.5.pter I containcd an input file as wcll as temporary storage.nd details of the simulatiori are straightforward' Turing Mqchine TheOff-Line The genera. F\rrtlter the rnachineis in a state from Q. shownirr Figure 10. ( # . what is currently re:ad from the inprrt file.lon V. putting it into the conliguration will work on the lower track. A forrrral definition of an off-line Ttuing machirre is easily made. R ) .1 Mrr. What we Figure 10. tr. In Definition 9. We now expand on this claim. # ):) ( F i .6 Read-onlv input file . but we will ltrave this as an exercise.5 Setltretceol conligurations in simulating d ( s r . each move is governed by the intt:rilal state.I tlefinition of an arrtomaton in Chir.Inuolts oN THtl TURIr'rc MnclHttrs Tsplrp 255 Figure 10.a ) : ( q i .c . claiming that this tnade no difference to the T\rring machinc concept. It rrcxt uses& transitiorr t ( t .

Then it carr proceed in the same v/ay as the standarrl machine. and the fourth shows the position of M.ft end. of First. we rrow know that M is to do. The simulation of an off-line rnachine M by a starrclarcl machine fr requircs a lengthier description. the read-write head of M returns to the standard position for the simulation of the next move. In that picture.s rc:rrd-writehead. Starting from some starrderrdposition. Then .7 want to do briefly is to indicate why the cla-ssof off-line Thrirrg machines is equivalent to the cla^ss standard machirrert. This irforrnation is again remembered by M with arr tlppropriate internal state. the tape contents shown represent the specific configuration of Figure 10.6. With the remembered input and the symbol on track !.s or.256 Chopter l0 Ornnn MoDFlr. M scarches track 2 to lor:a.pe of M. The syrnhol found in the cgrresponding cell on track 1 is remernbered by puttirrg the control unit of M i'to a state chosen for this purposo. track 4 is searched for the position of the read-writc head of M. Firrally. Next.. say tha le. All that needs to be rlone by the simulating machine is to copy the input from thc input file to the tape. Next. The first track has the input.te the position at which the input file of M is read. all four tracks of M's tgqe are modified to reflect the move of M. Give a formal definition of a Ttrring rrrachine with a semi-infinite tape.7. tlNr third representsthe ta. The simulation of ea. the behavior of any standard T\rring machine can bc simulated by some off-line model.TuRnrcMACTTTNES Figure 10. the second marks the position at which tho input is read.chmove of M requires a number of moves of _ffi. Each of the frrrrr tracks of M plays a specific role in the simulatiorr. A standard machine can sirnulate the computation of an off-line machirre try using the four-track arrangement slnwn in Figure 10. and with the relevant informatiorr rnarked by special end markers. 1.

c.1 MINoR VlRhrIoNs THurae oN THE TuRINc M. that is. ffi .10. if d (Sr.{tr}.. This can be achieved by the restriction that if 6 (qr.. 6 . Does this restrict the power of the Turing nrachine? 9 . Does this limitation automaton? ffi reduce the power of the 1 0 . Consider a version of the standa. we ask that d (q.a) : (qi . s. for all d (gn. but also on the cells to the immediate right and left. the distance and direction of travel being one of the arguments of d.q'cHII-In 257 prove that the class of T\rring machines with semi-infinite tape is equivalent to the class of standard Turing machines.o) = (qi. Consider a T\rring machine with a different decision process in which transitions a. b}) : (fi. A nonerasing Tirring machine irr orre that cannot change a nonblank symbol to a blank. 1 t . then a must be !. then a and b must be different.re made if the current tape svmbol is not one of a specified set. Show that no generality is lost by making such a restriction. Consider a Ttrring machine that cannot write blarrks. (a) Give a formal definition of such a machine' (b) Show that the cla^ss such machines is equivalent to the clags of of standartl T\rring machines. L or R) .b. ffi i}. then sketch its simulation by a standard T\rring machine. that is. but not both.{a.L or R). a) be defined for all pairs (g' a) with a € | and q t' F.b. Suppose we make the requirement that a Ttrring marhine can halt only in a final state.rd T\rring machine.R) . 4 . Consider a Ttrring machine that. Give a formal definition of arr off-line Ttrring machine. on any particular mover cart either change the tape symbol or move the read-write head.a) = (qj. Give a precise definition of such an automaton and sketch a simrrlation of it by a standa. b must be in f . Give convincing argunrerrts that any language arcepted by an off-line T\rring machine is d. For examule 6 (qt. Make a formal definition of such a machine.rd Ttrring machine in which transitions can depend not only on the cell directly under the read-write head. E. Suppose we make the restriction that a Tlrring machine must always write a symbol different from the one it reads. Show how such a machine can simulate a standard Thring machine. tr or R) . that is.lso accepted by some standard machine. Consider a model of a Ttrring machine in which earh move permits the readwrite hearl to travel more than one cell to the left or right.

The transition rule carr be applied only if the rnachirrc is in state Qsand the first read-write head seesarr a and the second a. But this is not the ca$(]rL$we now illustrate with two examples.8).1. Multitope Turing Mqchines A multitape Thring machine is a T\rring rrrar:hine with several tapes. brrt where d:QxI-"-Qxlu x IL. d. ffi M T u r i n gM o c h i n e s i t h M o r e w C o m p l e xS t o r o g e The storagc device of a standard Thrirrg mrrchine is so simple that one might think it possible to gain power by using more complicated storage clevices. At the same time. e ) : ( q r .E. y . then d ( q n .RI" 2. each with its own irrrlependently controlled read-write head (Figure 10. . Forrrralize this conccpt and show that thisr modification is equivalent to a standard Turing rnachine. R ) * is interpreted as follows.r .n e.' TuRrr. the svmbol on Figure 10.258 Chopter | 0 Orrrnn Mor)ELS or. l'he synrt-rol on the first tape will then be replaced with an r and its read-write head will move to the left. E. Typically.F by a. gs. For example. If n : r:urrent configuration $hown in Figure 10. The formal delinition of a multitape T\rring machine goes beyond Definition 9.1.n. F).rc Mecruruns will allow the indicated movc if the currerrt tape symbol is neither a nor [r.8. since it reqrrires a modified transition function.l. g0.I. L .8 Tapc 2 . with a specifies what lnppens on all the tapes.reas in Definition 0. we define an rz-taperrrachine M : (Q. where Q.

q.iglrtftrrward. since we can alwa.2 Tunrt'tc.multitape 'Ihe simula.ndartl Ttring machines. M. the two-tape macldlle in the conliguratiotr depicted in Figlre 10.10 7 a lr t ) E "f h . we arguc tltat any given multitelpe Thring machintl M cau be simrtlated hy a statrdard Ttrring trachine M and.10. for example. tr b Tapc I t v f Tape 2 the sccond tape is rewrittt+n as'g and the read-writc head moves right' The its control unit then chtr. Cotrsider.9 o.10.yselect to run a. but corrceptually stra. Corr. The simulatirrg single-tape machirre will have f'ogr Figure 10.9. To show the equivalence between mrrltitape and sta.nges state to 91 and the machine goes irrto the new conliguration shown in Figure 10. that any standartl T\uing rnachine can he simulated by a multitap() orle.cHIunswIrH MoRl.tplnx Sronacn 259 Figure 10. The second part of this clirirrr treedsno elabclrtrtion.tiorr of a machirrc with only one of its tapes doing useful work' mriltitrrpe machine by one with a single tape is a little more complicated. converscly.

copying Lherrrrlrrto tape 2.H'-:. beginning of tlrc cornlltta.11 makes it clcirr that. We tlrcn read all t. we clescribecl laborious thir.rrr:e being lengthy and elaborattl. it takes a lot of writing.260 Chopier l0 Otrnn Moonls oF TLrRrNcMncnrr-rps Figure 10. In Exampleg. can txqm. :1fl :illi.l".p[* ilCI. there rs a.cks3 and 4 play a similar role for tirptr 2. Certainly.pe 1 of M..*"'"IJ"ilili.':i1?l.'s. It is irnporta.ccepted. rnachines.1Tli:iJ'il1'#'JH.lJJi' t .l' Considertlte ltrngua.tionof a multitrrllc rnachine by a sirrgkl-ta.nt to keep irr rnind the followirrg point. thu computations of M given the appeara. proceclure try wirich the transition function f "f fr can be constmr:tecJ fiom the transitirtn ftnction 6 of M.'fhe outlinn given for the simulation of offline tnachinescat:riesover ttl this casewith rrrirrrlrmodifications and suggests a.pernachine is similar to that ust:d in the sitttulation of a.laim that l. Tfa.ll zeros.t'bttt is writtun on tape I at tlx. When we c.. The actual steps in the simulatiorr irre a.ge a {a"'b"}.il#T:.rticularly.n initial strirrg e.nguages be a.n deternine wlxlther there are arr cqrral number of a's and b's without rcpeated back-anrl-forth movement of the read-write head. While it is rrot difficult to nrake the corrstnrction precise.7. the ones that have the indic:atedform).f M.n off-line rnachine. Wratever can trt: done of M can also be done on M.lHf. what la. The represerrtir.we match tlrt: b's on tape 1 against the copied a's ou Lapc 2..lte nts.lso much the same.tiou. .unique corrtli.11). Wherr we rcat:h the end of tht: a..il'Hl pa. wc c:ir.rre10. llrt this has no of beallng on the cottclusion.ponding configrrrirtion of M. The fir:st track rcpresents the corrtcrrtsof ta.single 1 tnarkirrg the position of M's read-write iread. the only differencebeing that thcrtl ilre more tapes to rxlnsicler.lH-Jil"'m. exctlrt firr a.t a. for the relevirnt txrnfigurations of M (that is.il'1il. This way.11 trat:ks (Figr. The nonblank part of the second track hrrs a. Figure 10.

Thc two-tra. To simrrlate a move.l a lumher of ways.t:hinesare considered equivalerrt only with respect to their ability to do things.rt of the simulation of eat:hrrrove.te ordering tape.13. U . A ditlgrart1 of a. tr.12.rrrlrrgetnettt.using some spt:c:ialsymbols showu itt thc picture' to delimit the fields.l T\rrittg machine is otre in which the tape ca.ck tape of the silrulating rnirchiue will use clne track to stortl cell conttlnts and the other one to kcep the a^rsociated a. oll corre$poncling cells. two-dimensional Ttrring machitre involvt:s a transition function tI of the fbrm . not witlt respect to eilsc of progra. ancl cell (10.ttxl in Figurc 10. so orrclcornplica. twtr dimensiotnl Turing rnachine is shown in Figure 10'12. the configrrration in whictr cell (1.chinc -ffi ut* u]{tw.chintl M and ttre read-writcl head of the simrrlatirrg ma. wc associa. -3) contains b is shown iil F'igure 10'13' Note can invcllve arbitrnrily large intcgers. in the two-dimt:rrsioual ftr. at tlre sta. Instead.ck moclel depictcd in Figure 10.mmirrg or auy othtlr efficiency mcasure we rnight consider.10. Wc will return to this importtrrrt point in Chnpter 14. where [/ anrl D specify rnuvement of t]rc read-writs ]read up tr. This t:atr Lredone irr or acldrnsswith the r:clls of the twtrditneusiona. the address track caunot we must rrstt a variable field-size a. T[e formal rJe:firritionof a. .rrbe viewed as extcrrditrg infiniterly itr tnore tharr one dimensiotr. we (iaII use the orr two-tra. D } .Ilrc MecHtut:s wIrH MoRn Colvtplnx SroRlGu 261 Remember that the various models of Thring ma.12. L R .12 Two-dimensiond addressscheme . ): Q x f .2) contains a. the simulating tlachine M first comprrtgs the addressof the tnll to Figure 10.nd down.tlrlress trse a fixed-sizc field to sttlre addresstls. In thc scheme of Figure 10.dclress.2 Turr.shion indica. First. fbr example. To simqlate this machitte on a stanrlrrrd't\rring machine.tiorr:the cell a.s that.I x I ' x { . rcspectively.the rcadLelt us assume write heaci of thrl two-clirnensional ma. Mochines Turing Multidimensionol A multidirrtensiona.

ddressorr tra. 'I'uRrr*rc Mecrrrr{r:s Figure 10. 3' Give a forrnal definition of a T\rring rnachine with a si'gle tape but multiple control units.tr: * 7' A c. 'Ihen sh6w how such a r'achine ca' be simulated by a standartl r[rring rnachine. M Iinds the cell with this a. and the[ show how suc:h a machine t:an bc simulated with a standard Ttrring machine.pe and a single control 'nit but with multiple. Give a formal dcfinitiorr of a multihearl-multitape Turing rnachine. * 4. the number of rcquirecl states in Exercise b as far as you can (Hint: the smallest possible nurnber is ihree). given M. 1. there is a stra. Assume that sur:h a machifle is an on-li[e machine..le read-write heatl. a clueue. once tlrc trddressis computed. Give a formal definition of such an autorrraton. Aga.'nuional addrcss scheme.iu. Give a formal definition of a rrlrltihead T\rring machine.<lTrrring machirre can be visualized as a T\rring machirre with a single ta. . detailed sirnuler.chines is to lend cre'rhrirrg's dence to thesis by showing how seemingly rrrrlre complex situations cirn be sirnulated on a starrda. describe the sirnrrlrrtions in just enough depth to show that the details can be worked out.262 Chopter l0 Outtlrr MoDELSor.bet of exactly two symbols. . trriu is a sitnple computation.rd T\rrirrg machine.ck 2 and then changes the cell corrtents to account for tll rnove of M. A counter automaton is a deterrrrinistic autorrraton with orre or more counters as storage. it has rro input file. then investigate its power in relation to Ttrring machines. A queue automaton is an a. ffi 2.rrnter is tr stack with an alpha. irrdependent rearJwrite heads. The purpose of much of ortr discussiorr of Ttrring rna.ightforwirrd construction fbr 44.in.utomatorr in which the temporary storage is -. Only the counter symbol t:an be put on the stack or removed frorrr it. cach with a sing.h a multitape rrrar:hine.It which M is to rnove. 'lot"R*.. a stack start symbol and a courrter symbol. with the strirrg to be pror:essedplaced in thc queue prior to the start of the cornputation. A multihea. show how srrch a machine can bc sirrrulated wit. unfbrtunately. using the two-dirr*. In the exercises trelow.. Show that any T[rring machine carr be simulated using a counter autornaton with four counters. that is.tions are v()ry tedious irnd conceptually uninterr:sting. ffi * 5' Sho* that for every il\rring tnat:hine there exists an equivalent standartl Ttrring uachirre with no rnore than six states.

M N o n d e t e r m i n i s tTc r i n g o c h i n e s iu While Thring's thcsis makes it plausible that the specific tape structure is immaterial to the power 9f the T\rring machine. b.mecannot be said of nondeterminism. showing that nondetermirristic behavior can be handled deterministit:ally. The rloves qouaal bq$.lwayswhen nondetcrminism is involved.1. an appeal tg T[ring's thesis is inappropriate. *fimrttplClq. Since nortdeterminism involves an element of choice and so has a nonmechanistic flavor. the sa.1. I . the range of d is a set of possible transitions.c"hirre transitionsspecilied . (q2.IES 263 g.3 NONDETER. except that d is now a.function d:Qxf-+2Qxrxtr'fi}. We must look at the effect of noildeterminisrn itt more dertail if we wa. Write out a detailecl prograrn for the computation in Example 10.. As a. A nondetermirtistic Thring rnachine is an lrrtomaton as given by Definition 9.l (qo. ffi 9.& and qa1znal qzlcaa are both possible. show that everv cornputatiorr that can be done by a standard T\rring machine can be done by a multitape machine with a stay-option and at most two states.a) : { (qr.il 6y has If a Thring ma.c.nt to argue that notrdeterminism adds nothing to the power of a Turing machine' Again we resort to simrrlation. atry of wlfch can be chosen by the machine.I O. L)} it is nondeterministic.MINISTIC IIURIIIIC MNCHII. R) .

wiih q7 € F. Figure 10. nondeterninistic a.tions. But.the top track conta. This view of rrondeterminisrn rnav seern partit:ularly nonmechanistic. the hottom one for indicatirrg its internal state and the position of thc read-write hczrd..14 Tape contents Internal state and position ofhead Tipe contents Internal state and position ofhead . Each pair of horizontal tracks representsone machirrc. these aJternatives are irrelevant. Quite a few dctails have to tre resolved befcrrewe can claim to have a. all wc ir. unlirnited replication is certainly not within the llower of present-day computers.utomata are usually viewerl as accepters.te or to an infinite loop. To see hovr this can be done simply. Nondetermirrism can be vicwed as a deterministic backtrrrcking algorithm.14). To show that a nondeternrirristic Ttrring machine is no more powerful than a deterministic one.the processcarrbe simulated by a deterministic Thring machirre. we need to provide a dcterministic equivalent for tht: nondeterrninisrn. A nondeterrninistic machin{} may have move$ available that Iead to a nonfinal $ta. and a deterrninistic rrrar:hine can simulate a nondeterministic one as long as it can handle the bookkecping involved irr the backtrackirrg.reasonableoutline of the simulation. but we will leave this to the reader. The simrilating machine searches aJl active tracksl they are bracketed with special rna.rkersand so can always be found. let us mrnsider an altcrnative view of nondeterminisrn. one which is ustlfirl in rnany arguments: a norrdrlterministic rnachine can be seen a$ one that has the ability to replicate itself whenever neces$ary.re interested in is the existerrce of some sequence of moves lcading to acceptance.264 Chopter l0 O.15 representsthe initial r:onfiguration of the machine in Example 10. as always with nondetcrminism.$sor configurrr. It then carries out the irrdicated moves. when more than orre rnove is possible.ruun Moonls or. Consider a Thring rnachine with a two-dimensional tape (Figrrre 10. activating new machines as needed. Figure 10. the machine prt)duces as marry replicas as needed and givcs eaclt replica the task of carrying out one of the alternatives. two new tracks are rrtarted with the appropriate information. We have already alluded to one.ut i qq.r2.ining the rrrachine'stape.2 and its succ:e. A nondetcrministic 'rhrirrg machine is sairl to accept u if there is any possibre sequence of moves such that t111.''IuRrr-rc MACtrrNEts since it is not clear what role nontlcterminism plays i'computing function$. Nevertheless. whenever a new machine is to be created.

Because of its irnportance.11'r Sirmrlation of a rrorrtletenninistic fitove. .ccepts the 7={ww:ue{a. our conclusion is that for every nondeterministic: Thring machine there exists an equivalent deterministic one. Proof. Writc a progranr for a trondeterministic Ianguage T\rrirg machine that a. Write a simple pl'ogram for a nondeterministic the language 7 : {***"a i 'r1a1ue {a. 'I\rring machines and the class of nondetermirristic The class of determirristic 'I\rring machincs are equivalent. Irrdicate explicitly how new machines are created. W 4. t €. Based on this simulation.3 NoNDETERMINIsTIc 265 Figure 10.b}+}.TunInc Mncutt'lps 10. b}+ .bl+I. Outline how one would write a program for a nondeterministic chine to arcept the langrrage L: Tltring ma- - ( { . r I. and how machines that halt are remove<l frorn further consideration. ) Ttrring machine that accepts 5. 2 . we state this formally. Contrast this with a deterministic solution. " 'a. Iated bv a deterrnirristic machine. Itow artive machines are identified. \ r U ' u t . Discuss in detail the simrrlation of a nondeterministic Ttrring machine by a deterministic one. l"l } lgl} How would you solve this problenr deterministically? ." u : t r . Show how a two-dimerrsional nondeterministic T\rring machine can be sirrru3. Use the: corrstruction suggested atrove to show that anY nondeterministic Ttrring rnachine can be simulated by a deterministic one.

The transition function is encoded actxrrding to this scherne. T<r clefine such an autornaton. Wc then select an encodirrg in which q1 is representedlry 1.r.rd way of descritring T\rring machines. without loss of gerrerality. we first choose a standa. Once d is defitred. assume that Q = |qt.ton thal. qz is representedby 11. . a1 is encoded as 1. rI r with q1 thc irritial state. on the other hand. NN A Universol TuringMochine ffiffiffifif. With the initial rrrrrl final state and the bltr.nyTtring rnachirre r:an be described cornpkrtcly with d only. Consequently.. Thring machines cannot be considered equivalent to gerrcral purposc digital computers. called a. .x f .. string ?u. and so on. .ffi Cortsider the fbllowing argurncrrt a.a3. is arr autclma. A rrnivt'rsill T\rring rnachine M.266 Chopter l0 OrHnn Monnls oF TrrHrNC.Qz.. uz) : (qz. we modify Dcfinition 7. o A movc dcpends orr the tops of the two stacks arrtl results in new values being pushed on these two stacks. and f : { a r. 1 1 . . A two-stack autornaton is a nondeterministic puslxlown arrtomaton with two indepcndent stacks. Digital computers. Slrow that the class of two-star:k automata is equivalent to the class of T.s . Similarly.grr. 1 0 1 1 0 1 1 01 0 1 0 . q2 the sirrgleIinal state."L) might appear a.1 so that d : Q x (tU{A}) x l' x f * f i r r i t es u h s e t s f Q x l . etc.inst Thring's thesis: "A Ttrring machine as preuented in Definition 9. Desigtr a rrorrdeterministic Turing rnat:hirrethat accepts the language L: {a" : rr is rrot a prime uumber} . where a1 rcpresents the blank.mple. arc guneral purpose machines that can be prograrnmed to do rliff'erent jobs at differerrt tirnes..rrying out one partir:rrlrrr type of computation.nd a.1 is a special purpose cxrmprrter. can simrrLlte the computatiorr of M on ro. tht: ma. ffi * 7 .universal T\ring machine. ." This ohjection can be ovr:rrxlme by designirrg a rcprogrammahle Ttrring machirrc.nk defined by this corrvcntion. . To construct such irn M. ir.'s. vrith the arguments and result itt sornc prescribed sequencc. For exa. given ils input the descriptiorr of any T\rring machine Jl[ ir. & r n } . We rnay.uring rrrachines.chine is restricted to ca. The symbol 0 will be rrsedas a separator btrtweenthe J. .t MAcHTNES 6 . . .Q. a2 as 11. d(gt. & 2 .

Wrr prcfer irrstead to appeal to hypothesis.t is therefbre ir proper urorlt:l firr ir gcneral purpose cotnputer. we carl clecodeit uniquely.\rring ma.lizir.1 is a rea. Firrally.Thereforc.2.I)enrris.nd tapc 3 thc irrtertral state oI Il[. For atry input M and u. Section 9.ttguage.44 Tape contents of ll4 It folklws frotr this thirt arry Thrirrg rrrachirrehas a fitrite ettcoditrg as a string trrr {0. given any encodingof M..pcs 2 arrd 3 to deternine the tape 1 to seewhat M would do in this configurtrtion of M. By ca.16 Control unit of M. thrlrr has itn irrprtt alplnbet tlrrrt inits cludes {0..16. It is within rea$en to txlnstnrr:t an actuar.g. . tapes 2 and 3 will be rnodified to reflect the resrilt of the rnove. for exiltrplt:. Dt:nrrirrg.Tz. higher level la.T:1..s iruporta.c(:Hr'lr: 267 Figure 10.. and Qualitz 1978).nbe put into a.rrlrltititllt: milt:}rirrt:.4 A Lilrvllnsnl TuRrNc M. For infinite set$.. we need to review some resrrltufrorn sct theory. Some strings will not represent any Turing machine (e.rr11 stl .t t]rc cxlrrtcrrts of ta. A set is snirl to trc coutrtable if its eletnents with the positive integers.thc set of all even iutegerscau be written in therordcr 0.tiorrssptrilied by that progl'an ancl tha. sirlr. Tape 2 will contain the tnpe cxrntentsof M.. 1 } and the stnx:trrrc of ir. the strong 00011). It thc:rrcxrrrsults c:onliguratiorr. Wetarc tlrt:rrjustified in claiming the existerrceof a. so they a.lurriversal Turittg tnaciritre (see.ge..10.ngua.. Brrt bcfbre we explore these implications. The observation lhat every T. of far:t. a. Internal state of . rrrrclthat. The in irnplerrxlrrtrltirlrr tlc:irrly catr be done usitrg solne programming la. thtl l)r()grilrrrsuggesteditr Exercise 1. utriversal Ttrring machiue in a.. M* looks first ir. sltowtt in l'igure l 0. we distinguisir trtrtwccrrsets that are countable ancl sets that are uncountable.nt irnplications.. 1}+. Fclr cxilrnplt:. wtl tlxpect that it catr also be done by a stanclard T\rring machine. but we can easily spot these. Sone sets are finite. call carry out the comprrttr.one-ttl_oncrxlrrt)tporrrlerrce this we mean that tht: clernerrtsof the set can be written in ${lme ordtlr. A universal T\rring rnilr:hirrc M.n encxrdeddofirritiorr of M.4. but the process is 'I\uing's rrninterersting.T\uing trachine that. thirt cvtlry elertretrtof tire sel has some fitrite index.chinecan het rcprtlserrted by a string of 0's and l's ha.tion ir.pe I will keep a. ta. givcrr any program. lrut most of the interestirrgst:ts (and languages)are infinite..reof no concern..

{#} . This does not irnply tlnt thc stlt is uncountable.sta. the set is countable. occurs irr position n { 1. in such a way that any s in . 'I'ake the set of all quotitlrrts of tlte fortrt ytf q.t is. irr This eives us 1 1 2 1 2 t ' z 'T ' 3 ' i ' " Here the elerrtcrrt f. Lrlok ir. tha.9 is prodrrt:rxl irr a.17.17 Since any positive integer 2n.n carry out the secluenceof steps qoni q"rr # ELi Q"'rz sz . tlxr strirrg folbwing f rnust be in S.tr:q.y seem counterirrtuitive. Tht: set is therefore countable.." is rr. wlrt:rc p ilrxl q a. How should we order this set to show that it is courrtabk:'l Wc t:ilnnot write 1 1 1 I I't'3'4'. ancl every elerrrent has sotrc place in the seqrrcrrc:t:.. becaust:thtrrr fr worrld never a.t the scheme depicted in Figure 10.268 ChopterI0 Ornpn Morrtlt. .We call such a rnethod arl enumeration procedure. but there are lrlore corrrplic:irttxl cxilrrrllltis. clever way of ordering the set to show that it is in fact courrtablr:.ppear. # witlr 1llr1 I'* . ". Let 5 be a set of strirrgs orr rromrtnlpha.. or:r:rrrsin the eiglrth place. We see from this t:xirrnplc thir"t we can prove that a set is courrtabltl if we can produce a rnethotl try whir:h its elements can be written in sorrre $equence. irr this r:asu.bet E. This shorrldnot be too surprising.there is a. e finite number of steps.sol''l'unrlc MAcHTNES Figure 10.tesignifying rnernbershipirr S. Since an proce$$1 can use a {lrurrnr:rationprocedure is sorne kind of rrrer:}rirrrit:al we Thring trracltirrtl rnork:l to define it formally. sorrre of whit:h ma. € S'. whenever q" is entercrl..repositive integers. and write dowtt the elerrrerrt thc rlrder encountereclfollowing the arrows.. Then an enurneratiorrproc:+ 'Ihrirrg dure for 5 is a rrrirc:hirxr thilt r:rr. Tlxr stir.

luses for sur:h an orclerrirrgwcl will crrll it the proper order.n errcodeeir.. I - . we thrlrr rxrrrstnr(:tthc following euunreralion procedure. R.ac. Instead. ab.ting the condition of Definition 10. However. As we will have severa.lly lltl gtlrrt:rateclby this process. tr. ignore the string.r:hirrtl coding.ry.u$e prochx:cs wcll-rlc:firrcxlarrd it prtxlir:tabk: rtlsults. we (:an rrse il rxldificd rlrdrlr.chines a..n enumera.aae. As we will seein the next chapter.hb. to Since every Tlrring rnilt:hirrc hirs a linite description.te when S is infinitel.relisted befbre the stling b.tion.10.. bc{:ir.tethe next string irr {0.yin the order in which they would appea. c}. write it on the ta. is countable. the order used in dictionaries is not sr. iln (:rnrrnerirtiorr llrttcedur{l crrrrrrot be called att algorithm.ry. we will never reach b. 3. Tlrrirrg rnirchirre.chT\rrirrg rna. Witlt this enProof: We ca. cc. b.r:trrrrr Stcp 1.nenumeration prclcedurethrr.f'tcrir.rin a dictiorra. NevertlNrluss. 1.tionprocedure tha.chines. cb. Let X : {a. thus viola.it r:an bc c:onsirlcrcrl it rnerrrrirrgfrrlllrot:cus. usirrg 0 and 1. It. This is a. r. ua. 2. The set of all T[rring nta. We can show t}rat the S : X+ is countable if we can find in n.lt]xtug]rirrfirritc. In tr rlictiona. But any set for which an enurneration procedure exists is corrntable bccarrsethe enumeration gives the required sequence.t prodrrcesits elermrrnts sornr:rlrrklr.TunlNc Macnrr'm 269 Not every set is countable. sa. any specific milr:hirx: will eventrrir.t protlut:us thc sequence d.3 that any given string he listtNl a. finitc number of steps.ritablewithout modifica. 1}* irr properl orcler. in which wtl takc the lerrgth of thc strirrg as thc first criteriorr. I Arr irrrportant consequence of the above discussion is that T\rrirrg mir. Chcck the getreratecl string to see if it defines a.pe in the form rr:tpircd by Definition 10.If so. there are sorne uncountable sets. Genera. $trir:tly speakirrg.4 A Ur{rvnnsal. ba. followed by an alphabetic ordering of all equal-length strings.r. If not.recorrntable.3. ca. But when there ale an infinite number of a worcls.bc:. since it will not termina. all worcls beginning with a a.

.1}t o.ge other ways.L) . fbr in tlxample.chine. ffi L 9 . gpi..a r ) : ( S r . nondeterrninislic I'uring rrrachirrewith a. Show that the set of all triplcts. ir. {h 4 .rrd slxrws that the ordering itrrelf is utrimportant.$ffiffi1f BoundedAufomqto Lineor While it is rrot possiblc tri qrxtendthe power of the standard 'I\rring machine by complicating the tirpc: stnrt:trrrt:. 2 . For Example 10. we nright permit only a finite part of tirc tirlrrl to tlc ustxl as work $pa.a r . if we use a diffcrclrrt crrcxlrlirrg. 4 r . This is of rro conserlrrcrr(i(lr howevcr. Slrow that then Sr U Sz and .chtl its index irr the propel ordering. find a function / (tt) that gives for ea.r:kto finite arrtomata. ) . Sketch an algorithm that exarnines a strirrg in {t). positive irrtegers. It can be shown that this leads us bir. Sketch a Turing rnachinc program that enumerates the set {0. a : ) : ( g r . A llrshdown automaton can be regarded as a. We catr also restrict thc tir. irr proper Design a Ttrirtg rnat:hirre that emrmeraterr the following set in proper ordcr.?.. Show that the Cartesian uroduct of a finite trurnber of countable sets is rrlrntable. using the suggestcclmcthod.a r ) : (S. d (93.1lc r:an br: rrsed. tnpe thar..9r x . 7: {anb" : ri } 1} 6 . I t ) . (i.pe rrsrr. is countk able.j. az.. What rrrattcrs is thc cxisttlrrr:t:tlf sorntl rlrrlt:rirrg. 1 .92 arc also countallle.he encoding we use. for thc Ttrring rnachine with d ( g r . 3.s TtrRrN(+ oF The particular ordering of T\rring machines clepends on l. (see . We have already seenan example of this with pushclowtrautornatir. Give a complete encoding. I} I to tleterrnine w}rether or not it represents an encoded 'I\rring rna.3.ce. a .7.h) with . . What is the irrrlex of ()tlr in Exercise 3? D.wc rmrtlt oxllrx:t a rlifferent orrklrirrg. 1 . i ( g r.Je. it is possible to limit it by restricting the way itr which tlrt: ta.270 Chopterl0 Mncslrur:s OurnR MoDEr.tis restricted to being used like a stack.i. Suppose that Sr and 5'r are coutrtable sets.

i i l A string 'uris a. In particular.srrhiectto the rerstric:tion that X rmrst r:orrtairr two spct:ial synrbols land ]. generating another class of machines. Tlxl lirrrguirgcacccpted by the lba is the set Note tha.ninPii1.rbounded autornaton if there is a possible by s(xllr()rrcc{lf movr:s qn i l*rqr*rl lrl fbr some et E F. For sorne exDloration of this. The errd rrrarkers cannot be rewritten.harr:t:pttxl strings. This is not iust il rrrilttt)r of corrvetrience to tlxl rlisc:ussiorr lba's..n2 € f*.rhorrndcd irutornirtorl is assumed to but essential he nondeterministic.q.t in this clefinition a. For ilrr irrput rl. has an unbounded ta.l rxrrrfigurilti<lrr thc T\rring of rnachine is given by the instarrtaneousdescriptiort qn[r].St*. it is not of known whrrtlxlr thc:y ir.rings Lhan for short ones. Thus. suclt tlnt d (qr. We sometimes say that the read-write head "bounces" off the end markers. Rut there is ir wiry of lirnitirrg taptl usr: tha.rtl ecprivalent to the nondeterministic version.l.l-. of irll suc. the linear bounded automata (or lba). r.. and d(qi.t)1.n. the left-end marker ([) and the rightend marker (]).utomaton is a nondeterministic Tirring machine M : (8.l symbols.Qo. a. and the read-write head cannot rnove to the left of I or to the right of ]. we restrict the usable part of the tape to exactly the cells taken by the input. 1t).E. Iinea. A linear bounded automaton.sin Definition 10.'nc 27L Ext:rr:isc3 rr.l) catr contain only elerrrents the forrn (qi. see Exercise I at the end of this section. Iike a standard T\rring machine. we can envision the input as hracketed by two specia. but how much of the tape can be used is a function of the input. .]) can contain only elementsof the of form (qi.ccepted a linea. more space is available for long input sl.1lffi A linear lrounded a.$[iP'i.].L). F).5 LINITAR Brturunr:tr Au'r'oNl.ttho end of this section)r $o we need not pursue this.pe.t lt:ir.d.rlsto rl rnore interesting situation: we allow the tnachine to use only that part of the tape occupied by tht: input.10. While one can deline deterministic lba's. tlrtl irritia. To enforce this.2.

. until we can either accept or reject the string. I 6xil. say l-ry retrovirrg all syrnbols exccJrt those at multiples of tlx: clivisor.insthe current divisor (Figure 10. To prove srrch a conjecture. One way to solve the problem is to divide the number of tl's successively by 2. After this. a more direct approach is srrggestedin Exercises 5 and 6 a. It is not so ea"Ey to rnake a conjecture orr tlrtl relatiorr betweerr Thrirtg rrtachirres arrd lirx:ar botrrrrlcrl arrtorrata.3.re invariably solvable by a littear borrnded aut{rilratorr. and the extra tracks can be used as scratch space.ilL4li l'he Ianguage [. so it can be carried out by a linear bounded automaton. Problems like Exarnple 10.n be accepted by a linear bounded automaton.:{u. are mortl powerful than pushdown automata.We sketch the solution to point out one tacit implication of Dcfinition 10. wc irrcrement the divisor by one.tl.5 a.rz}0}.rise.r bounded automata. l. For this problem. we dividc the rrurntrer of c. This follows from the discussiorrin Example 9.t$ orr the first track. and continue rmtil we either find a nonzero remainder or are left with a single a.. eventually there will be a single a lefl. 1 8 .. we still have to show that any contextfiee language ctr.ii$ti$.mF. The computation ouLlined there does not require spaceoutside the original input. I Thc last twcl cxilrnplt:s suggest thirt lincir. if not. The first track contains the nurnber of a's lefb drrring the processof division.4. The actual sohrtion is fairly sirrple.272 Chopterl0 Ornen Morrus on Tunrlc MAcsltlss Exanrplle. at some point a nonzero remainder will a..8. and the second track conta. We will do this later in a somewhat roundabout way.nltncn:z>1} is accepted by sorne linear bounded autouraton.t the end of this section. IJsirrg the divisor on tlte secortcltrack. since neither of the languages are context-fiee.18).Find a linear bounded autornaton that accepts the language L:{ont. Tht: tapc of a linq+ilrboundtxl automatorr rnay be multitrack. sirrce art atrount of scrirttlh spircc l)rol)ortiorral I ' i g r r r cl { ) . Ifthe input is in -L.4. we can use a two-track tape.

L : {ai'o : n. Find a solution for Example 10.iruuutAurc-ruere 273 to the length of the input is availatrle. Consider an off-line Turing machine in which the input can be read only once.b}. 4." : n is a prirne mrrnher} (") .b}+} { D.4. L Give details for the solution of Exat4rle 10. Show that for every context-free language there exists an accepting pda. rnovirrg left to rigtrt. it is quite difficult to come up with a concrete and explicitly defined larrguage that r:rr.10. such L To define a detenninistic linear bounded autornaton. is fixed for all inputrr.tion of this requires a lot more work. assurrring thatE:{a. Examine your solutions to Exercise 4. ffi 3 . but require that the Thring machinc be deterministic. 2 .b}+.5 that does rrot require a rrecontl track as scratch space. arrd not rewritten. that the number ofsymbols in the stack never exceedsthe length ofthe input string try urore than . On its work tape. ( * ) .n>1} W (f) f : {rururty't:.utornata? If not. .5.rtt*. Find an lba fbr the cornplernent of the larrguage irr Exarrrple 10.ure a.5.ss of lirrcar brlrnded arrtomata is less powerful than the class of unrestricted Thritrg tttachirres. is not a prime number} (d) r: ( e )I = {ww. Show that such a machine is equivalent to a finite autornatot. L: { a ' " : n : n f .utomaton. try to fin<l rrolrrtions that are.nr}l} (b) fI : {o. but il dc:monutra.b}+} {wn:w €{a. Are the sohrtions all deterministic linear bounded a. In Chapter 11 we will show that thc r:la. Fittd lirtear bountlerl automata for the following languages.5 LINFTAR FJor. W Use thc observation in the abovc exercise to show that anv context-free langrrage not containing A is accepted by some linear bounded automaton. whete rt. it can use at most n extra cells fbr work space.w e {a. we carl use Definition 10. Irr ftu:t.nnothe accepted by rrny linear hounded a. 6 .


the study clf fbrrrral Ianguages.minethe la.milies.nd r:ontext-fiee gra.re no Ttring rnachitres.t the family of languages a"ssociated with thern is rpitt: llroarl. It int:hrrlcs not only regula.'pted some Tirring ma. thtlsc grarnmilrs rlnd relgrrlrrra. This leacls to a .ncJ context-free langua. and gives little insight into the probleut. We will answer this queshy tiorr lirst by slxlwing thrrt tlxrrc a. Re'Ihritrg cause tnachiues can perforrn arry kirrtl of ir.crx.-t I ti .tionwill be to look aL the relatiorr betweerrThrirrg rnachirrtlsirrrrl t:crtilin tyllt:s of grirmmar$ and to esta.ges.but also the various exarnples we have crrcourrtereclthat lic outsidc tht:sc fir. Arxltlrcr avenue of investiga.The nontrivia.clrine.wt: will cstalllisir tlx: cxistenrx: of latrguagestrot recogrrizablcLry Tlrring rnar:hinesthrrnrgh mrirc cxplicit exa. so that thtlrc rmrst llt: sornt: Iarrgrrilgcs for which there a. Our irrrrrrediirtcgoal will bc to cxa. Ilrt norrr:r)n$tnx:tive. lbr this rcas()rr.nguages than T\rring rnachines.blish connection between a.l questiorr is whether there are arzy languages that arc rxlt a.tnplesthat actually allow us t.lgrlrithrnir: r:ornputation.r a.ed with tnircltirrcsilrrtl srlrncof thcir rostrit:tions. we expect to finrJ tira.A Hierqrchy f o F o r m q lL q n g u o g e s ond Automqtq e ttow returtr our atterrtion tt) (lrr rnairr intelrc$t. I'he proof is slrort itntl c:lt:garrt.o iderttify oric suc:h lirrrguagc:.remore la.nguages 'I\rring associat.mmars.

ctthat T\rrirrg urac:hirrt:s. it rnay btl that the rrrachine halts irr a rrotrfinal state or that it never halts and goes into an infinite loop.tll Leucuacns axn Autoraana hierir.ge. This deflnition implies only that there exists a Ttring rnachine M.tccvurythirrg so tlrrrt . .r not irr .ngua. we have defined theur.ngua. the first does trot irnply the second. do ttot contain A.king.Sorrreset-theoretic cliagrarnsillustrate the relationships tx:twt:crr various language families clearly. In cloing sor we rnust nrake the important distinction between Ianguagesfor which there exists an accepting Ttrring machine and klnguages for which there exists a.ra. w{: rnake t}re tacit assutnptiotr l}rat the languages adcl errepr.t accepts it. Ber:inrsea T\rrirrg rnat:hirrt: does not necussarilyhalt rlrr irrput that it rloes rrot accept. It is a in trivia. Recursive ond Recursively irilmmffiffiilt E n u m e r o b l[e n g u o g e s o We start with some terminology for the languages associa. To irvoitl havitrg to replrrase the definition or having to disr:lairncr.\ is irrcluded. is said to be recursively enumerable if there exists a. fbr every ?0€ L. many of th{r irrgurnetttsitt this clmpter are valid only Strictly spea.276 Chopter I I A HlrnnnclHv or FoRtr. We can Lremore demanding irnd ask that Lhe rnachine tell us whether or not any given input is in its Ia.ted with Thring rnachines.Ttrring ma.rc:hyof grilmrnars arrd tlrrough it to a method for classifying ltrnguerge f'rrrrrilics. such lhat. for la.tc. tyyut I y :I:tqII:21 with gy il finill ster. Thc dtrlirritiorr says rrothirrg about what lnppetrs for 'rr.chinetha. rrnll:ssothcrwist: statetl. canuot accept the ernpty strirrg.l matter to rtrstir. but we will leave this to tire rerrder. This restrictiotr arises as fiom the frr.L.tncl cliscussecl this r:haptor. A language1.ges that do not inchrrlc thc tlrnpty strirrg.memhership algorithm.

rrrd sri on. We sirrrply cornpare the given irrput string agairrst successivestrirrgs gcriera.ted by the errurnera.'i irr the enurneratiorr.1.nd clo one stcp on ur3.nceis depictcxl in Figure 11.1 First move wl -j uj Secondmove Third move . we gerrrlrate?lr3a. After this. wc will eventually get a rna. Wc first construct anotlrrtr lhring rnac:hirre. Flom this.1 R.. cvery string in . If a..nenumeraticlrr pror:edure clxistsis recursively crurmerahle. As wltir:h is modifled so that it writes strings orr its tape only if thcy are in L. then there cxists an easily cclrrstnrctedenuttterirtion procedure.11.nguage rrlcursive if and only if there exists a rncrnbership algorithur for is it. the cornllrtation is perforrnetl irr a. We cannot use the erhoveargumerrt ir. the third step on [r1. It is easy to sec tha. Then we ltlt M generate u1 on ur2 irnd let M executt: one rnove on ur2.nenurrrerationprocedure lbr cvery recursively enumera'ble languagc is not as easy to srlrl. language is lrx:ursive. let us sa/ '{t1. when startecl with ttj ou its tirltt:.ECURsrvrrt AND R.t cvery language frrr which a. clifferent way.nguage-L.lsoa..1.te arrrl lt:t M execute one rrr()ve it. that gcrreratesall strings in X+ in proper orrler. 'r'hat thcrr: is a.tr:h. it is clca.r)2. say M. and the processr. Defitdtions ll. that follow rr. The order of ltrlrftlrma. may never halt rrrrrltirerefore never grlt to the strings irr r.ture of either recursive or recur$ively enrurrerallkllanguages.llyproduced by M.tiorrprocedure.L is cvcntrra.ro in X+.J ancl 11.ncunslvl:lv Exulannenlt Larvcuacns 277 I |.. 'I'hese delirritions a. Since any ur e tr is generatecl by -ffi ancl acceptecl by M in a.ttilr:h narnes to languagc I'nmiliesa.but shcd u2 Figure 11. beciruseif sorne rui is rrot in 1. the machirrc M. finite numlrer of steps. Sr4lposethat M is rr T\uing machine thir.r that M will never get into an infinitcr Ioop.2 givc: rr$ very little insiglrt irrto the na.t rleternines mernbtirship iu a recursivtr la.L. lf 'ur E . In olher word$.Iriion A languirge tr on x is said to be recursive if there exists a 'rhrirrg machine M that ar:rx:pts L and tlut halts on every .ssociated with rtrring rnachines. Trr make sure that this does not hrlppen. folklwod by the second ilr()vc on urr. Wc Iirst get fr to glenera.rFirqflir$.they becourethr: irrprrt to M.sit stands..2 . the second step on ur2. a Ia.nn be terminatecl. thesc strings are generatrlrl.

t2. In this ta. N6r do they tell rrs much about the rclationships between these languages or their conmlt:tion to the la. replace 0 with 1. t t (t) \. Thtrn any element f of 2" can be represented of by a sequenc:c 0's attd L's. the elerments resrrlt. Orte is very short and rrsesa very fundamentir..2.. We "Are there languages are thercfore immediately faced with question such as "Are there lanthat are recursively orrurnerable but not recursive?" and guages..n 0 f 0 ) 0 0 1 t f 0 ] \*. especia.. In the are 1L00. that is...use differs from fr through s1... any eletrerrtof 25 can be by {rr./j : . and compklrnerrt each entry.so we get 0011.. Proof: I. describablt: sotttehow. arrd atry such sequentlc represents a unique clernent of 2s..nguagefarnilies we have tlrrcountered before.. "r.. s3. with a 1 in position i if and only if sa is in while t. Suppose that 2s were countable. But it camrot be f1 bet:ir...2. as the example in F-igure11. is represented 10101"" Clearly. The trew sequencerepreserrtssome element of 2D. and vice versa. For example.."r.. Let S tre an infi.lly Enumeroble Longuoges Are Not Recursively Thqt We can establish the existence of latrguages that are rrot recursively enumerable in a variety of ways. say t!.... then its elements could be writl.ts..take the elements in the maitr diagonal.tR.2 'l 0 r l t L l l t-.hk:. say f1 for some it i. Then its powcrset 2* is not countable.} represcntcd by such a sequencc.we will not be able to produce very thc second one' explicit examples to illustrate these questions. as shown in Figure 11.I and elegant result of matherrratics.278 Chopter I I ANDAuToMATA A Hrnnancny o!' FoRMALLANcuAGEs no light on thc nature of represerrrtativelanguages in these families..s6} is representedby 01100100.nite countirble set. For the same Figure 11.et S : {"r. that are not recursively emrmtlrable?tt While we will be able to supply some &nswers.en iu some ordcr.}... and we could enter these into a table. the set {"r.

becausc it involves a rrranipulation of tlxr criagonal elements of a table. so the set of all recursively enurnerable la. For any nonempty x. Thercftrre the set of all Innguagesis exactly 2E-.ges that are not rur:ursively enurnerable.renot recursivcly enumerable. As an irnrnediate consequrlnt:e this result.ncuRsrvg AND R.rurr. the description of a larrguagc that is not recrrr$ively enumerable must be indirect. we investigate the conclusion rnore explicitly. this irnplies that there rrlrst be some languageri on x thar arc rrot recursivelyenurnera. F.I 1.ECUR.ble. A longuoge Thotls Not Recursively Enumeroble Since tlvery language that carr he described irr ir direct algorithmic fa-shion can be acct:pted by a I\rring rnirc:hineand hence is recursively enurneratrle.there are fcwer T\rring nrachineu than there are languages. This contradiction crcates a logical impasse that can be rernoved orrly by throwirrg out the a. or arry other fi. It is completely rronconstructive ancl. so that there must be sorne languages that ir. eilthough short arrd simple.srvgLy Er. we will see a similar argurrrent in several contexts.nbe enumerated. Tltcre exists a recursively enumerable lirrrguage whose cornplenrerrt is not recursivsly enurnerable. 1 R.ren. there exist larrguagesthat are rrot recursively errurnerable. is called diagonalization. The techniquc is attrilruted to the mathcrnatician G. Proof: A language is a.1 tells us that the srrt of all languagrn on E is not countable. The argumerrt involves a variation orr the diagonalization theme.eLH LANCIUAGES 275 rca$on it cannot tre f2. f3. Nevrlrtheless.l is diagonalization in its purest form.srrbsetof E*.it is possitrkl. By Exercisc 16 at the end of this section. In the next fcrw chapters. I This proof.ssurnption lhat 2s is countable.c. Theorem 11.nguagesis courrttrble. who used it to clernonsrrirre that thc uet of real nurnbers is not countable. I This kind of arflument.nguagesrnight look like. it gives us no ferrling at all for whirt these la. Cantor. . in some of serrue. we can show that. while it tells us of the existence of sorne larrgua. In the next set of results. Theorern 1l. Since X* is infinite. is irr rnany ways unsatisfyirrg. arrclevery such sutrset is a language. But the set of all T\rring rnachincs t:tr.

L or in Z? Suppose that ak eZ. Is it in . We therr use the enumeration procedure for I\ring machirres to firrcl Mr.M2. we give its description along with a" to a universal T\rrirrg rnachine M. for is an associated recursively enumerable lirrrguage L(Mt).2) implies that ak 4 t' ltvt*1. therr oo # L and (11.so wc can associatc arr orcler Mt. Finally. If a' is in tr. we corlsidet the complcment of -L. the computatign carried out by M.' e tr' Therefore. ke L ( M i l .rrguage . But thett from (11. will eventually halt.1) now irnplies that o *f T . If this is tro. which is rr. Next.}. that simulates tlte action of M orr a'.L if arrd oilly if ad E L(Mt). Given a'.For cach Tirring rlachile Mi. We now cotrsider a ncw language tr dcfined as follows.280 oF Chopter | 1 A Htunnrrc:IIY FoRMAI Leucu'q. strrrting from the assumption that . by Definition 11. To complete the proof of the theorem a$ $tated' we must still show that . It is clcirr that the la.bet. By Theorem 10.cus. qirch recursively etrumerable larrguage on X. since the Btatement u' € L (Mr)' and ltettce a' E tr' firust either btl true or fhlsc. The contradiction is inescapablc.. and we must t:onclude that olr assumption that Z is rccursively enumerable is false. with its c:lements.1. there Conversely. tlre string a'. and consider the set of all Ttrring machirreswith this input alpha.Iso We will show this bv srtrtradiction... this set is t. is ttot recursively enumerable. Conversely. if we assume tha.2) this implies that a .L is recqrsively enumerable.L is well defiiled.L. For each i ) 1. say Mp.t ak is in .. ) } . (11.2) Consider the strirrg aft. sucltrthat T: t lttt*1 (1r. we first find i by counting the number of a's. a. The combirred effect of this is a Ttrring tnachine that accepts every o.. then there must be some Ttuirtg machine. But (11.By (11.1) we get that ak eT. is in. We can use fOr this the knowrt enumeration procedure for T\rring machittes.twn AuroMere Proof: Let E : {o.3. L is recursively emrmerable' I - .1) well definctl but. T: { o o' a i 4 L ( M .L is recrrrsively enumerablc.ourrtable. there is some T\rring machirrt: that accelrts it.swe will show.

s enumeratiorr procedures for -L and Z.renot identical. Thereftrre. Coruider the la. we need do so in a rather roundabout way. .srrbsidia.anln LRrucu. but its complement is not. I Flom this. A Longuoge Thotls Recursively Enumeroble But Not Recursive Next.ECTTRSIVELy _Er-ruunR. Therefbre Z is recursive. it is not recursive.3. Nevertheless. we Iet M gerrerate fi1 and compare again.then Z is also rccursive.1 RFrcuRSrvE ANDR. Proof: If tr and -Lare both recursively enunrerable. that is.then there exist TLrring machirres M and M that scrve a. respectively. a welldeflned Ianguage that is not recursively enumerable. Sirrce any recursive language is recursivcly enumerable. in tr.rlt2. If the rrmtching string is procluced by M. throrrgh (11.L.{. If they are not thc same. We first letff generate ?x1and compare it with ru. If. If a lattgutlge tr and its complernt:nt Z are both recursivrlly enumerable. Thiu is not to say that there is arr easy. wc: next let M generate u2.1)..L is rcr. sively enumerable..milyof recursive larrguages a. so eventually we will get n match. The language tr irr Tlrrx>rem11. Aga.4.11. But this brxxrmesa member$hip algorithm fbr Z by simply corrrplt:rnenting its conclusion. Arry ru € X+ will be gerreratgd either by M or .. then both languagesare recursive.3 is in the first but not in the second family. The first will produce rily. otherwisc it is in Z.blelanguages and tlx: f'rr. and consequently lloth are recursively errrrnrerable.ry result. we cxrnclude directly that the family of recursively enurrrertr. This language is recur1. then M gerrcrate fi2. giving us thr: kroked-for example. and so on. we show there are some languagersthat are recursivrlly enumerable but not recursive. assume that tr is recursive. so they are both recursive. Proof. by Theorem 11. If we need to continue. Then there exists a membership algorithm for it...lrr$ive. I -: .in. thc uecond fr1. the family of re:cursive languages is a prolrer subset of the family of recrrrsivelyenurnerablelangrrages. We begin by establishirrg tr. The process is a membership algorithrn for both -L and .4. it would be difficult to exhibit more than a ferw trivial members of this language. zrrbelongs to -L.fr2. Z is llroperly defined. Suppose now we are given any ?r. There exists a recursively enumerable languaga that is not recursive.€ E+. intriltive interpretation of Z. For tho (ionverse.qcns ZBI The proof of this theorern cxplicitly exhibits. the proof is completed.ngrrage of Theorerrr 11. in L.

find one clement of the larrguagc Z ir't Theorerrr I1.hat L is such that there cxists a'firring rnachine that emrmerates the elements of tr in l)roper ordcr. 12. Why does the a.3.+ is also recursive? W L5. Show that. show that in fact Sr . Prove that the set of all languages that are trot recursively enurnerablc is not countable. 18. 2. Choosc a particula. 13. 52 a sct that is not countable. Slxrw that l+ arrd suggcst a. Let -Lr be recursive and -L2 recursively cnurnerable. Let . L S[ow that the farrrilies of recursively emrrnerable and recursive larrguages are closed under reversal. if a language is not recursively enumerable. 16.Sr cannot be countable.Sr' 17.9r be a countable set.L is recursive. is it necessarily true that 1.n entrrttetation procedure ftrr it. ffi 3. and with it.Lz .-Lr necessarilv rccursivelv ernrrtetable. Show that the fatrily of recursive lartguages is closed under union and intersection. Isr the family of recursive larrguages closed under concatcuation? 11. 4. Prove that the cornplcment of a t:otrtext-frce language rnust be recursive. Suppose l. Show that .qnc. Show that then . Show that this rneans that . Let -L bc a finite larrguage. If L is ret:ursive. Wc cotrclude from this that there tr. ffi 7. 6.1 fail when S is firrite'l 19. and Sr C 52. L4. Is the family of recursivcly enumerable languages closed urrder intcrsection? 8. 1. Suggest arr etrumeration procedure fbr -L+.Ilv FoRIrnAt.282 Chopter I I Ln'lqcuecnsnNrr A(r'ro\'Iann orr A Hrun. is recursively enumerable 5.r'gumcnt in Theorerrr 11.Lr is recursively enumerable. 10. In Exercise 16.r' encoding for T\rring rrrachines.roirrdeed well-defincd latrguages fbr which one cantrot crrrrstruct a memhcrship algorithrn. Show that S'r must then r:orrtairran infinite number of elements that are not in. Show that the family of recrrrsively enumerable languages is closed under union. Prove that the set of all real mrrtrbers is uot countable. Show that the set of all irrational mrrnbers is not countable' ffi ffi . its complement carrrrot be rer:ursive. Let -L bc a context-free language.

tha.2 UNRESTRTCTED GRaMunRs 283 U n r e s t r i c t eG r s m m d r s d Trl investigate thc connection betweerr recursively enumerable languages and grarnrnars. In Definition 1.S. we return to the genarnl definition of a grarnrnilr in Chapter 1.ffi. iiilffifr. we . thcrc will he a finite number of srrr:hstrings. We show this in two parts.ffi$ffi.' (:ernlist all ur in . but various rcstrictions were later made to get specilic grirmmar types.nd these can occur in ilny order. a. If we take the gcrreral form and itttposc no restrictiorn. rrnrestricted gralrrrnars gcnerate exactly the farnily clf recrrrsively enurnerablc languages. In fact.{tilli l'iirj A grammar G : are of the form (V.T. essentially no conditions are irrrposedon the productions. unrestricted grrrmmars are nruch rnore powerfirl than restricted forrns like the regular and contr)xt-fiee grammars we have strrdied so far.t S+u. In an unrestri<:tedgrammar.P) is called unrestricted if all the produt:tions U+U. Anv language getreratcrl by tln unrestricted grirrrrrnirr is recursively enurnerir. Next. where z is in (V U ?)+ ancl u is in (y U 7).n be on the left or riglrt. but the second involvcs n lengthy constructiorr.\ is not allowtrd a..bles and tertrinerls ca. w(. Proof: The grarnnrar irr cffer:t defines a procedurr: frlr enumerating all strings in the language systt:matically. For exarnple.t is. ur is derived irr orre step.ble.fi{. the flrst is quite straightforward.sthe left side of a prodrrction. Any rrurrrber of varia.1 the production rules were allowed to take any form. wc gert rrnrestricted grammals. As wc will see. unrestricted grarrrrnars txlrrespond to the largest faurily of languages so we ca.n hope to recognizc try mechanical rneansl thir.t is. There is orrly one restriction: .11.L such tha. Since the set of the productions of the grammar is finite.

2 . S4u' this.?.L (G): L (M)' The idea bchind the constrrrctiorr is relatively simple.3) holds.t can he dcrived itr two steps . tratrsforttts 3 . Tltring rnachine ancl.3) we will try to arrangerit so tltat the corresponding grarnmar has the property that Qxu tf'I:Qy!.go.n.4) and wha't we rea.F) prodnce a grammar G such that . (11. and want to We are given a Ttrring mat:hineM : (Q.a) is possibleif and only if (11. therefbre. we describe how any Tlrring machine can be mimickccl by an uurestricted grarrrmar.d.mmars is rrot surprising. To show the converse.the grammar.uel Lancueons AND AUToMATA Iist all w itt L tha. namelv.r this string into the origirral'u. To irc:trieve irr broad outline. has thr: following properties: 1 . we construct a grarrrrrlarwhich' ftrr all tu satisfying (11.9+r+tr. We can simulate these derivations on a. r This pnrt of the correspondenccbetweett recursively emrmerable langueges and urrrestricted gra. The grammar generates strings by a well-defined algorithmic process. wltat is more difficult to if and only if (11.llywirrrt.84 Chopter I I A HtnneRcHy or.4) 'Ihis is not hard to do. and so on. Since the computatiorr of the T[rring machine carr be described by the sequerrce instantancous clescriptions of qou I rqfA. . but its itnplemeutation bet:tttresttotatioually cumbersotne. Wherr a string rqly with qy € lr is gerrerated.chine.E. (11. so tlte derivations can he dclnc on a T\rring ma. (11. see is how to rnake the connection hetw(:t:rr(11.5 carr derive qou for all ur e X+. Hence it is recursivelycrrutnerable. .f.' !'on. have atr enumeration procedure for the langrrngtl.3).3) holds.

WdVrtq.rolvooo. lbr the second sttp. thc statc: r.tr.ry number of leading and trailing blarrks. How can the grrrmmar remember u if it is rnodified dr. The variable ydb ()rr(j()(l()tr tlrt'two syrnbols rr. '. If in thc st:t:ttrrrlstc:p. for cirdr trarrsitirlrr .5) Tlrc thirrl step in the a.t" VpiqVoa. d .l (qu.r W.6) (11.lla e E. we include productions WiA + et ( 11 . for every ei E F. we inch-rdein G VpqVo. p € EU {l}..ncl while V. For cirr.)e for all a. (11. the gratrrrnar tnust thert get rid of everyLhingexcept ur.i . the grammar erases everything except the saved'ur.yy 1w. q e f.Vprt --. : kt. ( rr .n arbitra. and all rl such that qa e Q.i6 encodesa and b as well as b.5) can berac:hieved the encoded fbrm) by (in 5 .. h € I-. for irll n.2 UrunnsrRrcrED Gnalruans 285 Tlrt: rxlrnlllete $equen(ie derivations is then of $ 4 qorrl xq.J .VrrSlSYnll?l 7' TV.}r d ( q 0 . . Therefore. R) productions of M.7) for a. while the second is used in the steps in (11. Thc first is saved. s ) ( 1 1 .mmtr. When a fitral corrliguriltiorr is tlrrtcred. M t:tttr:rs a firral stattl. we introduce variables Vo6 ir. (i1.rringthe secondstep? We solve this Iw enurding strings so that the coded version originally has two copies of u. q e I'. L ) of M.bove deriva.4). p e }JU {tr}. " ): ( q j .1. 1 0 ) .11. Tlx: first step irr (11. we prrt into the gra..'. which is saved in lhe first indices of the V's.ndVoi6f'or all a € XU {n}.tionis the troublesome one.fheseprocluctions allow the glarrunar Lo gerreratean encoded version of any string q0urwitlr a. 'Io produce two copies of tr and to harrrllrl thc statt: syurllol of M (which eventually has to be removed by the grammar).

d.lqprrrn.x. The following example illustrates this complicated construction. 1. b e f.6) and (11.b.:{a.14) which acct:1lts tlrt: strirrg na. F) uea Turingmachine n.mpleto see what the variorrs prodrrctions do antl whv thclv itrt: rrecrlcd. Clonsirler rrow tlte tlttrnuutittiorr esaal aesal aaesl F aqla[ (11. a(.!. R) d (qo. with qr} I : {qo. lliHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH mt M : (Q.6) rrnrl (11. wc rmrst first rrsc (11. The extraneous blanks are removed at the erd by (11. CQ. wtl first us(rnrlc$ of the form (11. We need one more specrial E .tncIIY oF Fon. (11.. Ttl tklrivc this strirrg with (J.13). f .by a cWt Wtc for all a.\. tr) !. (11.13) This last production ta.tr}. Carefully check each step in the exa.. representing all the tape region used.rt of the ta. c€ EU{tr}. I'he Iast sentential fornr is the starting point for the part of the derivatiorr that mimics the cornputation of the'I\rring machine.pe rlr:t:rrpicd tly thc input 'ur. t' : {qt} ' itncl d (so.nrlal Lnr'{cuncos awn Auttlltane f'rrrnll a € X U {tr}..b} . To rnirkc things work irr this casc. It contains the original . S =+ 5'Vll =+ TVaa +'lV""Vaa 4 VoooVooVll.o) .7) to get the appropriate starting string. r) .se when M moves outside thnt pa.286 Chopter I 1 A HInn. Tlris rttachirre acccptsL(uu*). T}ris createst}re first terrninal in the string.kescare of the ca. 1 2 ) prodrr:tion lre f.(q11. : (gr.ri) (11 . which tltcrr causes rewritirrgirr the rest... .a.7) to generate E. . )J : {rr.qo.

.9).14).nces (11.srV11. always rernembering the itrititrl input. which is equivalent to the sequenceof irrstarrtanrxlrsdesr:riptionsin (11. Then the next steps in the derivatiorr are V. T For every recursively enurnerable languagc.0.6) to (11.. The sequenceof first indices remains the same. then e ( r )+ e ( y ) .uVaa VouVoooVVa * )VooVooVlon + VooV.8).13) are used itr tlte last steps Vu"VtuVaa 4 VooaVag + V""al + aal + aa.V. aOa[. (11.10)to (11. there exists an unrestricted grammar G. a1aE.') " (g) ..13) is the basis of the proof of thc following result..I1. whiclt arc spcr:ifir:insta. we apply VoooVoo WoWoo. such that .L. Proof. and VoooV. whercc (z) denotes encoding a string according the givenconventhe of to tion.we cirrrthen showthat e (So?. Finallv. 5 . By an induction on ttre rmmberof steps.L : L(G). The sequenceof the other indices is 0aan. aa0!. and of Vo*l/tr0tr WruVaa corning frtlur (11. Thc cxrnstnrction describedin (11..z * Voolltr0tr. Next.2 UNRnsTRrcTED GRarrrlrans 287 input aaE in the sequellce tlf first indices and the initial instantarreous description esaal in the remaining irrdiccs. The construction described suarantees that if rla.

t .a$rb bB -.ltnl. u € (Y U ?)+ and lzl < lul.q. What language does the utrrestrictcd grammar Jj + 5'rB '5r .ssor:iirtt:rl etrumerablelauguages. Consider a variation oII gl'alnmars in which the starting point for arry derivation can be a finite set of strings.288 Chopter I I Arttr Atlr'ovane A Hrnnnncnv or Fon. Thev show that wittr utrresl.7. prove that the constructerJ grarrlrrrar cannot generate anv sentence with a b in it.rsis idt:rrtitlirl the family of langua. Formalize this concept. ffi 4 . unre- withA€Y. m .7. I These two theorems esta.A aa 2 . then investigate how such gratnma.hlishwhat we set out to clo.\. Slxrw that fbr everv unrestrictcd grammar there exists an etpivalent stricted grammarr a.gesa.rs telate to the unrestricted grarrrfirars we havc usccl here. L. Give a derivation fbr 0101 using the resulting gramnrar.releft irs iilr cxerconfiguration. The cleta.ils. In Example 11.ricted gra'mma.1. 6 . What difficulties woultl arise if we allowed thc empty string as the left sicle of a production in an unrestrictcd grammar? 3 . rather tharr a single variable. or . Give the details of the proof of Theorern 11.bbbB aStb ' clerive? ffi B . cise. a. with the family of rec:rrrsivcly 1.4 +. thcn find an unrestrictetl grarnrtrar for it using the coustruction in llheorem 11.ll of whose protluctioris have the fortn 'u+at with z.Ncuacns if and only if q1uri Y' Wc also rnust show tha. Construct a Turirrg rrrachineftrr tr (01 (01)-). 5 .t we can gcrrt:rate every possible sta'rting t:ttnfiguratiorr and that tu is properly rerxlrrstnrctt'd if and only if M enters ir lirrirl whir:h arc rrot too difficult.

.5.ttention. (il. in the sensc that for cvcrv grammar of one t.d€(vu").l' A grammar G: (V. a. the cotttcxt-st:rrsitivtl grilmma.ffi. Somc authors give a definitiot of urrtestrictetl grarnrnars that is not qrrite the same a.sour Dcfinition 1. N$ffi$fl N. the productions of In atr utrrestrictetl Hrarnrnar are rerruired to be of the form T+U.ted with a.S e n s i t i v e o m or s Gr m ond Longuoges Bctween the restricted. context-fiee gra.5. Show that this alternate definition is llasir:ally the sarne as the one we use.'I'. These grarrrrrrirrs gtlrrcrilte lrr.ge. restricted clir-rs of T\rring macltirres. whic:h we irrtrorhrt:edirr Ser<:tion 10.r$ ha.3. 9.'$$fr.\tthitffi tl.'s rr.nbe defined. where .|.rir:ty of "srlrncwtrilt restricted" grammitr$ t:tr.fi .3 Corur'r:x'r-Snrqsrrrvn Clnauuans . among the ones thtlt rlo. Show that the conclusion of Hxercise 7 still holds if wc add the further conditiorrs zl < 2 and l"l < 2.ssrlr:ia.ve relcc'ived considerahle a. The diffctcncc is that hcrc thc lcft side rnust have at least orre variable. a E (V U T)+ ancl l z l5 l s l . Not all c:irscsyit:ltl irrttlrcstirrg rrnultu.I1... this alternate definitiorr. unrestricted grarrlrnar$r ir grt:at vir.rup Larucuacns 289 8.P) tions are of the forrn is said to Lrecontext-sensitive if all produc- where tr.v(vu7). there is an equivalent grammar of the other typc.ype. C o n t e x t .lirrc:itrtrourrdcxlirutornata.15) .nd a€(vu7)".nguer.q.mmarsancl the general.

L : L ( G ) o r L : . in the sense that thc lcngth of successivesentential forms can never decrea^sc.tel This definition shows clearly one asper:t of this type of grammar. s u c ht h a t . we can claim that thc frrmily of corrtext-free languages is a subset of the family of context-sensitivc languages. By inclrrding thc ernpty strirtg itr the definition of a context-sensitive langrtage (but not in the grammar). This is equivalent to saying that tlre prodrrr:tion A-tu cirn be applied only in the situation where -4 occur$ irr a context of ihe string z on the Ieft ancl the string 3yon tlrc riglrt. While we use the terminology arising from this particular interpretation. qnd LineorBoundedAutomotq Context-Sensilive Longuoges witlt As thc terrrrirrologysuggests. if A language .4.I.\ can be generated by a special ca-seof a context-sensitivc grarnrrlar. so that a context-sensitive gra.mmar can never gernr:rittc a lirnguagtl containitrg the empty string. is said to be contcrxt-st:rrsitivc thcre exists a context-sensitive g r i r m r n ? r(rJ .It is less obvious why such grammar$ should be called rxlrrtcxt-serrsitive.wclHy FoRrr. both of which satisfy the conditions of Definition 11. Salomaa 1973) that all such grarlmars can be rewrittt:rr irr a rrormal form in which all productions are of the form xAY + YPY.290 Lauc.4 implies that r + A is not allowerl. thc forrn itself is of little interest to us here. In this definition. wo' reirrtroduce the empty strittg. every contr)xt-frrxl larrguage without . and we will rely entirely on Dtllirritiorr 11. but it cau tre shown (see.context-sensitivegrammar$ are a^ssociated a latrgua.4. for cxarrrple. say by otre in Chomsky or Greibach normal ftrrrn. Yet.uacns eNn Aurol"tn'r'A or Chopter I I A Hrnn-. (G)U {A}.gefamily with the sarnc rrarrle. Definiiion 11. it is noncontracting. .

+ aaabbbccc.rqu LeNcuecns 291 . where it creates another b arrd r:. The linear bounded automaton will have . context-sensitive language. We show that derivations in t}ris grarrurrarcan be sinrulatecl trv a. we see that the farrrily of rxrrrtcxt-free ltrnguages is a proper subset of the farnily of context-sensitive larrguagcs.ngua.spredictable space requirementsl in particular. then finding atr equivalerrt grarnrnar frrr it.ge.q.bove example is not context-free. . We slrow this by exhibiting a context-sensitive grirrrrilrirr fbr thel la.henthere exists a corrtcxt-sc:nsitivt: grammar fbr r-{A}.2 also shows that it is not an easy rnatter to find a context-stlrrsitivc grirmmerr even for relatively simple examples. Tlrc pror:essis very similar to the way one might program a Thring machine to accept tlrc lir. Ihe solutiorr clIectively rrses the varia. Exarlple 11. Ab . If -L is contexL-sensilive. S + aAbc+ abAc+ abBbct.lirrear hounded automaton. It therr sends the messenger B back to the Ieft in order to create the corresporrdinga. aB --+aalaaA. We can see how this works by looking at a derivation of o:lblcil. the corresllorxlirrg T\rring machine ha. Arr .ngrragein the a. A ftlw oxa.ton. Proof. Hm6.t. + aRbbcc + aaAbbcc+ aubAbt:c + aabbAcc+ aabbBbtcc + aab Bbbccc+ aaB bbbccc. .11. it carr trclvirlwrxl as a. Often the solution is rnost cirsily obtained by starting with a Ttrring machine progra.ge tr. whenever the language is context-serrsitive. bB --+Rb. At: + fi6"".In.bA.tttfble t.rrgutr.\.tthclaAbc:. trirvels to the right to the first c.bles A and B as rnessengers.2 I The language L : {artfirlsn: n > 1} is tr. there exists sorne lirx:irr brlrnded a. I Sirrrxl tlxl lir.L not including .3 Colrrlxr-SnNsrrrvr Gnntttrrtns.utomaton M such that tr : L(M\.One sut:h gramma. Iinear bounded autorna. Frlr every crrntext-sensitive language .r is S .rnpleswill show that.4 is created orr thc luft.

Proof: The coustruction here is similar to tha. the computation described in Theorctn 11. contextProof: Consider the context-sensitive la. . thirt is. nondeterministic. Another point to notice is that a.292 Chopter I I A HreRRRcHy FoRMAI Lnmcuacns eno AurouAlA oF two tracks.7.t in Thcorern 11. then thcrc r:xists a context-sensitive grammar that generates tr. one containing the input string ur. since we can claim that the cclrrect productiorr carr always be guessed and that no unprodrtctive alterrnativcs havc to be pursued. it can be done bv a linear bounded automaton.i t' x. Every r:rlrrtext*setrsitivelattguage -L is recursive.nguage.rsing G. and look a. cotnpleting the argument. The crux of orrr argrrment is that the numbcr of steps in arry tlerivation is a bounded firnction of kpl. But this production can be omitted. I If a language tr is acc:eptcrl by sorne litrear bounded automaton M.t origirrrrlly occupied by ur. All productions generated in Theorcm ll. The grammar obtir.derivation of ro 5+11 =+r?'+"'+rn+ut' We can a.7 are rroncontractitrg except (11.6 can be carried out without using space except tha. r Between Recursive Context-sensitive Relotion ond [onguoges Thrxrrr:m 11. the othrtr r:clrrtainitrgthe sentential forms derived r.ssumewithout any loss of generality that nll serrtcrrtial forrns in a single derivation are diff'erent. Therefore.$sociated sensitive gramrnar G. It is necessaryorrly when the Thring machine moves outside the bounds of the original irrput.L with an tr. by clefinitiotr. tr-*4.9 tclls us that cvt:ry cotttext*sensitivelatrguageis accepted by somtr T\rrirrg rnirr:hitre:arrtl is tlterefore recursively enunerable. We know that l*il5 l"i+rl. *.lincar bounded automatotr is.ineclhy thc corrstructiorr without this umrecessary produr:tion is nttncorttractittg. Theorem 11.13).t a. A key point of this argumetrt is that no possible sentential fbrm can have lcngth greater than lul.10 follows easily frorn tlfs. This is rrct:cs$aryitt the argument. which is not the case here.i for all i t' j. that is.

then u € L.3 CoNTExr-Snmsrrrvn GnavveRs ANDLANcuAcr.t most ont: corrtext-sensitive grarrrrrrarcorresponding to it. we defiue a' la'nguage-L tly L: grammar (J4 alrd *.i: trr4 .. Thcrcfore:. This observation giveu rrs imrncdiately a mernbership algorithrn for . with trl : nL(lrl) u bor. any context-sernsitive grarntrtar call be represented rrniqucly by a string f'rom I ((011-0).L. We check all derivations of lcrrgth up to l'ulrn (l'rrrl). To this string wtl now apply the hotnomorphisnt ft'(a) : 616. dcperrdirrg otily otr G atrd 'ur. Next. there are only a finite number of these.blc f'orm V : {Vo. i< l r i 1 .grlthat is not context-sensitive. givcrr arry such string.urded function lV u 7l nnd lz.vtr.. h (-) : 61:t6. A given string ui rrlay ntlt define a context-sensitive gramrnar.'1. 1}+. l This for all j. + Jlrrr.. . # L (G. Furthermore. call tho grirrnrrritrGi. Let us introduce a proper orcleringon {0. if it does..b}.the length of a derivation of w E L i s a t m o s t l r u l r n( l r r r l ) .. set of the We t:tlrr r$() a convention in which each grammar has tr. The cxrly tlfttg we need to add is that there exist some rn. Proof: Conuider tlrtr set of all context-sensitive grammars en 7: {a. there is a.11. Evcry context-sensitive grammar is completely spet:ifitrl by its productionsl we can think of them as written au a single strirrg l U 1+ AtlfZ + UZl .} .Vt. otherwise it is not. l . If any of them give ur.. ' . definesa context-sensitive {u. lz(.).ls 293 because G is noncontracting.given length. so we ca'nwrite strings in tlre order uJltlrzt etc. h (b): o1zo.):s1+6' h'(V):01i+50' Thus.)}.ritr. such that l z .Vz. of follows becausethe finiteness of lV U ?l implies that thcrc are only a linite number of strings of a. Since the set of productions of G is finitc. the representatiorr is itrvertible itr the sense that.I r n . I There exists a recursive langurr.

if we assurnethat. Tcrseethis.. We must therefore conclude that .t) is rer:ursive.. beirrg gr:nerated by context-free grarnmars.')< n.254 Chopter I I A HtnnencHv or FoRvel.L (Gj). 4. check it to scc if it defines a context-sensitive we grammar G. wu see that anv Ianguage accepterl by ir. As various examples show.rn. I.w €{a.. > 1 } (c) I: (d)l.nthen a. there would exist sornrr rl.L.: rn {e. we construct a membership a. ) : n . > 1] {ww. ( t u ) } (b) t : {w t no (r) : rro (u. ffi .b}+} ffi * 2 . then by definition wi € L and we havr: irnother contradiction. they arc ir proper suhset. Given ?o?. If not. Because of the esserrtial equivalence of linear bounded arrtorrrattr tr.ui # L(Gi). art: ir. so wtr Irave a corrtrrrrlir:tion.testhat linear bounded automata are indeed less powerful than TLrrirrg urachines. ( b ) Z .bekrrrgsto. ( u .ndcontext-sensitive languages orr one harrd. and pushdown autornilta ilnd context-free la. (ur)} 3.lgorithm. then ruaf L.srrhset of the context-sensitive languages.and we can use the membership algorithm of Theorem 11.t.10to find orrt if rui € I'(Gt).r*L tlrat trri ( L(Gj). 1.s. It f'ollows from the same resrrlt that linear bounded automata are rrrore powr)rfirl than pushdown automata. then u. is well defined and is in fact recursive. tlrcrr by clefinitionuj is not in. Show that thc family of corrtext-serrsitive languages is closcd under union.skif uri is in .grammar. z " :n . t The result in Theorern 11. since they accept only a proper subset of the recursive languagt:s.11 indica. Find context-serrsitive fJrammars for the following languages.'incl context-sensitive granlmars for the following {anguages. But L: L (Gi). but that there a.L is ttot mrntoxt-sr:nsitive.r bouncled automaton. suclr that L : L (Gi) We ca.xt-sensitive. * 1 . If it is not. : { t t " h " a . If the strirrg rlclrlsdcfine a.L is rrtlt rxlrrto. Conversely.llrshdown automaton is also accepted by stlrne linea. Elut . Show that the farnily of context-sensitivc languages is closed under reversal. If it were. lf *" . Larucuncus ANDAuroMNr'A.nguageson the othcr.. Context-free languages.nguagesaccepted by sorntl Iinear bounded automata for which there are no pushdown automata. then L((. (a) I: { t n : n * ( ' u r ): 7 r .re la.L.rtfir'txrt4rrt ] l. (u) {an+rbnr" 1:ri}1}.

L6. we have seen.ngrrages and the regrrlar lilngrrargers Orre way of exhibitirrg the relatiornhip between these farnilies is by the Chomsky hierarchy. Type 0 languages are those generated by unrestricted gramurars. we a.ra.miliesr:rr. A diagrarrr (Figure 11.11.nguages (Zcs). arrlong them the recursively enr.nguages and type 3 consists of the regular la.4.4 'I'HE CHor"tsxv HIURARC:HY 295 5 .n defined tr.ge he sturlirxl.rmera.rrivea.'p). Type I consists of the context-sensitive languages. in Theorem 11. Frrr m. b}+} ffi E The Hierorchy {ffiffiffiffiffiChomsky We have now encountered a nurnber of language families. give explicit bounds for m. each larrguage farnily of type i is a proper sulrset of the family of type i .t11t.1.10. type ? consists of the As context-free la.{a. Including the families of deterministic context-free languages(L nc p) . as a function of lrrrl and lv u71.ndtheir pltx:c:irr Figrtrc 11.nguages.3 and 11. Figure 11. Noarn Chorrrsky.3) exhibits the relalionship clearly. show that therc cxists a contcxt-frcc grammar for the language 7 : {uul i 1t. the corrtext-sensitive la. fa.ngr. and recursive la.4. Without explicitly constructing it.3 shows the original Chomsky hierarchy. the recursively enumerable languages. lir.'---- . the relationships are not cornpletely unclerstood. altirough thcir relatiorrships do trot always ltave Lite neatly trestecl structure of Figures 11. We have also met several other language families that can be fitted into this picture. a founder of fonnal language theory.nc).ngua. that is. Figurc I I. type 0 [o type 3. provided an initial classification into four language types. tho rxlrrtcxt-fieer (.hle languages (Lnu). In sorne instances.t the extended hierarchy shown in Figurc 11. (Ln. but the numeric types are actually different rlarnes for the language families we have stuclied.3 --------7. This original tenninology lns persisted and one firrds frequent refererrcesto it.ges (I'nac).4 Other la.

Lnmcuacns enn AuroMArA Figure 11. the language 7.5 .5. and rrondeterministic context-free languagesis as shown in Figure 11. determirristic corrtext-free. but not deterministic..: {a"bn} U {a'b?'} is linear.296 Chopfer I I A HrpRaRcuyol FoRraer.ilWil$\$ language we havepreviously introd'cedthe context-free L: { w : n " ( w ) : n 6( u ) } and shown that it is deterministic. On the other hand. linear. but not linear.4 ll\]i$'$. This indicates that the relationship between regular. f l'igrrre 11.

deterministic context-free but not linear.r bounded automrr.4 Tsr: Clsorr.re more powerfirl thtln lirrcar botutded autouata.crrv 257 There is still a.n pushdown inrtomata.there is tro easy answer. We introduced thu trrrx:cl1ltof a deterministic linear boundecl automaton in Exercise 8. . we established a hierarchy of languages and classified autornata. These in turn are more powerful tha.n unresolved issue.an. I .5. Ttrring machines a.ter. $ec:tiorr10. We can now ask the question we a-sked in r:onnection with other autorrrata: What role does nondeterminism plav here'l Unftlrtrrrratclly.mple I1.3) of languages that are Iirrear but rrot deterministic context-frcc.with which we begarr our study.set relatiorrs rlepit:tetl irr Figure 11. it is not known whether thc tarnily of lattguages is accepted by deterministic linea. w() have explort:d thcl rclationships betweeu several langurlgclfarnilies arrd their associatedautomata. Collect examples given irr this book that dernotrstratc that all the sub.rsxv Hlpn.ngtrages. ir.At thc bottorn of the hierarchy are finite accepters. propcr subset of the context-sensitivela. To slrmmarizc.3) of larrguagesthat are . Firrtl two exalrrples (excluding thc one in Exa. 2 Firxl two exarrrples(excluding the onc in Example 11. In doing so.11. by their power ir^slanguage accepters.4 are indeed propcr ones. At this time. 3 .


If this were a.lgorithmsfor the solution of ccrtain problems. then by definitiorr thurc is no membership algoritlrrn ftrr it.ted(but nol yet proved) that we there exists no algoritlmr to detc:rrrirut whether a context-free grammar is rrnambiguous.ges have little pra. Now we rnake more explicit what we mean by this claim.bout what Thring nrachincs r:iur rlo.ll there was to this issuc. For r:xa"rnplc. w() now look at what they cannot do. Sorneof tlxr results ca.L i mt s o f A l g o r i t h i c i m p Com ulqtion rrvirrg trr.t we mean when we say that sorrxlthing cannot be done by a.meabout quite sirnply. we hitvc c:lnimed on severa.mrning languages. We first define the concept of decidability and computability to pin rlown wha.chine.lked a. Although Thring's tlxnis letadsus to believe that thcre are f'ew limitations to the power of a T\rring ma.crtical significancein the study of progra. Thrirrg urar:hine.loccasions that there coukl rrot uxist a. if a langurrge is nonrecursive. alrrong thern tlx: well-known halting problern for Thring rnachirres. But the problem goesdeeper. This question is clearly of prrr. it would not be very interestingl norrrec:ursive: lirrrgrur. FYornthis follow a nuuber of rrllatrxl prohlems fbr T\rring machines and recursively 299 . We then look at several classical prcltrk:msof this tvpe.ctical value. hirve sta.nv a.

rs.t its rlomairr is. the set of all context-free gramma.bility results. wrr stated that a function / on a. we rnust be clear on wira.t we do not know what the correct' . We say thtlt a prublc:rnis decidable if there exists a.flectthe:c:onrlhrsiotr. fhe fact tha.cetiollrtrn$w()rr tnrt it ernphasizes atr ittrporlatrt point. each of which must he either trtrtt or false.nt ltrobletns for which. the langua. After this.nda set of relatecl staterrents. A firnctirin is rrrx:rlrnputableif no such 'I\rring machine exists. unfortunir.ges. Them miry be a Turitrg rnachine that can compute .coRt'ruMIC CoMpurATIoN cnumerable languages. while in the second case one tha. we consider the sta. ond Computobility Decidobility In Definitirlrr 9.4. there is iln rrnrk:rlyirrgtlorrrain.indornitirr is saicl if to be c:omputrr." 'I'he probletn is to it is fa.brrt cleir.mwe will understir.chinetha. sirrce the allswer is either single instance of a problem is ir.f rlrr pnrt of its dorrtairr. Intuitively wc krrlw that many vague and speculative qutrstiorrs require special insiglrt and reasotritrg well heyond the captr. In the first cirtrcra TLrrirrg rnachitre lhat always answers gives ttxl rxlrr(x:t irr$wer.Tlrrirrg rnac]rine t]ral computes the frrnction on ther wtxlltl of its dotttaitr. thc:rc are Ito algorithms.rsibly is that thcrc arc questiotrs that can be irrterestirrg to computer scientists clearly and simply stated. filr Whetr we state decidability or: undecida. lem rnay be decidable on some rlomairr llrt rrclt ort atrother. wc rntrst always The probkrrow whal the domain is.tr:ly.t a. decide whelthrlrthrl stirtcrrtetrtis true for any G we are given.r. with irrt apparetrt possibility of an a'lgorithmir: solution. when we cla. Again. We see fron this that.300 Chopter l2 Lrurr$ oH Ar. Our conceru her:e will be the sorncwhat sirnplilicd setting where the result of a comlrutatiotr is a." lrr this case.ya.lwirys "ttue" true or fa. This may seenrlike a fa. for others G. wc look at sotne questions relnting to context-free langua. but we call the function comprrtable only if there is a.Tirring macirine that gives tlxr txrrrcct irnrJw()r t:very statetnetrt in the domain of the problem.lse. br.mbigrrorrs.tement "Fer il t:ontt:xt-frtxl graIIIInaI' Fur sttrtreG this is true. sitrlrkl "yr)$" r)r o'rr).in.city of arty cotnforerstx:.lse. T S o m eP r o b l e m s h o t C o n n o fB e S o l v e db y Turing ochines M The argurnent that the power of met:hilnir:irl txrrnputations is limited is not surprising.rlywc rnust have otte or tlte other. becausethis ma. certrr.bltl there exisls a Tlrring ma. Specifically. (G) is rr. l'or example. we talk about a problem being decidable or undecidable.lways answers "firlse" is appropriate.geI. By a proble. ftrrrt:tiorr rrs cortrputable or not cornputable. Hcrc we find quite a few importa. WInt is more plrter that wc carr now cottstruct or even pla.rt which are known to lltl rrttsolvableby atry computer.ssify a. a rkx:idablc.t computcs t]re value tlf / frlr all arguments in its doma.

u).tr' applied Lo . we arc looking for a single T\rring marchinethat. no such algorithm exists.4.ssuggestedirr Secltirln10.cHtvt.ninput 'ur.EMS THAT Carvruor Bu Sol.or simply (M.12.ur.urhalts.rlrrect answer for any M arrrl ur by perfbrming sorne arrirlysison the machine's clesclilltion a.c.8Nlli|li Let up1 be ir.ll ru. . )_1.what rna.t have sornr: historica.t if M applirxl to zr does not halt. a. starting point frlr clevelopinglater results. wlfch firr a. We cannot firrrl the iruiwer by simulating tlrc a.rrse there is no limit orr the length of l.t'ur.T\rring machine fuI : (Q. A. we statc:rlsrlrnewhat looselv above.4. for this rcilrJorr. Here q.kenas the set of all l\ring machirrcs ir.string that describesa.tion.q a. then no matter how long we wait.t we mean by tlu: ha. For subsequent discussirirr.f.whcn startecl in Lhe iritia. perform rr.nd a.fiinnrnmtmW. Q(Ju. Sirrrply stated.:11'w if . the prol.t can detennine thc r.rr rrl u'.s 301 alnwcr is makes tro differerrcc. What we need is an algoritlrrn tha. say by perfbrming it ol a urrivt:rsill Thring machine. anrl qo'tlt It) i. n.rand ?r. br:(:ir. r'Rfi$.rlernis: given the description of a. r1o. atrcl let.yr. But as wc now show.c:tionof M on tu.vcry long cornputation.ven ey TuruNc.nv ?rA.it is convenient to hilvr: ir. We will imsllme tha.c:omputation that t:vcntrrerlly halts? Using irrr tr.ke a specific tlcfinition of whal.tthe same tirne givc: rrs a.1 Solrn Pn.bhreviated way of talkirrg aborrt the problem.ttersis that tlxlrc exists some'Ihring machinc that cloesgive the corr(x:t re$ponse. will predict wltctlNlr or not the cornputatiori of M applied to to rvill hrrlt.If M en[ers an irrfinite loop. precise idea of wlra.l configuration {6ro.l.does .otl.Ni'. arrrl g". tha[ is.. A solution of the halting problt:rn is a T\rring machine H. TheTuring Mochine HoltingProblem Wt: begin with sotne problurns tha..lting problem. Thring machine M irrrd rr. Tlrc best-known of thr:sr:is the Ttrring rnachirrc: halting problem. llcrforms the computatiorr i *rqo*r.l signific:ance arrrl thir. are both fitral states of ff . M. halts or docs not iralt.we ma.d-.nclthe input. we rrskwhelther M applied to . given thu rlt:scription of an arbitrirry M and tl.he cornltrrta. F). 'fhe dorrrairrof this problem is to bc tir. It rnrry simply be a caseof a. w() (iirrrrrever be sure Lhat M is irr ftrct in a loop.'u bc a strirrg in M's alphabet.nclru are encoded as a string of 0's irrrrl l'$.t rr.

t n lltQ. The haltirrg problem is therefbrc urrdecidable. qow u) i rt. With thc a. if M is startsl irr state {s with input rit14tr.2.grarrtis to indicate that. Proof. TIte intetrt of this dia. The: rcxpiretnent is then thtr.*.T\rring rnitt:hirreiI' with the strrtctrrrtl showrr irr Figure 12.1 .nbe visualized }ry a block ditr.tIc Thurc cloes not exist any Ttrrirrg rnachiue F-l that bchirves as required by Definition 12. that there exists a.iglrtforward.nsitiorrs tretween state qo atrd tlrtl rrtlw states q* and q6 are to be rnade.!!2. and (1. Formirllv...U't M if M applied to ru dot:s rrot lrrrJt. Thc wtw this is done is stra.1. We imsurtte the contrary.1.gra. s&y. or 4r. the rnodified machine fll will t:rrter att infinite loop.14ru.2 wc warrt to colrvey that the trtt. and consequently sorne I\rring ma. wermodifu fI to produce a. the action of 11' is descrihed by qilt)Nt'tu irr.lt with either a yes or no arswer.t. and qDwMu. The input to ff will be tlte string u.halts.1. if M applied io 'u.Ar.'!!2.lt in state q..ddeclstates itr Figure 12.chine ff . f if M appliecl to tl does mrt halt. The siluation ca.n algorithtn.lt in ttnt: of two correspottding firrrrl states.t. namttlv.302 Chopter l2 CoururNt'lol Lrlurs or..tl w^ill htr. Next. it will eventually ha. thrrt solves the halting protrlcrn. givctr any wMwl the T\rrirrg machine. We achieve this trv askittg that 11 ha. regardlcssof the tape syntbol. Figure 12.. if M applied to ur halts.c-+orrtrtIr. As reqrrired by Definitiotr 12. Comparing iI and II' we see tha. {ri or qr..rn like Figure 12. in srrt:hrl way that lhe tape remains unchanged. we want ff to operate according to thc following rules: Qowtwuisngyn2. irr situations where fI retit(lhtl${rr alrd iralts.

What . however. contradiction to not The tells us that our assumption of the existence of 11.identifyirrg M with Il.F is applied to fi.1. It is. Flom the ir.. Ff. if M applild to wp1 does not halt. ri. Tlfs string.q. i if fr applied fi does halt. to solve the halting problem.Az.often we can tell by an analysis of M and ur whether or rrot the T[rring machine will halt. must be false.1 SoMEPRoBLEMS Tg. Then the ar:tion of fI is such that qow*t l ii Q11wx4u)14 *. After that. I One rnay ohject to Definition 12. not hard to see tlnt these somewhat arbitrarily chosen conditions play only a minor role in the argument. so it has a description in {0. It does not preclude solving the haltirrg problem for specific cases..2 FYom f/'we construct arrother Ttrring machine f/. since we required that. but this does not affect the conclusion. and qowm ifi qouMlilM f p Afl.l says.12.[y'. We can therefore legitimately ask what would happen if . It is important to keep irr mind what Theorem 12.rCaruruorBp Solvno ny TuRrrrrcMacnrNns 803 Figure 12. We have tied the problem to a specific definition for the sake of the discussion.bove. and qofr n AtenUz. This is clearlynorsense. and that essentially the same reasoning could be used with a"rryother starting and ending configurations. we get * gsfi l-6 oo. and hence the assumption of the decidability of the halting problem. This new rnachine takes as input ur14 and copies it. it behaves exactly like I/'. in additiorr to being the description of . "I1 had to start and end in very specific configurations. say. Now fI is a Ttrring machine. if fr applied to fi halts./ applied to w71halts.also can be used as input string. 1}*. ending in its^irritial state qs. if JI.

40.2.5 is a consequenceof the fact that the halting problettt and the memtrtlrship problem ftrr recursively enumerable languages artl rrearly identical. d. making -L recursive. f . Giverr atry Ttrring machine M : whether or not the state (8. illustrates the very important teN:lrrrique reductiorr' We say that problem A is reduced to a problem B if the decidability of A follows a.L. so it will eventually tttll us whether tu is irr . 1. thtlre is no algorithrtr thrr.mplesto illustrattr this idea.t catr make a correct decision for all toy and ur.If . X. the halting protrlcrn is undecidable. '. The1. But we: already know that there are recrrrsively enumerable lattguages that Arc rtot recursive. if we ktrow that A is urxlecidable. The argumerrts for proving Theorem 12. and let M be a Ttrring mac:hine that accepts . The state-entry problern is as follows." thett by definition rrr is not in . thrrt the halting problem is undecidable.5 (via Theorem 11. Let us do a few exa.L.L be a recursivelv errurlerable languageon E. both being a version of diagonalization. q is ever entered .roclosely related. The contradiction implies that I1 cannot exist.'no. t:ttrrrrecting the halting problem to thc ntembership of probletn. Apply H to w14tt. Consequentlv.1 vrcre given becarrsc they are classical anrl of historical interest. Iet . from the decidabilitv of B. <lecide wherr M is applied to tr. The corx:lusion of the theorem is actua.tIcCotueurrrlom the therlrem says is thrrt this caunot always be done." then apply M to u.1a. Let 11 be the T\rring machine that 'We cotrstruct from this the following proceclure. This problem is urrdecidable. F) and arry q € Q.L or not' This constitrrtcs a trtembership algorithm. The only different:c is that i1 the haltirrg problem we do trot distinguish between haltirrg in a final and rrotrfinal state. But M rnust halt.lly inrplied irr previous results as the following argument shows' If the haltirrg problem werq tlecidable. solves the halting problem.304 Chopter | 2 LIvIrs oF AlcloRl'rHr.[.3) and J. Problem Another lo Reducing Undecidqble One The above argument. t The sitrplicity with whit:h the halting problerm can be obtaincd from Theorem 11. then (lvety recursively crrurnerable language would be recursivc. that is. The proofs of Theorems 11. whereas in the membership problcrrr we do.F1says 2. we carr conclude that B is also undecidable. If H says "ycs. Proof: To seethis. w € E+.

configuratiorr qoil).12.nswers yes. To get fr. Sirrcc the latter is known to be urrdercidable. Tltus. M. we use the algorithm for solving the blank-tape halting problem. suppose that we have an algorithm A thrrt solvesthe state-entry problem. the same must be true for the blarrk-tape halting problem. then (M. given a.. an algorithrn for the blarrktape haltirrg problem can be converted into an algorithm for the haltirrg problem.2 is summarized in Figure 12. acts like M. it cloes$o beca.rr) is rmrlefined..il The blank-tape halting problem is anotlrcr problem to which the halting problem catr be reducerl.t we a. that is.te-entry problem is decidable gives us an algorithm fbr the halting protrlcur. tlrcrr (M. ru). l ( q n a ) : ( q .1. If A rr. we first modify M to get M irr urrcha way that M halts irr state q if and only if M halts. which wrr a*. Given a Ttrring urachirrc M.w) irtto M.JrJume exists. A block diagram often helps us visualize the process.*) does not halt. Clearly M. .lts. determine whether or not M halts if started with a blank tape. the assumption that tlrt'sta. The constnrc:tion in Example 12. . T Bxfiilnpls t fi. assumc tha.q.u. For exanlple. Next.). We first rxtnstruct from M a new rnirr:hine Mthat starts with a blank tape... we first us(] an rrlgorithm that In transforms (M. srrch an algorithm clearly exists. then positions itsclf in ir.use some d(qi. a . the starteg is entered.regiven sornt. Since this can be done for any M and ur.. But this is impossible. The conclusiorr tr:lls us whether M applied to tl will halt. we chrr. writcs ur on it.nge every such undefined d to .will halt on a blank tape if arrd only if M halts on ru. Putting the two together yields an algorithm for the haltirrg problem. If M halts. that diagrarn. Given any (M. We can do this simply by looking at tlxl transition function d of M.ny M and . thon apply the blank-tape haltirrg problern algorithm to it. We apply the state-entry algorithn eto ('rt.crrrlus 305 Trr reduce the halting problem to the state-entry problem. This is undecidable. the state-entry problcrn must also be undecidable. After that. Supposenow that the blank-tapo halting problem were decidablt:. Because the halting problem is undecidable.rrr) ha.' M and some tu.. where q is a Iinal state. If A says no.u. and we can rruncludethat A cannot exist. We could then uscrit to solve the halting prohlem. I The construr:tion in the argurnents of thesc two exelmples illustrates an approach cornurorrirr cstnblishing undecidability results.l SoME PRoBLEtMsTHnr Calrlror Bn Solvnt ey Tunrlrc M. To show how this rtxhrr:tion is accomplished.R ) .11. we first construct Mr.

for exarrrple rnachines that ha. or M applied to ir. It is easy ttr modify M to produce M in sttch a way that the la.5 Algorithrn ftrr halting problerri.Il of the n-state machintls.chinethat does halt will execrrttl a certaiu number of rntlvesl of these. Before we set out to dt:rnonstrate this.llltl funclions we must Iook a little firrther.rrurn's Figure 12. Considor t}rc function / (n.pplietl to a blank ta'pe htrlts irr rrd more than rzr nrovcs. Noticg first that there a. do so. is not cxlrnputable. Let f : {0.lfurtctions to to stx: if they are comprrter.ttcr will always ha. hlarft tape makes mtlrtl tltatt ?Iumoves. that i$. Some of the z-state tnachineswill rxrt halt when started with a blank tape. we f'olltlw the established trcthocl and rerhrr:c the halting problcrrr (or any other prtlblern knowu to be rrntlecidable) in to the problern of computitrg t.rnctitlrr question.nd ritrrge. we expect that functions t:rrcoutttered prar:titlal circurnstanceswill he crtrnputable. We cau loclk ir. Assrurlc rlow ihat / (tz) is <xlrnputable by some T\rrirtg I (n)- .lso at rnore genertr.vrlorrly final sta. All we hirvc to do for this is to hirvr: M count its moves arrd termiuate when this r:outtt exceeds rn.chine M and positive numbcr ttl. but they do not enter the definition of /. Most exa. there are some that always halt. A decision problem is cffectively a' fitnctiorr with a ra'nge {0. Every ma. This function. 1}. I'his in turn implics that there are only a linite rrurnber of different dts and Lherefore a firrite rrurnber of difl'ercrnt rl-state T'uring machinrrs. ir true or false ilnlJwcr. This is because Q arxl 1' are finite' so t)-iras a finite domain a. Becatrscof Tbritrg's itr thesis. Of a. so for exermplcs of uucotnprrta.tesand therefore tnake no mov()lt.sstlciated with tr. as it turns out.ktrthe Iargest to give 'l\rring Tirkc irny ma.he fr. 1.lt with one of two answers: M a. we ttr. Iet trs rrtake sure that . n}.306 Chopter l? ou AlcoRlrHMI(t CoMPUTATIoN I.ttcrnpts to preclict thtr bchavior of T[rring rrrirr:irines.reorrlY a finite number of T\rring machines with n statcs.) whose value is the mtrxirrrutn rrumber of moves thilt catr be made by any z-state Ttrring milchirre that halts when started with a blattk tapc.f (n) is dclirred for all rz.blc.rnllltls of uncomputablc ftrtrctions are a.

./ iu not cornputable. we ask for an algorithrn that gives thc corlect answer lbr arry M and u.1 to require that qo.fapplied to a blank tapc makes rnore tharr /(lQl) rnoves. Irr the genera.use the definitiorr of /.ltsor not. Thiu tells us the set mir. detcrrnine whether or not thc symbol a is ever writtcn when M is appliecl to 2.universal T\rring ma. a E f.4 Algorithrn Iirr blank-tape halting problern. Mdoes not halt in rn steps rrtirr:hineF. g.ximum number of moves that .nktape halts or rloesnot halt in lcss tha.ke if it is to halt.rrr € X+..4. We carr then put fr ancl F together as shovrrrin Figure 12..n bc modified to prodrrce Ii' Suppose we r:hangeDefinition 12.1 to show that this diflererx:e in the clcfinit.tu. a.u q. The value wc get is then usecl as rrr ttl r:onstruct fr as outlinecl. tu) halts. If we find thal .trla.r MAcHlNns sOz Figure 12. W . qow*tw i I o. Show that the following problerrr is undecida.il. depertding on whether M applied to rl ha. Given any T\rring ma. Show that even in this restrir:ted setting the problem is unrler:idable. We say that such a problern is decidable iffor every u there exists a (possibly different) algorithnr that detcrmines whether or not (M. M.1 ca.then trtx:tr. the irnplicirtion is that M nevc:r ha. Thus of we have a solutiorr to the blank tape hrrlting problem.12. where Q is thc ster.rrrnr. for exarnple. I I' Dcscribe in detail how H in Theorern I2.lxc.iot does rrot affect the proof in any significarrt way.\4 r:arr ma.lhalting problern.lts.chine 4 .ncla clescription of -ffi is givt:n to a.eexarnirre proof the of Thcoretr 12. and. We can relax this gencrality..4.o.This tells us whether M applied to a. l-irst we compute /(lQl).t.te of M.blc.41 }lrt only a single . 3 . by looking for an algorithm that works for all . r'he irnpossibility of tlrc conclusion forces rrs to accept that .chinefor extx:rrtiorr. R.1 sorvrsPRonlnvs T'rrn*rcaNNor Bp solvnn nv Tun.n / (lQl) steps.

that is. 6 . 7 . that. can we always predict whether or not the automatorr will halt on input Trr? 1 0 .t there is no membership algorithm for recrrrsivtrly tlrrurnerable languages.308 Chopter l2 Coururun'tol. Show thatr thc problern of determining whether or rrot niu u is undecidable.) is not t:onrputable. !} and let b (n) be the maxirrrurn nurnber of tape cells exatnined by any rz-state Ttrring rnachine that halts when started with a blank tape. 1.tIrson Al. there is little we caII say about these la. 1 5 . As we now show.2. Corrsider the questiorr: "Does a Turing trarhine iu the course of a computation revisit the starting cell (i. ffi L How is the conclusion of Exercise 7 affected if Ml is a finitc automaton? 9 .GoH. 1 3 . 1ti. ffi 1 1 .). t}re nutnber of distinct Tbring machines with this f.rrguages.ble' brrt not recursive. the t:ell under the read-write head at the beginning of the cornputation)?" Is this a decidable questiorr'i D. Show that this set is recutsively cnumera. ffi ffi f P U n d e c i d o b l e r o b l e m so r R e c u r s i v e l y o E n u m e r o b l[e n g u o g e s We have cletermined thrr. Give an expression for rn(rr.3. Is the halting problcm solvable for deterministic pushdown automata.tIc show that there is no algorithrn to decide whether or not an arbitrary Tirring machine halts on all input. give the values of / (1) and / (2)' 1 2 . ffi Considcr the set of all tl-statc Trrring machirtes with tape alphabet | : {0. Lct I' : {0. I. Lct B be the set of all Ttrring machines that halt when started with a blank tape. but rather is the gcrreral rule. Show that there is no algorithrrr for dcciding if arry two Ttrring machines M1 arrtl Mr accept the sarne language. givetr a pda as in l)efinition 7. in e$$errce. Let M be any T\rring tnachinc and r arxl gr two possible instantaneous descriptions of it. Itrvariably.l LItr. 14.e. Recursively erlrrnerable languages are so general any question we arrk about them is undecidablet.D}. . In Example 12. Show that b (rr. Slxrw that the problem of tletermining whether a T\rring machine halts orr arry input is undecidable. Deterrnirre whethcr or not the following statement is true: Arry problem whose dornain is finite is decidahle.rrttr.lrages. Thc lack of an algorithm to decide on some propertv is not an exceptionirl state of affairs for recrrrtrivcly enumerable lan.

5.r tells us whether or rrot rt is in L(M).te. creating for r:irr:hur a machine M.nindication of tlre seneral situatiorr. M first savesits inprrt on some special part of its tape. whcrrerver enters a final sta.L(G.t L(M". We carr do this by changing d in a simple way.y Er*rurrrnRasr.12.ted problem "L(G):0" i s n o t d e c i d a . srrch tha.44 arrd .ECUR$IvRr. irr dirs:t contradiction to a prcviously esta. Usirrg Theorern 11.E LaNcuncns 309 whetr we ask a question about recursively enurrerirble languages.t there exists an algorithm A for deciding wlrt:tlrcr or not tr(G) : E. Assurne now tha. If sur:h ir T\rring machine existed. Figure 12.) is nonernpty if and only If w e L(M). Then.rallyclear is that. whethcr or not Then the problem of determinirrg L(G\: O is undecidable. We can modify M as folkrws.blishedresult.u.ble language..5 Mernbership algorithm.ysbe done. Suppose we are given a Thring rrra(:hineM and some strirrg zr. b l e . Proof: We will reduce the menrbership problem for recursivcly enumerable languages to this problem.If we let 7 denote an algorithm by which we gerrcrateG. Eqr. L(G-) not empty ueL(M) eL(M) .): L(M) n {zr}. Let G bc an unrestricted grailrmar. We give here sonre examples to show how this is done and frorrr thcso examllles derive a. we tlrcn construct a corrcsponding grammar G-.rr. we would have a mernbership algorithm for any recursivcly cmrmera. thc rxlnstruction leading frorn M and tl to G. 2 UNDECTDABLTI PRonlnus FoR R...7. carr alwtr. Wu conclude therefore that the sta.I Figure 12. we find that there is sorneway of reducing the halting problern to this question. Clearly.it checksits it saved input and accepts it if and only if it is ur. then we carr prrt ? and A together as shown in Figure 12.5 is a T\rring rnac:hine whiclr for any .

tIc Lct M be any Ttring machirre. Therefore no algoritlrrn for deciding whether or not tr (M) is finite can exist.u) tloes not halt. T\rritrg machine M that does the following. the original mat:hirre is modified so that the new machine ff first generates tr on its tirpe. If we trow assulne the existence of an algorithm A that tells us whether or not .narnely "Is L (M) finite'i". all input is accepted hy MThis can be done by having any halting configurtrtiorr go to a final state.6. if (M. If M halts in any configrrrrrtion.310 Chopter I2 Coututnrlot't LIuIrs oF Al. using atrd some otherwise urrused space. thetr performs the sarrrecomputations as M.chir. Sccorrrl. ur) halts. it will be modilied to accept the two strings a and b. we can construct a solutiorr to the halting problcm as shown in Figure 12. Then the qrrestion of whether or not L (M) is finite is undecidable. We can change the nature of the problem without signifitnntly affecting the argrrrnent. In other words. tht: irritial input is saved and at the Figure 12.I'r'ulr. Therefore.ncl will accept nothing' In other words.colt.4. Show that for an arbitrrrry Thritrg machiue M with X : {4. is imrrraterial. If so (M. the the newly crcer. b}. From M we construct arrProof: Consider the Blting problem (M.L(M) contains two different strings of the samc length" is undecidable. M either accepts the infinite language X* or the finite Ianguage 0.6 . the specific nature of the question a. except that whert M reaches a halting configuration.*). I IE Notice that irr the proof of Theorem 12. the problem . final state for all input. thgp fr will tralt in a final state.4.rc clnttged so that if any one is reached. First. then frwiil not halt either a.sked. ye use exactly the sarnc approach as in Theorerrr 12. For this.teclur moves rrrade by M after it has written u on its tape are the salne as would have been tnade by M had it started in the origirral configuration gnrr. (fr) is finitc. M will rea. To show this. the halting states other of M a.

ncuRsrvr:uy ENulrnRanl. I In exactly the sarrrc ilIirnner. Let Gr be any urrrestricted gramrnar. The adjective ((nontrivial" refers to a property possessedlly sorrre but not all recursively enumeratlle Ianguages. 1 .4 is constructed.L (M) contains any string of length five.A precise statetnent arrd a. show that the question "Is tr (G) : L (G)*?" is undecidablc.L (G) : f (G)R? 6 . A general resrilt fbrma.rrystring of length five?" or "Is -L (.rs non R. Does there exist an algorithm for determining whether or rrot .2 UwpBcTDABLE PRonlu. are undecidable.lizing this is known as Rice's theorem."rrr I2. namely (a) .L(Gr)n L(Gz): is urxlecidable. ffi a Slrow Show that the question in Exercisc 6 is undecidable for any fixed G2.L (G)E is recursively enumerable? 5 . if (M. The rest of thc ir. Thcse questions.redwith a arxl h. a.r'y T\rrirrg machines.4.cceptingonly these two strirrgs. Show in detail how the rrrachine . M will accept two strings of equal lcngth.rgument then procxrercls as irr Theorem 12. are all undecirlrlble. otherwise fr wiil accept rtothing.12. LeL Mt and Mz be arbit. ffi Show that the problem 4 . Let G be arry rrnrestricted grarnmar. Thus. ffi . .4. Let G be any unrestricted grarnmar. Show that the two problems mentionetl at the end of the preceding section. that the problcrn .rcuAces Bl1 errtl of the cornputatiorr compa. Argue (a) frorn Rice's Theorem and (b) from first principles. as long a^rtr (G2) is not empty.[f in Th"o.f) regular?" without affecting the argurnent essentially.geis undecidable.pLer. we can substitute other questiorrs slch as "Does -L (M) contairr tr. and Gr any regular grammar. This tlreorem states that any nontrivial property of a recursivcly enumerable latrguir.proof of Rice's theorem can be fourrd in Hopcroft arrd llllman (1979). (b) . " L (Mt) C L (Mz)" is rrndecidable.ra. Does there exist an algorithrn fbr determining whether or rrot .L (M) is regular.w) hirlts. as well ar-s similar tluestions. L lbr an unrestrictecl grammar G.

Post correspondencesohrtion (PC-solution) for pair (A..ue:001. particularly in tlrc area of context-fitNl languages. ' : UiUi''' 1. t .)1r.h...1 u. such tltat U ) i ' t . say A: Ld B : UtrI)2r.312 Chopter l2 LItr. there exists a PC-solution as Figure 12.?s:1I For this case.. But irr rrratry instanrxrs it is cumbersomrr ttl work with the hrr...LDzr. The Post corrcspottdence prohlem t:itrt be stated as follows.nt to study atrd therefbre make the argurrrerrts easier. : 1000 urs :0.u2:11.l interelst. These intermcdiate results fbllow from the rrndr:r:idability of the lrtlltirtg problem.tIrs oF ALcoRI'I'UMIC CoMPUTATIoN Problem fiWilffiThe PostCorrespondence The undecidability of the halting problem has many (toruequencesof prirctica. One srrchintermediate result is the Post correspondence problem. whether or not there exists a. 0 0 v2 "J I .B).us:011 Figure 12. The Post correspondence probk:rn is to devise an algoritlun that will tell us.. If we take ?01 Ul = 00.7 1 ul u2 I .i. B) if there is a nonelmpty sequenceof integers i..rlz : 001. Given two sequence$of z stritrgs on som{} alp}rabet E.rI)r1 u)Lt. and it is corrvcrrient to establish $ome intermediate rcsults that bridge the gap hetwecrr the halting probkrrn and other prohltrrns. t " i' u k .Il)n we say thnt there exists a.PC-solutiotr' Eifrrlplq t{.7 sltows. brrt are more closcly related to the protrlerrrs we wa..ltirrg problem dircctly.1} and take A and B as : r u r : 1 1 . for any (A.S Let x: {0. ? r 2 1 0 0 r u t r: 1 1 1 ur:111.

F..I2.tely obvious. AC --+ uat:.aABblBbb. c.xTIffi ilTl-il'll.te the undecidability of thc modified Post correspotrdence problerrn by reducing a known undecidablc llrotrlem. the first elernt:rrts of the se- an MPC-sohrtion.b. then there is also a ltC-sohrtion.ythat the pair (A.fr.nceswe rray be abk: to show by explicit constructiorr that a llair (4.. ()r we may be able to argue. We want to claim eventually that rp e L (G) if and only if the sets A artd B txinstructed in this way have an MPC-sohrtion. Itr tlrc first pa. P) with productiorrs S . B) has a modilied Post correspondencesolutiott (MPC-solution) if there exists a sequrrnr:e of intrlgursi. .} .ir (4.lsor.:fi ". B) as shown in Figure 12. Wcrwill demonstra. we introduce the modified Post correspondence problem.rt. Bb-C. Iet us illustrate it with rr simpkt example. Since this is perhaps trot irnrntrditr.Ia. ilirr. I In sptx:ific insta. the string .f.5. we create thc pa.oeLEM 313 there cannol. Supposewe are giverr arr unlestrictcrl grammar G: (V.8. such that 'W1'Wi'Ui "tttk = U11)i1)i'''U1x. In Figurc 12. as we did abt)vc.tJ*Ti.rndecidable. .5.". we break it into two parts. This problem is a. be arry PC-sohrtion simply becausc a"ny string composed of tlltlments of . Ttr this end. corresponderrce To show this is a sourcwhat lengthy process. that no such solution can cxist.. Cl . B. For tlrt: sa. The rrrotliliod Post correspondence problern is to devise an a.4 will be lorrger thnn the corresponding strirrg frrlrrr B.d Let G : ({A.8.restion under irll r:ircumstances. B) permits a PC-solutiorr.i.Posr CoRRnspoNDENCD Pn.9 =+ is to be taken as'u1 irrrrl tIrc string F as u1.lgorithm for deciding if arr artritrrlrv pair (4.:K'J#liJJT Exom$lf I f.f. we iutroduce the followirrg r:onstruction. " In the modified ltost corrt:sllorrdence problern.. The Post prrlllklm is therefore undecidable..P) arrrl a targr:t string ur.keof clarity. Wc ua. Brrt in general. With these. to it. but the converseis not true. The order of the rest rlf the strings is immaterial. the membership problem for rrlcursively enumerable languages. there is tro algoritlun ftrr dor:iding this qr.ll THr'. R) admits an MPC-solution.'l'.

used to create the string.10.L(G) and lus a derivation S+aABb+aAC+a. The sequencesA a.c.wltere the first two steps in the derivation are shown.rrdB obtained from the suggested is con$tructiolr are given in Figure 12. respectively.8 d F.9.s A B C s s s E d'{Bb Bhh U ailt -j addcb Bb AC . The string ru : a.ralleled by an MPC-solution with the corrstructed sets can be seen in Figure 12. How this derivation is pa.' and u.e. *1. for every *. in P ---} D and take ut : aaac.C --\ B Fis a symbol not in FU 7' a foreveryaET v tl for everyVreV E JroE E is a symbol not in I/ U ? ti x.9 T u.314 Chopter l2 LIIrllrs ol AlcoRtrHMIc CoMPUTATIoN Figure 12.a.(Lc in. The integers above and below the derivation string show the indices for z.e. Figurc 12. vi I FS+ a tj t F tl 2 3 + 5 6 7 8 9 10 11 72 13 l4 b t A B c .

so to rnatch it we have to use u1s or u11.. Irr this instance. so we must sta.a t s = "10 il14 b v12 + ilt^ v- C '13 E .12. Looking at several rnore $teprJl see that the string u.10 f wl %to s ?10 a d B b il! ur F . I Let G : (V. The only exception is the last step.8. We want to construct an MPC-solution.n.mmar. I Witlt this result.S +. we is always longer than the corresponding string a1uiaj.T.11.linductive argument based orr the outlined reirsonine. .11 F vI -1 u1o a i D ur4 *z -s a @r2 ur4 u2 a il u.ppening.2 l vg Exarnine Figurc 12. The complete MPC-solutiorr is shown in Figure 12.P) be any unrestricted gra.. This string contains 5. lending us to the secondstring in the partial derivatiorr.7 *10 1J).rtreduce the rrretrbership problem for recursively enurnerabkl lrr.ur1.refully to see wlnt is htr. we ca.. indicate tho lines along which the rrext rcsult is established.ir ('{.nlr S15 Figure 12..10 r:a. Proof: The proof involves a forrrrir.. f.3 THE Posr Con. B) permits an MPC-solution if arrd only if w e L (G).'16.this brings irr'rr. where uo rnlmt he applied to let the u-string catch up. We will omit the details. together witlr the example. we u$e tr10. ^ u2 tus u12 4 s ?10 d n il. Therr thc pir. Let (A.rt with tu1.pspoNt)ENcE Pnoer. B il'rz b = rt C vz Figure 12. B) be the correspondencepair c:t)rr$tructecl fiom G and trr be the process exhibited in Figure 12.ges the rnodified Post correspondence problem and to thereby dernonstrrrtethe undecidability of the lattcr. that is.t and that the first is exactly one step ahead in the derivation.S.ngua. Tho c:onstruction.rur. with u any string irr T+ .

By Theorem 12. gri is creir.12 Membershil.P) and ur € T+. respectively.. now we assume that the modifiecl Post correspondencleprobSuppclsc'l Itlrr is decidable.. An algorithm fbr constrttctitrg A front B from G and tu clearly exists.ai1Qui2Q. . .il.12.2. The Post corresponclenr:cprotrlern is undecidable. arrd ?i. tZ n l t .. Proof.s16 Chopter l2 LItr. l.n.1'ttn. the modified Post correspotrdence problcm woulcl be decidable.4 and B as in Figure 12. We must therefore c:clrrclude probletn' r algorithm for deciding thc rrrodified Post correspondorrt:tt With this prelimina.\)i1 on lrorrrealphabet X. We then introdu(le rrew symbols g and $ and thtl rrtlw sequeilces C:Ao.8. .nrs on Al(loRIrttNatc CotrrlpurerloN Figure 12. .. where ru4i irrrtl u4i denote the jth letter of u). .ttxlfrorn trri by appending I to cac:h l. wt: construct the sets A and B as suggestctlabove.t... and B : 1..ry work.r algorithm.-tUn*tt D : z o t Z t t . We argttc that if the Post corresponden(icl)roblem were decidable. Construct.' does not. Proof: Given arry urrrestricted gra..T. thc lrair (A. wc are rlow ready to prove the Ptlst corresporrdenceproblem in its originirl ftrrm.n. li : zi : wilQu)i2Q''' LDi. defincd as follows. We can then cttnstruct atr algorithm for the merrrbership problern of G as sketched in Figure 12.nrl rni : lu..nnot be arry G and u. r'1: lua In words.5.Aft-.mmar G: (y. Lti. For i : 1.B) Iras an MPC-solution if and only if w e L(G). A tu* we Srrppose are given sequence$ : 'u)71Lu2r.. .S. ueL(G) vte L(G) The modified Post correspondenceprohlem is undecidable. but a membership algorithttr for that there ca.. a.i$.

3 THE Posr ConnnspoNDEN(iEPRosr. we camrot havel arr cortesprlrrrlcrrtxr problerrr. t . we see tha. .rJv tltx:idesthe modified Post correspotrcletrce problcrn.12. B) therr tlierc is a PC-solrrtion for the pair (C. . 1 0 1 a n d I J : { 0 1 . ( . . . B) irave a PC-sohrtion? Does it havc an MPC-solution'l ffi Provitle the rletails of the nroof of 'l'heorcm l. u t . We Assurne now that tlrrl Post r:orrespondence r:an then construct the rnachineshown irr Figunl 12.13 MPC algorithrn. q \ : g z t 1 1 t l u 1. 0 1 0 } . D).2 L u . . we tako A 0: g Y r ) : 3/n. and suppose it has a PC-solution.1111Qw12' ..'rof u. i . 0 0 1 . 1 1 1 .+I $r f() * 3lr 2".+1* #li. Ignoring tlte charactcrs r/ arrd li. Q1.2. D). I 1 . Q E . such a solution must have g/0orl Lhe left rurd Un1l olr tlrtl riglrt and stt must look like ' 'Qwtt(.t this inrplies ' W 1 U l . L e t A : { 0 0 1 . This mar:hineclea.4. with f .1' .5. problem is deciclable. consequently.n N{PC-solution. r | .1 1 1 . We can trrrrr tht: ilrgrrrnerrt rlrclund to show that if there is an MI'Csolution for (4. j ' ' " l i l 1 r ': U1'11. Brrt tlrc rnorlifiexlPost prohlern is undecidable. Consider rrow the pnir ((J.13. To cornplete the deflnition of d and D. Bcr:arr$el of the placernent of q arrd $. while aa is obtainecl by prefixing each chirrnr:tr. Does the pair 1 } (4. . MPC-solution No MPC-solution charat:ter. t algorithtrt for decidirig thc Post rxlrrosp{nclence 1 .urra 317 Figure 12. . B) pcrmits a. 'Qw.i''' Ulrj so that thc pair (.

that is.. ..ndB : Proof: Consider two sequences strings A: of (ut. j. and qlnsider thtt twcl larrguages Lt1 : and Lp : luitt..wk : Itn'u . the Post correspondeuce problem is decidable. The corrcspondence pair (4.ka.'ur) over some alphabet E. AlcoRrrHurc Conrur. ffi 4. new set of distirtct syttrbols & I r & 2 r . for Undecidoble Problems Context-Free ffiilffiffi [onguoges The Post correspondence problem is tr convenittrrt tool for studying undecidable qrrestions f'rrr txrntcxt-f'ree larrguages. . . ...4... an arbitrary pair (.3L8 Chopier l2 Lrvrrs or.1J)2.&r}flX:O. Suppose we restrict the domain of the Post correspondenceproblem to include only alphabcts with cxactly two symbols. (wt. B) has an even PC-solut.ion is undecidable.-B) on a singlc-lctter alphabet. {wiwi .4. ffi that tuatrri (b) There is an MPC-solution if there is a $equenceof irrtegers such that ultrr2zriur "' u)k :'Dt'Dz'tti'uj'' "D hr. B) is sairJ to have arr euen PC-solut'ion if and only if there exists a nonempty seqlren(:eof everr integers 'i. ..wtuth&ka4. .. a n }. We illustrate this with a few seler:tedre$rrlts. } J U { a 1 .Show that the problem of deciding whether or not j. There exists no algorithm ftrr decidirtg wltetlter any given context-free grammar is amhiguous.l Now look at the context-free gramma. (a) There is an MPC-solution if there is a scqucnce of integers srrch "'utNwT = uiuj "'utrur.. Is the resulting correspondcncc problem decidable? 5. Show that for lll : 1.. n } .. . .t1r)tr. t {or.re undecidable. there is an algorithm that can decide whcthcr or not (4. S r a . Choose a. 1)luke.k such that wnwj ..i .'5.I..u2. &jilil . 'r. . . aiai. j 6... B) ha-sa PC-solution ftrr arry given (.a r .qrtox 3. .r G : ( { S .. S ) 5 . .. Show that the ftrlkrwing rnodilications of the Post correspondence protrlerrr a. P . a 2 .or. S U C hh a t .

.2. S n } . we could adapt it to solve the Post correspondence problem as shown in Figure problem.n.L (G) ends with aa. P n . then its derivation with grammar Ga rrrust have started with 5 + u)isai. If a given string in . Thus. ^ 4 * wiwi " 'wtrak " 'eiai : 111 and 5 =+ ^9gg ui]Bai 4 + uiui ' ' 'ukek'''aiai : v1. Then clearly Ln:L(Gn). i : I. .CoN'rEXr-FRueL. .& . 12. .12. a n } . wi]eatlwiai..ncuncns 3L9 wltere the set of productions P is the union of the two subsets: the first set P4 corrsists of S-Sa. Conversely. if G is ambigrrourr it mrrst be because there is a ur for which there are two derivations J :* Jg ) wi5nai L (Gs). Consequently.q.. Ln: and L(G): Ler Ln... B) has a solution. G n : ( { S . then the Post correspondenceproblem cannot have a solution...14. . It is easy to see that Ga arrd G6 tly themselves rlre rrnamhiguorrs. .if G is unambiguous. Now take G a : ( { S .. P e . Sn -+ uiSnailuiat. . we can tell at any later stage which rule has to be applied.4 UuoncrDABLE PRonlnlrs FoR. a 2 .. If there existed an algorithm for solving the ambiguity problem. S a } . I . 1.n. I \ ) { a 1 .. Similarly. But since there is no algorithm for the Post correspondence we conclude that the ambiguity problem is undecidahle. a 2 . X U { a 1. S. .q.. S ) . if G is anbiguous.2. then the Post correspondence problenr with the pair (4. S ) and 'i: I. the second set P6 has tltc lrro<1rr<:tions S __+ Sr.

For example. We conclude that . then L (Ge) and I (G6) cantrot have a common element. i and S 6 1 u i u i . .L (Ga)n L(Gu) is nonempty if and only if (. . r There is a variety of other known results along these lines. Somtr of them can be reduced to the Post correspondence problem. Then the pair (A. defined Proof: Take as G1 the gralnlnar Ga and as G2 the Hrammar Gr a*s that tr(Ga) and L(G") have a in the proof of Theorem 12. But the results that have beerr established tell us that this is not possible. B) has a PC-solution. Keep in mirrd that this does not rule out the possibility that there may he ways of getting the answer for specific . that is S a 3 t r t i t u. u k e k" ' e j a . it would be helpful if we corrld tell if a programrnirrg language defined in BNF is ambiguous.rrrrrurc Figure 12.if the pair does not have a PC-sohrtion. There exists no algorithm for decidirrg whcther or not r (Gr) L(G2): n n for arbitrary context-frcc grarrllllars G1 and G2.8.c+oR. Converscly. or if two different specificatiots of a language are in fact equivalent. This reduction Drovesthe theorem. whilc otlters are more easily solved by establishing different intermediatc rc'sults first (see for example Exercises 6 and 7 at the end of this section). ' a j a . B) ha-sa PCsolution.i' .4. and it would tra a waste of titne to Iook for an algorithm for either of these tasks.320 Chopter l2 Coupuranron Lluns or Ar. i . We will not give the arguments here. ' u t r a k ' . Srrppostr common element. but poirrt to sorrreadditional results in the exercises. That there are many undecidable problems connected with context-fiexr languages seems surprising at first and shows that there are limitations ttt computations in an area in which we might be tempted to try an algoritlrrrric approach.14 PC algorithm.

Show that for every M we carr cortstruct three context-free gramrnars Gr . F :r.(Gr)c t(Gr) is trnrler:idahlefor context-free grammars Gr. Use the results of the above exerciseto show that "-L (G) : X*" is undecidable over the domain of all context-free grarnrnars G. We can as$lrme without loss of generality that everv computation involvcs an even number of rnoves.ff. t 3.. Show that the problem "lt C L (G)" is undecida]-rle. Gz. xn. (Gr ) n L (Gr). Show that. * 5. (h) the rret of all invalid comprrtations (that is. For any such comnutation gourF fll F 12 F .4 UNDECTDABLE PnonlnrulspoR.12. Use this to show that the problem "I (CJ) is regula. Prove the claim made in Theorem 12. Show that if thc languagc L(Ge)nL(Ge) in Theorerrr12. such that (a) the set of all valid rnmprrtations is /. Let M be any Tlrring rrrachine." [. frrr arbitrary context-free grammars Gr and G2. then it must be empty. we can then construct the strins qot F *fl F rc. Gr.XT-[.e by therrrselves are unambiguous.. the problem "L (Gr) n L (Gz) is context-free" is undecidable. Show that thc problem of detertrining whether or not * ? 1.'REE LeNcuecns 321 casesor perhaps even most interesting ones. Let -Lr be a regular larrguage arrd G a context-free grammar. there are invariably some situations for which it will breah tlowrr.rJ is urxler:idable for context-free G.8 that Ga and G. What the urrdccidirbility results tell us is that thert: is no txrmpletely general algorithm and that no matter how many dilTerent cases a rncthod r:an handle.rr.8 is regular.. * 4. I. GB. * 8 .f | .. the complement of the set of valitl t:ornprttatiorts) is I (Gl).'l I-his is called a valid computation.Cor'r. Is the problem I ( G r )n L ( G z ) : o decidable? . Let Gr be a context-free grarnmar and Gz a regular gramnrar. * 7 . * 6.

(b) when G: is context-free.c.322 Chopter l2 Lrurrs or Ar. Is the problern L(Gr) = L(Gz) decidable wherr (a) Ge is unrestricted. (c) when Gz is regular? .rrrnrrc CoNrpurerroN * L Let Gr anrl G2 be grammars with Gl regular.on.

At variotrs times. irrtelrcst in ma. With the discovery of diffr:re:rrtiill irrrtl integra.t tha. No t:rlrnrn(:r(:ilrl rxlrnprrterswere availahle a. The results that were fourrtl slrrxl light not only on the concept of a mecha.A. however. some of which at flrst glance seernedto tre raclically different f'rorn Thring machines.tion for the study of digital r:ornputcrs. Although T\rring's ideas eventually becarnc vcry importattt irr cornputr:r sr:iunr:e. A mrmber of and 323 . Ttrring among them. we must brielly look at tlxr statc of ma.. other rnodels hirvt: llccrr proposed.nical cornputation. Much of the pioneering work in this area was tlorrr:in the period hetween 1!130 and 1940 and a nurnber of rnathernaticiirrrs.l way. To understand what Thring was trying to do. In fact.t tirne.t that time.l r:ir. but orr rnatlxlrnaticu a-r a. origina.O t h e rM o d e l s of Computqtion Ithough T\rrirrg rnar:hinersa.re the most general models of cornputation we carr corrstnrt:t.re not tlre only ones. M. T\rring's work was published in 1936.l goal was not to provicle a fotrrrhis da. contribuled to it.thematics inr:rea.lcrrlus Newton and bv Leibniz in the seveuteenth and eiglrteenth c:etrturic:s. all thr: rnorlcls were fbund to be equivalent.sed the discipline entered an era of explosivr:grclwth.whole.thematicsa. thev a. Eventually. lhe whole idca had bccn rxrnsideredonly in a very pt:ripherir.

pplictr. circlt of whiclt takes us from one proven fa. at !'indirrg such lortnal systetns was a ma." Such phrases a. or make wrong infi:rcnt:tls. we cannot help but wonclelr tlrtl llroof we are given is indeed correct. Matitettnticiatrs also had becromeurrfficienlly sophisticated to recognizc'that sorne logical difficulties had a.thema. One alternative to such "skippv" rnathtlrlrirtitls is to formalize as fa.isely definecl nrlels for logical inference and cleduction.jor goa. If thr: propositiorr corrflicts with another proposition that t:iln b(: llroved to be true.ptrr dealittg with mathematical subjects.Ir that a. Ttr see why this was llecessary. Of course. the body of mathcrnatical knowledge hacl become quite la. Whetrever we see arguments like if this. then iL is considered false. Becarrse pra. a. But to those demanding cornplete reliability. Itowever. this type of reasoning is a.t]rcrrratic:s t]te crrd of the nineteenth century. one rxrrrldgivt: rrtclredetailed reasoniug. aud atrything cleriveclfrom an inconsistent $y$torrrwoultl tltl cotrtrary to all we agree on. irrxl what one Ineans by them is thtr.rart(ltt O'r'truR.t. tvpit:a. it rnust couany interesting consistent systeiurmrrst tltl irrrxlrrrlrlctel .irnsis rrrade.tical knowledge in the proce$$.txxrptcd of by most rnirtherrraticians. considtlr what is involved in tr.shtxl by the work of K. since it is possiblc to overlook things. they are unacceptrr.lse. A propositiorr is irr considered proven true if we can derive it fiom the ar. A sequenceof plausiblc cla. Often there is no way of telling. this is very dangerous.t tlrc rt:sult is true.eurcan be provecl to be true or fh.Frlr sorne tirntr it was hoped that consistent and complete systomtr ftrr rrll of tnalhernatics could be devised thereby opening the door to rigorous but cotrtpletely mechanical theorem proving. By this wc rrx:irrrtltat tirere should trot be atry proposition tha. Tlrrl first was that the system should be consistent. In his famous Incornpleteness Theorem. Ihe arguments throw light on the subim:t attd at least increaseour confidencethir. by which we mea. prec.nrl sigrrificatrt aclvauceswere made in alrnost all of them. Two concerns immediately arose.ct to a.r sct of assutned givetrs. The rules tlre of used in a $eqrr()n(:o stclls.rnclerr*one()uiJ orrly after a considerable amount of time. and as possible. Thc: nrlr:s rnust bel such Lhat the correctnessof their a.re (pnv()rrtiorritl. irrrd lortg ancl involved proofs have heen published and fbr.rison that required a more ca.t:ticalliruitatiorrs.refulapproach. by Consistency is indispcnsablc irr rnathernatics. then shown to be ftr.lse irrxlthtlr ctpirlly valid argumeut.rt with er. different areas were studied. if challengedto do so.t ca.rgrt. This krd to a concern with rigor in reasoning and a consequent examiniltiorr of the foundations of rna. Gcidel showed that tlnt is.inlerspersedwith phraseslike "it ca.l proof itr just about every book and ptr.l of mer. use faulty hidden assumptions.nypropositiorr cxpressible in the syst. Godel.xiorrrs a Iirrittl sequeltce of logical steps.nother.nbe seen casilytt irrrd ttit follows from this. A secoud concern wils whctlrtlr a systern is complete. Rut this hope wrs rlir.324 Chopter l3 Motnls on Clolrlru. We sta. By the end of the nirxlteenth century.ble. called axioms.n be proved to bc tnxl by orre sequellce of steps.tiorr carr be checkeditr a routitre atrd completely mechanicalway.

Gtjdel'u rnvolutiotrary conclusion was published in 1931. There is a wealth of materia.l rrxrtlels of computation were esta.nd Qualitz (1978). ilhe spirit of this observation is genera.rite firrrt:tiorrsarrtl Post systernsca.t gives iln explicit nrethod for thtlil cotrrputation. It was this problettr Lhat T\rring and other mathematicians of the time. a variety of forrnir.blished. have diverse origins.t there are futrctiotrs that cannot bc cxprcssed itt atry way tha.1 R. Sorrte of thetn we use frecluently. This thesis statcs that all possible models of ulrnputir. if they are sufficiently hroa. Church. ately raisesthe questiorrof how we can explicitly reprtrst)rrt There are many wa. a. Gcidel's work left rrnanswurudthc tluestiorr of whether the unprova.hle rxnrld sornt:ltowbe distinguished from the provable oncs.As surrrmilrizcd in Section 1.lkxlChurchts thesis. a.flnction is a rulc tlnt assigusto a. Kleene.sthat arosc out of tltese studies. Brrt it was evetttually found tha't thev wcrel all equivalent in their power to c:itrry ottt cornputations. Itost. but there arc rnarry other such systems tha.ninherent limitation itr lhis atrd thir. It also irnplies lhal there is a. We will give onlv il vtlry bricf presrlrrtatiorr. arrtl E. In tiris cha'pter we briefly review some of the idea.ntltlrt:rset. Prorrrirrc:rrtirIIIoIrg them were tht: rccursive futrctious of Church and Kleene a. This is very broad and generrnlirrrd itnmeditltis association. prr. gives $trong e:virlt:trce powerful models cart lle fcrutrd. called the domain of tlx: firrrc:tiotr.y$irr whidr furrctions can be definelcl. so that sta. and the combined notion is sotrretines called the thesis.tementu thcrc was still sotrtehope that most of ma.tion. I'he clainr is tif t:tltrrsevery closely related to Thring's tk:sis.1. We are a.nbe firrrnd irr accessibleaccount of recursiver Denning. S. addressed.rrrust be eqrrivaient.t havc br:crr sttrdied. while a good disclnsiorr of vitrious other rewriting systems is given in Salornair(1973) and Salomaa (1985).ncunsrvn !-uncrroFls 3?5 tain soure unprovable propositions. called thc range of the function.1i1.thematit:stxltrld be tnade precise with mechanically verifiable proof's. It provides a genertll prirrciple for algorithmic: Church-Tlrring that uo more cornputation and.referritrg the reader to other ref'ercnctlsfbr detail. Recursive 1gffiffifi$ffiffi Functions Tlrc rnrrcept of a function is firndeurcrrtalto rnuch of nta. The utodels of computation we study here.ll fnmiliar wilh functional rrotatiorr irr wlticlt we write expro$sions like + I Ui: rz2 l.ntl Post systems. while otht:rs are less colnmolt. Dennis. C.tl. unicluevalue in a.l here thnt wc carrrtot cover. In order to studv the tprcstiott. whilc not provable. as well as others tha.trelcrlcrrt of otre a st:t.llyr:a. . A qr.thematit:s.vt: bccrr proposed.rticrrlarlyA.t ha.

s (z) : z * 1.xt in sequence to r. for all r € 1.a ) . 2 .y.r from defined functiorrs gt. together with nrles for building from thenr solne rnort) cxlmplicated ones.yrrof building more complicated functions from these: 1. hy which we construct ( I @ .. and h.2. we also rmrst specify its domain. gz.u)). 3 . Primitive through recursion. then the range of / will be some subset of the set of positive integers.soF CoMpuTATroN Tlfs dcfinrls the function / by means of a recipe for its computation: given any value for the a. . Tlre zero function z (r) :0. we start with the ba-sic: firnctions: 1 . .rgrrment n. s z r . for exarrrplc. Thct projector functions whose value is the integer ne. and whose range is irr 1. by which a function can be defined recursively :'. h : I. a ) ) frorn defined functions gL)gz. Composition.-Tll . in the rmual notation. and therr add one. can it be expressed in such a functiorral fclrm? To answer the question. the set of all non-negative integers. To cornplcte the definition of /.ny very complicated furrctions (:an be specified this way. or I x I. we know the relation bctween the elements of its domain atrd its rartgc).v).. In this setting.. Since ma. MoDEr. whose dornairr is t ither I.h.i. (n. that is.'. we rnay well ask to what extent the notatiorr is universal. multiply that value by itself. we must first clarify what the permissible forms are: for this we irrtrodrrce some basic functions. If. Primitive Recursive Funclions To keep the discussiorr sirnple. There are two wa. a ): l t ( n ( * . The successor function. Sirrce the function is defined irr this explicit wa. P n ( n 1 .r z ) : r k .we can compute its valucs in a strictly mechanical fashion. 2.326 Chopter l3 Orrrnn. we take the domain to be the set of all integers. we will consider only functiorrs of one or two variables.i.s(r). If a function is dcfined (that is.

2 : a d d(. A kind of subtra(:tiorr is dcfined from usual subtraction by : t .lltxl thc: monus.1. 0 ) 1 )+ I + :(3+1)+1 :4*1:5. function.king into rr. ta.3/ + 1) : add (r.rlcrloNs 327 We illustratc how this works by showirrg how the l-rasicoperatiotu of integer arithrnetic can be constructed in this fashion. Additiorr of integers r and E can be implernented wiih the function add(r.g) is the projector ftnction Pt\r.1 Rncunsrvr Fr. Forma. To adcl 2 and 3. . m u l t ( r .the uerxrndstr:p is arr applicatiott of pritnitive recursion. I Using the add function defined in Example 13. :t-"u:0ifc<E' The operator .t/) + 1.3 .0) : .is sornetirrxlu cir.tl). I Substraction is not rpite so obvious.! ! : . iu which h is itlrrntilicd with Lheatld. Y + 1 ) : a d d ( n .r. we apply these rules successively: a d d ( 3 . Y ) ) .0):0. ( r n u l ' Ln . we cirrr rr()w dt:fintl rrrrrltiplication by mult(r.ccorrnt thnt rrcgativtl nutrrbers are not perrnitted iu our system.y).13.lly.. it defines subtraction so that its range is 1. I ) + I ) : ( a d d ( 3 . adrl. we rnust deflne it. defined by ttdd(n. and 92(r. (r. t :.g l f x : 7 A . First.

and very complex computations built from the simple ones. If we accept this as given.t. It follows from this that everv primitive recursive function is a total function on I or I x 1.3): : - pred.2)) pr ed(ytr (subtr (5. by successive composition and primitive recursion. Note that lf h.t/ + 1) : pred(subtr (*.. With the algebraic operations precisely defined.0) : r. Conrpureuor-r Now we defirre the predecessor function P r e d ( 0 ): 0 ' Ttred(Y*I):'c.328 Chopter | 3 Ornon Monnls or. then / defined by composition and primitive recursion is also a total function. y)) . We call firnctions that can be constructed in such a manner primitive recursive. the subtracting function subtr (r. subtr (r. we can define integer division. To prove that 5*3 : 2.pk. 1)) ) ed pr ed(pred (pred(subtr (5. other rnore complicated ones can now be constructed. and lr are total functions. we reduce the proposition by applying the definitions a rrumber of tinres: subtr(5.0)) )) pred(ered(ered(5))) . T In much the same way.il$' lrr .e.irfr..(subtr (5. A firnction is called primitive recursive if and only if it can be constructed from the ba*sicfirnction$ s? .l . i'r. but we will leave the denronstration of it as an exerci$e. and from it.d(B) _.pred(pred$)) : pre. gz. we see that the basic arithmetic operation$ are all constructible by the elementary processesdescribed. fil i. .

We can then write all functions in some order. and most common functions are primitive recursive.. But clearly . Combining these two observations proves that there must be some function in f' that is not primitive recursive. Let C be the set of all total c:omputable furrctiorrs frorn 1 to 1. g is well defined and is therefore in F. f2.its the following argument shows. function in . r The nonconstructivc proof tlnt there are computable functions that are not primitivc recursive is a fairly simple exercise in diagonalization. However.13..and define a firnction . however. f t... this goes even firrther. Such strings can be encoded and arranged in standald order. I Actr. the set of all primitive recursive functions is countable.F that is not primitive recursive. not only are there firnt:tions that are not primitive rccursive. the function g differs fiom every ri arrd is tlterefore not primitive recursive.rz. Therefore.gby sQ):rift)*-By construction.1 RncunsrvnFUNCTToNS 3?9 The expressivepower of primitive recursive functions is considerable. The actual construction of an example of such a function is a much more complicated matter. say. This contradiction proves that F cannot be countable.g is r:onrputable. Then there is some Proof: Every primitive recursive function can be described by a finite string that indicates how it is defined..2. g differs from every fi in the diagonal position.rally. but equally clearly. the set of all prirnitive recursive functions is countable. Let us denote the functions in this set as rr. Suppose now that the set of all functions is also countable.. not all functiorrs are in this cla*ss. We next construct a function g defined as s ( i ) : f i ( i )+ \ i : 1.. Let f' denote the set of all frrnctions from I to I.. Clearly. We will give here one example that looks quite simple.. the demonstration that it is not primitive recursive iu quite lengthy. . there are in fact computable functions that are not primitive recursive.. Proof: By the argument of the previous theorem. Then there is some function in C that is not primitive recursive. proving tlte theorem..

it is quite elementary to write a recursive computer program for its computation.330 Chopterl3 OrHnn Mounls or. 5 3 4 ) . see. r E If we accept this result. For primitive recuntive functions. we cannot argrre directly from the definition of A. Proof: Consider the function s ( i . and Qualitz ( 1 9 7 8 . will be omitted.. Then there exists some integer n such that /(i) < A(n.. : A(:r . In fact. it follows easily that Ackermann's function is not primitive recursive. There is a limit to how fast a primitive recursive function can grow as i --+ oo.i.4 (4a. Its proof. . 1) . defined bv A(o's):3/+1.n+1.A ( t : . see Denning. Ackermann's function is not primitive recursive.nn's function grows very rapidly is easily demonstrated. fbralli:n.1 . Dennis. ExercisesI to 11 at the end of this section..'CoNtrurnrtotl Ackermqnn'sFunction Ackermann's function is a function from 1 x 1 to 1. Let / be any primitive recursive function.). ): A ( i . i ) . We need to appeal to some general property of the class of all primitive recursive functions and show that Ackermann's function violates this property. 0) A ( u . It is not hard to seethat .fbr example. y ) ) . How this is related to the limit of growth for primitive recursive functions is made precise in the following theorem. t 1 * 1 ) : A ( r . it is possible that an appropriate alternative definition could exist. . Proof: For the details of the argument. That Ackerma. Of course. which is tedious an<l technical. Even though this definition is not in the form reqrrired for a primitive recursive function.1. The situation here is similar to the one we encountered when we tried to prove that a language wa$ not regular or not context-fiee.. and Ackermann's function violates this limit. Ackermannts function is not nrimitive recursive.4 is a total. computable function. But in spite of its apparent simplicity.p . one such property is the growth rate.

3.?)) : snrallcst 37 such that g (r.rsive Functions To extettd thc idea of recrrrsive functions to covcr Ar:kermann's function and other computable furrctions. accordingto Theorern13.3: 0. But it turns out ihat it . y) : O. In this definitiorr. Thercfrrrc. I As the abovc t:xamplr: rthows. Orrclway is to introduce the p or minimalization operator. If we now pick i : rr'.n) < A(n.for r ) 3..r.).srvp Fur-rcrronrs 331 If A were prirnitive recursive.u)) : 3 . ftrr all rl. fhsn U:3-r is the result of the minima. proving that A cannot be primitive recursive.there existsan n suchthat s(i)< A(n. w(l iL$$rrme that g is a total function. then therc is rro y € I srrt:h that z I y .1 R.n.i. defined by pA (g @.lization. But then. the minimalization operation opens the possibility of definirrg partial firnctions recursively..A)) mav only be partial. Ler llHr*sNE[U*il*.'a):n*u-3. I Recu 11. but if r ) 3.13. which is a total functiorr. pA(S@. : unclcfined. wt] get the contradiction S(n): A(n.$* g(x. for r: { 3 P'a(g @. If u { J.ncun. we must add somethirrg to thc rulcs by which suc:hfincltions can be constructed.ll) is a total function.).then so would g. We see from this that everrthough g(r.

tt):n :0 Show that this futctiot ifr*u.rn<:tiorrs. i fr : U. 4.s if r : u . Agaitt.titlrrs primitivc rct:ursiort. v ) : 1 i f r > U :0ifr<U Show that this function is prirnitive recursive. t..'g) : I rtyt.l. Nhft'ilrii\ A firnction is sa.33? Chopter l3 oF Orul:R MoDELS CoMpurATroN also extends the llrlwt:r to defitre total functions so a. Proof: -Fbr a proof. Chapter 13). Derrnis.2 to rrrove that 3 + 4 - 7 and . see Denuing.ilsmay be fortnrl. Ct-rnsiderl. ir.rrdorrly if it is cotttputalrle. Define the lunction Ill.$to int:hrtlt: all computa.-recursivera. we tnerely quote the major resrrlt witlt references to the litcrra.I and lll.ffi 3 . I I'he p-recursive finctions therrrftlrc givc rrs itrxrtlrtlr rnodel for algoriihmic computation.ffiRq.rrirnitivc rccursivc.rrtlQuiilitz (1978.turtlwirere the deta.he luttctiou (t:. lili. is .idto br: p-recrrrsiveif it can be constructed fiom thc birsis of functions by a. g r e a t e(r* .seqrxlnr:t: irpplicatiots of the p-operator and the opera. Show that this function is prirnitive ret:trrsive.ble fi. of compositiotr and if A lunction is trr. Usc thc definitions in lixamples 2+3:6. .frtiffift.a. Let / be defined by 'f(n. :0 1 I x t ''c.

Write a cornputer l)rogram for computing At:kermann's function. 6. I) and A (4.1)I fu +t7 part (d) g(n.(t. ''. anrtl rem. Tfy to usc the plograrrr t:orrstrrrctedfor Exercise B to evahrate . g).g) :u mod(s*1) 16. s = s + 2 f f i ) (b) A (2.u) : x is primitivc recursive.u) : intescr of (* .ua:rd re. :a) : :ra (lr) .u ) : 2 ' + 3 * 3 10. Show how thc definitiorl t:an be rewritten so that it has the corret:t form. Show that f ( r t ): 2 " is primitive recursive.:3).'g): n. Show that Ackernrann's functiorr is a total function in 1 x 1. althorrgh intuitivcly clcar. does rrot strictly adhere to the defirrition of a prirrritive reclrrsive function. Use Exercise I to compute A (4.i. 7. r:orrrpute ltakl@.):2"' -r u . W 8.'a): n .tl in Example 13. Show that the functions d. Ilse it to evaluate A (?.2).nr ate prirnitive recursivc. . 11.5). Clive a gerrcral expressiorrfor A (4.1 RncunsrvnFrrrucrlons 333 rk 5. Sltow the sequencc of rccursive calls in the cornputation t-rfA (5.3 ffi (c) s (r.s (r. 13. ( a )A ( 1 . lbr cach g below. Cat you explain what vou observe'l 15.q(". 14. and tletertrine its domain.nA. L2. :2u-r3 ffi a) ( c ) A ( 3 .4 (5.11. L Prove the following fbr the Ackermann lunction.2). (a) .37)). Show that the furtction g (n.5) and A (3. The definition of pre. Integer division can be defined by two functions iCzu ancl lern: diu (x.13. wlrere n is the largest integer such that x j rrg.

sf. Wi EV. A Post systeur II is defined by fI : (c.t eac:hvirriable on the right mrrst irppear ou the left.'trz : V2..r..'1. We can then rrrirkethe identification u)1 : V1. yt € C*. Tz. fi ilitftilll\l. where C is a finite set of constants. rrrirtchthe correspondingstrittgs itt (13'1) arrd'ur1€ C*. cir.ritr.. thtl st:t of terminal constants. sutrject to the requirerncrrt tltat atry variallle can appear at most on(:c orl the left. that is i=t U*rU" t=l we Srrppose have a string of tcrrnirrals of the forrn rru)trzl:tz' ' 'rDnfrn+tt wlrere the substrings rt. A is a linite set fiom (J*.334 Chopterl3 OrunR Monnls or Covpurnrton MM PostSystems A Post system lotlks very much like tln utuestricted grammar' colsistiug of an alphabet rr. P is ir"fittite set of uroductions. (13.!lm-lt. s. v.VrT. so that Vfviforilj.1) where ff. called the nonterrninal constants.1.P) .rxlsorne production nrltls by which successivtl strings can be derived. .-.. V is a finite set of va.blt:s.ffi .. and Va.. But there are significant difl'crettces in the way irr which the productions arc applied.lrres tlte W's ou the right of (13.'.*-ytWtUz-. and th. for and substitute these va.l$. The prodrrctions irr a Post system mrrst satisfy certaitr restrictions. Thcy must bc of the form fr1V1fr2.. and C7'..-Wm. consisting of two disjoint sets (..A.1)' Since every .lledthe axioms.'.

. you will quickly convince yourself that the language generated by this particula.. we carl now talk about the larrguage derived by a Post system.2 Posr Sysrpus 335 W is some yd that occurs on the left. we apply (13. I aVrb. V : lVr. and production Vt This allows the derivation A =+ ab + aabb. irn C 7 : { 7 . Arnlt. .b}. Cw:A. V : {Vr}.* . frnlt + !yUa!2Wi.rs u for some zuoe .tedby the Post system II : (C. In the first step. A : {I}.Antt. leaving everything else the same. If you continue with this.. Wt : Vt.13. . A:{1*1:11}. The language genera.u.r Post system is {attbt' : rz } 0}.1) with the identification 11 : A. 4 irrt C7: {a.: . In the second step.Vz. We write this as I1W1I2W2. it is assigned a unique value. . Cu =4. P) is I. A. and we get the new string lltwiUzwj .. rt : A. V.VsI. As for a grammar. and 3r2: b.a} . we re-identify Vt : ab. At : &. Vr : tr. : } . (II) : {w e Ci.

of Irrterpreting the stritrgs of .tcsit. for every production fr -i A of the grammtrr. as tlte next theorem shows.tiorrs integers. it can be carried out on a Thring machine. havirtg productions all of the form t+lt with r. any langua.ted grarnmar G. {S}.Cr : T. Proof: The argurnents here are relatively simple and we sketch them briefly' First. since a derivation by n Post system is completely mechanical. C. and with productions V1rV2 + VrUVz.6 illustrates in tr sittrple manner the original intent of Post systems a^r a rttecltanism for rigorously proving mathematical statt:rnents from a set of axioms.gc getrerated by a Post system is recursively enumerable' Frrr the converse. wc create a Post where Vn : {yt' Vz} .mmar G. remember that auy recursively enumerable language is generatod by some unrestrit. Tht:refore. It is then an etr$Ymatter to show that a ur carr be generated by thc Post system II if arrd only if it is in the Ianguage generated by G. It also shows the inherent awkwardness of such a completely rigorous approat:h and why it is rarely used. A language is recursivcly enumerable if and only if there exists sotne Post system that generir.336 Chopter l3 OrImn Moont s ot'' CoupurartoN and productions Vt*Vz: Vt*Vz: V 3V 3V r 7I V z : VtIVzI: Vrl' Vrl' The system allows the derivation 1*1:11+11*1:111 + 1 1+ 1 1 : 1 1 1 1 . t]re derivation can be written as 1+1:2=+2+1:3+2*2:4. The language gcrrerated by this Post system is the set of all identities of integer additiors.['s as ullary representer. such as 2 * 2 :4. derived fiom the axiom 1 * 1 : 2. P11). Given any unrestricted grs. even though they are t:urnbersome for provirrg complicated thtloretns' are general models for computation. A : system II : (Vn. t . But Post systems. A. y e (V U T)-. _l Example 13.Cw : V.

{***. n : nr in (13. since its instantarreorrs description is a string tha.hine can he viewed this wrr.6. A restricted Post system is onc on which every production r * g satisfies. a rewriting system consists of arr tr. Find a Post s. Show that for every language.. What language does the Post systcm with VnaVV and axiorrr set {ab} generate'l 8.c}.b 1a['*c) (b) r: (c) l.1). For X: {u. in addition to the usual reqrrirements. Ftrr X -. find a Post systern that generates the ftrllowing languages (a) L(a. startirrg from the axiom 1 + 1 : 1. These observations carr ller formalized irr tlNr concept of a rewriting system. 3.e.{o}.lphabet E and a set of rules or productions by which .t completely defirrcsits configuration.hstring from a previou$ one.ysterrr that generatcs t: u. Rewriting ystems S The virrious grammars we havcl studied have a nurnber of things in common with Post systems: They anl all based on sotne alphabct fiom which one string can be obtained frorn irnother. t .: {urtu} ffi\ lanb"t!'j . ffif.e {a.L gerrerated by some Post systern.$ 6 . ab} ? Find a Post system for proving the identities of irrteger multiplication.3 RnwnlrrNc SYSTTTMs 337 1 . there exists a restricted Post system to generates Z. i. Give the details of the proof of Theorem 13.b}-}.y. Even a 'Ihring rnirt. Generally. the further restriction that the number of variable occurrcnces on the right and left is the safire.h. What language tloes the Post system irt Exercise 3 gencrate if the axiqm set is {a. The program is then just a set of rules fbr producing one suc. what language rloes the Post systern with axiom {a} ancl the following production generate? Vr * I4Vr W 4..13.

we must next ir. What distinglishcs one rewriting svstern from a.P.t r .lomeur Mqtrix Grommqrs Matrix grammars differ from the grarnrrtars we lrave previously studied (which are often r:alled phrase-structure grammars) irr how the prodrrctions can be tr.chof which is an orderul sequellce 1.they are governccl by Church's thesis.pplied.- is Whenever thc first production of some set .chieve .-. P2: 51 + 0. A derivation with this grammar is S + S1S2 + a. We cxlnclude from this thir..' Cotrpurnuon a.ctlv one production.t r .is cluite broad a.14rly set can also be applied.9rbSzc+ aaSlbbS2cc+ &abbcc. P2.ll otlter productions in tltis cannot tr.7 mars trave the same power ir.t rnatrix grammar$ trrrtl plrrase-structgrtl graurrnodels of computa. Consider thrl rnatrix grammar P1 : .strilg in E* can produce another.9 --+ .Zt+'AIt12+!2r. thart we cau a. For matrix graIIIIIlars.91.titlrr of the productiorrs. Htrrc we briefly intrt)thrce sonre less well-ktrown ones that are interesting arrtl also provide general tnod(1985)' els fbr rxrrnputatiou.92. and so on. Pn. applied. We the first production of 4 unless tr. sornetimes the use of with arr uurestricted phrase-structure grammar. AIso..nqtht:r is the nature of X atrd restrictions for the applictr.see Salomaa (1973) and Sa. For dotir. producing a corresponding b and c. as ExamJrle 13. T Matrix grillrtlnars contain plrrase-structure gramttrars as a special ctt*se in which each 4 coltains exa..5I' Sz + bSzt:. The iderr.338 Chopter l3 O'rseR Monnls or.rrtatrix grammar givcs a much simpler solutiotr showtl.ils.pply the secondorte to the string jrrst created.tion. ea.. But. the set of productious consists of subsets P't. Note that whertever the first rule ttf 1'2 is used to create arl ar the second one also ha^rto be used. . This rnakes it ea.s a. P 3 : 5 ' 1..nd allows atty numtrt:r of specific ca$osirr adtlition to tfie orrcs we have alretr.tly etrcountered.l strings generated try this matrix gramrnitr is 7:{a"b"c":n}0}. since tnatrix grrlrnm&rs repre$crrt algorithmic prot:t:sses. 9 2 .sy to see that the set of termirur. then the third otre.

so L (IV): { r r € { o .lgorithn the with X *T : {4.nnihilatesa srrbstring ob or ba. Starting a is with a terminal string.ur € x tr. productiorrs a. For languageat:ceptance.6} and productions ab-4. set ? C X of terrnirra. .. I Lct M be a Markov a. .re applied rrrrtil the empty string is Drodrrced. ba . Some of the productions rnay be singled out ir.$ Consider Markova.it_. In a tlcrivation. A derivation utarts with sorne string .\. the lcftmost occurrerrt:eof the substring r must be replaced by g.t u ) : " o ( r ) } . e f.t.sterminal productions. tlNry will be showrr a.. by {". : r t .1} I .! t -R. Furthermore.pplicableproductions.lgorithrnwith alphabet E arrd terminals ?'.ndcontinues urrtil either a tcrminal produr:tion is used or rrntil there arr: rro a.3 Rnwnluuc Sysrnlrs 339 MorkovAlgorithms A Markov algorithrn is a rewriting systt+mwhose productions n+ U are consideredordered.I3.5..' -'Fr" t*qmFld lS.!. Every step in the clerivation a.s r + .ls identified. b } .(. I . the first applicable prodrrction must tre used. Tlrcn the set L(M): is tlre language accepted M.

L-systcrrrs are essentially Ttarallel rewritirrg systems. Bv t|is we mean thnt irt each step of ir. grailrrrrars working brr.rrgua. This ctr.rkov algorithm f.+'u) (13.2) where a € X a.s op Courur.gcl 11 a certain uense.Markov a"lgorithms are simply phrase-structure .starting point for a proof of thc ftrllowing theorem that cha'ra'cterizcs tlx: power of Markov algorithms. 35).thtt llroductions of tr. Lindentrir. I A ltr.r I.pplitxl to every symbol of the string bcfore the new strirrg is geuertrtetl. otre such productiol nust be a. since it is nqt rlc:irrwhat to do with the Iast produt:tiotr.t:kward. If irr this last examplc we take the first two productiorrs and reverse thtr left nnrl right sides.q"ttotl Wnd a' Mtr. = {trLfirl : rz > 0] ..\.tlt:rivatioll' everY syrtrbol has to bc L-system must bc: rcwritten.rrguageis recrtrsivt:ly errumera. Proof: SeeSalomaa(1985. used them to rttodel the growth pattern of certairr organisms.S. r L-Systems The origins of L-systems a. An arrswer is ab.rc quile clifferent f'rorrr what we miglrt expect' A. But the observation does prtr viclq ir. .p.yer. Wht:rr a stritrg is rewritterr. tz5b -+ S. we gc:t a context-f'rt)(t gralnntar that gcrterates the la.rrrtotbe takett totl literally.rrdz € X*.lgorithrn for it. Their clervcloper. 5 --i .L.rr of the form 0.bleif arrd only if thertl cxists a Ma.340 Chopterl3 OrHnn Motlpl. For this to rnake sense.rkqv tr.

2":n>0}. 9 14 . An cxtension of the idea provides the necessary generalization. . Supposc that in Examplc 13.2) are rrot sufficierttly general to provide for all algorithmic r:omputations.0. 5 2+ . What language is generated l-rythis matrix grarnmar? 4.a.a.. IxEncrsES IFind a matrix grarnura. 3. Why does the Markov algorithm in Examplc 1B. Wlut language is generated by thc matrix grammar P1 :. 2. For cletails. we carr rna.o..t group of procluctions to P 3 : . Pe: Sr a S L b .11Ruwnruuc Sysrr:MS 341 Letx: i\iHl$[Nt$[$W.r for 7 : {uzn: ut € {a..13. In an extenrlrxl L-systern.ke the derivation a + aa + aaaa + a. It is known that such extended L-systems are gerrcrell models of computatiorr.g not accept abab'l .a. {a}anrr define an L*system.9+S1.:{u. with the interpretaticln that a carr bc replaced 1u by z only if it occurs as part of thc string rag. startirrg fiom the string tr.S2.a.b}-}.S'r*tr. Note agairrhow such special rewriting systemsare able to deal with prohlems that are quittl difficult for phrase structure gremrnars.toi. m Pr:5'r-tr. 9 .7 we changc the la^.a. I It is known that L-svstems with productiorn of the forrn (13. productions irrel of the forrn \ixlatA) + u1 wlrer(ra e E and 'J:1y E E* . 5 2+ h S z a . sec Salomaa(i985). The set of strings so derived is clcarlv 7.

7. + o.342 Chopter l3 OrHon Monpls oF CoMPUrerIor'l 5.1 when started with the strirrg a'l ffi . Find a Markov algorithm that acccpts L: {a"b"'un* i n} 1 . r z l} 1 } . Find an L-system that generates tr(oo-). What is the set of strings generated by the L-system with productions o.o. Find a Markov algorithrn that derives the language L {a"hncn : n } 1}' ffi * 6. + aa1 o. 8.0.

An I n t r o d u c t i o to n co mp u t q t i o n o I co mp l e x i t y rt strrdying algoritlrrns a.nd computirtions.rd.n carried out with reasonabkr be efficiency.id Iittle atterrtiorr to what actually can be expcctcd when we apply these ideas to rual compulers.tecl.re can be crca.. For actual c:ornprrtations. how effectively it ca.srr(:tras processor tirne and mernory spa. Problerns Lhat can be solved effectively a. but we also rnust be rrble to construct algorithms that c:rr.clical world of software dcvelopment. we rrt)tlrl not only to know that a. This is an appropriate starting point firr a theory but cleirrlv of limited lrrat:tica.ningin this chapter.we are cuncernecl with thc efficient use of thc r:omputer's resources. Sometimes. All we can do is to ftrcus on sorne of the more tangitlle issues and create the 343 .recalled tractable. a descriptive term that will Lregivcn a nrore precisc mea.ce. AII this is mrrr*r too complicatcrl to be captured by any abstract thcorv.nbe maintairx. we may ernphilsize the efficiency with which a us(lr'$ problems can trtl solved. we may be rnore concemed with how quickly softwa. efficicncy ha. At other tirnes.l significant:c.problern can br: solved in principlc.or how relirrble it is. In the pra. Wu have been alrnost exclusively corrr:ernecl with questions of the existt:rrc:eor nonexisterrr:e clf aleorithrrrs frrr certain problems.s many facets. we have so f'ar pa. At still tlther times.

The exact type of Ttrring machirre to be used will be discussedbelow. clearly.t btlfore you can evctt tttake a rorrgh guess of the tirne requirements.little rnore accessible.et rrs start with a ctlncrete examplc.we givera brief overview of some compklxity results.. Also. For our rliscussion of computational complexity. If wc have arry hope of producing sorne general picture of sorting. in irscendiug ordtlr.t throw further liglrt on the nature of languagesarrd compUtations. The size of tht: problem will be denoted by n. Sorting is ir sitnple problcrn but also onc that is verY furrdamental in cortrputer science. Wt: refer to this irs the tirne-complexity and the space-complexity of algorithms. Tfuere are probably a few more things you carl t]rink of that lleed to be looked a. hoth irt the selectiorrof topics rlrrd in the formality of the discussion' we will lirnit our dis<Irssion here to issues of time-cornplexity. computatitllr. There is thc question of what computer we use and how we write the program. computational complexity theory is ilrr extetnive topic. that are simply statecl and errsily appreciatQtl. there are a number of sorting rrrethodsso that selection of thc algorithm ilr irnportant. however. the primary colcern is the efficiency of a computir. rnost of these issueshavc to be ignored. Although thc: size of a . Most of thc tesults tha.tioll as met$rlred by its time arrd space requirerttettts.rtrtlifficult ald we will clispense with thern hy rcference to appropriate sources. and tha. arrd we must corrcentrateon those that are most fundtr.metrtal.How long will it take to do this task'i" we see If we now ask the questi0il immtxliately that mrx:h more inforrration is needtxl before we (laII answel it. we will rnake the folIowing simplifving assumptiorts. In this set:tiotr. say. proofs a. Our intent heru is to present the flavor of the subject mirtter and show how it relates to what we knqw about languages and autorrrata. 2. most of which is well outside the scotretlf this text. There are similar results fbr space-cotnplexity. In the stucly of t:ornplexity. Thcre are some rt:sults. rl is obviously the number of iterns in the list. Givetr a list of orre thousand integers' wc want to sort tltem iu som() way. For our sorting problem. 1. the numtrerrof itenrs in the list plays arr itnportant role in how much time will be taktlrr. . trut time-corrrplexity is a. but there irre other f'actors. Fclr the most pirrt.t have been tlevelopecl arlclressthe sllirt:c and time cfficiency of ir. E f f i c i e n co f C o m p u t o t i o n y I. The nodel fbr our study will be a'ftrring mat:hitre. Ieading to the important topig of complexity theory. For this reason we will allow ourselvesir great deal of latitude.344 Chopterl4 CoMPLEXITY Arv lurnonuc:rtot'tto Couru'lA'l'IoNAL appropriate irbstract fiamc:work for thesc:.

can geuerally relate it we in some way tcl rr positive integer.stthe resourrxlreof quirenents grow as rz becornr:sltr. which may difftlr considelably fionr how it does orr a T\rring machine.r-of-magniLude analysis irr which we use the O. Orrr first attempt at rrnderstanding the resour(:e requircrncnts of an algoritlun is thcrefore invariably an ord. 'I\rrirrg Wtr think of a ma. ma.e.lwaysso easily c:harrr. computation has ir timecorrrplcxitv 7(rz). the number of operatirlrrs perforrntxl may vary with the srnelll details of the progra. Unlcss the set is organized in some way. lbr a va.riation.. Givr:rra.st element of the set. we wtrnt to study how the cornputational rrxlrirements grow with the size of tlrc problem.m and so iltay depcrrd strclrrgly on the prograrrirrr{)r.rch.tch or we get to tlrc la. As stated. we give some mea. so the tirne taken by a courllrta. the sirrrpklst nlgorithm is just a l'inear seu. specific typc of T\rring rnachirre as il computa.cterizecl.Ftoru 345 oF problem is rrot rr. we rrrearrthat the computation firr any problern of sizc n can be tlclrnpleted in no rnore thirrr 7 (rz) moves orr rrome Ttrring rnachine. Our inrtrretliategoal will therr llc tcl characterizt:the time requirement of a. problern il^s fhnction of its size. tsut. O.tr:h tlte first cornprrrisorror on the last. 3. Second. In spite of the apparent inforrnality of this approach.chine as making one rrove per tirrrc rrnit.lyze algorithrrn try writing explicit progrrrm$and counting the nurnber of stcps involvecl in solvirrg the protrlem. we coukl a.irrs r.r:tica.na..T21. in the worst case. frorrr a pra. nnd () notation introdrrr:ed in Chapter 1. we are less intcrcsted in its perforrrrarrr:rr on a sper:ific case than irr its general behavior. in the set of all protrlems of a given size.lstandpoint.. Here w(r irre irrterested orrly irr the worst case that hils the hight:st resource requirclrncrrts.. and a key mrrnher r. First. Norrnally. wcl hirvel . rrsing a Thring rnalchineas the cornputer a model.r'A. wc often get very useful itforrniltion.riety of reasons.. but we know thtlt.best we can hope fbr is that thc T\rring machine analysis is representative of tlxr major aspects of the real-lif'eperformance. which we corrrtr)ar(r successivelyagainst :r:11.. TIrc. irr r until we either firrcl a.ningt. Since we may find a rrrir.the asyrnptotic growth rate of the time complexity.rclruNcrr CoMprr. set txlntrr. determine if tlrt: r. there is sorrrc vtr.tion is the nurntlt'r of moves nade.1 Epr.set of rz numbers ir:r1r'21. After scttling on a..mple. for cxa. Irr analyzing atr algorithm. this is rrclt ovrlrlv useful.14.rge. tire prirnirry qrrestion is with how fh. Becaust+ this. we carrnot predict on lxrw much work is involved. wc irre itrtercstcd in how the algoritlrrn perfbrms in the real world. We are particularly conctlrrrttdwith how Lhealgorithm behaveswltt:rrthe problem size irrr:rerlres. First.tional model.othe concept of time for a Thring machine. By saying thrrt a.

with a two-tape machine w() carr use a different algorithm.4 wc t:onslructed a. ffiffi o T u r i n gM o c h i n e s n d C o m p l e x i t y In Chtrpter 10 we a. Repeat Excrcise 2. Review how the r:hoice of algorithrn afl'ects the efficient:y of sorting.}1}. 2.ltclrt what machine tlfs is run on t)r how the rrlgorithm is implernented.ssurnptiousa.irD alld are asked trt tleterminc whether this set conta. (b) Examine if the irnplernentation of the algorithtn on a T\rrirrg rnachinc affectrr your conclusions. Irr Example 9. This allowed us to take whattuvcr type was most t:orrverfettt for an argurnent and even use programs in higher-level r:orrtputer langua. In ilraking this analysis. I LL supposc you are givetr a set offl.346 Chopter l4 Ar'r Inrnonucrrlow ro Coururh-l'loNAl CoNrRLuxtrv to rriake rz comptr. O(n).1.gesto avoid some of the tedium involvtxl in using thc standard Thring machine model. the equivalence btttween the various tVpes of T[rring mAchifles no longer holds.n tlx:n say that the titne-complexitV of this Iirrear search is O(tl). Is the algorithm as efficient as possible? 3.rgrtedthat the various types of T\rring machirres were equivalerrt in their powt:r to solve problems. A look at that algorithm will show that for ut : unb" it takes roughly 2n stepu to rnatch each a with the correspondirrg b. We first copy all the a's to the second .mple 10. we rnacleno specifit: ir. (a) Suggest an algorithrn and find an orrler-of*rnagnitude exJrression for its tinre-comJrlexity. singltl-tape guage 'ftrring machine for the lan- 7={anb":rr. But whetr we tnake cornplexity an issue.'. Therefbre tlte whole comprtation takes O (n2) titne. or even bt:tter. this tirne dctcrmining if the set containrt arry triplicates.We ca.insarry duplicates. rrutttbers l. as we later indicated in Exa.^r1frz1.riscltrs. But.

n operators are thcrr used to combine booklan rxrnstantsa1d variafles irrttl bclolea. which plays an irnporta. Both thcse algorithrns are cletcrministic.tionis essentially l.t has no unit or . then the exhaustive st:arr:h method has compklxitv O (n M) . Tape 2 (b) Tapes aftcr copying of a'.2 arrcl6. the Ierrgth of the dtlrivtr. A logic or boolcla.s.tumpl* l4.r. If we of work with a gralrlllrirr th. The simplest boolearr oprltltclrs are or.4 We uow introcltrc:c the satisfiability problern.\-productions.l:.rrddefined by 0V1:1V0=lVl:l 0V0:0. tirpe.nt role in complexity thexrrv. denoted by V a.1.lgorithrnfor this problem proceedsby sirnply guessing which sequenc() productions is applir:d in the derivalion of rr. so I Fr(qffiFl$ l4.2 TuRtNc MncHrruns AND CoMpLEXrry 347 Figrtre 14.rlrl.instthe b's orr the first.. which we will denotc: tly I and 0. then rnatch tlrcm tr. A nondeterrnirristica.t can take orr exar:tly or twtt vtrlues.If we take tlx: Iength of the inprrt string tt as the problr:m sizen.ga.rrt variablc is one tha.1 Tape 1 Tepe I Tape 2 (a) Initial tapes.3 we cliscussecl rncmbership problem for t:onrextthe free larrgrrages. Both the copying arrrl the rrtatr:hing can be donc irr 0 (n) time atrd wc see that a two-tape malt:hine has tirne-corr4rlcxity O (n). where M depends otr thcl gramma. I f.14.S Irr Sections 5. .rnsta..n .n expressiorts. Botlletr.ively. The situation before and afher the copying is slrown in Figure 14. we have an O (n) algorithm. true or firlse. respecl. T'he nore elflir:ientCYK algorithrn him urmplexity O ("').

is there an assignrnent of values to tire va. ff2.sm.rt frotrt variabl€s 1r2. we guessthe va.9pstands for somtl variable or the tregtr.tttllle.rtl2t' such choices.rcha. t'r. 1z\1:1...ti.hrir.. In We considerrurw boolea. I ..t' e 1: ( 7 1 V r 2 ) n ( r 1 v r 3 ) . starting with fi1.3. make e2 A dr:tr:rrninistic algorithm for the satisfiability problcxn is easy to disftll cover. 2 ) of wlrere each El. rrr.Tz. urrlike tht: llrtlvious exa.348 Chopter l4 Cotr.r..he variables flr irlrd z2 will fa.. by dettotedby a ba. that is.. we hirvt: ir deterttriuistic exhaustive sea. t3 : I na. is essentially irrr itble.kes e1 trtrc:so that tllis expressiclrr is satisfiable. Therassigntnentrt : 0. (14. On tht: otlter haucl.nd artd operation(n) definecl : 0A0:0n:L 1A0:0. lclok irt rt.At7. c2: (r1Vr2)AzrArz lbr is not satisfirr. fi2 : 1...ilrl their Ii :..ltrcof citch r i and then evir.blt:btlcause every a. The satisfiability problem is therr simply sta'ted: givcn atr expression e in coniurrctive norma. the nondeterurirristic: 'flils e.gat'ion. approaclt simplifics rrtatlers.V.. As irr Exarnple 14. 0: 1..V.tplnxlrv Ar'{ IrurRonucrIoN l'o CoMPITTATIONAL by tlrtl rr. Rtr rr specific case..hm. If e is satisfiAgaitr..srr.lsc..ssignrnc:rrt l.and definecl Also neededis ne.titlrr a variable.. t:0....tc: O (n) algorithrn. alrd evaluate Siuce there a..l fbrm..lgrlritlrrn whose complexity is exponetrtial and a lirrear uotrdeterwe rninistic orrt:. However. . we createexpressions (::ftAtrA.. this fbrm.V..s. iu exprt:ssitlrts corqiunction normal form.r) trr() t:reated by or-ing together variables tr.riahlels that will tnake ttrtl value of e true. iras r:xporrentialtinre compk:xity.lltcs the variablesrr . this t:xhiruslive approa't:h the exprrlssiotr. dtl rr<ltktrow of any norr()xl)()rtential deterministit: algclrit.fk nrrgation..$pr ( r4 . . Tlre terms ti. Wu take:all possibleva.

5 . 6.tprnxrlv 349 These exarnples suggc:stthrlt complexity cluestions are affectcrl try the type of Ttrring tnar:hirrt: w() ll$e a. For the tlxur(:ise$in this set. Irr Example l4. nrultitape T\-rring rnachinrl iLsorrr model for stuclying cornplexity issutls.t is thc Lrest. o ("' (n)). off-line 'llrring machine in titre 0 (T (rr)) also r:an be perfornred on a standard Turing rnar:hinein timc O (? (n)). 7. I . Find a. we will rrse a. vrr) n (r1v:r2 vn 3) n (nr vnr vn3) is satisliable. assume that the Thriru rrrirr:hirrcsinvolved are all deterrrfnistir:.1 suggeststhat a.and T'("): O (n) for thc scconcl'lIIow this strcngthen the argutrretrt in Exarnple I4.b}.n be pcrformcd on a standard Tfring tnachitre in tirne O Q (n.t the issue of deterrnirrisrrr v()r$llti noncleterminism is a partir:ularly r:nrr:ial one.2? .) (tl2) ftrr the first t:ase. Wha.)) also can be performcd on a Turing machine with one semi-infinitc tapc in tiue O (T (n)).14. that any txrrnglrtation that ca..). ()a. Exa.2 'I'uRrNcl MecHrrues erur Col. Show 4. For this reason.n wc.2 we claimcd that thc first algorithtn harl tirne trrmplexity 0 (rr. Dctertuine whether or rrot the expression (rr vr. lihow tltat any t:ornJlrtation that can be performcd on a t.') antl the $et:onrl O (n.lgorithrns for a rnultitapc rnirr:hine may be reasonably close to what wt: uright use when we prograrrr irr ir rxlrnputer language. Rewrite the boolcan cxprcsslon (rr Arr) Vnil in conjunctivc nornral lbrrn. you could expect on a one-tape macirinc? a) Show that any cornputation that can be performed on a singlc-tapc.nd tha.rnple 14.wo-tape rrrachirre in timc O(Z(n)) catr be perforrnerl orr a standarcJ'l\rrirrg machinc in time s.} using a two-tape Turing machine. linear-timc algorithrn lor rnernbers]rip in {tlur: ru E {a. bc morc precise and rdairrr that T (n):(.

L in time T (n\. rxlrnplc:xityclassessuch as Some relatiorrs between theser DT|ME (r(')) and T 1 ( n . To do so.ythat a Thring machine accepts a langrrrr. DTIME (ri (")) c DTIME (Tz(n)). 2 ( n ) ) irnplies c NTIME (7-(")). is acceptedinO(T (z)) moves.tion.milieswith clir.t a.ssos inrtornata. A lir. we assctc:iate of of fa. n.. ): O ( T .rticrrlar rr distinguishing factor.r. .reobviolrs.Lis said to be a tnember of the cla.L with lrl < r.ir ritsof'1.til is deflned by latrguages is the natrrre of its tt:trrporary storage. ir lolllll. wltere each cla.t for ()v{}ry 'u € L. tlris implies tha.IIUTATIONAL Corvet. A lrrrrguage. wc first clefitre the time complexity of tr language.stlrtl order of T (z) increases.ge irr tirrre T (n) if every ?u in.L is said to be a member of the cla.t wt) r:arr say is th. t}rere is at lea. Another way of t:lir.L Wu sa.350 Chopter l4 Aru IN'r'Rot)uc'I'Ior'tro Co\. we take in progressively morr: languirgc:s.ss arttomir.ssifyirrg typc but corrsidertiute complexity to usc a Thrittg tnachiue of a 1ra.r:xIrv Longuoge omilies nd Complexity losses F o C language In the Chomsky hicrirrr:hy for language classifica. If M is trondeterministit:. ."f'i.ssNTIME (f (n)) if there t:xists a rrorrdeterrninistic rmrltitape T[rring mar:hine that atrepts -L in tirne T (n).st one $eqlrenctlof rrrovesof lcrrgth O (Z (ltul)) that leads to accepta.ssDTIME (f (")) if there exists a deterrrrinistic mrrltitape Tirrirrg rnachirre tltat accepts .1ini. Whtr.nguirgc.nrx. . llrt frorn here the situation gets obscure quickly.

{tut: M. r The conclusion we can clraw frorn thiu is that there are sorne larrguages thrlt ca. is in -L. that there are languagcs in DTIMH (n3) that are not irr DTIME (n2). does not accept . must be false. wit}r all strirrgs in X+ arranged in proper otder'u1. DTIME ('"*) .ln in.2 allow us to make various clairrn. is Proof: Consirlt:r the a. and so on. Conversely. for example. Assume now that the function / (n) irr the statement of the theorem exists.l T\rring computable function.t. 299). etthough this may be of thcorctical interest. no matter how rapidly the cornplcxity firnction 7(n) grows. Supposeit were.ts(14.. thert: is some M6. There is no tota. By assuming that / is a tota.. IslL.. it operate orr to fbr /(lr.3 LalrcuecnFauu.n then define the Ianguage 7 . To see this..3).we get a contradiction if we assume that ** # L.l) steps} . then M6 accepts trrp in /(ltrr6l) steps. This is also possibk: bet:arr$e sequence in proper order.. The irurbility to resolve this issue is a typical diagonalization result and leads us to conclrrde that the origina.AIso..trzt. Whcn wr: have i. there is no limit to this. it is not clear that such a result .3) We cla. and let.. This gives us iln inlinite number of nested com* plexity classes..rli in f (lur. Mz.im that . Proof: This follows from a result in Hopcroft and Ullman (1979.. This will tell us whether or not zu is in -L. p.l Ttrring computable function / (n) srrt:hthat every recursive larrguagcr in DT I M E (f (n)) .l assurnption. this is posuible. r:onsider any ur f tr and compute first / (lurl). namely the existence of a computable / (z).r1t2.'in the sequence101.L is recrrrsivt:.l) steps. we the is find M. # (14.. Since.rps enn CourlExrrrr Classns 851 For everyintegerA > 1. that there is a language in DTI M E (na) ttrat is not in DTi M E (nB).1 rr.14. But we carr n()w show tha. I Theorern 14.L is recursive. In fhct. We gct cvcn more if we allow exponerrtial timt+ complexitv. We next find the position i of u.n be accepted in tirne z? for which there is no linear merrrberslrip nlgorithm.. there is alwrrys sornething outside DTIME (f (")).lphabet X : {0. 1}.nd14.L? If we clairn that'ur1. so is recursive. assumethat we have a proper ordering for the T\rring rnachinesit M1. But this corrtradir. Wt: ca. DTIMn (t**t). such that tr: L(Mn). This is becauseL € DTIME (/ (")) and every w e L is accepted by M6 irr time / (ltrrl).L is not irt D'I'IME (/(")).

. 3. At this point. as we calt recognizt:strirrgs in this langua. Tlrc argtrtnetrt given there r:tlrr lltl rrsed for eveu rnore complicated la. but couuting only every seconrl syrrrbol.7 that tht: c:ttrrtext-free rclcogtrized in time O (n.). Ct4ry tlte itrput from the inprrt filc to tape 1. We can get a little of rrroreinsight into thc matter if we rela'tethe c:omplcxity classificationto tht: languages in the Chornsky hierarchy.gt: DT'IME (na) might bc. Nondetertninistictr.ermittistic finitt: autornatorr irr titrte proportiona.s any pr:r. Actua.cp { DTIME and Lar E NTIME (n).lly.ch syrntlol orr tape 1. we cirrr show tltal L € DTIME (n.ge thtr algorithm 1. Ilvery regular language can be rerxlgrriztxl by a det. TIds can be done: we look at ea. 2.b}. e * o m p tie . Brt.ngrra. We havc rrlready establauguage {a"b' : rr. € non-contcxt-f'ree by This is straiglrtftrrward.).l to tlrc lcrtgth of the input. (n:]) .) if wtr c:arrdevise au algorithm for flnding tlx: rnitltlle of a string in O (n) tirntr.mplc:s tlrrrt give sorne of the mrlrc cltrviousresults. We leave the details tls iln t:xercise.pe 2. I It follows fiorn Exarnple 14.r {n.> 0} can be Iislxxl in Example 13. keeping a count on talrc 2.la. so L e NTIME (n).ngrrilges.352 Chopter l4 Arq INrnotrtrc'tIouro ClonrurlTtoNAI-Coltplnxlrv ha. we have no t:hrtl wltat the fu r:Irirracteristics a.} is fi NTIME(n. We will krok irt sotne simple exa.2 that [.ctical sigtrificance. Ctlpv the secoud llart to ta. Clearly all of tlre steps r:an bc dorrein O (lul) time. Thert:ftrre Lanc. $ . I f *innpidlilUildlt The language7': {wu: rr. Compare the symbols orr tape I and ta'pe 2 one bv tlrrtl. C DTIME (rt). DTIME (n) includes mtrc:hrttore lhatr L6a6.llvgut:ss thc rnirldle of this string.

: {www : tr € {a.sses nrul NP P 353 Consider now the family of context-srlrrsitive languages.Lcrp are in P. more and nrore of thc fermiliesL.ttempt to produce meaningful hicrarr:hicsvirr. 3 .l.lready seen.t we ca"rrnotclaim from this that Lcs c DTIME (nM) becausewe cannot prrt arr upper bound on M. tha. rl) I This r:la. This leads us to consider the famous corrrplclxity c:lir^rs p-UDTrME(nt). hrlwever.ssincludes all languages that are accepted by some deterministic T\-rring machine in polynornial tirne.R1G.14.4 THn Coupr.. Lcn. Lcs arcl txlvered.M. S h o wt h a t . and . 4. But tlte connectiorr between the Chomsky hiera. ftrr example by removing sorne urrirrtercsting distinctions.how ma.vea. As we htr.lr parsing is possible here also since at every step only a. 1.ny tapes).5. Note.. Exhaustive searc.mples we note a trrlrrrl: as 7 (n.L466. without arry rcgard to the degree of the polytrotnial. TheComplexity P Closses ond NP ffiffiffi*ffiffi Sincc the a. limited number of productions are applicirtrkr. I Frorr these exa. L : {wwdw:w € {u. Show that thcrc are langutr.4}+} is in DTIME(n). . lct us ignore some f'actors that are less irrrportirnt. say. Cotrplete the argumcnt in Example 14. We (rfirrargue that the difference between. DTIME (n) and DTIME (n2) is trot fundamcntal.g. . every string of lcngth n can be parsed in time n.nxlly Cl.) increases. since some of it depends on the specific modtrl of TtringE machine we have:(rr. 2 . and it is not a priori clear which model is most appropriate frrr ir real computer.rchy and the corrrplcxity r:lilssesis tenuous and not verv clear.time-complexities witlt differcrrt growth rates a. where M clcpcnclson the granurrar. Therefore. Show that 1.ges that are not in NT'IMfl (2"). such as that betweert DTIME (nft) and DTIME (rzh+l). b}+ } is in DTI M h) (n).ppearsto lre unproductivrl..

Whilc it is generally in believrxl that there are sorne ltr. Is the Cook-Knrp thc:sis a good way of separir. firnctiort like 20'1". no one has yct fourrd an exanrple of this.l.But what a. wtl irlso int.rp thesis. while in prirrciple cornputable. rlrrc srrrely cannot do much with it even for small n'.tirrgproblems we c&n frotn l.lling a. come$fiorn arr at.ctnbkt. cxrmputer scientists have introducerrl a variety of .particularly in the clilss P.too)f While the Cook-Karp thesis t:a. but whtlt is rrot known is if this c:orrtainment is proper.roduce NP : ! I'rzrua ("t) .bility orr ir formal basis.a clear distinction exists between llroblems in P and those not irt P. and its require:trcrtts will increase very quickly with the probletn size. In the Cook-Ka.tiorrs. DTIME (rf).lls Iem tra.prohlcrtt with this complexity intra. and wc rrriglrt feel justified in ca. Among practica. I'he interest in these complexity classtltt. irs well as on supercomptrtt:rsyet to be designed.n ilrry llolytrornial. Obviouslv. Ohviously PgNP.1ly cut. This is one of the funda.llvpossible. and onc thnt is not is said to be intra. Certain comprrtir.ttt:rnptedto put the itlea of intracta. To utrdersLandthis bettelr. Even for a. a problem that is in P is called [racta. while thosc outside this class tetrd to havc exponential crrtrplexities.tiorr between the complexity cltt^sscs and NP has generated partir:ular irrterest alnoltg computer scientists.ctable. One atternpt to defile the term intractable is ma.dcirr what is generally called thc Cook-Karp thesis. high resourcerequiremrlrrtsthat in practice they mrrst tlc:rejected as unrealistic on existing computtrrs.c:table.ppear$to bc fundamenta. thrrrc is rro realistic hope of tr.bout problcrns suclt a probtlrat are il D'I'IME (n. P The study of the rela.mental urrsolved problems iu the theory of cornputation.rrguirges NP that are not irr P. rrny rxrrnputation that is not in P hirs tirtre complexity that grows faster tha.354 Chopter l4 Couplnxtrv Ar'r lrutRoouctIoN To CoMPITTATIoNAL Since the distinction bctween deterministir: irrrd noldeterministit: txtrnplexity classesa. c:ornputerscientists have ir. say rz ) 1000.have such prrtatiorrs.ble.nnot'l The answer is not clearwork with rea. At the root of this is the question whcthtlr or trot P:NP. prirctical algorithm. Such problems are sorrxrtirncscalled intractable to irrtlicate that.l prcibletns. To explore it. this will be excessive fbr large ?r.tempt to distingrrish between realistic and urrrcalistic cotnalLhoughtheoretictr. or DTIME (rz3). The jrrstificntion for tlte Cook-Karp thesis sccrnsto lie in the empirical obscrvatiorr tlrat tnost practical problems irr P are h DTIME (n).hose we ca.Iistic:a.

L. Definition 14. therr .'1l-H:til.discussion of which t:irrrbe fcrund in Hopcroft and Ullman (1979).t this rrreans ha.L/ e NP is polynomia. Loosely speaking.L2 in such a way that w1 € L1 if nnd only if w2 €. an NP-complete problem is one that is as hirrcl as any NP problem and in sorrre sen$e is equivalent to all of thern. if Lz e NP.t is P:NP.14.4 A Iangrrage /. Florr this we see that if . rr. is said to be NP*corrrplct<: if I e NP and if every . We e'cocle specific instances as a string that is actrrpted if and only if the expressionis satisfinble. L2.L2. a largc rurrrrberof other NPcomplete probltrms have heen found. Similarly.T.ngrragein NP is also in P.4 THU Col.t the satisfiability problem is NP-complete is known as Cookts theorern.L2 if there exists a determirilstic T\rring ma.L1 e P. a' be vir:w.chineby which an1' u1 irr thr: alphabet of "L1can be transfonned irr polynomial time to a tu2 in the alphabet of . It follows easily frorn these definitions that if sorne . llM A ltr"nguage.Lr is polyrromial-time reducible to . The staltementtha.sa language problem. thcrr trr E NP.lplnxrry Cllessns P aNn NP 355 related concepts and questions. I In addition to the satisfiability problern.*. tha. This problem is NP-complett:.l-time reducible to . For all of thenr we carr lirrd txponuntirrl .central role for the study of this tpcstion.L1 is NP-txrmplete ruljn". This puts NP-cornpltlterrttssin tr.T tinte algorithm f'or any NP-cornplete larrguage."*j#l"ll'-'. Wha.Llis said to be polynomial-time reducible to sornelirrrgrra.sto be explained. therr cvcry la. Orre of them is the idea of an NP-<:omplete problern. arrd if L2 e P.-'r'J::'::.ge .rl a.

: is also NP-complete. ** and polynomial- 2.nxIrv ArrrlrutRopucuoN ro CoN4r. t-rut until someone produces an actual language in NP that is not on P or. Prove the statement that if a larrguage . Is it possible that the question P : NP is undecidable? . the question remains open. and compile a list of problems that are NP-cornplete. then -L. 3. alternatively.lpr. but for none of them has anyone discovered a llolynomial-time algorithm.356 Chopter l4 Cotr. until someone proves that no such language exists. 1. Consult books on complexity thcorv.Lr is NP-complete time reducible to tr2. These failrrres ltlad us to believe that prohably PINP.urATIoNAL algorithms.

Let' Z : : 51 U 52.92.. This can be proven by an induction on the rrumher of sets..+r Z l) 5. Then r f 51U.+l : Sr f-'l fi "'15' I F. To prove that two sets are equal. 6. Therefore ZnE. that is. that is r € .1 5. then r is uot in 51 and r is not in 52.€ 51 U Sz.91tl 52..U Sr.. Suppose n .. By the stattdard U DeMorgan's law M:E)8. that is r e 5r U 52.9r U. 357 . U 51". Conversely.9*.n-. wc mrrst show that an eletnetrt is in the first set if arrd only if it is in the secorrd.. 5'.+t. With the inductivc assurrrption.q Solutions nd Hints for Selected Exercises Chopter I Section 1. n Fr rt ..+t...if z € . Then.5'z-.9r t'l . 5z completing the ilductive step. Z :5.+r.92. which means that r catrnot be in 51 or in 52. thc rclatiorr is true for up to rz sets.

J. An arguilrent by contradiction works.. If r is in . it carr be writterr as the oroduct R. Suppose 5r : 52. if not. Many string identities can be proven by induction.::_" By indrrction then.il". By induction. Take now a string of length rr { 1.T._ .. Then 51 l'-l$2 : ^91 5"2: ^9rfl^9r = O and tlte entire tl expression is the empty set. ):r#. Suppose that 2 ... Thcn r e Fz so that SL1E1 + fr. a rrecessaryand sufficient condition is that the two sets be disjoint. and so can rt. nl nn-L n n n n n n is the product of factors lesstlurr or equal orre. n! : O (n"). (c) Since 21. contradicting the fact that V2 is not rational. Becauseof this. L2. Srrppose now that St f Sz and that there is an element r in Sr but not in. Ttren n 2-rt: WL gives \/t = 2 m .14 were ratiorral.. By the inductive assumption.2 R 2.. Thc c:ornpleteexpression carr then not be equal to the empty set. 1 5 . Supposethat (zu) : ufuft for all u € E* and all u of length rz..Sz. Suppose that every integer less than z can he written as a product of primes.. the result holds for all strings.n n1.: ?'tyTL2 where hoth factors are less than n.92.say ur : ua.91and u is itr ... Therefore.then u is not in (. If z is a prirne. .5z.51U Sz) .. they both can be writterr a*sthe prochrct of primes. 27. Then * (. there is nothirrg to prove.358 Alrswnns 8 . Section 1.

(d) The arrsweris ca. t:ilrr eaclr of which is iri -L. 11. TIte two carr be t:orrtbined irrto a singlc gramrnar by 5 51152.baa.t the grammar can geuerate atry string trr € {4. (a) Generate one b. 10. S+aA+abS+o. there is no possiblclrlt:r:ompositionfrrr haaaaabaaaab.u. . b}* rrs klrrg ir. This shows tha. this so string is not in I*.'s. However.crsus S59 4.Solu:uorvs nNn llrnrs pon Snlncrnr Exlln.e. I'his leads to the easy solutiorr SA aaaA aAbl. (d) Wc first go'nerrrte three c.> 0} . Tlrc first iu cxrvered by St + aaaStl(t t thc sec:ondby Sz aaaSzlao.ndI'rirlmod3 :2.uba. S --+ AbA A -+ aAblA B -.then add an arbitrary nurnber of a's ancl bts arrywhcrtl.)} 3. (b) The problem is simplified if you break it into two cases.aa.ab.baA+abubS fiom which we see that L(G) : {(ob)" : n. The secorrdcan generate any number of ats and b's in atry posititln. Sitrce abaabu. (ru.sic:r stxrif yorr notice that to Lt : {1""-l:1b"' : ttt } 0} .e.sn.aAlbAlA The first productiorr grlrrcratcsthree a's.o. 13. bu decomposed into strings ab. Similarly.lrulmod 3 : 1 rr.\ 14. baaaaabaais in L*. S --+AaAaAaA A . thc string is in tr*.srrrarry more b's as needei. then an equal number of ats and bts.. finally a. bAlA 13.

The automatort lus to remember the input for one time period so that it r:arr bc reproduced for output later. t ) /-\-'' .digit I digit magnitrrde digit --+ 0l1l2l3l4l5l6l7l8le it Tlds can be considered an ideal version of C. integer --+ sign magnitrrdtr sign + + l-lA magnitude . When a set of three hits is dene. /)uo . -( ^) +\ r Vi\ b/a 7q \ l )---\.13 and modify that grammar so that the start syrnbol is 51. it can be derived hv S --r ^9r5. If this string start with an a. 7. but the completion shtiulrl trc obvious. Remembering t:an htt done by labeling the state with the appropriatc irrforrnatiotr. We remember inprrt by labcling the states mnemonically.If it starts with a b. (a) We can use the trick and results of Exantple 1. though.sthe fbrm ur : o:tltt where trrl e . then it ha.tion can be taken care of by S -+ rr./ Y . This situa. The following solution is partial. . \-)uh 10. place a Iimit on the number of digits.$'r.3 1. ir^s puts no lirnit on the Iength of an integer. Section 1. Let .360 ANSWERS ro. Most real crompilcr$.Lr. Consider then a $tring u € L. The a*s label of the state is then produ<:erd output later.L1 be the language in Example 1.13. w{: produce output and return to the beginning to profft$$ tlx: rrext three bits.*.

the transducer must rememberthe two precedinginput symbols and make transitions so that the neededinformation is kept track of. In this case.pcteD ExERcrsEs 11. .SolurroNs ANDHrNTspoR Spl.

(c) Break it into three caseseach with an accepting state: no o'il.If they are not correct. two a's.1 2. putting the dfa in an acceptingstate if the suffix is bb. three a's.we keeptrack of the last two symbols read. The solutionthen is quite simple. t)tltl a. If the prefix is correct. the string is rejected. 7. \/" \y'" \y'd \*J l (a) The first six symbolsare checked. A { lf+il A ll-+ilA ti+t \/o o. (a) UsestatesIabeled with ltulmod 3.362 ANSwERS 2 Chopter Section 2. . t t I t ]t+il . A fourth a will then send the dfa into a non-accepting trap state. A solution: \t +ti n' \n' \a' \n' \n"o .

with the first part of eachlabel no (ru)mod B. (a) Count consecutive zeros. (}--O-r-O:{} Then put in additional transitions to keep track of corsecutive zeros and to trap unacceptable strings. . part rz6(tr)mod3.Soluttor'rs aNn Hn-rrs FoR SELECTEI ExnRcrsns 363 (d) For this we usenine state.re the second then simple to figure out. 9. The transitionsand the final statesa. to get the main part of the dfa.

Assutne that (J. Then every wa.nnotc:ontairra.helanguage is finite. fbr L {u"' : n. then complerlt:rrtthe solution. 2 1 .lkLr.tc:to any final state. (h) AIso by cortttaclictiou.rtritrarily lotrg stritrgs.:4}. -{. . A uartial sketr:h of the solutiou is below. Suppose G1a has no cycies in arry patlt from the initia.tt:. with atr acceptedstring.s$otrc c:yclein a path 'Wt: carr then use the cycle f'rornthel irritial staLeto sotne accepting utir. But to generatean arbitrarily long wa.lk ha*sa.364 ANSWFTRfi (d) Here we need to remembcr all contbinations of thrcc bits' This requires I states pluu sorrrcstart-up.14ha. Iirrite trumber of stcps.n \ 1 3 .berlrxl a finite language ca.sto berof firrite length. The solution is a little long but rrot hard. arrrl so every accepted string ha. But this irnplies that t.l sta. Ttrl tlirsiest way to solve this probll:rn is to cotrstruct a dfa. (a) By rxrrrtradictiorr.

T'he strirrg a. A four-sta'te solutiorr trivial.rting sLatep6. 7. rerrrtlvcster. .2 4 .lent to the origina. It is straightfrlrward to see thnt the new nfa is rxlriva. Introduce ir sirrglesta. This is the kind of problem in which ylrr jrrst ]rave to try diffcrcnt ways. Section 2. q r } . r.Sot.te Herr:is one answer: one. Tltcre il. Here is one that doos. d(*Q r .rting state status frrim Qe. Here is one of tlxlrn. d * ( q o .tj'uor'rs ano Hlr'r'r's noR Snlnc.rnr FixnRcrsns 365 24. Next.s three diffi:rent symlrols arrcl tlxrre is no way this r:iln he accepted with f'ewerthan tlrrcc utates. 17.l one.hclia. Then add a transitiorr d ( P 6 ' . but it takesa little expcrirnernting get is to a tlrrr:rr-sta. a ) : { s o . t r ) : { q u . Prohalbly rnost of youl tries will not work.L A a / *V-\'_/-\-i\ / r n \ No. 1 5 . e z l t . \ )= g u .e t .re many diffcrr:nt solutions.

w) e F originally. then d* (qo. so both the original and the modifies nfa'$ are equivalent. This givesthe dfa 7.r) : {pf } afber the modification. as can be seen by constructing dfa's that accept {I'o}.3 2. Just follow the procedurenfa-to-dfa.366 ANSwERS 20. It is a sirnple matter then to argue that if d* (qs. Introduce a non-accepting trap state and make all undefined transitions to this new state. Then make ?7 the only final state. Since this construction requires tr-trarrsitions. . Generally it is impossible to have only one final state itr a dfa. Introduce a new final state p1 and for every g € F add the transitions d (q. it cannot be made for dfa's.A) : {prr . Solution: Section 2.

.wz.. \r' tl accepts -L. Then the nfa .. The trick is to use a dfa for tr and modify it so that it remembers if it has read an even or an odd number of symbols... if part of the dfa is ...ecrpo ExeRcrsns 367 8.w^}. For example.L : {trrr.b h lL. 14. One stllution is _ffi a.Sol. This can be done by doubling the rrumber of states and adding O or E to the labels..... Getting an answer requires some thought.uttolrs er'ro Hrr-rTS ron Snr. so the language is regular. This is not easy to see.= + { tfr acceptsw. Suppose that .

\): Q. (b) introduce a new initial state p0 and add d (Ps.\aa'. Suppose have a dfa that acceptstr' We then (a) identify all statesQ ttrat can be reachedfrom gs.2&3o'4. Although the constructiorris plausible.u)*qi... . It should not be hard to seethat the new nfa acceptschopz(L). originaldfa accepts eaen(L). a complete answer requires a proof of the last statement. reading any twosymbol prefix u. that is Q:{SeQ:d*(go. new automatonwill accepttra2. and therefore we 15.\transitions.368 ANSwERS its equivalentbecomes Now replace all transitions from an E state to an O state with . you shouldbe able to convinceyourselfthat if the With a few examples the a4o..

Since go a.fbllowed by the requisite nunrberof b's.d* (qr. 6 . 1. Sirnilrr.rly. Continuing this way. cornplerntrntthe final state set to give a dfa fbr . denotes arrystring of 0's ancl1's.gr) arrd (g3. Solutiott: uaa(1. % f F and qa € F.. qo ancl g. (A + (r:) The complernerrt the language 5(a) is harder to find.1 2. But this dfa is smaller than M. 5. so (3 and ga are distinguishable.L. (Ienerate or rrrore 4 a'$. sr) e2 and Qa are distinguishable.hle.t'b"". Assume that q6 ir. Then we can corrstruct a srnallcr dfa M that acceptsT. T}ren use the procerhrre marh to firxl the indistinguishable pairs (go. are indistirrguisha.l. Ncxt.treca.gr). tnffi. rerrrovethe inaccessiblestates {2 and 4a. By contradiction. must be irrdistirrgrrishable.3.ndq. $o does(0 + 1)-. 1 0 .) f F and d* (q. brrt .a) € h-. Yes.ndd* (qo.o) # -F tr.SoLurroNs eNn HINrs rron Sunc. uu.redistinguishable and therefore the dfa is minirrml.(1.use ((o+ t)(0+ 1). d* (Sr. we see that all statcs ir. This thcn gives the rrrinimal dfa. By contracliction.* b + bb+ bblt). ur) Q1and {3 are distinguishable.is not minirnal.4 2' (c) _@_"*64__r_@___@*G) Tlris is rninimal for the firllowing reasorr. First.ssumptionthat M is minirnal.uu) e F. A utring of irr is not in L If it is of tlxl fbrm o.). (a) Separate irrtrlruNes ???:0. with either rz { 4 or rrz> 3.D Exnncrsns 369 Section 2.r.nd q6 are indistirrguishable and indistingrrishability is arr equivalence relatiorr (Exercise 7). contradictiru the a.2. Assume that -il[. Chopfer 3 Section 3.

1 4 . m . = ( r ' 1 * r ' 2 * ) * (I(r1)u is true.kel the strings in which a b is followed by an a. (a) Create all strirrgs of length tlree and repeer. A closecl gorrtour will be gcntlrated by an expressiou r if tr. but is not enorrgh since the a. 18. (c) You just have to get in each symbol at lcast once. 1 6 . interspersedwith l's.: l . S p l i t i n t o t h r e e r : i r s e sn t .t.b ba.* b)* ba (a+ t). (r') and nu (r) : na (r). Solution: (. Bv thc given rules (r1 +rt)* denotes the langua. that is the set tlf all strings ( r 1 ) a n r l L ( r r ) . The expresSiorrfor an infinittl latrguage mgst involve at least rlrre starred otherwise it r:arronly denote Iirrite striugs. . etc. givirrg six terms that cau be adrltxl.nd only if n1 (r) : n".. then this string ca.01. although quite lorrg.ge of of arbitrary cont:er. 2L. will precede tht: b. 1 b)* ba + hb(a * b). (c) The staternent ( r ' 1* 1 2 ) . Ea. 7 2 . We rmrst also ta. m e n t so f . then rcpeat' But don't fbrget the casewhen there are no 0's at all. a n d n . o. bb. The tcrrn (a + b * c)*a (a * b * c)*b(a + h+ c)*c(a + b + rNwill do this.c:h 1 2 . L which is the same set. If there is one subexpressiorr. Sohrtion: (1*01.llyrrot hard' 1 5 . . B r .nbe rcpeated as often as desired and thcrefore denote arbitrarily long strings.te permutations of the three symbols. r t t ') 3 ' case has a. (c) Create two 0's. r 2 * ) * d e n o t e s( ( r . starrecl sullexpressiontha.)* + 1*. z ) 3 . . 2 3 . is conceptua. Enumerate irll caseswith I'rrl: 2 to get (a * aa (a * b)* aa I ab (a -f b).strrrightforwarrl solution. ( " r ) ) .370 ANSWERS irr this cloesrrot txrrrrpletelydescribe Z.A short sohrtiorr is ( ( a+ b + c ) ( a + b + r ) ( a * b + c ) ) .t tlenotes a non-errtpty string.tcrrations eletr ("r)).\ + a + u(r + aaa)b* + u*bbbbb* (4. * : 9 . . ) . The answcr. r t ( r 1 . n Z 2 . For all the completc solution you rilust genera.( I ( " r ) ) . : 1 .

its value is at least 64. but gives a more cornplicated answcr. The bit string must be at least 6 bits long. If you sec this. so arrything will clo.2 S.c. then the solution ( 1 1 1 1 1 0 1 0 1()o + 1 )( 0+ 1 )( o+ 1 )+ + + 1 ( 0 + 1 )( 0 + 1 )( 0 + 1 ) ( 1 0 ) ( 1+ 0 ) ( 1+ 0 ) ( t + 0 ) . Notice several things. This can be solved from first principles. without going through the regular expression-to-nfa construction. The latter will of course work. Solution: 4.ND Hrr-rrsFoR SELECTEo ExeRcrsns BZl 25. .Solurlons . Section 3. If it is longer than 6 bits. + readily suggests itself. (a) Start with Then use the nfa-to-dfa algorithm in a routirre rrranner. then either the second bit fiom the left (16) or the thircl bit from the left (8) must be 1. If it is exactly 6 bits.

(b) First.2. we have to modify the nfa so that it satisfies the conditions imposed by the construction in Theorern 3. . 10. One case is l l t \ ir'-----\ }{ ( rr )--"-----*( I ) \+/ 11 n Since there is no path from qj to qa. This is easily done. gotten by looking at aJI possible: paths. Removing the middle vertex gives bb+ah The language accepted then is . E. The result.a*(a I b) ab) .372 ANSWERS 7.o qlL **19 The other case can be analyzed in a similar manner.L (r) where r : a* (a + b) ab (bb * ab* o. one of which is qs f F.the edgesin the general case created by such a path are omitted. is \ / t / Ou.

.onSnr....t irrrd introduce a "parallel" autorrraton with states Q0. if pa.rt of the original dfa looks like tltett the dfa with its para.nsert(L).. For exarnple.ncrnl ExnRcrsns BZB Therr rernove statrl 3.. removclstate 4.SoLUTroNs nr'ro HrNrs r.L if arrd only if the corrstnrcted nfa accepts i.. Then arrange matters scl that the spurirlrs symbol nondettlrministically transfers f'rom any state of the original automaton to the corresponding state irr the parallel part.llelwill be arr rrfh whose corresponding part is It is not hard to nrakc the argurnerrt that the original dfa rrccepts . aa+b Next.. start with a clfa witlr states eotett.4t. (ab) + (aa + h)(ba)t bb -ai}-r-o \---l \/ 0 The regular exprelssion then is r' : (ab * (aa + tr) (Oo)-OA. .blem until yorr see the trick. ) 17' (a) This is a hard pr.

then w 1 Btnttl.1.wfuRB : (Buw)R completing the inductive steP. 5 --+ aaSlA A + bbAll tnkirrg care of casc (i).3 a. Right linear grarrrrrrar: S-aaA A ts C Left Jinear gramma. 6( q r . thcrr t tR can be <lcrivedin the sirrrrenumber of steps by G. it Inust have the form ur : Awr. Beciilrse ur is cretr.FA can be derived via G. The solutiorr then falls out eir"sily. d (qr.374 ANSWER.5 b) : qr d d (qr. If wc now apply A --+ Bu.uRB. This is straiglrtforward and gives transitious such a^s d (qu.b) * Qo ( g s . aAlB bbbG bClA we (:arr show by irxlrrction that if tu is a sententirrl form derivtxl with G.b ) : g r d .r: S -+ Abbb A'-+ AblB B --+aaU C --i a(. By the inductive assumptiorr wR : u.R -. (a) First constnrct a dfa fbr -L.d (qu.J1. Split this into two cases: (i) z and rn are both even and (ii) rr and rn with are both odrl.a ) : e 2 . But 8 contairrs the rule A ---. o) : (1o.S Section 3. o) : t1:1. (qt. l o . so we (larl tnake the dcrivation ll.(q2.b) : .o) : q1.tetl with left lincar derivations. rZ. with A € V andrur e T*.\ t .

1 ( ( q .n) : pt.b) : p. with q1 a trap sta. we see that the only firral utate is (qr. a ) : ( q t .f (pr. aq1 lbg2l.teand final state 91. p o.te p2. p t ) . b d ( ( q t . p z ).) n L (baa*) : baa*. but tedious. We can show that their union is also regular by constructing tlrc f'ollowing dfa.) is given by d (po. ) : qr.SoLU'r'roNs nr'rn HrNrs poR Sur.6 (p1ra) : pz. b) : Ft.6 (po.\ qr --+ bqnlaqo qz Qzaqtlhrlo tulzlbQt 16. Therr thc' construction of Theorcm 3. 51 is regrrlar as is 52. a ) : ( h . p z ). p r ) .L(baa. l i ( g r . When we complete this construction. p t ) . L ((a + b) a. l r ): g r .a ) : q r . d ( ( s r . p r ) . A dfa for 6 (h. etc. a d ( q o .a) : pzl6 (pz. A dfa for . (rr) The construction is straightforward. Chopter 4 Section 4.pr) and that L ((u + b) a. with Iirral stir.b) : qt. r . p o . .6 (pz.4 gives the answer qs _. . \ Ff^f"'L(qtl The condition that V1 ancl V2 should be disjoirrt is cssential so that the two nfa's are distinct. a ) : ( q r .1 2. Obviously.) is given by d (qo.eflret ExnRcrsns 378 with qs the initial and final state. Flom this we find ) d ( ( s o .b) : pl. ) ) : ( q r . .

theu L n L ((u.2 1.2.) = a. The resr.7.L2.r. If Lt c .Lft is regular. Suppose Jr :1yt.r.rlt tht:n follows fronr closure untler cottcateuation. X*.9 .'l L1 .l.5r. P 2 ) . $ince by -Exarnple4.W i t h o r r t l o s so f 26.Pr) genertr.we calt a"ssume that yl arrd V2 are disjoint.u1 ab * ba I bb).$o we can use Theorem 4. final state.51 lS2' (b) ln P1. werillstt have att algorithm for sct irrclusion.l sta.ndz € 7*. .L2 is regular' 18. Then use thc rxprality algorithm in Thcxrretrt 4. 5 2 . The rest ttrett follows easily fiorn the known closures unrlcr uttiott antl compl(lrrxlrrtatiotr' 14.tt:.thetr L1l L2: Jlz. LylL2: given statement would then imply tha. If . The Ieft side is regula. Frotn the clfa ftrr. We cim use the following construction. 5. By closurc urtcler reversa.tiorr that sitrce tr1 is Iirrite. Corntlirre the two gra. replar:t:every production of thc forrn A+ rt with A e V1. . Section 4. (c) In Pr.rt sytrtbol and add prodrrt:tiorrs. Tlrtl result then follows frorrr closure under irrtt:rsection and complcrntltrtir.r:Sr.lity.Lz)n Zi) u (r' 1rz) . fbr aIV tr2. Sint:cL1l L2 is regular and wtl have att algoritlrrn fbr set equality. by A . It can be olrtairreclby sta'rting fiorn the set identity Lz = ((Lr u.L. Lz) : L1l-) L2.L2 is regular. Find a. Use trr : X*. cotrstruct the tifa for -Lft.by A+:rSz. usitrg the constnrt:titlrt suggestedin Thcorern 4. there exists a' rrtetnbership algorithrn for it. ( a n d G z : ( V z .mmars aild (a) Make 5 the new sttr. Therr. is Tlre key observer. The an$wcr is yes.7.t :r.6. which is regula. 2.376 Ar-rswHlrs 7. replace cvury production of the flrrrn A+ frt with A € y1 atrdr€?*. L1 tt l'2 is finite atrd therefore reElrlirr for all tr2. 12. Htrrc you need a little trick. Notice that nor (I'1. T .ny.Tlrt: 16. ir.tiorr.tcsP suclt that there is a path fiom tlrt: ittitial vertex to somt: elertrent of P' and fitlrn tltat elenent to a.L contains Ilo even length strings. Then make every tllrlrrrent of P a fina.llsta. 12.

But r:leir.qxo Hrw. (a) Givcxrrn. directly.-. rrsis ea"rily shown. If we ta.r. Lherermrst bc ir. ta. (a) Ibke p to bt: tlxr srna.r known closures fbr regular Ia. > rn}.ge. h : 0 .L is not regular.s r.The strirrg.r. r rp + ( i * r Irtplr is nclt irr ttn lirngrrir.l choices. wa. For the dfa ftrr -L to l)r(x:e$rJ the middle string u reqrrirt:s ir. (a) The larrguagc is rcgular. then u. 1 S .rnt O2rn.'rrthrr. n. we carr thclrrviolir. 9 .r. tr is regular.rrrl But Lrt-lLz : {att6nt . We see this frorn L : h rf . . Then by the closure of regular languages under c:ornlllrlrnonttr. which is regular. As a courttc:rtlxilmple.i)ft > rn..keL1 : {a"b* : n.rly this cycle can be repcatrxl a. If y consists of orrly o's.(aa + bb)(a + b) (a + b).tion.soften as desired wilhout charrgirrg the acceptability of a strirrg. If this is longer than thc rurrnber of sta. Now ig is a.ryrmrst then he aa and the putnpcxl strings will be 'ut. By contradiction.o(w): no (u)} whit:h. Suppose that 1.r expreusion for "L is (a + h) (a * b).m't (i'* I)k r.1 : p .cryclelabeled E in this walk.Lf rrnd tlx. pir:k rt. n. since anv string tha. 5. : a. wert: rrlgrrhlr. r t s cottstnrr:t rcgrrlilr expressions. 1 ) f u : p ( h + 1 ) i s c o m p o s i t ea r x l E .) The Iauguage is regular.string of a's of length fr.nguages.llestprime number greater or tlqrrirl to rn rr.tesin the clfa. 1 m. L(a.nd : clroose'Lt) trP.} * L2 tr.t: ettzb't'a2"'.sily (lr) Tlris lir.*h*). firr which one can ea. If we chooseu : a(Laauab-a.kei ) 2 then nL+('i . Brrt Z : {to . If the opponent choosesgr as consistirrg of b's.ngua. 4.t has two consecutive symbols the sarne is in the languagc.3 2.tothe condition ft < L 1 1 . A regrrltr. is not in /:. our is opporx..onSelncrrnn Exnncrsns 377 Section 4. we use i : 0 to violatr: thc txrndition n ) 5. (e) It does not seerneasy to apply tht: prrmping lerrrma.Z would also be regular. bot}r of whic}r arrl rulrr-rcgrrltr.lk in the transition graph of lt:rrgth lul. t h t . I f w e t a k e i . is not regular. Tlre prolrosition is false. . so we prcltxlerd indirectly. This is nxrst qilsily seen by splitting the p r o b l e mi u t o c a s e s u c h a s l : 0 .ge not regula. (a.Soluuons . > 5.s severa..so lhat wi : aP+\i-t)h.

Obviously. Since the ouly possihlt: idtlntification is ururft = hta. With the second choirur.\ or 5 . . 2 . For ea. such as y : (.r languir. Since neitlrtlr of these chatrgesthe nurnber tlf a's arrd there.o.t: (ab)'"'aa(ba)"'. This ndds arr extra a attd an extra b.nguirges Li: : l { u i u u f . it is not. if tlrtl prefix cotrdition holds after n z steps.Consequently..Li : tlb' ti : 0.Li is the set {ttrtrrft}. If rr. No.take any string.Take '7. ) ti. L 7 . To get a sententirr..t thtl llrelix conclition no (u) > n6 (u) holds. ..fJer { 1 stt'ps. is irr the uniou of a. : l t r i l= i } u { r .This follows from the fa. .r..r.. Liis regultr. so we havc a trirsisarrd the induction succeeds. Chopter 5 Section 5.} ilntl so also itr -L. It is qrritc otrvious that atry string genera.. . As cotrttterexantple.rly to hold. To justify tlris. In the sary llow has severir. the prefix r:orrdition will still be satisfied.ll t..thc lettgth of u0 is odd. If it we take i greater tha.. 1 5 . so it r:irtrnot Lrein 1. ri.texlby this grarnmar has the silmc rrurlrber of ots as b's. .55. it will still hold a. we ca'n a.n m. . t h t r nz e { t t i u f .take the lir. 2 0 . is firrite atrd therefbre regtrlirr. we apply S . e z: {0. L a . r f : l u d < z } .La. either. t}re prefix conditiorr holds after one step. then a E If r { u i u . brrt the utriotr of all the langtrilgcs is the notr-regula..gc tr : { a ' b " : r z> 0 } . .1 a.lforrn irt rl * 1 steps.. the prefix condition txlrrtirlrt:s b's or the location of those irlrt:ir. As the final step we must slrow that for each i.2.S (b) The language is uot regrtla.ltl'tl6 is trot in 1. we carry out an induction on thtt lcrrgtlt of ihe derivation. Tlte adver: r:hoit:cs. z < z . take any strirrg z : lttu)Rt with ltul : n.ct that for circh i tltere are onlv a' finite mrrntttlr of substrings ui. 1 . so t}trt it has the form tutlft.a5b. Suppose that for every sentential fbrm dt:rivcd frotrt 5 itr rt. . Corrversely.pplv S . Thrru. Altcrrratively.. l u l l < z } .f .ht: Li.r of lt:rrgtlt rtl that is in a'll of tht:. r r r l t } r c r e f o r e i n . luil < i}. 1. : l o i l = i } a .l first case uro = (nb)--k aa(ba)"'' .ab)kot . tlrt sirrce the addecl a is to the left tif tht: adtled b.. We claim that the rrrriorrof all the .ch -t. To show thtr.l = i} trec:ause is not Iong enough.. 1. Take .s (:irrrrlotbe in {tizuon: ltr. u f . It must tht:rcfrrrctbe in {lr1u.'tJ fub)k'a. steps this corrdition holds.378 ANSwER.

we start productions with S -t Srl. hopefully recursivedescriptionfor . then .bDclE E . Trr tir.AlaAlaaA (d) This ha^s unexpectedly arr simplesolution S + aSbb la^9bbblI. 1 5 .SoI.aSrbltr. 1. is arbitrary and m 5 h.:h*m and rn: h+rz.L..\ In the second case. This c. Then add rnore b's. (a) First. so we rreedanother. Theseproductionsnondeterministicallyproduceeither bbor bbb cach for generatecl a. (a) For the first c&se : ?n arrd ft is arbitrary.Je$: n.a. (e) Split the probltlrn into two ()ii*.\ l51l Sr .urroNseun Hrrurs poR SslucrHD FlxgRcrsEs 379 7.2.te since it creates at Ieast tlrree a's.rn be achieved rz by St: AC A + aAbl\ C --+Ccl. (a) If S derives. The first caseis solvedby 5 .ke care of the casesz : 0. Here we use n.L directly to get a grammar for Z. wt: add 5 . It is normally not lrossihleto use a grammar for .Sz.Ecl). Finally.9c . solve the ca$e n.. This can be done by S ttaaA A + aAblB B .BlA D .+B b l I But this is incomple.5r - 55 derives-L2. L2. = m * 3. Sz --+BD B -+ rr.

ABlb. C)nesohrtion is S . possibly lead to But this scrrterrtial form has the suffix abc. 5+aaBlaab S+AB+AaBlaaB+a.2 2. we can then corrstruct grailtlna. E Section 5. thc third the two disagrecirrg symbols.EIE. or vice versa.380 ANSWER.ASAIB A -+ a. E --+ * E lE. trrrt this is not all.ab. the senterntlc 22.. There are two lef'tmost derivatiotrs for w : aab. compare corresporrrling symbols. Orrct: we see tlris. One obvious subset of 1.nbe of tlte forttr wwR .t is rrot of the form uuft. Working fiom thc r:crrtcr to the left and to the right sirrrultaneously.rsfor these typtls of strittgs. While some part irroutrd lhe center ca.nnot aubbabba. The first two productiotrs geuerate the u and u. A solution irr S --+aA. The only possible derivations stirrt witlt S + aaB + aaAa + aabBba + aabAabu. tlrcrefore be of the fortn utturRbu rsr ubwwRau with lul * lul.ts Note that the more obvious grammar 5 --+ a^9rB b. contains tire strings of odd length. A --+a. .lh B C b](rla]b aCa |bCblA.S Z.veirrr cv(llt length string tha. atrd the last the innttrmost palitrdrome. ttris is a little harrl to stle here. at some point wc get aII a otl tire left and a b 'I'he string must in ther rxrrrcspondingpla. 6. Suppose we hrr.ceon the right. 5r --+ a5rBltr B-b is rrot arr s-grammar.l(E)llls.so it cer. 19.

Exnn. The only nullable variable is A.s b a? uAlalaBB fi --+ A + ae.Al. 20. Use lhe rule in Theorem 6. But this rulc rlotls rrlt crclatc irrqy arntrigttity.t this grammar is unambiglr()lrir.5.\. To rnir. there is never arry choice in the procluction that can be applied. An orltivir.Ala. so removing A-prorlx:tiorrs gives '? . * uhab.ndllrnovitrg it results in 'i*hl h ' r | f i A "*\> '8 yl i".\ A .mmilr is obviorrs fiom thu rklrivatiorrs S+aSb+ah S +. and ti. Morc cxrrnplicated strings are built frorn these two situations.4 -n aAb... so they can be parsed only in one way.kt:it plansible. By Theorems 6.fi S --+ AlulaBB u A .bAblhh.lont rrrril.2 thc two grilrnmirrs are equivalent.d \.a B --+b}lbbl C --+bClbb1.SELECTET 381 L Ftorn thtl dfa for a rcgrrlilr limguagc w(] cirrl gct a rcgulirr gritrnrnat by the rrtt:thorl rlf Th(xlrcrn 3.rrnrnirr i$ 5 .rting with A --+AA.ilrr retrtoved. r4. C B is the only unit-prodrrction rr.9 + abS+ ab. Thcl grirrnrnirr is itrr s-grittrtrnar except for {l . ut = aahb.1 and 6.rtefbr B irr tirc first grantrnar. Sirrce the dfa never has a choice. Chopter 6 Section 6. It is not ea. be Then B becomesuseless and the associatedprodtrt:tiorrsc.crsns Souluoxs ar'rl HrNrs FoR. Arntriguity of the grir. whictr r:ir.4.li .1 3 .aAblabl AA. consider the two tvpical situations.syto see tha.rr orrly be derived by starting with .aAlaAA A .aaAlau B --+hClhbC C-8.1 to substitr.. Solution: S . whitfi carr otrly be derived stir.rnlligrttlrrtt gri. .

rV6V1.b B + bbAlbb. First we must eliminateA-productions.Al V"BIV.6. The grammar ^9 --+ aA.2 5.a) .aa. of of is S . Tltis has introduced a unit-production.aBbl\. Section 6.V. .AB lbbAlaBlbb A . A --+V. 1 6 . Removal this unit*production easy. B + V7.aaAlaa. any unit pr()ductioruJrer any A-prodrrctions. We can now apply the construction and get S --+ AB lVhV.V6AIV6V1. This is obvious sincxr the rernoval of rrseless Drocluctiorrs rrever adds anvthing to the grammar.BB B .aBblab. B and d ere useltltss. 14.bbAlbb. 2 1 .uAl(r A -. A --+ a does not have any useless productions.aa is an equivalent grammar. which is not acceptablcin the construction Theorem6. Brrt it is not minimal gince S . we get so S . The language generated hv this grarrrrrrar is I ((aa).BBIB B . Whenwe remove tr-productions get we SaAla A . This gives S--+ BlBlaB A A-aab B .mpleis S+aA A. An exa.382 ArqswnRs Finally.

Sirrce aab is a prefix of the string in Example 6. wc catr use the yij colrtputed there. Fbr A . The latter is not in correct form. Productions of the form . 9.nd A atVr.bAV and V + BC.. irrrrl so rlrr.reperrrnittr:d since a : .. b r u .V.\ is possible. 12. . Vz . Then use a.. until no terminals remain on the left. n Irrtrodrrce a new variable Vr with the productions Vr + az. (lontirruc this process.. .ABlV.4 --+BC a.V. but after substituting filr B. . I/r.V. . .crsns 383 a.nd S ..VnV. a. Vt ..b-.A. This normal fbrm r:arrbt: reached easily frorrr CNF. so we introduce A . Only A --+\ABC is not in the required form.V. Since S € Vrir. Solrrtions:5' --+aV6laSluV"S. the string arlb is irt the latrguage generated by the grammar and can thercft)rc be parsed. Considerthe generalfonn for a production in a linear grarnmar A a t a z .(trn Vz --+QB Bb1b2...blelu Vz attd productions A . itttroducitrg V2 alrd :. tr/l . create new varitr.Souluoms eNl Hrr'rrs poR. Va b. V' .AlWBlVbVb A VaVa g .br.11.tr.aVtVz. 15.anBbft2. W a.. simila. aB b 1 b 2 .a. W * a Vn-b' 8.3 2.r I)ro(lessto rernove terminals on the right. aBC.Srlncrno Exnn. wtl hirvc S-aSA A-bAV V-bC C Section 6.AIV.

genera.nnot he reached. 1 ) : { ( q 1 .rp. Solution: d ( g o .tthc timc of the switt:h is n112.1 1 a ) } a .1 1 l a ) } arrtl . 1 1 ).e)..384 ANSWF]Rs Frrr ptr. However.1 2.r$ing. (f) Here w€)u$e nondeterminism to. a ) : { ( q r . 1 1 ). (a) The solution is obtained by letting each a put two markers on the stack. which is done nondeterministically and need not happen in the middle of the string. Suppose tlre ccrrrtcrrt thcl stirck ir. 1 1 . Chopter Z Section 7. d ( q o o .cr:eptingconfigura.r .z. ( q r . . u sS e A €. B This shows the productions needecl justify membership. while eaclt b corsurrresorre. if a..B e Vn a B e V r r bet:a. a ) : { ( q r ..switch is made at some other point or if the input is not of the ftrrm ur?rrft. 1 " ) .\. 1 1 1 ) }. 1 1 a ). we see tlut we carr get to this tlorrfiguratiort orrly if at tlris point the utrread lrirrt of tltc irrptrt is r1:r2.. a. z ) : { ( q 1 . V11 b e c a u s e A B €V23 becauseB A € V 2 2 becauseAAB.a ) } I d ( q o . ..teone.tion ca. The key to the argument is the switch frorn q6 to q1. ( q 1 . d ( q o o .b .iln a. u ) : { ( . two. ( g r o . ( q 1 .. 1 i .By exarrrirrirrg thtr transition functiotr.we determine the prodrrt:tions that wttrc trscrl irr jrrstifirrg5eV13: S e V r r b e c a .trse + b.1 1 1 ) } d ( q r .ll to thesecan thcrr bc usetlirr thc parsirrg S+AB+aB+aAB+aaB+aab.np.a . z ) : { ( q 1 . if thc original input is of the forrn 'ururEand the switch was rnade cxactly irr the rniddle of the input string. or three tokens by d ( q o . I ) } d ( . 1 rt. with A € Vzz. t ) : { ( S t ." ) } .with 4 € V1l and R eVzt a AB.. a. that is.Trr arx:ept tlf a string we rnust get to tlte cotrfiguratiott (r71.

b: {(qi. i .a .a. d ( Q oA . so that is. to givc .'s to b's and back.n be ca. We have only two states.rriedout for any pda.de)} d (qj". Since d carr hirvrl only a finite number of eletrerrts and each can only add a finite arnount of infbrmation to the stack.o. 6 ( t u . Z ) } .l (go. Here we are not allowrrd enough states to track thrl switr:h from n. lr. 6) : .1 ) } . These are the only choices.b) : gt d ( q r . Tlace through thc: pror:ess.10.{ ( q o .nthen be taken directly from the pda. 2 ) : { ( q l . 1) : {(So. a . The state transitiorrn r:a. d) : {(qi. followed by orrrl or more b's. thc initia.) : {(Sn.70 11.rld normally be trackrxl hy different states is now tracked by thtr symhol in the stack.)}.rr llxnRcrsns 385 The rest of l. tr. a .2)} . . h:) { ( q i . 2 ) .. What wor. T(r ovtrrcome this..t:edby ) 6 (qo.tr. irr trfl'ec:t. this rxrrrstnrction ca. The transition fiom qs to 92 can bc rnade with a single a. The altcrnative path requires olle a. t t d (.a) : qr r) (qo. d ( r 1 1 . For exarnplrr.r'r.me 4(a). we put a syrnbol irr the stack that rernemberswherc irr the sequerrce are.solution is we d ( s o . finite acx:t:pter.a ) : c r d (gr. For cxilmple.a.ccepts the Ianguage L:{"}DL(tbh*a). a"s L This is n pda that makes rro us(] of the stack.. c d e ) } is repla. 14.taking one path rrt ir time.l:c. 16. Here we use internal states to rcrnember symbols to be put on the stac:k. u ) : { ( S o . terrninated by irrr n. cd)} .Solu'r'rons Ar'rD HlNrs rron Snl.]resolrrtiorris then essentially the $a. 2 ) }. The pda thercfbrc a.l state qs and the acceptirrgstate gr. 6 (eo.

A r:ornplctesolution is (qn. . d ( q i 1n . . a )= { ( q 2 . Ient. a r) : { ( s i .") : {(qr.\.B . .I)] d (q'. a) } a) stirtc:. With t}re latter observatiorr wc get a solution .l. : { (q.A . .b. 1 n . where qs is the initirr. a ) : { ( q z .l rrpdrr's are also equivathe original and the tivity of equiva..1 to gct art npda with three statcs. Florn 7.2 'I'heorem T. d ( q r . b) : { ( q r .l state atrd q1 is the finer.A)} simply reads o. e)} .f) : t(.t. Thcrc rnust be at least one a to get started. s) : { (qo. First txlrrvert the gramtnar into Gritlbach trortnal form. B)} d(qr.2. Next.klrrcltl. S ) : { ( q o . we apply thc corrstructiou in Theortrrn 7.t:l<.A) : Fitrall.1' .a. z ) : { ( S r ' a ) } d ( q r .o .. > 0}. d(g6.386 ANSWERS Section 7.1*.y. B .I) } 6 ( q o . .l tlr you can trotice that 3. L We first obtain a grirnrrrrarirr Greibach normal filrm for tr. 'Iheoreur 7. Ytnr carr follow the constnrt:tiorr of tlre language is {a"+?bzntl :'n. Then ftrlkrw the cotrstruction of Thslrr:rrt 7.l d ( q t ' . i ( .5^5'. z ) } : 6(qr.aB. Frorrr that gramlnar we can tlrtlrr construct an of equivalent three-sta. S) : { (qo. 1 .1) {(q2.liBlb.o.b. (lo)(h)qI. a ) : { ( q 1 .111)} 6 ( t 1 2 .l 4. .A .S B ) } o b.5'sr) } . I'he state tJ1ttirrl bc clittritratecl if we use a special stttck syrrtbol z1 to rnark it. d (r. 5 ' r ) } d (qt .1 ) : { ( q r ' A ) } . o .). o.z ) } 1 6 ( q r . when {(Sn. I ) } t. Be<:ause the trausifintr.1r. I ) } . we call constrtrt:t arr equivalent cotrtext-free grammar.b. After tha.n. usitrg'Iheorem 7. giving 5' aS. B ): { ( q n .5) : { (q1. for exantple S + o.trlrrlltla. SSS) . I ) } d ( q o .'s witltout changing the stir. givcn arry trpda. (qr . l .r.tr. h .tr.l (. 11.

r:k. The pda will bc in sta.mple 7.h stirtcs to q0.This dpda. from whir.\-transitiorr to tlxr final state. the initial. Wlrcrr this happens. Nevertheless. d ( . with thc sta.. .Soluuow$ er'rnHlFrrsnon SnlncrnD ExliR{trsus 387 the first b is encountered. goes as into a final statc wlxrrr the first input sytrbol is arr a..1 z ) } a .r:k hnndled as it is for.tire derivation In rrrrrststelrt with 5 + aSB.tht: Ianguageis tlcltr:rrnirristic.te qr unless a a is on top of thc sl.a ) } . Consider the strings aabb and aabbbbua^ the first case. w i t } r d ( q r . This is ohvious since every regular languirgc r.rtching.r:cepted a dfa by artrl such a dfa is a dpcla with an unused stir. this tna. There are two stiltcs. 11. If more symbols follow.4 2. Since .n only ma. if followed try a single Section 7. The grammar is thereftrrtt rrot in LL(4). it gclrrs orrt of this state and thcrr ir:rnpts at'bt. The solution is straiglrtforwrrrrl. Il should rmt tre too ha. 15.ack. Section 7. the pda will switc. When you write this all out. a ) : { ( q z . while in the sccxrrxl + SS is the necessary 5 Iirst stt:p. Complete solutiorr: d ( q o .a.dpda.1):{(s1. Brrt if we see only the first four syrnbols. Thc: c signals the switch from savirrg to rnr. e t c . t r" ) : { ( s o .te q1. Therefore. with a dfa along the lines of thc rxrnstrrrctionin Theorem 4. a ) : { ( q 3 . non-accepting statc 96 ir.llsfor two different types of suffixes.3 4.r)} . . a strirrg will be ac:t:cpted and only if it consistsof one or more {r.1) {(q1.a.'s into state q1. i ) : { ( q 1 .l.b.11)} b d(q'.rlrr be tr. 'I'hus we have d (t111. {(qr. ca. we can constrrrc:ta rlpdtr. d ( s 1 .qJ}. The basic idea here is to corrrbirrcrr.pplies. wo r:ilnnot decicle wlric:lr t:ir"st: a.vseerrr lle a noncleterminislic larrguir. Prrt a's and b's on the stack. everything can be dorrc so dctcrrnirristically. you will see thai the pda is deterrnirristic. 0 0 ) } . The rest = is essetrtially tlrtl sitrnc nu Exa. 1 rA . 9. 16.3. Lhe prltr goe.2) O t ) } . At first glauce. 0 ) : t ( s . to since the prcfix q. .nd the final sta.'s.L1.gc. d ( q Ba .a .kea .rclto see that the result is a durlir.1 1 ) } : (Qr.a ) } whereF:{q2.h it ca. . .

u: ak and't7: tl{. ct aba. :4..51152 .L grammar anrl the two can be combintxl irr an obvious fa. then the 4.L(a. dpda. which is not in -L.If.shion.bn).but in all casesthe string ca.r c:anonly be in.388 ANSWTTR$ similar exa.dvcrsarylrow has several r:htlit:tls 3. ftrr exaurple. I Chopter Section 8.blestring rmrst be in . s.reil rurrrlberof olher possible choitxls.rr -L. wc rnust use 5 .I (k) fbr any A. is converted into ir. A sohrtiorr is .mrrrar . (a) S . we must apply S ."'b"'brrlartlattlbttr. When this dpdir. Givetr rrz. The gra.a S c 5 1A l l '5'1--+ bS1cl. If they are (r.\. if it is b. tltett 1!)O : Q. necessrrry.ge.L.t the first tlrree symbols tells rrs if S + The gramtnar is thereftrrt: LL(3).''*h-L6nt6m OrnOut6rtt. The only ?.aSc. There a. choice of u irrrd g that treeds any serious tlxarnirratiott is tr : aft and . Take ut: u. This is almost a. the gra.(t.1 The ir.nhe purnped out of the langua.L(1).tl. 51 or S =* 5'z is 7.t26m. Ftlr t:ach case' we can find a.L (abbb.\.n s-grammar. aab. is wc carr otrly use 5 .51 .). Ltrok rrt the lirst three symbols.mplcsc.grarnrrar.bRlA. lf the first three svrnbols are abb' then strirrg atty parsa.arrbe made for arhitrarily long strings. if it is r:. tltat have to be consiclered. the grirrntnar is not /. Looking a. As lotrg as the currentlv st:arrrredsytnbol is rl. (a) Use the pumping lcrntna. For a detcrmirristic CFL there exists a. pick u. wit}r u atrd E located in the prcfix a*..nlmtrr is urrirtnbiguous.5 .91-'+ nS1lba Sz + ahbB B .

If we cho11se : pq. )+ ( . (b) The la"nguage is not context-free. The npda chooses the fu nondeterministically. Use the pumping lemma ftrr linear languages.) stand frrr rn left or riglrt parentheses.z e r o . w i t h k a n d I n o n . But m2 (rn+I)2:mz +Zn-t. . the string is accepted. Construct an npda that counts to some value h (by putting h tokens on the stack) and remembers the h-th symbol. use u . It then examines the fr-th symbol in ru2. so that tr2 ha^s + h a's and rn * 1 b's. wherep and q are prirnessuch that p> m arrd g > rn. Perhaps surprisingly..u: e. 20. Now u and y are entirely made of a's. . n1 (u) < ny (u) which results in an improper expression. If luul: k. If w C L there must be sorne h for which this happens. If lzl > 1. With the pumping lemma. .respectively. L 0 .ruis easily pumped out of the language. which is easily pumped out of E . Use the pumping lemma with I.( and ). this language is corrtext-free. the language cannot be context-free. 15. so . L2. choo$erp *qtnfitn+l"m*Z. The language is not linear. (f) Given m.r arguments holcl for other deurmpositions. If this does not match the remembered syrnbol. then i wi+L: aps(l+li)1 which is not in the larrgrrage. . .. . then lni+rl: pS+ik. . : ( . Similar arguments hold a fortiori for I > 1. . Use'u): qrtQ.. ) where (.S u p p o s e h a t l : 1 . Sirnila..( o ) . i >-mz+k' Since ru2 is not in the language. choose u) : a*bz*a*.( o ) .Solurrons AND HrNTS FoR SELECTEnExnRcrsps 389 t A : b I .^bn'an'b* and examine variou$ choices of u and . the language. With a given rn.. we can easily purnp so that for some prefix z.*I T h e n c h o o s e: 2 . .r7.

Construct a grammar 82 from Gz by replacing every production of the form V + rtn e T* with V --+ Sr?:. choosing Sz as a start symhol. (Gr) u L (Gz).T.a. L 1 a n d l e t G 2 : (Vz.L2 biguous.a) : d (sr.Sz). Hnnu. R ) R) d (qr.T.91 P1) and G2 : (Vz. I e .xR ' We can then show by an induction orr the number of steps in a derivation that if tu is a sentential fbrm for G then uft is a sentential form tbr d. . (ru). d 5.mmar G. construct a context-fre* gru*** by replacitrg every prochrt:tiorr A .L (t) : r. (G) if and only if 5 is nullable.mmar$Gr and d2. L e t ( J r : i y t . Then d i. Combine gra. n " ( w ) 7 n " ( w ) . take tlre litrear language 1': {a"b": n > 1}. a n dn o ( ' u r ) n " ( u r ) .Lz is not linear as can be shown by an application of the pumping lemma. Sz. form thc combined grammar G : (Vr\)V2.390 Ar-rswurrs Section 8. ? . S r .Ill}and . TIre languages -Lr : {u'nbtrt"rttl are both urlam{arrbtlc. l?) !. 9.Sz. But their intersection is not evt)rr cotrtext-free. T}re language . 2T.. 1 5 . Each of these is context*free as can be shown by construction of a CFG. t S . d ( g oa ) : ( u . P r ) b e a l i n e a r g r s . These in turn can be broken into n .^9. Chopter I Section 9.2 1. The complemerrt is context-fiee. Pl U Pz US -i Sr l.h) : (q1.L 2. with F: {Sr}.t scans the entire inprrt is .] < n " ( w ) > n o ( w ) . n " ( w ) < n o ( u ) . A three-state solution thtr. with VrftVz : @. and .7. It is then easily shown that in this grammar SzlSrwlutu if arrd only if u E L1 atrd tu € tr2. u . Given a context-free gra.r by A -. The full Ianguage is then the union of these four case$arrd by closure under union is c(lrrtext-free. . Ttr show that linear langrrages are not closed under concatenation. m m a rf o r . Given two linear grammars G 1 : (V1. d (sr. Pz) . The complement irrvolves two caseu: (*) # n.Pz) be a left-linear grammar for L2.tr) : (gz.7. (w) # nu (*) and ?2.

12.b. l t ): ( q z . (b) d ( q o . (ii) Replace the two-symbol combination ca on the left by ac arrcl the twcrsymbol combinatiorr ar: on the right by ca. 19.Sot. R ) with . We rrrust proceed in a back-and-fbrth fashion.on Snr.n. ftlr example.F : tqi]. R) forallq€Fnnda€1. ) : d ( s o . If the final state set .F: {qr}. but it is typical of the cumbersone wayir in which Thring rrrachirrers often do simple things.a ) : d ( q r . 7. b .( q s . .a) : (8r. introduce a new firral state g.b): (gs. The general scherntrlooks something like this: (i) Place a marker symbol c at cach end of the strirrg.i and the trarrsitions d (q. a) : (q. l ( S t .a. We cannot just search in one direction since we dorrtt know when to stop.lly simple.urrolrs arul Hrmrs l. ft) . 10. R. a.a) * (gz. fi) with .a. (iii) R. ) R 6 (qz. d (qo. The solution is conr:eptua.b ) . f l ) a d (qo. .n .a.D . (a) d (go. ft) d ( q r . lt) . but tedious to write out in detail.F corrtains more than one elernent.emove one of the cr's arrd move the rest of the string to fill the gap' Obviously this is a lorrg joh.) 6 ((tz.b ) : ( q 1 .tr) : (qz. Repeat until the two c's meet in the middle of the string. placirrg markers at the riglrt and left boundaries of the searched region arrd rrrovingJ the markers outward.r.ncrnDExERCISEs Bg1 It is also possible lo get a two-state solutiorr by just examining the first sytrbol and ignoring the rcst of the input.a) : (qz.

d The state qs is any state in which the searchrightinstruction rrray be applied.5} with the restriction that for all transitions d (qz.L.1' Then comparethe two parts' in symbol until a mismatch is fouttd' symbol by corresponding A solution: d (go. and so on.1 a. b.. we introduce new tratrsitions d (qa. S) and 6 (qiz.b) : (si.. an add. We have ignored the fact that a Ti.2 s. (b) To simulate 6(si. while a pda can be non-deterministic.u) : (Qj. This can be done as suggested Exercise10. as defined so far' is deterministic. Section 9.L) with a'+ h of the standard machine. Section9. o. : (er. R) . a) a.b.b. the condition a: b must hold. . Therefbre. L) for all c € f .3 2.l) : (qi. : (Qo. . Stltetnatically they are cornbined in a simple fashion.a) : (rljz. split the input into two eqrral parts.B). (so. we cannot yet claim that Thring machines are more powerful than a pushdown autolnata.R) . 8. c) c.-onerrtachinethat jrrst adds one to the input. | Chopter 0 Section 10..rring machirre. (a) We catr think of the machine as constitrrted of two main parts. and a multiplier that multiplies two numbers.n.b. (c) First.l (qo. L or .392 ANSWHRfi Section 9.a) : (qi . lt) for all c e X * to} . (a) The rrrachine has a transition function d:Qxf-8xfx{.rt.

d): (qi.R) fbralldef-{a. to write n. One issue to consider is wha. while the other rrutracks are used to show the position of OM's tape heads. fbr each r) (qi.rt) by 6(qe. where rn is the rrurrrbcr of reacl-write heads. F o r t h e f o r r n a d e f i n i t i o n s ef 7 ' : I ' x f l u x.b}. L) wt':ilckl .. Section 10. xl andd: Q xlr 8 x fz x {L.t are origirra. say A.nningand updating its active area.b.{o.c.chine is equivalent to a standard Ttrring machine and tlrrt therefbre a queue is a rnore powerful sttlrage . 1 1 . the new rnachine writes B.b}) : (qr.ll : (qi. fr) : (qi.nJ"'.nk Whenever theroriginal machine wants F. Thc frrrmrr. we let 5M have rrr f 1 trirr:ks. For each symbol {t € l.l so trarrsitirrnd(S.. tape content ofOM position oftape head # 1 position oftape head # 2 .rl by a.L) must be retained to handle blarrkstha.ldefinition must provide ftrr the resolution of possible conflicts.ce 6 (s.'uema. To simulate the original rnachirrc (OM) by a standard'furing machirxr (SM). This exercise shows tlnt a qrrr..u'r'ror.Sol.c. Of course. On one track we will keep the tape contcnts of the OM..2 L .9M will simulate each move of OM by sca. L This dotls rrot limit the power of the rrrachine.. the origintr.ls exn Hrr-{rsFoR SELECTEI Exnncrsns 393 6 .t happens when two reatl-write heads are on the same cell. We introduce a pserrdo-trlrr.l (qi. wt: first write A.!) = (Qi. Then.llyon the tape. Wc repla. Whenever we need to preservtr this a. rrrrcl on. we introdut:tr a pseudo-symbol.tr).b.b. then return to the cell in question to replace .

We need just two tapes.' b c d f tapeof OM c d c f g x b Simulation by queue A right move is easy arJwc just remove the front symbol in the queue arrd place something in the back. keep the right side of the OM in the front of the queue?the left side in the back. to simulate d (qr. then circ:rrlate.discarding Y and Z as they crlnte to the front.c) : (qi. we can. z. fbr example. so the querre t:tlrttents have to be circrilated several times to get everything in the right place. e b c d c configuration ofOM configuration ofSM SM needs only two states: an accepting and a non-accepting state' .394 ANSw}]RS 'Io simulate a standard TM by a queue machine' device thirn a stack. (iii) Add Z to the back. (ii) Circulate contents to get bzYdef gXa. read-write head . It helps to use additional markers Y itnd Z to denote boundarics. ltowever. For example.sY to the back. (i) Remove c fiorl the front and add .t stores the state of the OM. one that mirrors the tape of the OM. the second thir. 8. A left movtt. goes against the grain. L) we carry out the following stcps.

chirrrgcall I's to 0's ancl add a 1 on the left.92.ts is thereforr: r:clrnta. L L e t 5 1 : { s 1. write Y irr thr: cell.pe. s 2f. Section 10. Rememlrer the symbol hy putting the rnar:hinein the appropriatc state. . : tit we list it otrly orrc:t:. For ir.Sor.For 51 x . divide the input by two arrd rnove result to one part of trr. (i) fJtart at the left of the input.ble. .This frcc sperr:e initially occupied by tlrtr input.5 2. (ii) Move the read-write head to the right. 6. .completely deterrnirristic solution.sroR SnlncrnD ExERcrsEs BgE Section 10. we lirst find the center of the input (e. accept.f2. (ii) l'ind the rightmost 0. An algoritlrrn. value for n. s r . . (iv) Repeat frorn step (i).3 S. Nondeterministically choose a. ..g. Then change all the J. Clrrrrrgcit to a L.2 . .'s to tlrc right of this to 0's... .f 1. . This space carr thcn he used to slore sur:cxrssive divisors. then there is $ome n for which tiris works. (i) Start with a copy of the preceding string. (iii) Compare the syrrrbol there with the rerrrernberedone. in outline.]re two s. s ? .17.usc thc ordering in Figure 10. is as follows. If an € L. Section 10. Dt:termine if the length of the input is a rnultiplc of rz. match. If it is. First. }a n d .4 3.reject input. ] . If they don't rnatc:h. Then replace it with X. stopping (rrorrdeterministically) at the center of the irrput.uuolrs aNn HrNr. and rnoving theur inwards until they nrcct). we carr now proceed deterministic:allg a.lternatively rnoving left and right. If sotnc si. 9 2 : enumerated by b {tr. corrrparing symbols. by putting markers at cir(:hend. (iii) If there are no 0's.} Then t}reir urrionr:ern e 5 1 U 5 2 : t . . If they . (iv) With tlrc rxlnter of the input rrrirrkedwith Y. The union of l.

On the third track.hlesets is countable and that the set of all recursively enumerable larrguages is countable.6(ei. Use Exercise 16. Left arrd riglrt moves of the OM are easily simrrlated.1 2.): (qi. The pda we get from this consumes one input symbol on every move and never increasesthe stack contents by more than one symbol each time. (e) Use a three-track machine as shown below.2 to find a grammar in two-startdard form. Section 6. If the set of . Stack? contains all the information to the right of the read-write head.'lcells..396 ANSWITRS +. Chopter| | Section 11. Then use the construction in Theorem 7.L) can be simulated by popping tlre a off Stackl and puttirrg a b on StachZ.1.e. We then compare the cell contents between the markers. inFut dividers trial ralue of lwl g . we keep the current trial value for lrul. We know that the union of two counttr. On the second track' we place dividers every ltr. Example: Configuration OM of -*l-"1 H -fl Stack Configurations o r-l ln l l fI rr Steckl l e I [l ---l Stack2 Stachl c:ontains the symbol under the read-write head of the OM and everything on the left. T.b. For example.

The result is a T\rrirrg machine that acr:eptsLtQLz. then the set of all langrrages would be cxruntable. 1 4 ..2 is not square.tion that 2s is not countable for finite 5 fails becauselhe table in F'igure 11.2 1.as we krrow. Let . .'tlrr. i l } 1 . Thc a"rgurnent itttcrnpting to show by diagonaliza. Flom this it is not hard to conjecture that the grammar derives L : { a " + r b " t k . When represented with an inprrt u.B +. Since for each'ur there are only a finite number of splits. 6 . 3 . 1 1 . . determine whether or not wi e L.lcould be in one of the rows below. f r : .utlolrs altn HIt'Irs ron Snlncrno ExpRcIsns 397 all languages that are not recursively enumerable were also countable. however. 1 . consider all splits 'tD: 7ltLlz. we can decide whether or not w € L+ ..4 its complement is also recursive. having lzsl ro*s and l5l columns. } . the result on the diagona..'. For any given 'u ( L+ .ge. we nondeterministically choose M or M2 to processzrr.1 .. But this is not the ca.. .so by Theorem 11.Sol..lize. Note. Section 11. 1 8 . Look at a typical derivatioru Sl aStbB + aaSlbbB1a"S1b"B t on*r6n-7R+ an+rb"+l..L1 and t2 be two recursively enumerable languages and M1 and M2 be the respective T\rring machines that accept these two languages. that the complement is not necessarily context-free. For eaclt split. lSl columns When we diagona. A context-free languageis recursive.

introduced in Example 11.398 ANSwHRS Fornnlly. cE-Ec bE-Eb aE .B --+CI) D . (c) Working with context-sensitive grammars is not always easy. insert dummy variables on the righi whenever l"l > lul. The rrnrestricted gra..sa€ S.rSp --+ .T.rs in Dtllirritiorr 11.. thc: B travcls to thc right to nxret the D. is often usefirl.mma. 7.\.Erl.ced by A. Irr the rtext sttlll. the grammar carr be describedby G: (V u r)+ and L(G): { r e T * : . f{rr all .9}. the first stcp is to crtlatt: thc scrrttlrrtial fclrrn o.rrdstop.AclaBc.3 1. When that happens.3 irrt: txpivalent to this exterrsiorr becarrse to arry given unrestricted grarrurrar we can always add startirtg rrrlrr. In this problern.ub. BD . with 5 f r for any s € .S.P). . Section 11. fhe idea of a messenger.5.2. For example.nBt:nD. The first part is achievtxl casily with the produr:tiorrs S + aAcDlaBcD A + a. AB-C r:anhe repla. To get tlfs forur for unrestricted grammars. s* s (V. we can create one d and a return messengerthat will put the b in the right place a. Thtl equivaklrr(jc irrgunro\nt is straightfrlrwarrl. The variables -B and D will act as rrrarktlrs irrrcl urcsscrrgcrsto assrrrc that the correct nurnber clf b's and dts are r:reaterl irr the riglrt places. bv Bc --+cB Bh --+hB.

we first rewrite it as tuft.nrl cr)rrrpiirc syrnbols rrrrtil yorr grtt ir mat<:h. 'I'he rnachine {w : wR €.. / L(Mr) therr (M. rnoclifv M to get fr. there exists an Iba M that acceots it.Sol. which ha. we apply ft tu fr with a: f ... wlrether or not a specified syrnbol o is ever written.:T. Given (M. with a diff'erent messenger thatcr. \ .fr) # L(M) then (M.. and M writes 'Ihus. then stops. Because LR : ( LR . we could apply the supposed algorithm to the rnodified rnachine. there is an lha tha. rxrntext-sensitive grammar must exiut. This ca. Just stirrt irt oppositt: enrl$ a. easiest argument is from an lba. / \ . reject otherwise. Given (M. Accept if so.t can recognize arty strirrg of thc ftrrtr u. Now construct a simple Ttrring ma.*. 6. 'l'he I-b ubB.u. say M1 . a tF --+ Fr b.ru) does not halt arrcl we havc a solutiorr to tli trattirrg problem. say {a}. M halts implies the M writes f. Ihus.nlba.urroNs alo HrNrs ron Snrlcrrr Exnncrsls 399 Alternatively. and qf?r.w) we modify M so that it always halts irr the configuration eyw. If the giverr problern was decidable. Given ?u. the larrguage is cxrrrtext-sen$itiveemd a.1 3.r'url.d plus a. Suppose that a language is 'I'hen context-sensitive. If L(fr\: t w o l a n g u a g e sw e c o u l d u s e i t t o s e ei f L ( f r \ : .nbe done by M first checking the input irnd rurncmbcring whcthe:r thc input was a.. check if the input was a.. we create a.t reversesa string and applies M is a. Therefore -Lft is contextsensitive.s b. L}. if we have an algorithm that tells us f implies that M halts.w) halts if and only if fr accepts some simple language.lts if ancl only if a sllcr:ial syrntxrl.chine. Therefore M accepts {c. Clearlv. Wc (:arr rlrglle frorn a.)halts. ?. M accepts trrn if and only if w tha. then apply M to it.F aF 4.If we had an algorithm that checks tbr the equality of L(M). Given M and tu. Then M carries out its rrorrrral cornputatiorrs. with configurations qgu.n lba. is writterr. When it halls. 10.*t.but**" tT. say arl irrtrrlrluc:ctl syrntrol ff. Chopter 2 | Section 12. that accepts rz. This would solve the halting problem.} if and only if M halts. We carr do this by changing the halting configurations of M so lhat every one writes f. Sirrce there is tln lbtr. .w) rnoclify M to fr so that (M.marker D. This would give us a solution of the halting problem. If I (.

ntrlgrr rithrn A that lists all Ttrring urachines that halt on a blank tape input in some order of increasing lengths of the program.i?2. wQ corr$trlr(]ta Turing machine that behaves a^rf'ollows: if problem : gr1thr)n rcturrr firlst: if problem .rap1.We could therr construct a machine M2 such that ..lgoritltrrr to decide whether or not L(ML) C L(Mr). Suppose we had rr. The set is therefbre recursively enumerirbk). sint:e v/e can construct M1 frorn anv given grammar G.ice's theorem.r h e n . we have-a sohrtion to tlx: blirrrk tape halting problem. the compa. the undeciclability follows from R. 6. M will accept ia]" and it (M. Then L (Mr) Q L (Mz) if and only it L (Mr) : fi.pe. L ( d ) = r ( d ) . then return true Whatever thc truth vahres of the various instances are.400 ANSwER.L (Gz) : E*. Suppose now the set were recursive. Whernevtlr the sintulated computations halt.pz then return true : if problem : p.. Sincc the length of the original program is fixed. The universal Ttrring machine is therefore atr accepter for all Ttrring machines that halt when applicd to a blank tape. Thus. But this contradicts Theorerm 12. accept the T\rring rnac:lftrc:bcirrg sittrulated.L (Mr): a and apply the algorithm. M accepts fl.3 and is therefore undecidable. the problem becomesthe question of Theorem 12. FI'om M get the grammar G by the construction leading to T h e o r e m1 . only to guarantee that it exists.. 2I.S 1 3 . If we take . See if the.But . and some for which this is not ${r. Take a universal T\rring machirre and let it simulate computations on an empty ta.*) halts. Since there are $omc grirmrrrars for vrhich -L (G) : L (G). : {a}*. Remember that it is rrot nccessary to know what the I\rring machine actually is. There would then exist rr. Section 12. there is always some T[rrirrg machirre that gives the right answer. 1 6 . E.r t ( f r ) 1 = { o } * .2 3. If the specific instances of the proble[ fl. To do this from first principles is a little harder.*) and modify it (along the lines of Theorem 12. ry that if (M.rison will stolr when this length is exceeded.4).n ir.w) does not halt.3.pn. Take the halting problem (M.original Ttrring machine is amongst the T[rring rnachines generated hy A..

Since wrwi. For a one-letter alphabet.uk: (w[.3. ("). llsing the firnctiorr subtr irt Example 13. rherefore.. g . the other 01.f afld u$e the assumed algorithrn.n} such that flrrl:fluil.y-2)+2 : .... we form tuf. 2.A) : subtr(1. we would have an algorithm for deciding the original MPC-prohlem. 0 ) E + :v12- . If it were decidable..0):1. 3.1 2. 9 * . they can all be checkctl and therefore the problem is decidable..u. a ): m t t l t ( * .Solurrons errp Hrrurs noR SnlncreD ExERctsHs 401 fi L (fr): o.I Since there are only a finite numher of subsets.. (a) The problem is undecidable. JEJ JE.. ( r s f u .y))) ' greater 7. 4( 1 . Given u)Lj'trz. we could get a solution of the haltirrg probletn.. A ( 1 . y ) : A ( 0 . A PC-solution is u3u4w1 : uzutat.3 r.wfu.... tte original MPC-problern has a solution if and only if the new MPC-problern has a solution.we get the solution (*.) n'n.y-1)+1 :A(1. Section 12.subtr(I.f)f . (t) : @ and .: {A}. s' (a) A ( 1 . w{. if this problem were decidable. Chopter 3 | Section 13.subtr(r. There is no MPC-solution becarrse one string would have a prefix 001. g(r.. there is a PC-solution if and only if there is some subset J of {1.1 ) ) . 5. then .-s 1 ) ) :A(1.

Tlrt:rcfrrrc pa(2"*s-B):1. Since A(2. t ) . giving a minimum of value of g : 1. Section 13.2 > 1}.2 1. t h e uA : 3 2 ' " . 1}.2n+1) :2n* 3. Assume that for A : I. tFor exanrple 1 * 1 : 1 + 1 1 +1 : 1 1+ 1 1* 1 1:.0): A(1.. ir 15. the only possibleidentification of Vr is with the entire derivcd string. (b) UseCr : {a.4OZ AuswnRs (b) With the resultsof part (a) we can useinductionto provethe next identity. the r: is removedby VnVz + VVy 3. and so on.A ( z . Then A ( Z .I ) ) :A(1..b.c}. so the doma.a) :2U + S..in p is {0. T h e o n l y v a l u e s f r t h a t g i v ea o positivegrare 0 and 1.1) -3. At every step. This results in a doubling of the string and n: 5.V 1 I + V 2 : V t V z V1+ V2 : Vt -+ Vt * VzL: VsVr. {a'". fiom part (a). 1 1 1 1 .D. wtt havqt basisand the equation iu true fbr all y. n ' ) : A ( 1 .If 2"'IA-3:0.V2{r V lVrhrVzbl crVpl At the end. The non-terminal is r usedas a boundary betweenthe left and right side of the target string and the two ru's are built sirnultaneously by V1rV2--+V1 ar. Cx7: {ru} and A : {c}. we haveA (2. r V 2 : V r .. A solutionis V 1.2. n .1.

say a255. .\.92 --+ .Sz .91 --+ ..Sol.ecrnD ExERcrsEs 403 Section 13.9r. Sz + aSz P3 ^9r- b. This can be derived from a127 by applying & + &&& once and (t + ee. 5.bSz Pa . Thus every string in L (aa*) can be derived.3 1. Take any string. 8.\. P1 ^9* ^9r^92 P2 5r --+a5r.urtot-rsann Htrvrs noR Snl. Although this is not so easy to see.\. The solution here is reminiscent of the use of messengerswith contextsensltlve grammars. 1?6 times. this is one way to solve Exercise 7. and so on. ab-+n nb+br rc .r way. Then a127 can be derived from a63 in a simila.


Introduction to Formal Language Theory. and Computafion. Kovahi. Formal Languages. Hunter. J. Salomaa. A.ling.B. Ullman.ni. E. 1978. J. Johnsonbaugh. 1972' The Theory of Pars'ing.R for Further eoding A. New York: McGraw-Hil1. Languages and Computafion.: Addison-Wesley.: Addison-Wesley. Mass. V. Reading. 1978. A. ?he Des'ign and Constructi. and Compi. N. Salomaa. Cambridge: Cambridge University Press. 1981. Ullman. R." it Encycloped'ia of Mathernat'i. Dennis. D'iscrete Mathematics' Macmillan. N. P. Vol. Aho and J. 1996.J. Fourth Ed.: Prentice Hall. Reading' Mass. 405 . Second Edition. Englewood Cliffs. Harrison. D. 1979. Englewood Cliffs. Denning. New York: Z. 7978. "Computations and Automata. 1973.nd lts Appli.Languages. Machi.te Automata Theory.tching and Fi. J. E. New York: Academic Press.: Prentice Hall' M.J.ta Theory. R. D. New York: John Wiley. Introduction to Automa. and J. TTanslation. Qualitz. A. 1985. 1. Swi.on of Comp'ilers' Chichester. Hopcrofh and J.nes.cso.cations.


144 automaton.355 cycle in a graph.165 Church'sthesis. conjunctivenormal form. 4 407 . 100 of regular languages.289 context-sensitive languages.354 Cook-Karp Cook'stheorem. 26 nondeterministic. 25 deterministic. 344 tirne.15 25 configurationof an autclmatorr. 195 deterministic. 344 cornplexity classP.I CYK algorithm. 233 computation.A accepter.246 alphabet.343 of a grammar.126 125 context-freelangrrages. 353 complexityclassNP. 18 closureproperties 2L3 of context-freelanguages. 8 Chomskyhierarchy. complexity. 149. 2.325 Church-Ttrringthesis. 290 context-sensitive 25 of atr automa.141 inherent. of strings. 348 324 consistent$y$tems. ?99 computablefunctiotr.334 B Backus-Naurfcrrnt.I blank. 324 complete systerns. 305 C Cartesianproduct of sets. 354 composition. 321 corrcaterration 18 of languages. 26 axiom$. control unit thesis.ton. 3 18 of a language.26 Ackerman's function. context-freegrammars. 136 of a grammar. 8 simple.52 decidability.295 Chomskynormal forrn. grammars.323 valid. 99 positive.2. 299 DeMorgantslaws.146 baseof a cycle.5 child-parentrelation in a tree. 326 computability.325 closure.172 D dead configuration.15 ambiguity.330 algorithm. complement of a set. 2?3 blank-tape halting problem.163 space.18 star.228 models.324.

36 diagonalization. 4 end markers for an lba. 89 leftmost derivation. 7 I incompletenesstheorem.195 E empty set. 25 instantarreous description of a pushdciwn arrtomaton. 324 indistinguishable states in a dfa. 149. 196 accepted by an lba. 271 enumeration procedure. 5 total.279 disjoint sets. language. 21 leftmost. 140 unrestricted. 89 simple. 5 partial. 226 internal states of an automaton. 129 linear bounded automata. 268 equivalence. 275 homorrrorphir: image of a. 201 L-systems. 89 right-linear. 35 formal languages. 7 of automata classes. 5 computable. 154 derivation.36 finite automata. 103 homomorphistn. 55 of grammars. 91 Ll-grammars. 89 linear. 42 final state. 129 derivation tree. 270 left-linear grammar. 63 labeled. 7 Greibach normal form. 130 partial. 229 ociated with regular expressions. S G grammar. 5 range.408 Iwnex dependency graph. 66 inherent ambiguity. 179 of a Thring machine. 156 language. 38 accepted by a dpda. 51 accepted by a Ttrring machine. 15. 4 distinguishable sets in a dfa.73 generated by a grammar. 91 regular. 2 functions. 19 context-free. 103 dpda.250 of dfa's and nfh's. 270 linear grammar. 301 halt state of a T\rring machine. 131 yield. 36 intractable problenrs. 271 accepted by an nfa. 144 initial state. 25. 354 L lambda-productions. 2L generated by a Post system. 340 . 129 rightmost. 126 context-sensitive. 17 accepted by a dfa. 36 dfa. 283 graplr. 224 hierarchy of language families. 289 leftJinear. 131 deterministic finite accepter. 233 domain. 335 language families. 42 lba. 24 F family of larrguages. 168 H halting problems for Turing machines. 36 input file.

276 reduction rrf states in a dfa. 52 nonterminal consta. 338 membership algorithm. 47 nondeterminism. 129 right qrrotient of languages. 355 regular expressions. 172 for context-ser$itive languages. 339 mir. 331 N nfa.334 normal form of a grarnmar.utorntrtotr. L8 of a string. 11 induction.277 recursively erturnerable languages. 175 phrase-structure gramrrrar. 355 problem. 290 nondeterministic finite accepter. 325 recursive la.19 program of a Ttrring macltirre. 313 mu-rtrcursivefutrctions. 8 sintple. 312 Post corresporrdence 313 modifierd.nt. 111 for context-free languages. 311 right-linear gra. 331 monus. 326 prirnitive recursive functions. I labeled. 67 minimalization opertrtor. 269 propcr subset. 89 regular language.trix gramrnar.T2 productions of a grarrrrrtar. . 149. 9 proper order. 355 rrpda. 175 deterministic. 85 PC-solution. 222 recur$ive futtction.mmars. 104 root of a tree. 326 proof techniques. 43 relation. 136 exharrstive search. 165 NP-complete problems. 25 MPC-solution. 8 pattern matching. 4 pumping lemma for context-free langirages. 177 null set. 334 powerset. L5 rewriting systems. 206 for linear larrguages. 195 nondeterminiutic. 114 polynomial-time reduction. 62 reductiorr of undecidable problems.115 pushdown automata. 328 primitive regular expressions.It'tlpx 409 M Markov algorithrtt. 304 polynomial-tinre. 89 rightmost derivation. I P parsing.21O for regular languages. 4 Post system. 5 reverse of a language. 7^ regular grammar. 176 !1 e: R read-write head of a T\rring machine. 8 o order proper. 313 pda. 327 move of an a. 224 projector function. 136 path in a graph. I contradiction. 269 relation in a tree.nguage. 337 Rice's theorem. 4 primitive recursiorr. 338 pigeonhole principle.48 non-contracting grarrtmars. 293 minimal dfa. l3(i top*down.

175 alphabet. 334 terminal symbol. z zero function. 140 simulaLion. exterrdcd. J ilfi| .I48 sentence. i347 sernanticsof prograrnrning langrrages. 221 multidimensional. 233 T\rqing machine. 36 .267 set operations. 251 upace-complexity. 109 syntan of a prograurrning langua.pe.ton. 8 Y yield rlf a derivationtree. 16 srrbset.te-entryproblem.r:k.a E. 39 trulu.26i3 off-line. 344 sttr.4 proper. 253 tractatrkl prohlems. J.t J I(r) 5 Io 'l T* w walk in a graph.l. 158 urrivtlrutll set. 304 storage of a. 26 transitiorr firnction. 112 sta. 3 s-gra. 344 tracks rlrr ir ta. 8 T\rring-computable function. 354 transducxlr. 25 strirrg. l5 prefix. 16 successor func.ble. 326 symmetric difference of two sets..153 r47 T tape alphabet. 300 for cotrtcxt-frtxl lir.410 Ilt nr:x 1 t r q h 5 t E tl J satisliability protrlcrn. 244 U urtdcxlitlir.tiorr.bleprohlem.ge. 25. 3 countable. 36 genera. l5 olletrir.53 V vilriahle of a grarrtrrrar. 261 with multiple tracks.tions.251 rrniversa.y-option. 222 terminal constant. 4 substring. 226 with sta. 255 with scmi-infinite tape.ngrrages. 308 trnit llroductions. 326 .lizecl. 267 uncounta.trmilr. 8l trap statc. 4 rrrtivt)r$rrlTtrring machine. 266 T\rring's thesis. 177 standard representation for regular languages. 156 start.n automa.ngrrages. 283 uscltlJsproductions. I6 urrffix. 21 set. 19 useless.223 tape of a Ttrring rnachirre.l . 19 time-complexity. lS ernpty.J' V R F . 37 transition graph. 13L r* 3 I. II 3 for recursively enurrrerable lir. 343. 19 nullable. 266 urrrtlstrir:tedgrammar.17 sentential forrn. 253 mrrltitape. 253 starrdirrd. 15 Iength. 258 trondeterrnirristic. 177 start symbol.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->