You are on page 1of 171
Information theory and molecular biology HUBERT P. YOCKEY CAMBRIDGE Contents Preface si Prolosue 1 Part 1 The basic mathematical eas b 1 Rae ssn probability theory s 1A Ther f ray 112 The meen of prbaty te 12 Theappleion of tery tothe somatic exe of 1A bri mods tary haga of » 122 Tie spe pee probaly ery x 122 Fhe a fone mater x 124 The ima foundation of probly or s 1257 enable of prt for vor sage Tae Phe robes or mas) eo a one » 128 Random rer 0 129 Same nantdprobbly rons 3 1. Probabiy sis a roa veers x 13 Defoe of mae and ers xu 132 alae proper of mates nd ets % 13 The Poor roe there nd pprion x moteur ay « coments La rhe gas edges need ith aati 135 Moser ca ase oo al 136 Brae ander seers Te role of entropy quatatve measure of information uncertain) and omplety “The coeeps tinerition and voces 2 Ifamaion meni, owsie a nce? Fo Bente of guoane era formation 213 Phe saopy of rebailydorbutiors 24 Coto cary “Te cite fo omen Been probity sc “thon ae ie tee? a1 ae conc famosa ond probaly BST rea relation een Shame ney nd Mase Bolmom- Oi enopy™ “The eon f eqns 231 The ena of Mako has FO Trac mumbo of econ pene sn he 0 ea fase and he cmpreion oie! Ain information ery 2A The ition erin norman LES Bavpy ov a moment, od retin and Intent cot 28 The quan of menap: com ery Be eae? EES Romer on comply moira Bay 246 grt sfrmatin ay ade ine Reweoncommantn syste compe’, emai fog tm ane gene sem Donte gdsens lo? “The pil of maximum entropy xublshing robin te ase osu Lewes 11 Bema prin f io eae SLE itech of pbb rain hat maxis ny te most bed? ” geene ” $\L3 Te grerttion como probity rons Lt Anns of he maine evr i de S18 pct mole logy 4 Coding theory and codes witha Central Dos {U3 Shimon Foo coding sn Shannon’ oss ong DI The propre of encase cos 1422 Shamor's nul coi ear 43 Brrr dettng od et someting cde 43.1 The pati of gnete nte {232 Bor ding onder concn perio Bock 35 reef the Hamming doce nor carting nd rer deting coder {T51 Teer eect feof he mane of coos ‘ign oa ain eid 44 Code with a Cental Dogan The sours, tensmision and reception of information S12 The Dia mBNtopotcommanetion em 513 Masha proper of hence and het 521 Conf entropy iste paper mews fen mie 522 Mormon! prope of ma nro 53 The mal envoy ofthe rome a ecto of ene 4 Mota enttopy 2:4 mest of be faformation content or complet fain aes 95 vo ww Part If: Applications to problems in molecular ology 6-The information content a complet of protin families ‘The infrmtion conta or py ofa Koala {211 anton pl amin alt and he etl er {G12 Theconitoa enroy of ntl gin cme it ‘Tw msl enopy of seeps poi ees {21 Te dct Breen ‘logos andi (622 Mua ony a replace pr ot ey 8 Saray arty fain or or ace ences ‘Apropo hat pei fnetioely equal! io {631 Theft! epic of etc e eror ‘nr atin | {53 eran of ante ai ine gates a hac Eno ace 53 The retin ofan ure Fe Cotpeon of tc canes ihe oe “Telnforaton content ote ytaehome cfs “Tascpunaton ot oelapping rere by ineraion (051 Pepa of he con fea ees (652 A compart ofthe pe ese Be DY of Coens rom formation Bry and og ory Showing or orp gees arco th he Jomtien Bley voltion ofthe gnats code and its modem arly ieation ne eoon ote pnt code 1 They f dawn he on of he ect coe 2 Ipsec hwy of te von fhe ee 129 ry ws e 0 Contents 1113 re serch! ay of elon rene 14 Eis fh ater of se rial ewes ces 112i te gsc onde vee rom st etsion of fou leer sipate? 121 Ws the encode ev ina? {TRL re proper har se gn cde may hve ss ters 1S as rp ae eich fom 8 STB Bvt of te gone cde by rad a seo Marton tes eS espn of he ee ably andreas he note of epee +13 roped dete codon mens nd eli 6 De ‘fem pec cos [Ta The formation fet con from dle ons [Bb eng shocker ces 333 Te bona paoays 1134 The Coal Dogma ie fs extol 1 Charatetion of he enbiocole 173 Die enact ve ch my tha sr mia i ane snr ce ne fcr rey ny 20 nas corporat eth) 15 Nong ang in he genic ede 1 To mocha genset enone md vy tcemmen mec is “The carly Barth ao the pineval soup Ah Tae connote Big Bang tery the rn fhe 81 B's gsr! ery of ly LE Te epen of mt ery 0 th character an iu of he were ET cere tr creat met ie 0 26 96 2 am Po 1.24 Mev ft Earth 113 Pheer of te amaphere {6 The fet ofthe Bars bt the von fe simeslere EU cramer of plat ae for sh My Way felasy ET Mole exit anf eamonril atatrophes [SLD Thales of aronomlenarops for Be ory iieadeoin 4.2 Choma von an te pinea Sou 21 mfr: hee eden fe 1522 Enc f eo he rd tne ati 31 Terdiman cals ofthe omentaton f onto tie at oar Sar ere time ce 155 er he nl op suc of fod foe proton!” {4 The connote eens opine S09 Di ite emerge by chance fom primeval soup? p Cou ie have xg in 2 rier oup containing he ng eck of We ha eh soup exe? 91 How mor oe ae ered by the tion ns ST Terao fod asamp on of lie 912 The rbiy of set poten eure by ae 921 Prevos tenps ot eng te probably 532 Core meh f ene hepa 528 Thee of chiral on he prabby 934 bide enn ve fo sr ues? 4.3 Lite i ot right om he ing sks by cane 10 Seltoreaniztion orgn of ie scenarios TL Cone of ami eis nde arin of fe fom TO depo of hing Nok’ of eon lay pai 26 2a 2 12 tie formation ade oii of 102 Wishes hoy fearon ito he Jooatonsf pie. 109 Saerpszaton of ina aso form pots 1031 Th prove sari fr hog of 1033 Obrint poten cnr 104 Se-rgiiaton ad the pros of es. + 103 The word game mole of eprization nt oo 10517 oman cane enacts ie wr {0.82 The mation content of prada fr el 107 RNA staid te engin of Me 104 Does pate in be vee? 18. Dorf xt mons fer the wre? 1082 ba ie came fom ne soe? 1094 Peed forthe eeration of wom 1092 fetemtel volo tr ay abot? 10.10 Bohs somites ob xg fH; lithe iecompeee in moll Dlg) 111 Souelone forthe nrg’ foro ie eer 11 ror tories of ang 1A The med ora meer of 8 14.24 Toners of meat hears of ag 1122 ort ro cnteraphe’ hoof ge Pron ear adie cee terns of the pr 11. Gane of eo te prt sess process 1133 Pec enoy and th aearoy of pe ste 1123 Te ol of he mcg of 174 134 The prong proce 1135 Bw daar deo emo the eile creat isle Coil Depa 2% 2 Conene “The mathoatcal fonmuion ofthe pot oabeteory of Information theory apd molecular evlution emake von onary 1th ssod lw of ‘hemodyasis? 1211 Thrash ato of ere” yen 1212 Me nce none earl enropy i ‘oie “hei mang of he lation of petit omaloous 12.21 Anon ite pcr of macaoneo aprabisy Saran lng the poe sues 1222 7h mre ok and the cco of phyogei ‘re for omelgns rosns TE 7a rr mand of pst plone es and he trace chant ae of he coin of pyneretic 1254 The aman of poet onary reer the td friar ese 1225 The oman of oer eoatory ety om reac Maroy cant TH Sete apntin of he Marr goes othe role ito fama rts The sond measog fhe lato fetish ees nove nto rom dapened DNA by teal Maroy cis Epilogue References ‘Asst Index Subj dex 10 310 ue m0 ne a 36 ws Preface “This monogap i intended to introduce molecular biologists to the spplation ofinfocnaion theory and codig theory in molecular Boley. [x euvows lack of eorumsication between molecular biologists andthe ‘ratcmatcansand ctrl engineers who developed information theory Su coding theory has ese in lecular biologists being unaware hat trent arcation an, i some cates, the solton of thee problems has Sead been dovloped. By the sme fea, mathematics are unaware bite very important opportunites for applied mathemati n molecule ‘ology This monograph is intended f he read by bth proups with the ‘xpetion of sigifcat mutual bene Patt {presets the mathematical ideas and background in suienty ‘complete form that tls not nesesary to refer to ater books. Pat 1 loved to te application ofthese mathemati ideas fo problems in troller biology Those readers not fame witha reasonably soph- SSietedpresettion of probably theory ae advised to sly the ‘atrial in Part oarefly. Mathematicians wil miss algebras, ore 8 {nd Lebesgue mensre theory, bat mot readers wil quite enough rnathematia sopbitiaton i Pat I On th oer hand, suggest that Inathemaiins have #oopy ofa good text on molec biology handy 2s they read this book. Mathematician unfair with molecular biology ‘nl nd they esd an explnation ofthe base ideas and experimental sls in molec ology to understand the application of information theory and coding theory to problems in Pat I The rid to reasonably consistent inthe deisiton of mathematic! symbos. Neverthe the notation aay vary fom chapter 1 cher treauwe ave adopted the notation oundin the teratue for he ferent Prac wai subjects that ae arsed. No onfsion wil esult ice che tems ate (ine in each bape. ‘The herein at Tare fecal in Part 30 cha the eader may tare back to ern sor her memy. In no ce have Tsiit o tive the sphiscaton ofthe mathemati, T have provided enough ‘planation of both the mates ad the molar bog 0 tht an ‘vanced nie student or graduate student in ether bilogy oF ‘mathematics vn urdetand he egument. Thi monopraph may serve st {ea fore graduate sina attended Hy both lear Roget nd ‘mathonatinor. Exablsed profesional alo wi id cient interest ininformaton theory and eating theory and ther aplication nmokclar ‘ology They inay it o bron thi appeiaton by sling he Tver to make the references as ompste athe sj dean ‘Tis isa monograph and not an enelopdia oT have made no atterpt to ct all fren aval, Ihave aot repaded essary 0 eal tston every paper believe i not make an inporantcontibton ‘oct those which aeincvrec Ta cada nthe eter ony tote rk conrad to he point Being made. Soe ede thik that hse eget an important paper here and thre. acknowledge that hs maybe the as bu here are nes when one mst het the ln a It the chips fall where they may. Te eferens marl sp to date sof ‘Aogst 3, 1991 “This monograph follows my interest in the abet, which was frst strated bythe Work of Dr Henry Qua, am ined fo Profesor “Thomas H Juke, whowestongresommendations eed in my oral pers beg pubis Many of rotenr Juke important contbutons {0 moleua biology have Shaped the idear presented a this took, pty those concern the eolton ofthe pots cde Profesos Jes alo inoduced eo Dr Adan S. Wilkins of Cambridge Univerty Press. who sv the merc of my proposal to writ this book. Taso am tral t9 Dr Gregory J. Citi, whote eign! and seminal wok i orth information theory reacted throughout the took Thank bo Profs Jukes and Dr Chaitin forthe asl view ofa dt of | "hs book, Each mide important suggestions ors inprovenent "ihn Dr Paul Wool, who revened the penlimate dof much eal and also contsbated a momber of suggtion hat have removed ‘store and proved the ela ofthe text {woul la ike thank Dt Robin C. Sth who grcouy completed the editing ofthis book fer Preface an Dr Wikins et for oer work Dr Susan Paskinsoncopy-eite the book Stecled my tenon toa marta fen error and contributed in dvimporfan measure tthe elaity ofthe teat. am indbied to Dr Bryan Mackay of the Biology Department at the University of Marvand, anlimore County campus, for ea he fates a the Albin ©. Kuhn Libary and Galley. My daughter, Cyntin Ans Yoskey, ete this ‘mans rom propos ol daft and contrbued much co improve The clrty and orpaiation of he materia, Without the contribution of these pople| would not have brn ae owt this book ert? Yockey Bel A, Maryland USA Prologue vat ere har ing my ry ‘nay te et ca a ae ee een ca spy ee stitch a caine mnogo ae era eeeceeeceeeraee erase “The poe of ths book st show that the equenehypoteis and the ‘Rhutscnde, together ith he eal of chemistry, ar ete to fhe mathematical foundations of molecular icogy. The sequen: hy (nn vats the eqoenes miles and amino aks cary te ‘frocnaion tht conto te folding andthe sctnty of teins (Watson UE ck, 19S ik ta, 161; Aninsn, 1973; Lak 1988). The ‘Sita cierence beeen ving and nonrtving rater is that ce {btmnson and fonson of print Binlgal molec i governed by (pote menses No ce of» emparble speiention by ghee Era ade feteen sequence ent norlvng mats (May, 985), The nu of the suficeney of pis ad chemistry 10 del wit he compli of bislogy mar duced tong apo by Bobr (1938) “The sentation the eentil importance cf fndamentalystomiste etres Tpit tuncon of living organ by no meas scent, HOWE, fe 1 camprcbouive explanation of teiopel phenomens, Before we can ‘each tn understanding of Me on the bois of physical experince ‘Sbough moll Boog st beconaset wth hemi nd phys ‘pocwnte loge! rips ae not derivable from phys ood ‘Reet foe (Monod, 1972; Maye 1982, 198; Kappes, 1986, 1980) ‘Noweduiy af ths sore nt usa sence Fr isan, the ‘Sead nw theses reonstnt wth Nevionsisw of motos, ‘Soap hit ri tine ever ut 8 no erable fom the. ot (193) drew a analogy between the existence of ie and te complenentcy penile of quantum mechani. The photon may a3 Sine for ctmpleins faction expetinent, orn the same sifacon ‘pee tay ac particle, Iexpernnts at ery Tow baht eves Prtoue ‘he photnis detected athe posnsin the iaction pattern appropriate toiwe ike roperiesin stockatefasionasificwee pate. The ‘Sie nue-parile duly fas been fond in the ease of etn sad eurons, Boh rented thse apparently consadtory properties 35 {Ena complemenay aspects fhe same thing. Tne wave-partle ity ‘poser nt ou aby to mess both the momentum and poston Toe stomie parle soc as a elstron ora netvon, The experiment at thatthe meaning proce ats both the rementm, and He foaon, amd Induces an experimental ertor in each The relation ‘Rte thea ears ben by the Heienberg uncertainty price, Spar ly2s. Bohr extended the complementnty prinsple to the ‘uenton ofthe aby of s9en® to understand ie “Thus we auld outa tla anima wed wary te invesption of is orgon30 {drt could describe the rae played by dngleatonsin wil fants. Tncver experiment on ving opis, hee mst resin an urcetalaty heer he physical conditions to wich hey are subj, an the es ‘gas cia the minimal som wet allo the orn in his repens large enough to permis ty tides imate secrets from us, On ths vw, th eens of He ust be considered 8 an tlomentar atta anoot be expnne, but mst be taken a ating einen tology ina simlar may athe giant fin, whch appears Tran dratindl emet frm the pie of ew of classe ech gn taken tpete withthe essence of mentary particles, fms {he foundation of stone pyses (Bor, 193" the nner ema, ‘neon ied many to teeta besnkertance of carats sth a {and short in reper Mendl's experiment with stains of ea would died bending of hee ats and produce plans of medium Wight. The Facts with wig the deri famiar show that hist case ony when iho its dominant. The theory of blending ineranee pei that Chpernents shuld show the variance ofthe fprng would be rlueed 0 ‘reall of that ofthe parents each eneaion (Fs, 1830). This prac ete the Ssoppearane of favored ais or hat ations mst several hoosand ins a foqent they are Known tobe, Theor, tecoring fo the blending teary of laberianc, Darwinian evotion Wrould never ovcur Kappes (1966, 1990 ees Feing Jenkin, Contemporary of Darwin, sing pointed this ou. Mende laws of ‘loinc ecesvens sorgation are coueter inv nd there Tore mun bereaved as anima or sementay facts that cannot Be ‘ede to per one. TAS may te one ofthe reasons that Mendes Prologue 3 work was ignored wil eae Thee of the iste enary (MAS tos, MD opt this pat of view may cause considerable Angst a0 dete olen waste care rong clase physics who Boht en senor nich he was aarded he Nobel Pie in 1922 that mang cmcvors which rei oil mation shot the na Sotlipe into the poe chard cen of the 30 nal ag enced ty he gua faction fom radiating ey a Oy by Manel lw: The sence hypothesis may be kn a pot or atom of molecular tology and pay the ee cle for tah 39. ike he quam of acon cs exten wellsuppored Pe nctal evens Stent (98) has eed Bohs uo iht ve un whic he dre aterin tothe impctons for bony ara eturena changes in patra leat quanto mechanics Mad broader phys en race hypothe, Sst supe by Seboingse (1948) snd hers corte formulation by Watson and Grek (Waton & Crick ree Cock 1990), accounts forthe very of Ting organs in ea sie nono uber of enetc messages tat may Be eored (Rath tooafmuceotdes andamio acide, Thepbevomenon orev ec shows tht there enough information near bE seat aseguaner of sb letd ema ites aha todsernne rt incrooal object. One reminded of ch Hncar equene of 4 ge romateevsion sation which, because fhe ser, dies 14 ‘Frese objet, amas, the pitare on the re ie canbe i hemi. ost-a2y chive fe-ensis-bsoay wal bas Baas yobs tai ‘hus, Reasoning with words ons ft esis ling no wor as Rect every word in the lcionary as ml pe manne The ‘rope mening ede ox Jessie. A wster ms. Ce roca wordin dierent acacia esa paragzaphor 2? Apr snenomtence, An eagle of reasoning wi. words alone eit Tower esonabe people today the argue ves by Arle (Tie ‘eu inbch he pins ou tht people rina ty and ays the Fett that some pole spell bararians (UW 8 n00-Greks) ection ty Nature Arse conideed that nati aso ile tnd os ais jut enslvingprsonesot ws, Heargued hat eee scuy beefed by og taken from hs weulaed ative iid teh ang Becoming part of» cia Grek oy, of te <8 Then that the DNA suns of fur eters dtermind the rote Prloue ‘ botiom of the scl sale. Hetook this position in spite ofthe at hat he Ms anae of obecons by fers, wanamed thatthe re of maser over Ulver besed on fore ae is hrefore wrong, Aso the inventor of oval log, based is arguments on the meanings of words (ancient (Geek word wht lg?) in ode oso what s"conary 19 Nature She uly he scl rca of ie. "The Grek mnths during he sae peiod mee dscovring the omar method of reasoning. Thy found that al eomeiy bot plane ‘atu col be fduced ost tant Fanon rom wih IM tnorems coud be proved Eu, Elements of Geometry). Thus ytgors founda proof atthe many lease ats abou the a on ‘Sohypotens tht were wallow othe Babylonians che Eolas steve tre In nr Mathematical synbols and the postsats on which Spe be theurgamen oot change or hae mpl ean is ot Joudbi to drew unrated conclusions if theargument scared 9 {he coment of «mathe! dscuson; Pato, in Meno, has Socrates daw tom the sveboy te crs way fo double a uae jst by aking “onwionr (Quietly now, how dacs ove double a square? this book 1 Shall show a aumber of css where reasoning with words alone bas ed sens of 20 estes by mean of code 20 unconventional tha “Gamows pane bee strate by lost anyone di wou ave most ‘rs tnen ered, Preview may take onthe coisas eton Spreng te renders of teshnzal journals frm heresy (Madox, Toiha). Bute world of sence ha ered either oak hs i. ena ‘Unreian Amescan seiovay otoacap thconequencss nd aihough tis Rsccode mos soon show fo Bence the eral stuck, The Seach wo forthe caret gaeiccode. ‘We mun siinguh eleary between axioms and dhe theorems fom tic they are dese Por etampl easy to show th the Cetra ‘Dogma (eh 1910) tore oding thor and ot fendamenal ‘ological phenomenon, The Cental Dogs states ta iforation my ‘wetter fo DNA to DNA, DNA to miRNA and mRNA (0 protin, Under spe ceumstaes,faformtion may te transfered From mRNA to MRNA. saRNA to DNA and DNA to prot. Three {eer th Central Dogma tats never osu are protein o prot froin 19 DNA and pron #0 MRNA. The DNA and miRNA gene Ends have 6 code words made alts fom an alphabet of our eters, Prologue 5 roan tht us nla f 29, ame he io vce ure em aa rou ope ss te degeateni-emintrman tre wa Sem eae tare eo cote nos? aa inte tcp Supp nen x oe Sis ty ao The rot on ech le my Shite tanita econo nates sant ryespmmofhe maton wd sure ann Se Men ae om af he des th sue cr SEE Gates (ih an hte oscome oe eae = See Sy eat it onto Sr ater th vee eee eter meng ety eal The Pore eves a rs tise unter 2 eo Fre tether ncn mess meted te md Bereta temples fe ners on he aorta fo ounpl ct fr by ante oboe Menton roy dy Core, eh, ah) of sae Sree ei ects ee ce arate ne cee ime frees cle worn ees TRS scanty os meme eae ae SOS ie Siew afte tude T nam Sonor th ee er hh cdo oui nDNA SSTRNA sod 2 anno sas in eee spab, me, Poe. FEELS ora cent og nc, ifrmnion os inom diseon T ng senate Se ee ante pei ee cree tav nudge ater ett he Oe Sloe mela Uy etal proper a Sem ede gece od oe egy on as an masa por Hoss 7) roars ree nd th son by Momo (972) ne site G0 a) wat pgs ad city So a tl: ep Nebel tp the mae too ange shal ot Sa lpcaongetscatesht ecm aches Oe sm sated og ate nt feta ot Sekt pa prose tt he ec bat nt SERED Sicttotn oes tom pyar city hen SSPE tnt one mylene em 9 Prolue 6 theo, No harm come om ths sinc they ae so wel supported by "The mathemsti formats that deals wit he proper of sequences is allel formation sory, The formalin tat deals wih the ations “ete seauenes alld coding tery. Tens sch 3 iformation SSapkiys oniiny"andandones re fln sed wth gue dierent meanings by diferent autor. Yet the presse meunngs oe Sn he en ow rom the raters feats nt ‘Great des and Sacer offen ave deep 008, 0 it i with infomation theory, One mit havea means of measurement i ode 1 bot ese cooepstnamathesati formalism. A logarithonie measure of Finormaton wasn suggested by R. VL, Hate an engine withthe Bal Tepe, Labortores (Haley, 198), Important sas were ontbued by Norbert Wiener (1695), who oso wed logarithm Imes for information and noted that he measure had many ofthe oper the Blan exresion fr enropy insti mechani, ‘Aeonng to Albert Seen Gyoray, coer is eng wha everone ns seen bit thinking wis 5 on has houp (ite by Musskeneim, 1986), And soit fel to the tof Cla E Shannon (1948 think wat tocone ha thought an to spp the mising ideas or the mathesati Fovmalisns we ow know a formation theory and coding theory. In these popes he ound unique mena of information base on The Imeematcl pins of st teary an proved a amber of heres {hat povided an anever fo the qertons of ow well the emanation Systm ofthe telephone and tegaph companies were Being wed and ow Tee could be improved. Messages ae weten #6 a sequence of {moots chosen fom afte uber af symbol cal am alphabet These Sobol reseed at the sore whore pe the probability ofthe th Syinol Shannon's metre othe formation ery iol Ho -Enton, ov “This epson emia the expectation value of the lopariti of the probability pT infornation per symbol in # sequence is gen & ‘usntiatve Sefniton by equation (0) end therefore information i Imessrble hough aot material. ineinfomatin ia fnion of ‘robes, formation divi anh ofthe athena thar of baby and sa Chaps devoted nha sbi ‘uation (shows th the wordinformation’ can beidetied with Proto 7 sathratio enon that captures mot, but not all of what he word conveys Shannon's dos aot measure the meaning of x ese a he pnd out athe Beining ois is saps (Shannon, 194) Cees of [formation theory decry th ata deft and fil to moter the scond| Paragraph of Shannon's paper, which expe why hs is 50. The rumuneaton sem thatthe engncer dent smst be capable of Tranniting, doodling ad recording aay ear of age ensemble of tresses. By the sre token the Balog slormation system mus be "theo necorimodat he geet messes ofa organisms that have Id ‘nthe past ae no ving or may Live in the Ttue, As we shal si ‘Chaier2 bth communication system arent data recording ans rocsing stems “horn socodote (Tb, 1979) ha Shannoa, desing a name for the Funston piven in uation (01, sought the ave of Profesor Jahn ton Neumann, who epi, “You should all “enopy” for to Tenvone Firs the uncon alien nein tatistia mecanis under "tat name (Boliumano, 187%, 6). Second, and more importantly, mos fronle dont know wnt “etsy” relly IFyou ws "entropy a frgument you wil win evry tine!” The Maxwell-Bolmane-Cibbs ‘ntropy of themosyoamics base on the prcbablty f a ston of "leone difeen fom tht of bennon apd the vo have 30 relation. To ‘eet ths fat it usual nth tate fo al th ee Toton essed in etn (0) te Shannon entropy Tes perp certo th reader that the goss in piel ‘somorphi with communication systems degre by commuontions engines. AS a mater of fact, peat! syses hve histor piosty ‘Shot organs have Hee ing the princi f fnformation theory ai oding theory fora eat 38 1" years However, ther pecaa to fommunicauon engnerng colored the orignal popes by Shannon ‘These consis fave ben found to Be sul inthe development entiguous cssiplines, ama, sass, computer science decision {hoor and of course molecule boy. Teil oten be emphasis bok that theorems ao dsnseres ‘orwich we would tte le by ition. This ot unas essen quantum mechanics is fll of X-rays and gama ays ave zero mass Sndarecorialy cvomagrete waves Yet sar predicted the olde ‘wi eleewons he particles. lero, on he ober hand, have as aos te carat paris, Ye, as was preiced in refeting fom a esta they Beave Hike waves. These very important resus are contrary to Prloue inition and ake ome ting sed. Infomation theory, Shanoons Chanel ape here san xpi signa example counter fetuiveconspt This theorem roves Tat x code exis sich tha an ‘rl menage nay be ever by the prope sof eundanes, fom Ahr eoced message aed by noite wth ax small a probity of ror tS ple sso thi theorem ha the aio the genetic esas rose for so any slion yer the infomation needed 1 frm & rtecanh bse Fi ren un 0 mich couater to ave exes that i hae been challenged publly by communication engineers {Mesfitn. 195, Ama he approach based on intuition would ed oe toexpecthat theony a to overcome the eto os woul eos themesnage several ines. There o way of owing how many times the Inerage mute mnt ace 2 gen necuracy. Since MOS ofthe (reser ens core, his wou nay event be very waste The conetniation sen. spy rpenting a mesage the only wy We hve to reduce th fl of rae, “channel open lsd term The chanel apa theorem shows that ee i a welt infomation cape fora commuasation chan. “Thepraccal va ofthis, an other theresa information theory and coin theoy, othe work's ecnomy is above measure. Tt permit he frome of enormous fds beac ancl iaittons ad among ations tthe sped f Hab. Obvious. errs cold not be ler ‘och raactons Spevaslr stor photograph have een eed om Space raf fay ae Nope, These tres were set by ue of the powerful Reed Mulercnde, The ightes of ech oti the picture was Encoded ia cade word (or codon) af 32 bts hing 26 redunis, ‘These ae only to crmmples ofthe ae of Shanson's channel eapsity ‘ore, ich we nw take Tor grid (shat plc the eal amine se und soe by certain symbols ig ining information theory and coding ery to moka bilosy) ‘These symbols represent what inporiant inthe DNA, mRNA and protsinmaecles andar ements atc the geetmesiages. The {hac locaton of the various atoms tht eompore tee fafonnting foes s unnporant und meey cuts our thikig In molecular bog the mole, each ith cea and phys proper, te lkechusmen, each with isnberent pose moves un rank Theesence of hes inn the ules thea by whch the game played tin the substan ofthe chosen. The pis ehesemen cane place by Inaks on pes, by tonic sigale caring messages between the Prologue > lars, or even the atl image of th ose athe ins ofthe players ‘Shoo cheging the character ofthe ee. he physics and hems in) the flee ebook, the gente mesg ae the strategy} Ths The enormous civesty of ving organi seein the enormo08 Uhcny ofgeetic meses that termine the fe ad characteris oF Coch organism, “Tn nnd fra fendementl theory in moles biology, cle for by Madox (OHS, can be stifled by lcuding information theory and ing theory o wel ae pies and chemistry. According (0 set ‘eductonstperadp, mle blog imag al boiogy 3 ranch of psi chemist (Madox, 1983)The tt materia who insists ha ‘ens tant not be coneteed with anything thats ot mate or energy ‘Steady in serous (one Tn gunntum theory one mus se de mae fonction that is sluion tothe Sedge equation. These wave Tinton often corporate the square oot of ius one, and ate clesy tov mater Wave functions we kind enough ott appear i the re World excep inthe form of the product wih thr complex conse, lh mst bea ea umber Tn conidting the question of whether ology ttl contin in lysis and cenit tua: Is cles mechanics toll eonained {mNewn's laws of raion and Newton’ Iw of gravistion?. An amination of tes Ine reveal 0 Menton of the orbit of Mars The jpoblons hatcan te solved by Newions mechanics inde the pendulum, Thepyroscopeand the various problems incerta mechanics Ineach ese cust pest teproble fos eb then apy tee avs othe ‘Sinon: By te same token, in Bony we mast fst define the problem "nd then appl the appropsats laws. As we shal = tri this book there inst becomsderbl formation a he message hat defines ie problem For example inset macanice the San and the ples are fst fepanded as point masts. This i suit to derive Reper' laws of Petry motion bu st doe ot expan th recon ofthe oun. “Ta nde that problem one mast dene the Earth a5 a rigid body wih Cetsinchroteristic A numberof bodes nthe sla )tem are close owgh tht the rvitatonal el vari pprcaby oer he dimersion of theattrctng Sis, Te Ert- Moon sytem ina amar exam. The "ction ofthe Moon nus ds, Some Beit inthe Earth isl nd Some maton ofthe acne mst Be llomed nde to este he es, ‘Oncimst semoreinformatin sbout the wo bodies norer to defi the prob i ach ass, We may find the conequens ofthe information Prolgue » inthe defton ofthe prob bat we anno et moreinfomation out of {problem than we pet in By the seme ken, biology int contained in Fi and hemi (Monod 1972; May, 1982 198), Tat which the ‘RMoene hype ang the geeiecodecan es must Beaded nthe ‘Geez othe peblem, We sll ind tis book tht ilo is ot (fr ot pis and chert bat eather pis and ehemiy are he Fondnadene f bog. “ses Lach a th co ofthe rlueonstparadim ding the conrevriee wit the vat piosopkers Head Berson and Hans ‘tate ate intent ae ear went entries (aul, 187. ‘Foc tle of hs book, The Masham Conception of Life: Bloat ‘Buoys (912), was deliberate pun. Loeb experiments with atl thenopneio and Mulers experiments wih the genet of ut Stone with eras were prominent im supporting the educonis Faetign, Thee experiments copter wih progres inal elds of Bocbeiy radually dined the inaece of th waist pier en, Bob (1935) made sear tat be was ot eiving vias bi Tigi and bf lecare Te information content ofthe geome is Rom ‘muti ba neerhcles meauraie Ths ean avd te defen of Tin eduesonsen sith adopting ny ofthe omantiinm, as Loeb Cale ity of tai The sic reductions paradigm has bens widely coped habe fll role ieguene hypothesis as remained ire} undeelped. Mos of {he reusn is that information teary has been grey misnerstod, fun this due to eta, similar to that made by Asso, i tcmpung wo bse the argument om he meaning of words, namely, the “tropomorphism ofthe word information” fo orer ofr Shannon huropy 1 0 measure meaning ee ave teed ove af hoe he ‘Mlocnaion content ofa sequens ino porposel purpovds, rend hd derivative information Jobson, 1970), The notion of ezatie ‘hop has ao bated the erate The anthropomorphism ssa shoe word faforation” raided ion regards te genetiostem a8 {ai rsordig ad processing tem. ‘Wi approprit o Jefe nthe ginning, whatsmeant by ‘thon 6 tis book A ory sae ment and set fasion (r postal) together with alte theorems tat eae proved from thee axons Fane crampls of there are Euclidean gromecy, Mendes law, {ermodjnamis, Newton's mechanics nd Maxrels 308 Some may betcha © pu Let dente summation ofthe Bat et lint summation ove the woond st. Thes the two sets may beara Pan En-Ein wo En oo Lets ee namer of vals fin te Pivner-eiton oe Borpoeiow : etc EBer) os sevcntscoasfttincras meena] (7 sat coma s-amo sre ena 4 erent cam Eatin) — 2) sata) om eer ‘aorta a “Thefore AM and! approach bere nonzero constant ies. Irie rca aie doubly sacha an the vetr pas the ‘quan pP~ phen eck component of pis I/n "This pba and futher dersapments en be ound in Moran (1980. 1.34 The igemate and elgnrectors acid wih a matric 1 pointed ut in section 1.1.2 that the matrix A may be though af as toprating on eto a toehangeit to vector Tate mate Aca be thought ofaeamppag ofthe posisin pace Xf the m-omponent lor Proebiy matrices and paboiiyvectors s onto the plas the space ¥ of he p-oomponent vector Thesivaton fer compbeatet i one considers ll vectors in thee spac, However thre are specie wstory hat behave in vey simple way when operated ‘bythe mtx A that map pace X into ele Stch a weir, % whe mak by mate beseres ail of Hs x. Tis ale ‘haratteristi ctor latent tro proper er ofthe mati. The ore Frequent designation isthe Dewch-Enasn hyd eigenvector. The ‘German wordetgen'trasnted as pear, rope, seca Ti ‘dcr the catnip ofthe egevecor tothe matrix A, Egenvecors fave a number of tnporat properties among which i hat they ase Irutaly orthogonal iA syne etalon; An mn sat A) whose ements ae polyoma in 2s fale ti, A) sid to be bur oF nonsigular according 0 tr the determinant AC is zo or mo. DefiitonIAi guare ates of order wand xis anon clam ‘estor that saints equation denis nn xis eala igh haraceriae vector, loon vector proper retor ‘ptimorefequelly the rig eigonvctr af A conesponding to 2. Non til a that tess one compet of xs ot 20. “ise square mata ofthe order mand xis om-rv ow vector that sates the equation ten isle ae character str, lato tor proper weir ba ‘nore Fequety the lf gemtorcoesponding t0 1. Sice matrix Iulplcation snot comma, lee and ght egeovecors ae not rece ego Detnons IFA ita sqaze mates fore then the 3 mast (ana) is called the characte man of A. The dterinant [Ai Called the characteristic determinant of A. The expansion af thi de- ‘ermine polyomilof eprom nha elle he caractrsie Fincton of A. When se equal to 260 this polynomial i eld the as ideas in rab ory “ choratriti eputon of A. Th 0s of his pyaar calle the ‘ovatrae rots of A. Other erm tat appear inte erate ae ton worn, proper rot a serdar ales. The mos popula te in the scant Tere, however, gsi. the Dewseh-Enabsh hybrid UMfesnabcr The st of sige scaled the specu of Th tr Syenates are ex the epeceum (sid to be dtinet. IF wo or nore agenvales ae egal he ctu sai ob degen “Tore L6Thersa nants ston othe esaton AX = Axion tony ithe determinant (Ai, arises. Prof: Seung the charset determinant equal to ze gives the ‘Kiracec equation Ax =x. Thin equation maybe solved foreach igen 2 Since A ea square mate Ax ~ x may be writen 38 t fsimliseousequsons and solved by Cras ri Coat: A es the gener xf and only ifs an geval of A “Theorem 1.7: CA is mt (ey Gino! then ite eigensevtrs are ohn. 2) an genvalus are Prof Given the nisin cigevales A then [AH ]x = Ory be switen mr ytem afm simultancos gestions in w unknowns for the Components of Let the components of %, which comesponds 0 the ‘Stomalc i of Abe degated an. Then,caling the ements ofA oy Baer he as Mukiny Bot sides of esto (152) by and sum oer Baus cp AE ay asy “Tete indies are only abel; heefoe, we may wre te same quntion ass Api nteshange and onthe fet ofthis equaton Bossa E see as) Probebiiy matrices and proba vetars : Size Ai symmetic(y a) the end ies of gation (1.53 and uation (155) are egal, 0 a ADE satya ss) since the genase isnt, theo Eaten os “re expression ye th intr product ofthe eget and hich tne when so vectors are thos ‘Seve th eigerecios of + symmetric marx are orhogood if the cigenaun reds te form th basso an abet Eucde pace en nek a her vectors in that apace canbe expres by net orbation, The axioms of Bocean geometry are obeyed i this Sturt apace at they arin te epce in which we Ue Tiss aother Itesastion ofthe power and generty ofthe axiomatic method of (rein one takes iter interpretation of Eu’ aioms tis point ‘Toad Be minal. Generations ofthis Kind are fequeny made i rahe and hore py cs. Tovah Hotacker (1985) have applied the eigeovstors of the mutason probity mast of Daybel, Eek & Park (197), whichis proosinelysymete fo the consroction of an aburat can ‘open spe af amine ace The postion of the aminoacids ad the Ties Seween them is a mesure of thet functional equivalence as ‘eslocorent in mutation Geton 3.2). he ener wile delghtd (or appalled) to pow that not ony there a vero and pat scr bt lifer and integral aa of ‘corr and matics (Blin, 1970. Furthermore, matematians have Trove complenied acy of mbes, called tensors, whic abo ave an ‘Sere ancl. "Poe alba and cakes of satrces and tensors wil id any soplcatons a theoretical ncaa logy teeaus hey enable 0 Tnkecomputetions sith Irge amounts of dat tht conse all the howtedge tat data, Mout quants i molesua biology cannot be Caprased by a singe umber without destroying knowiedgetvovah asc ies proba theory « 135 Markov sin ond he rand wal tn Chapter 12 1 hall apply matrix algbea to the contruction of Dolgenete chains Pylosnete chats, which rate one pron 1 nother by sven of mtaonal steps. are examples ofthe generat Imatbonatal theory ofthe transons eiveensucesve sae of ‘Ju The seuonc of Marko sae (or evens) relting the wansitons teoweco the naan false of @ ays i known asa Marko rosso Mato han ln oles bsogy wear concerned only wth Cnr and fte Maciow chains (Kemens, Soll Kaapp, 1978. 1 oer the probbity othe ol states govern by the probes of ‘tguene of sates peeing te Sol wate If there ap sch aes the ‘Markov proeisie known a borer Markov process. n information theory, mee on neta inte geertion ofa eqoene of yl, own aan pub tat generate + eee message, 2 hr Markey proces known tsa kinerary sure, We rite the conditional {ras pobubtytht state Bw be oped ae th segue Ey Byrn By has ecard PAE Eg Be) Note the revere order of the Using, This standard notton in informacion theory. “The tstion graphs fortwo kinds of Markov sass ae shown in Figures 12 and 13, As aoother ample, consider the sequence of uations Lysis Asan Tye!8" Phe", The condition! ans itn potty for thi sues of mutations ie writen ULC | AAG, AAC, UAC) ‘inal thre milter hive hen changed thsi theesmemery Marko proses ae tn exp fhe it that te probaly ofan cent Is oten stony Infuenend by events ocuning previous. Ih era, ovr revo events ray aot the endiona tanstion [obsbiliy Two satesare suo communieateifonecan breached from The other I sate such that nce entered eannot beet cad sn absorbing stat, Asn trough wis aye may move i called a ‘rset sate A refeting sates ett ends the yam nt another ‘ee iout ring oye oles bolgy the macliesia DNA, formRNA sequense nd the codons nthe gent code may be reper Proiiy matrices and probably vectors ” ss Markov states. Al codons communis, een the sop codons, by tres of a sequence of sage base inerchanes, that, by a Markov ‘Buin, The stp codons are usally aborbing sats, but they may be tranent states (ection 123. “The ements othe ransition probability mati of Markov states may change asthe proses proves fom ope ep tothe next. I the cemens tthe wanton probity mut donot change, the Markov proces i Caled homoseeous or atoary. Brnoull process or Bernoulli ‘which suceding event te Independent ae snple amples of Stionary Markov proce. "The sequenos 50 produced re cal ‘Beroul sequences Slomono, 1968). The equenee of outcomes ofthe teuraacoin sa Beroul seoencebxaue thee ae ely 0 outcomes, the tae are fepenent an the probability esrbtion i stationary The sequence ofthe nmr enerated by a sequence throws dei ota Beso sequence sins thee ae sk outcomes, “Among the eampteof the Markov sate pron, for which we shall save web oneknown a.ndom walk Suppose man sanding on te idl ineo football. He Bip acon to detain which way 0 heads one way, tal the other, The fla stance moved in one ‘ton othe other i random varie. A aumbe of qusions can be {shed For expla th probaly ofersung a goal ite? Ths it th camps oft one-nensonal doth walk. The scenario may be tlasrsted to more tan one dimension. Te man might be on the commer ‘fiwociystrente Hef forgotten herbi tele. yer teat ight fd thre noone to ask Here te Markov sate pte i he pir of ‘Sordiais svg teaspoon onthe sitymap. Hepoerone lok at ‘ie and at each corer e icin tic, odeeine which ofthe four wap to go. He sats oat wih the flowing cde: AT = noth, HE uh, TH-=wes, TT = eas, AI seis going in the north ‘roto end at ver, which a absobing sate goss that fa. be will in Theres avalon the south end ofthe north-south es, hich ia rebectng wale. Ofcourse ie does ind his hotel he goin and {oth hotel ilo an aborbng tt, Al other sats re rate! tts Suppor biswifiea he hot and she goss out to lec fori, She woul ted the probity dsrbation so ee eld goto the moe peolable fret intrectons et He might forget his ode betwen corners This {hange in coe isa tion ater whic, for exp, HH = north and THT south Thee are many other questions hat cul! bashed and sch problems may get quite completes

You might also like