Professional Documents
Culture Documents
In Appendix 2
In Appendix 2
thethreeaboveassumptionsisoftheform:
whereKisapositiveconstant.
nH??K
pilogpi
i?1
Thistheorem,andtheassumptionsrequiredforitsproof,areinnowaynecessary
forthepresenttheory.Itisgivenchieflytolendacertainplausibilitytosomeofour
laterdefinitions.Therealjustificationofthesedefinitions,however,willresidein
theirimplications.
QuantitiesoftheformH?? p ilogpi(theconstantKmerelyamountstoachoiceofa
unitofmeasure)playacentralroleininformationtheoryasmeasuresof
information,choiceanduncertainty.TheformofHwillberecognizedasthatof
entropyasdefinedincertainformulationsofstatisticalmechanics8wherepiisthe
probabilityofasystembeingincelliofitsphasespace.Histhen,forexample,the
HinBoltzmannsfamousHtheorem.WeshallcallH?? p ilogpitheentropyofthe
setofprobabilitiesp1pn.IfxisachancevariablewewillwriteH?x?forits
entropy;thusxisnotanargumentofafunctionbutalabelforanumber,to
differentiateitfromH?y?say,theentropyofthechancevariabley.
Theentropyinthecaseoftwopossibilitieswithprobabilitiespandq?1?p,namely
H???
plogp
?qlogq?
isplottedinFig.7asafunctionofp.
???
??
??
??
BITS????
pFig.7Entropyinthecaseoftwopossibilitieswithprobabilitiespand?1?p?.
ThequantityHhasanumberofinterestingpropertieswhichfurthersubstantiateitas
areasonablemeasureofchoiceorinformation.
1.H?0ifandonlyifallthepibutonearezero,thisonehavingthevalueunity.Thus
onlywhenwearecertainoftheoutcomedoesHvanish.OtherwiseHispositive.
2.Foragivenn,Hisamaximumandequaltolognwhenallthepiareequal(i.e.,1).
Thisisalson
intuitivelythemostuncertainsituation.8See,forexample,R.C.Tolman,Principlesof
StatisticalMechanics,Oxford,Clarendon,1938.
11
3.Supposetherearetwoevents,xandy,inquestionwithmpossibilitiesforthefirst
andnforthesecond.Letp?i?j?betheprobabilityofthejointoccurrenceofiforthe
firstandjforthesecond.Theentropyofthejointeventis
while
Itiseasilyshownthat
H?x?y???p?i?j?logp?i?j?i?j
.
H?x???p?i?j?logp?i?j?i?jj
H?y???p?i?j?logp?i?j??i?ji
H?x?y H?x H?y?
withequalityonlyiftheeventsareindependent(i.e.,p?i?j??p?i?p?j?).The
uncertaintyofajointeventislessthanorequaltothesumoftheindividual
uncertainties.
4.Anychangetowardequalizationoftheprobabilitiesp1?p2pnincreasesH.Thus
ifp1?p2andweincreasep1,decreasingp2anequalamountsothatp1andp2aremore
nearlyequal,thenHincreases.Moregenerally,ifweperformanyaveraging
operationonthepioftheform
?
p i?aijpjj
whereiaij? jaij?1,andallaij?0,thenHincreases(exceptinthespecialcase
wherethistransformationamountstonomorethanapermutationofthepjwithH
ofcourseremainingthesame).
5.Supposetherearetwochanceeventsxandyasin3,notnecessarilyindependent.
Foranyparticularvalueithatxcanassumethereisaconditionalprobabilitypi?j?
thatyhasthevaluej.Thisisgivenby
p?i?j?pi?j jp?i?j?
Wedefinetheconditionalentropyofy,Hx?y?astheaverageoftheentropyofyfor
eachvalueofx,weightedaccordingtotheprobabilityofgettingthatparticularx.
Thatis
Hx?y???p?i?j?logpi?j i?j
Thisquantitymeasureshowuncertainweareofyontheaveragewhenweknowx.
Substitutingthevalueofpi?j?weobtain
or
6.From3and5wehave
Hence
Hx H?y H?x?y??H?x Hx?y??
Hx?y???p?i?j?logp?i?j p?i?j?logp?i?j?i?ji?jj
?H?x?y??H?x?
?
Hxy??H?x Hx?y??
?
?
Theuncertainty(orentropy)ofthejointeventx?yistheuncertaintyofxplusthe
uncertaintyofywhenxisknown.
Hy Hx?y??
?
Theuncertaintyofyisneverincreasedbyknowledgeofx.Itwillbedecreased
unlessxandyareindependentevents,inwhichcaseitisnotchanged.
12
7.THEENTROPYOFANINFORMATIONSOURCE
Consideradiscretesourceofthefinitestatetypeconsideredabove.Foreach
possiblestateitherewillbeasetofprobabilitiespi?j?ofproducingthevarious
possiblesymbolsj.ThusthereisanentropyHiforeachstate.Theentropyofthe
sourcewillbedefinedastheaverageoftheseHiweightedinaccordancewiththe
probabilityofoccurrenceofthestatesinquestion:
H?PiHii
??Pipi?j?logpi?j i?j
Thisistheentropyofthesourcepersymboloftext.IftheMarkoffprocessis
proceedingatadefinitetimeratethereisalsoanentropypersecond
H??fiHii
wherefiistheaveragefrequency(occurrencespersecond)ofstatei.Clearly?
H?mH
wheremistheaveragenumberofsymbolsproducedpersecond.HorH?measures
theamountofinformationgeneratedbythesourcepersymbolorpersecond.Ifthe
logarithmicbaseis2,theywillrepresentbitspersymbolorpersecond.
IfsuccessivesymbolsareindependentthenHissimply? p ilogpiwherepiisthe
probabilityofsymboli.SupposeinthiscaseweconsideralongmessageofN
symbols.Itwillcontainwithhighprobabilityaboutp1Noccurrencesofthefirst
symbol,p2Noccurrencesofthesecond,etc.Hencetheprobabilityofthisparticular
messagewillberoughly
or
p1N p2N
p?p
ppnN12n
logp? N pilogpii
?
logp? ?NH
?log1?pH??
N
Histhusapproximatelythelogarithmofthereciprocalprobabilityofatypicallong
sequencedividedbythenumberofsymbolsinthesequence.Thesameresultholds
foranysource.Statedmorepreciselywehave(seeAppendix3):
Theorem3:Givenany 0and 0,wecanfindanN0suchthatthesequencesofanylengthN?N0
fallintotwoclasses:
1.Asetwhosetotalprobabilityislessthan?.2.Theremainder,allofwhose
membershaveprobabilitiessatisfyingtheinequality
HN
logp?1N
Acloselyrelatedresultdealswiththenumberofsequencesofvariousprobabilities.
ConsideragainthesequencesoflengthNandletthembearrangedinorderof
decreasingprobability.Wedefinen?q?tobethenumberwemusttakefromthisset
startingwiththemostprobableoneinordertoaccumulateatotalprobabilityqfor
thosetaken.
Inotherwordswearealmostcertaintohave
veryclosetoHwhenNislarge.
??
1?
logp? ??
?
???
13
Theorem4:
N? N
Wemayinterpretlogn?q?asthenumberofbitsrequiredtospecifythesequence
whenweconsideronly
logn?q?N
thespecification.ThetheoremsaysthatforlargeNthiswillbeindependentofqand
equaltoH.Therate
ofgrowthofthelogarithmofthenumberofreasonablyprobablesequencesisgiven
byH,regardlessofour
interpretationofreasonablyprobable.Duetotheseresults,whichareprovedin
Appendix3,itispossible
formostpurposestotreatthelongsequencesasthoughtherewerejust2HNofthem,
eachwithaprobability ?HN
2 .
ThenexttwotheoremsshowthatHandH?canbedeterminedbylimitingoperations
directlyfromthestatisticsofthemessagesequences,withoutreferencetothestates
andtransitionprobabilitiesbetweenstates.
Theorem5:Letp?Bi?betheprobabilityofasequenceBiofsymbolsfromthesource.
Let
1
wherethesumisoverallsequencesBicontainingNsymbols.ThenGNisa
monotonicdecreasingfunctionofNand
LimGN?H?N?
Theorem6:Letp?Bi?Sj?betheprobabilityofsequenceBifollowedbysymbolSjand
pBi?Sj??p?Bi?Sj??p?Bi?betheconditionalprobabilityofSjafterBi.Let
FN??p?Bi?Sj?logpBi?Sj?i?j
wherethesumisoverallblocksBiofN?1symbolsandoverallsymbolsSj.Then
FNisamonotonicdecreasingfunctionofN,
themostprobablesequenceswithatotalprobabilityq.Then
isthenumberofbitspersymbolfor
logn?q?Lim?H
whenqdoesnotequal0or1.
pGNii
??
BNi
?
logp
B
?
?
?
FN?NGN??N?1?GN?1?1N
N FN?GN?
n
FGNn
?
?
1
andLimN?F
N?H.
?
TheseresultsarederivedinAppendix3.Theyshowthataseriesofapproximations
toHcanbeobtainedbyconsideringonlythestatisticalstructureofthesequences
extendingover1?2Nsymbols.FNisthebetterapproximation.InfactFNisthe
entropyoftheNthorderapproximationtothesourceofthetypediscussedabove.If
therearenostatisticalinfluencesextendingovermorethanNsymbols,thatisifthe
conditionalprobabilityofthenextsymbolknowingthepreceding?N?1?isnot
changedbyaknowledgeofanybeforethat,thenFN?H.FNofcourseisthe
conditionalentropyofthenextsymbolwhenthe?N?1?precedingonesareknown,
whileGNistheentropypersymbolofblocksofNsymbols.
Theratiooftheentropyofasourcetothemaximumvalueitcouldhavewhilestill
restrictedtothesamesymbolswillbecalleditsrelativeentropy.Thisisthe
maximumcompressionpossiblewhenweencodeintothesamealphabet.Oneminus
therelativeentropyistheredundancy.TheredundancyofordinaryEnglish,not
consideringstatisticalstructureovergreaterdistancesthanabouteightletters,is
roughly50%.ThismeansthatwhenwewriteEnglishhalfofwhatwewriteis
determinedbythestructureofthelanguageandhalfischosenfreely.Thefigure
50%wasfoundbyseveralindependentmethodswhichallgaveresultsin
14
thisneighborhood.Oneisbycalculationoftheentropyoftheapproximationsto
English.Asecondmethodistodeleteacertainfractionofthelettersfromasample
ofEnglishtextandthenletsomeoneattempttorestorethem.Iftheycanberestored
when50%aredeletedtheredundancymustbegreaterthan50%.Athirdmethod
dependsoncertainknownresultsincryptography.
TwoextremesofredundancyinEnglishprosearerepresentedbyBasicEnglishand
byJamesJoycesbookFinnegansWake.TheBasicEnglishvocabularyislimited
to850wordsandtheredundancyisveryhigh.Thisisreflectedintheexpansionthat
occurswhenapassageistranslatedintoBasicEnglish.Joyceontheotherhand
enlargesthevocabularyandisallegedtoachieveacompressionofsemanticcontent.
Theredundancyofalanguageisrelatedtotheexistenceofcrosswordpuzzles.Ifthe
redundancyiszeroanysequenceoflettersisareasonabletextinthelanguageand
anytwodimensionalarrayoflettersformsacrosswordpuzzle.Iftheredundancyis
toohighthelanguageimposestoomanyconstraintsforlargecrosswordpuzzlesto
bepossible.Amoredetailedanalysisshowsthatifweassumetheconstraints
imposedbythelanguageareofaratherchaoticandrandomnature,largecrossword
puzzlesarejustpossiblewhentheredundancyis50%.Iftheredundancyis33%,
threedimensionalcrosswordpuzzlesshouldbepossible,etc.
8.REPRESENTATIONOFTHEENCODINGANDDECODINGOPERATIONS
Wehaveyettorepresentmathematicallytheoperationsperformedbythetransmitter
andreceiverinencodinganddecodingtheinformation.Eitherofthesewillbe
calledadiscretetransducer.Theinputtothetransducerisasequenceofinput
symbolsanditsoutputasequenceofoutputsymbols.Thetransducermayhavean
internalmemorysothatitsoutputdependsnotonlyonthepresentinputsymbolbut
alsoonthepasthistory.Weassumethattheinternalmemoryisfinite,i.e.,there
existafinitenumbermofpossiblestatesofthetransducerandthatitsoutputisa
functionofthepresentstateandthepresentinputsymbol.Thenextstatewillbea
secondfunctionofthesetwoquantities.Thusatransducercanbedescribedbytwo
functions:
where
xn?nyn
isthenthinputsymbol,isthestateofthetransducerwhenthenthinputsymbolis
introduced,
istheoutputsymbol(orsequenceofoutputsymbols)producedwhenxnis
introducedifthestateis?n.
yn?f?xnnn?1?g?xnn?
Iftheoutputsymbolsofonetransducercanbeidentifiedwiththeinputsymbolsofa
second,theycanbeconnectedintandemandtheresultisalsoatransducer.Ifthere
existsasecondtransducerwhichoperatesontheoutputofthefirstandrecoversthe
originalinput,thefirsttransducerwillbecallednonsingularandthesecondwillbe
calleditsinverse.
Theorem7:Theoutputofafinitestatetransducerdrivenbyafinitestatestatistical
sourceisafinitestatestatisticalsource,withentropy(perunittime)lessthanor
equaltothatoftheinput.Ifthetransducerisnonsingulartheyareequal.
Let?representthestateofthesource,whichproducesasequenceofsymbolsxi;and
let?bethestateofthetransducer,whichproduces,initsoutput,blocksofsymbolsy
j.Thecombinedsystemcanberepresentedbytheproductstatespaceofpairs .
Twopointsinthespace??11?and??22?,areconnectedbyalineif?1canproduceanx
whichchanges?1to?2,andthislineisgiventheprobabilityofthatxinthiscase.The
lineislabeledwiththeblockofyjsymbolsproducedbythetransducer.Theentropy
oftheoutputcanbecalculatedastheweightedsumoverthestates.Ifwesumfirst
on?eachresultingtermislessthanorequaltothecorrespondingtermfor?,hence
theentropyisnotincreased.Ifthetransducerisnonsingularletitsoutputbe
connectedtotheinversetransducer.IfH1?,H2?andH3?aretheoutputentropiesof
thesource,thefirstandsecondtransducersrespectively,thenH1??H2??H3??H1?and
thereforeH1??H2?.
15
Supposewehaveasystemofconstraintsonpossiblesequencesofthetypewhich
canberepresentedby
?s?alineargraphasinFig.2.Ifprobabilitiespijwereassignedtothevariouslines
connectingstateitostatej
thiswouldbecomeasource.Thereisoneparticularassignmentwhichmaximizes
theresultingentropy(seeAppendix4).
Theorem8:LetthesystemofconstraintsconsideredasachannelhaveacapacityC?
logW.Ifweassign
?s Wij
?s?
where? isthedurationofthesthsymbolleadingfromstateitostatejandtheBi
satisfyij
?s?Bi?BjW
??ij
s?j
thenHismaximizedandequaltoC.Byproperassignmentofthetransition
probabilitiestheentropyofsymbolsonachannelcanbemaxi
mizedatthechannelcapacity.9.THEFUNDAMENTALTHEOREMFORANOISELESS
CHANNEL
WewillnowjustifyourinterpretationofHastherateofgeneratinginformationby
provingthatHdeterminesthechannelcapacityrequiredwithmostefficient
coding.
Theorem9:LetasourcehaveentropyH?bitspersymbol?andachannelhavea
capacityC?bitspersecond?.Thenitispossibletoencodetheoutputofthesourcein
suchawayastotransmitattheaverage
C
??H
symbolspersecondoverthechannelwhere?isarbitrarilysmall.Itisnotpossibleto
transmitatC
rateanaveragerategreaterthan
.
H
?s?pij?
BjBi
CH
ofthechannelinputpersecondisequaltothatofthesource,sincethetransmitter
mustbenonsingular,andalsothisentropycannotexceedthechannelcapacity.
HenceH??Candthenumberofsymbolspersecond?H??H?C?H.
Thefirstpartofthetheoremwillbeprovedintwodifferentways.Thefirstmethod
istoconsiderthesetofallsequencesofNsymbolsproducedbythesource.ForN
largewecandividetheseintotwogroups,onecontaininglessthan2?HNmembers
andthesecondcontaininglessthan2RNmembers(whereRisthelogarithmofthe
numberofdifferentsymbols)andhavingatotalprobabilitylessthan?.AsN
increases?and?approachzero.ThenumberofsignalsofdurationTinthechannelis
greaterthan2?CTwith?smallwhenTislarge.ifwechoose
HT???N
C
thentherewillbeasufficientnumberofsequencesofchannelsymbolsforthehigh
probabilitygroupwhenNandTaresufficientlylarge(howeversmall?)andalso
someadditionalones.Thehighprobabilitygroupiscodedinanarbitraryonetoone
wayintothisset.Theremainingsequencesarerepresentedbylargersequences,
startingandendingwithoneofthesequencesnotusedforthehighprobability
group.Thisspecialsequenceactsasastartandstopsignalforadifferentcode.In
betweenasufficienttimeisallowedtogiveenoughdifferentsequencesforallthe
lowprobabilitymessages.Thiswillrequire
Theconversepartofthetheorem,that
cannotbeexceeded,maybeprovedbynotingthattheentropy
??
? ?T1N
C
where
?
issmall.Themeanrateoftransmissioninmessagesymbolspersecondwillthenbe
greaterthan
?
1
?
?????????
1
???????
1
1
??????????
TT1HR
NNCC
16
CH
Anothermethodofperformingthiscodingandtherebyprovingthetheoremcanbe
describedasfollows:
ArrangethemessagesoflengthNinorderofdecreasingprobabilityandsuppose
theirprobabilitiesare
p1?p2?p3pn.LetPs?s?1pi;thatisPsisthecumulativeprobabilityupto,butnot
including,ps.1
Wefirstencodeintoabinarysystem.Thebinarycodeformessagesisobtainedby
expandingPsasabinarynumber.Theexpansioniscarriedouttomsplaces,where
msistheintegersatisfying:
1 1
?m
AsNincreases?,?and?approachzeroandtherateapproaches
.
log2s?1?log2?
ps
ps
Thusthemessagesofhighprobabilityarerepresentedbyshortcodesandthoseof
lowprobabilitybylongcodes.Fromtheseinequalitieswehave
11
?
Pareatleast1largerandtheirbinaryexpansionsthereforedifferinthefirstm
places.Consequentlyalli2mss
thecodesaredifferentanditispossibletorecoverthemessagefromitscode.Ifthe
channelsequencesarenotalreadysequencesofbinarydigits,theycanbeascribed
binarynumbersinanarbitraryfashionandthebinarycodethustranslatedinto
signalssuitableforthechannel.
ps 2ms?1
2msThecodeforPswilldifferfromallsucceedingonesinoneormoreofitsms
places,sincealltheremaining
TheaveragenumberH?have
But,
andtherefore,
ofbinarydigitsusedpersymboloforiginalmessageiseasilyestimated.We
1
??H?
msps
N
????
11111
mplog2psss 2s
logpNpsNNps
?
1
?
1
GN?H??GN?N
AsNincreasesGNapproachesH,theentropyofthesourceandH?approachesH.
Weseefromthisthattheinefficiencyincoding,whenonlyafinitedelayofN
symbolsisused,need
notbegreaterthan1plusthedifferencebetweenthetrueentropyHandtheentropy
GNcalculatedforN
sequencesoflengthN.Thepercentexcesstimeneededovertheidealistherefore
lessthan
GN1??1?
HHN
ThismethodofencodingissubstantiallythesameasonefoundindependentlybyR.
M.Fano.9HismethodistoarrangethemessagesoflengthNinorderofdecreasing
probability.Dividethisseriesintotwogroupsofasnearlyequalprobabilityas
possible.Ifthemessageisinthefirstgroupitsfirstbinarydigitwillbe0,otherwise
1.Thegroupsaresimilarlydividedintosubsetsofnearlyequalprobabilityandthe
particularsubsetdeterminesthesecondbinarydigit.Thisprocessiscontinueduntil
eachsubsetcontainsonlyonemessage.Itiseasilyseenthatapartfromminor
differences(generallyinthelastdigit)thisamountstothesamethingasthe
arithmeticprocessdescribedabove.
10.DISCUSSIONANDEXAMPLES
Inordertoobtainthemaximumpowertransferfromageneratortoaload,a
transformermustingeneralbeintroducedsothatthegeneratorasseenfromtheload
hastheloadresistance.Thesituationhereisroughlyanalogous.Thetransducer
whichdoestheencodingshouldmatchthesourcetothechannelinastatistical
sense.Thesourceasseenfromthechannelthroughthetransducershouldhavethe
samestatisticalstructure
9
TechnicalReportNo.65,TheResearchLaboratoryofElectronics,M.I.T.,March17,1949.
17
asthesourcewhichmaximizestheentropyinthechannel.ThecontentofTheorem
9isthat,althoughanexactmatchisnotingeneralpossible,wecanapproximateit
ascloselyasdesired.TheratiooftheactualrateoftransmissiontothecapacityC
maybecalledtheefficiencyofthecodingsystem.Thisisofcourseequaltotheratio
oftheactualentropyofthechannelsymbolstothemaximumpossibleentropy.
Ingeneral,idealornearlyidealencodingrequiresalongdelayinthetransmitterand
receiver.Inthenoiselesscasewhichwehavebeenconsidering,themainfunctionof
thisdelayistoallowreasonablygoodmatchingofprobabilitiestocorresponding
lengthsofsequences.Withagoodcodethelogarithmofthereciprocalprobability
ofalongmessagemustbeproportionaltothedurationofthecorrespondingsignal,
infact
? ?1?
? ?
logp C
T
mustbesmallforallbutasmallfractionofthelongmessages.Ifasourcecan
produceonlyoneparticularmessageitsentropyiszero,andnochannelisrequired.
For
example,acomputingmachinesetuptocalculatethesuccessivedigitsof ?produces
adefinitesequencewithnochanceelement.Nochannelisrequiredtotransmit
thistoanotherpoint.Onecouldconstructasecondmachinetocomputethesame
sequenceatthepoint.However,thismaybeimpractical.Insuchacasewecan
choosetoignoresomeorallofthestatisticalknowledgewehaveofthesource.We
mightconsiderthedigitsof?tobearandomsequenceinthatweconstructasystem
capableofsendinganysequenceofdigits.Inasimilarwaywemaychoosetouse
someofourstatisticalknowledgeofEnglishinconstructingacode,butnotallofit.
Insuchacaseweconsiderthesourcewiththemaximumentropysubjecttothe
statisticalconditionswewishtoretain.Theentropyofthissourcedeterminesthe
channelcapacitywhichisnecessaryandsufficient.Inthe?exampletheonly
informationretainedisthatallthedigitsarechosenfromtheset0?19.Inthecase
ofEnglishonemightwishtousethestatisticalsavingpossibleduetoletter
frequencies,butnothingelse.Themaximumentropysourceisthenthefirst
approximationtoEnglishanditsentropydeterminestherequiredchannelcapacity.
Asasimpleexampleofsomeoftheseresultsconsiderasourcewhichproducesa
sequenceofletters
chosenfromamongA,B,C,Dwithprobabilities1,1,1,1,successivesymbols
beingchosenindependently.2488
Wehave
?
H?? 1log1?1log1?2log1
?
224488
? bitspersymbol?4
Thuswecanapproximateacodingsystemtoencodemessagesfromthissourceinto
binarydigitswithan
averageof7binarydigitpersymbol.Inthiscasewecanactuallyachievethelimiting
valuebythefollowing4
code(obtainedbythemethodofthesecondproofofTheorem9):
A0B10C110D111
TheaveragenumberofbinarydigitsusedinencodingasequenceofNsymbolswill
be
? 2?
???
N1112 3?7N?2484
Itiseasilyseenthatthebinarydigits0,1haveprobabilities1,1sotheHforthe
codedsequencesisone22
bitpersymbol.Since,ontheaverage,wehave7binarysymbolsperoriginalletter,
theentropiesonatime4
basisarethesame.Themaximumpossibleentropyfortheoriginalsetislog4?2,
occurringwhenA,B,C,Dhaveprobabilities1,1,1,1.Hencetherelativeentropyis
7
.Wecantranslatethebinarysequencesinto
44448theoriginalsetofsymbolsonatwotoonebasisbythefollowingtable:
00011011
A?B?C?D?
18
Thisdoubleprocessthenencodestheoriginalmessageintothesamesymbolsbut
withanaveragecompres
sionratio7.8
AsasecondexampleconsiderasourcewhichproducesasequenceofAsandBs
withprobabilitypforAandqforB.Ifp?qwehave
H??logpp?1?p?1?p??plogp?1?p??1?p??p
?e?plog?p
Insuchacaseonecanconstructafairlygoodcodingofthemessageona0,1
channelbysendingaspecialsequence,say0000,fortheinfrequentsymbolAand
thenasequenceindicatingthenumberofBsfollowingit.Thiscouldbeindicated
bythebinaryrepresentationwithallnumberscontainingthespecialsequence
deleted.Allnumbersupto16arerepresentedasusual;16isrepresentedbythenext
binarynumberafter16whichdoesnotcontainfourzeros,namely17?10001,etc.
Itcanbeshownthatasp?0thecodingapproachesidealprovidedthelengthofthe
specialsequenceisproperlyadjusted.
PARTII:THEDISCRETECHANNELWITHNOISE
11.REPRESENTATIONOFANOISYDISCRETECHANNEL
Wenowconsiderthecasewherethesignalisperturbedbynoiseduring
transmissionoratoneortheotheroftheterminals.Thismeansthatthereceived
signalisnotnecessarilythesameasthatsentoutbythetransmitter.Twocasesmay
bedistinguished.Ifaparticulartransmittedsignalalwaysproducesthesame
receivedsignal,i.e.,thereceivedsignalisadefinitefunctionofthetransmitted
signal,thentheeffectmaybecalleddistortion.Ifthisfunctionhasaninverseno
twotransmittedsignalsproducingthesamereceivedsignaldistortionmaybe
corrected,atleastinprinciple,bymerelyperformingtheinversefunctional
operationonthereceivedsignal.
Thecaseofinteresthereisthatinwhichthesignaldoesnotalwaysundergothe
samechangeintransmission.InthiscasewemayassumethereceivedsignalEto
beafunctionofthetransmittedsignalSandasecondvariable,thenoiseN.
E?f?S?N?
Thenoiseisconsideredtobeachancevariablejustasthemessagewasabove.In
generalitmayberepresentedbyasuitablestochasticprocess.Themostgeneral
typeofnoisydiscretechannelweshallconsiderisageneralizationofthefinitestate
noisefreechanneldescribedpreviously.Weassumeafinitenumberofstatesanda
setofprobabilities
p??i j??
Thisistheprobability,ifthechannelisinstate?andsymboliistransmitted,that
symboljwillbereceivedandthechannelleftinstate?.Thus?and?rangeoverthe
possiblestates,ioverthepossibletransmittedsignalsandjoverthepossible
receivedsignals.Inthecasewheresuccessivesymbolsareindependentlyper
turbedbythenoisethereisonlyonestate,andthechannelisdescribedbythesetof
transitionprobabilitiespi?j?,theprobabilityoftransmittedsymbolibeingreceived
asj.
Ifanoisychannelisfedbyasourcetherearetwostatisticalprocessesatwork:the
sourceandthenoise.Thusthereareanumberofentropiesthatcanbecalculated.
FirstthereistheentropyH?x?ofthesourceoroftheinputtothechannel(thesewill
beequalifthetransmitterisnonsingular).Theentropyoftheoutputofthechannel,
i.e.,thereceivedsignal,willbedenotedbyH?y?.InthenoiselesscaseH?y??H?x?.The
jointentropyofinputandoutputwillbeH?xy?.Finallytherearetwoconditional
entropiesHx?y?andHy?x?,theentropyoftheoutputwhentheinputisknownand
conversely.Amongthesequantitieswehavetherelations
H?x?y??H?x Hx?y??H?y Hy?x??Alloftheseentropiescanbemeasuredonapersecond
orapersymbolbasis.
19
12.EQUIVOCATIONANDCHANNELCAPACITY
Ifthechannelisnoisyitisnotingeneralpossibletoreconstructtheoriginal
messageorthetransmittedsignalwithcertaintybyanyoperationonthereceived
signalE.Thereare,however,waysoftransmittingtheinformationwhichare
optimalincombatingnoise.Thisistheproblemwhichwenowconsider.
Supposetherearetwopossiblesymbols0and1,andwearetransmittingatarateof
1000symbolsper
secondwithprobabilitiesp0?p1?1.Thusoursourceisproducinginformationatthe
rateof1000bits2
persecond.Duringtransmissionthenoiseintroduceserrorssothat,ontheaverage,
1in100isreceivedincorrectly(a0as1,or1as0).Whatistherateoftransmission
ofinformation?Certainlylessthan1000bitspersecondsinceabout1%ofthe
receivedsymbolsareincorrect.Ourfirstimpulsemightbetosaytherateis990bits
persecond,merelysubtractingtheexpectednumberoferrors.Thisisnot
satisfactorysinceitfailstotakeintoaccounttherecipientslackofknowledgeof
wheretheerrorsoccur.Wemaycarryittoanextremecaseandsupposethenoiseso
greatthatthereceivedsymbolsareentirelyindependentofthetransmittedsymbols.
Theprobabilityofreceiving1is1whateverwastransmittedandsimilarlyfor0.
creditfortransmitting500bitspersecondwhileactuallynoinformationisbeing
transmittedatall.Equallygoodtransmissionwouldbeobtainedbydispensing
withthechannelentirelyandflippingacoinatthereceivingpoint.
Evidentlythepropercorrectiontoapplytotheamountofinformationtransmittedis
theamountofthisinformationwhichismissinginthereceivedsignal,or
alternativelytheuncertaintywhenwehavereceivedasignalofwhatwasactually
sent.Fromourpreviousdiscussionofentropyasameasureofuncertaintyitseems
reasonabletousetheconditionalentropyofthemessage,knowingthereceived
signal,asameasureofthismissinginformation.Thisisindeedtheproperdefinition,
asweshallseelater.Followingthisideatherateofactualtransmission,R,wouldbe
obtainedbysubtractingfromtherateofproduction(i.e.,theentropyofthesource)
theaveragerateofconditionalentropy.
R?H?x??Hy?x?
TheconditionalentropyHy?x?will,forconvenience,becalledtheequivocation.It
measurestheaverageambiguityofthereceivedsignal.
Intheexampleconsideredabove,ifa0isreceivedtheaposterioriprobabilitythata
0wastransmittedis.99,andthata1wastransmittedis.01.Thesefiguresare
reversedifa1isreceived.Hence
Hy?x?????
99log?99?0?01log0?01???081bits/symbol
or81bitspersecond.Wemaysaythatthesystemistransmittingatarate1000 ?81?
919bitspersecond.Intheextremecasewherea0isequallylikelytobereceivedas
a0or1andsimilarlyfor1,theaposteriori
2Thenabouthalfofthereceivedsymbolsarecorrectduetochancealone,andwe
wouldbegivingthesystem
probabilitiesare1,1and22
?
Hy?x??? 1log1?1log1
?
2222
?1bitpersymbol
or1000bitspersecond.Therateoftransmissionisthen0asitshouldbe.The
followingtheoremgivesadirectintuitiveinterpretationoftheequivocationandalso
servestojustify
itastheuniqueappropriatemeasure.Weconsideracommunicationsystemandan
observer(orauxiliarydevice)whocanseebothwhatissentandwhatisrecovered
(witherrorsduetonoise).Thisobservernotestheerrorsintherecoveredmessage
andtransmitsdatatothereceivingpointoveracorrectionchanneltoenablethe
receivertocorrecttheerrors.ThesituationisindicatedschematicallyinFig.8.
Theorem10:IfthecorrectionchannelhasacapacityequaltoHy?x?itispossibleto
soencodethecorrectiondataastosenditoverthischannelandcorrectallbutan
arbitrarilysmallfraction?oftheerrors.Thisisnotpossibleifthechannelcapacityis
lessthanHy?x?.
20