You are on page 1of 22

InAppendix2,thefollowingresultisestablished:Theorem2:TheonlyHsatisfying

thethreeaboveassumptionsisoftheform:
whereKisapositiveconstant.
nH??K

pilogpi

i?1

Thistheorem,andtheassumptionsrequiredforitsproof,areinnowaynecessary
forthepresenttheory.Itisgivenchieflytolendacertainplausibilitytosomeofour
laterdefinitions.Therealjustificationofthesedefinitions,however,willresidein
theirimplications.
QuantitiesoftheformH?? p ilogpi(theconstantKmerelyamountstoachoiceofa
unitofmeasure)playacentralroleininformationtheoryasmeasuresof
information,choiceanduncertainty.TheformofHwillberecognizedasthatof
entropyasdefinedincertainformulationsofstatisticalmechanics8wherepiisthe
probabilityofasystembeingincelliofitsphasespace.Histhen,forexample,the
HinBoltzmannsfamousHtheorem.WeshallcallH?? p ilogpitheentropyofthe
setofprobabilitiesp1pn.IfxisachancevariablewewillwriteH?x?forits
entropy;thusxisnotanargumentofafunctionbutalabelforanumber,to
differentiateitfromH?y?say,theentropyofthechancevariabley.
Theentropyinthecaseoftwopossibilitieswithprobabilitiespandq?1?p,namely
H???
plogp

?qlogq?
isplottedinFig.7asafunctionofp.
???
??
??
??
BITS????

pFig.7Entropyinthecaseoftwopossibilitieswithprobabilitiespand?1?p?.

ThequantityHhasanumberofinterestingpropertieswhichfurthersubstantiateitas
areasonablemeasureofchoiceorinformation.
1.H?0ifandonlyifallthepibutonearezero,thisonehavingthevalueunity.Thus
onlywhenwearecertainoftheoutcomedoesHvanish.OtherwiseHispositive.
2.Foragivenn,Hisamaximumandequaltolognwhenallthepiareequal(i.e.,1).
Thisisalson
intuitivelythemostuncertainsituation.8See,forexample,R.C.Tolman,Principlesof
StatisticalMechanics,Oxford,Clarendon,1938.

11
3.Supposetherearetwoevents,xandy,inquestionwithmpossibilitiesforthefirst
andnforthesecond.Letp?i?j?betheprobabilityofthejointoccurrenceofiforthe
firstandjforthesecond.Theentropyofthejointeventis
while
Itiseasilyshownthat
H?x?y???p?i?j?logp?i?j?i?j
.

H?x???p?i?j?logp?i?j?i?jj

H?y???p?i?j?logp?i?j??i?ji
H?x?y H?x H?y?
withequalityonlyiftheeventsareindependent(i.e.,p?i?j??p?i?p?j?).The
uncertaintyofajointeventislessthanorequaltothesumoftheindividual
uncertainties.
4.Anychangetowardequalizationoftheprobabilitiesp1?p2pnincreasesH.Thus
ifp1?p2andweincreasep1,decreasingp2anequalamountsothatp1andp2aremore
nearlyequal,thenHincreases.Moregenerally,ifweperformanyaveraging
operationonthepioftheform
?

p i?aijpjj
whereiaij? jaij?1,andallaij?0,thenHincreases(exceptinthespecialcase
wherethistransformationamountstonomorethanapermutationofthepjwithH
ofcourseremainingthesame).
5.Supposetherearetwochanceeventsxandyasin3,notnecessarilyindependent.
Foranyparticularvalueithatxcanassumethereisaconditionalprobabilitypi?j?
thatyhasthevaluej.Thisisgivenby
p?i?j?pi?j jp?i?j?

Wedefinetheconditionalentropyofy,Hx?y?astheaverageoftheentropyofyfor
eachvalueofx,weightedaccordingtotheprobabilityofgettingthatparticularx.
Thatis
Hx?y???p?i?j?logpi?j i?j
Thisquantitymeasureshowuncertainweareofyontheaveragewhenweknowx.
Substitutingthevalueofpi?j?weobtain
or
6.From3and5wehave
Hence
Hx H?y H?x?y??H?x Hx?y??
Hx?y???p?i?j?logp?i?j p?i?j?logp?i?j?i?ji?jj
?H?x?y??H?x?
?

Hxy??H?x Hx?y??
?
?

Theuncertainty(orentropy)ofthejointeventx?yistheuncertaintyofxplusthe
uncertaintyofywhenxisknown.
Hy Hx?y??
?

Theuncertaintyofyisneverincreasedbyknowledgeofx.Itwillbedecreased
unlessxandyareindependentevents,inwhichcaseitisnotchanged.
12
7.THEENTROPYOFANINFORMATIONSOURCE

Consideradiscretesourceofthefinitestatetypeconsideredabove.Foreach
possiblestateitherewillbeasetofprobabilitiespi?j?ofproducingthevarious
possiblesymbolsj.ThusthereisanentropyHiforeachstate.Theentropyofthe
sourcewillbedefinedastheaverageoftheseHiweightedinaccordancewiththe
probabilityofoccurrenceofthestatesinquestion:
H?PiHii
??Pipi?j?logpi?j i?j

Thisistheentropyofthesourcepersymboloftext.IftheMarkoffprocessis
proceedingatadefinitetimeratethereisalsoanentropypersecond
H??fiHii
wherefiistheaveragefrequency(occurrencespersecond)ofstatei.Clearly?
H?mH
wheremistheaveragenumberofsymbolsproducedpersecond.HorH?measures
theamountofinformationgeneratedbythesourcepersymbolorpersecond.Ifthe
logarithmicbaseis2,theywillrepresentbitspersymbolorpersecond.
IfsuccessivesymbolsareindependentthenHissimply? p ilogpiwherepiisthe
probabilityofsymboli.SupposeinthiscaseweconsideralongmessageofN
symbols.Itwillcontainwithhighprobabilityaboutp1Noccurrencesofthefirst
symbol,p2Noccurrencesofthesecond,etc.Hencetheprobabilityofthisparticular
messagewillberoughly
or
p1N p2N

p?p

ppnN12n

logp? N pilogpii
?

logp? ?NH
?log1?pH??

N
Histhusapproximatelythelogarithmofthereciprocalprobabilityofatypicallong
sequencedividedbythenumberofsymbolsinthesequence.Thesameresultholds
foranysource.Statedmorepreciselywehave(seeAppendix3):
Theorem3:Givenany 0and 0,wecanfindanN0suchthatthesequencesofanylengthN?N0
fallintotwoclasses:
1.Asetwhosetotalprobabilityislessthan?.2.Theremainder,allofwhose
membershaveprobabilitiessatisfyingtheinequality
HN
logp?1N
Acloselyrelatedresultdealswiththenumberofsequencesofvariousprobabilities.
ConsideragainthesequencesoflengthNandletthembearrangedinorderof
decreasingprobability.Wedefinen?q?tobethenumberwemusttakefromthisset
startingwiththemostprobableoneinordertoaccumulateatotalprobabilityqfor
thosetaken.
Inotherwordswearealmostcertaintohave
veryclosetoHwhenNislarge.
??
1?

logp? ??
?

???

13
Theorem4:
N? N
Wemayinterpretlogn?q?asthenumberofbitsrequiredtospecifythesequence

whenweconsideronly
logn?q?N
thespecification.ThetheoremsaysthatforlargeNthiswillbeindependentofqand

equaltoH.Therate
ofgrowthofthelogarithmofthenumberofreasonablyprobablesequencesisgiven
byH,regardlessofour
interpretationofreasonablyprobable.Duetotheseresults,whichareprovedin
Appendix3,itispossible
formostpurposestotreatthelongsequencesasthoughtherewerejust2HNofthem,
eachwithaprobability ?HN
2 .
ThenexttwotheoremsshowthatHandH?canbedeterminedbylimitingoperations
directlyfromthestatisticsofthemessagesequences,withoutreferencetothestates
andtransitionprobabilitiesbetweenstates.
Theorem5:Letp?Bi?betheprobabilityofasequenceBiofsymbolsfromthesource.
Let
1
wherethesumisoverallsequencesBicontainingNsymbols.ThenGNisa
monotonicdecreasingfunctionofNand
LimGN?H?N?
Theorem6:Letp?Bi?Sj?betheprobabilityofsequenceBifollowedbysymbolSjand
pBi?Sj??p?Bi?Sj??p?Bi?betheconditionalprobabilityofSjafterBi.Let
FN??p?Bi?Sj?logpBi?Sj?i?j
wherethesumisoverallblocksBiofN?1symbolsandoverallsymbolsSj.Then
FNisamonotonicdecreasingfunctionofN,
themostprobablesequenceswithatotalprobabilityq.Then
isthenumberofbitspersymbolfor
logn?q?Lim?H
whenqdoesnotequal0or1.

pGNii
??

BNi
?

logp
B
?
?
?

FN?NGN??N?1?GN?1?1N
N FN?GN?
n
FGNn
?
?
1

andLimN?F
N?H.
?

TheseresultsarederivedinAppendix3.Theyshowthataseriesofapproximations
toHcanbeobtainedbyconsideringonlythestatisticalstructureofthesequences
extendingover1?2Nsymbols.FNisthebetterapproximation.InfactFNisthe
entropyoftheNthorderapproximationtothesourceofthetypediscussedabove.If
therearenostatisticalinfluencesextendingovermorethanNsymbols,thatisifthe
conditionalprobabilityofthenextsymbolknowingthepreceding?N?1?isnot
changedbyaknowledgeofanybeforethat,thenFN?H.FNofcourseisthe
conditionalentropyofthenextsymbolwhenthe?N?1?precedingonesareknown,

whileGNistheentropypersymbolofblocksofNsymbols.
Theratiooftheentropyofasourcetothemaximumvalueitcouldhavewhilestill
restrictedtothesamesymbolswillbecalleditsrelativeentropy.Thisisthe
maximumcompressionpossiblewhenweencodeintothesamealphabet.Oneminus
therelativeentropyistheredundancy.TheredundancyofordinaryEnglish,not
consideringstatisticalstructureovergreaterdistancesthanabouteightletters,is
roughly50%.ThismeansthatwhenwewriteEnglishhalfofwhatwewriteis
determinedbythestructureofthelanguageandhalfischosenfreely.Thefigure
50%wasfoundbyseveralindependentmethodswhichallgaveresultsin
14
thisneighborhood.Oneisbycalculationoftheentropyoftheapproximationsto
English.Asecondmethodistodeleteacertainfractionofthelettersfromasample
ofEnglishtextandthenletsomeoneattempttorestorethem.Iftheycanberestored
when50%aredeletedtheredundancymustbegreaterthan50%.Athirdmethod
dependsoncertainknownresultsincryptography.
TwoextremesofredundancyinEnglishprosearerepresentedbyBasicEnglishand
byJamesJoycesbookFinnegansWake.TheBasicEnglishvocabularyislimited
to850wordsandtheredundancyisveryhigh.Thisisreflectedintheexpansionthat
occurswhenapassageistranslatedintoBasicEnglish.Joyceontheotherhand
enlargesthevocabularyandisallegedtoachieveacompressionofsemanticcontent.
Theredundancyofalanguageisrelatedtotheexistenceofcrosswordpuzzles.Ifthe
redundancyiszeroanysequenceoflettersisareasonabletextinthelanguageand
anytwodimensionalarrayoflettersformsacrosswordpuzzle.Iftheredundancyis
toohighthelanguageimposestoomanyconstraintsforlargecrosswordpuzzlesto
bepossible.Amoredetailedanalysisshowsthatifweassumetheconstraints
imposedbythelanguageareofaratherchaoticandrandomnature,largecrossword
puzzlesarejustpossiblewhentheredundancyis50%.Iftheredundancyis33%,
threedimensionalcrosswordpuzzlesshouldbepossible,etc.
8.REPRESENTATIONOFTHEENCODINGANDDECODINGOPERATIONS
Wehaveyettorepresentmathematicallytheoperationsperformedbythetransmitter
andreceiverinencodinganddecodingtheinformation.Eitherofthesewillbe
calledadiscretetransducer.Theinputtothetransducerisasequenceofinput
symbolsanditsoutputasequenceofoutputsymbols.Thetransducermayhavean
internalmemorysothatitsoutputdependsnotonlyonthepresentinputsymbolbut
alsoonthepasthistory.Weassumethattheinternalmemoryisfinite,i.e.,there
existafinitenumbermofpossiblestatesofthetransducerandthatitsoutputisa

functionofthepresentstateandthepresentinputsymbol.Thenextstatewillbea
secondfunctionofthesetwoquantities.Thusatransducercanbedescribedbytwo
functions:
where
xn?nyn
isthenthinputsymbol,isthestateofthetransducerwhenthenthinputsymbolis
introduced,
istheoutputsymbol(orsequenceofoutputsymbols)producedwhenxnis
introducedifthestateis?n.
yn?f?xnnn?1?g?xnn?
Iftheoutputsymbolsofonetransducercanbeidentifiedwiththeinputsymbolsofa
second,theycanbeconnectedintandemandtheresultisalsoatransducer.Ifthere
existsasecondtransducerwhichoperatesontheoutputofthefirstandrecoversthe
originalinput,thefirsttransducerwillbecallednonsingularandthesecondwillbe
calleditsinverse.
Theorem7:Theoutputofafinitestatetransducerdrivenbyafinitestatestatistical
sourceisafinitestatestatisticalsource,withentropy(perunittime)lessthanor
equaltothatoftheinput.Ifthetransducerisnonsingulartheyareequal.
Let?representthestateofthesource,whichproducesasequenceofsymbolsxi;and
let?bethestateofthetransducer,whichproduces,initsoutput,blocksofsymbolsy
j.Thecombinedsystemcanberepresentedbytheproductstatespaceofpairs .
Twopointsinthespace??11?and??22?,areconnectedbyalineif?1canproduceanx
whichchanges?1to?2,andthislineisgiventheprobabilityofthatxinthiscase.The
lineislabeledwiththeblockofyjsymbolsproducedbythetransducer.Theentropy
oftheoutputcanbecalculatedastheweightedsumoverthestates.Ifwesumfirst
on?eachresultingtermislessthanorequaltothecorrespondingtermfor?,hence
theentropyisnotincreased.Ifthetransducerisnonsingularletitsoutputbe
connectedtotheinversetransducer.IfH1?,H2?andH3?aretheoutputentropiesof

thesource,thefirstandsecondtransducersrespectively,thenH1??H2??H3??H1?and
thereforeH1??H2?.
15

Supposewehaveasystemofconstraintsonpossiblesequencesofthetypewhich
canberepresentedby
?s?alineargraphasinFig.2.Ifprobabilitiespijwereassignedtothevariouslines

connectingstateitostatej
thiswouldbecomeasource.Thereisoneparticularassignmentwhichmaximizes
theresultingentropy(seeAppendix4).
Theorem8:LetthesystemofconstraintsconsideredasachannelhaveacapacityC?
logW.Ifweassign
?s Wij
?s?

where? isthedurationofthesthsymbolleadingfromstateitostatejandtheBi
satisfyij
?s?Bi?BjW

??ij

s?j

thenHismaximizedandequaltoC.Byproperassignmentofthetransition
probabilitiestheentropyofsymbolsonachannelcanbemaxi
mizedatthechannelcapacity.9.THEFUNDAMENTALTHEOREMFORANOISELESS
CHANNEL
WewillnowjustifyourinterpretationofHastherateofgeneratinginformationby
provingthatHdeterminesthechannelcapacityrequiredwithmostefficient
coding.
Theorem9:LetasourcehaveentropyH?bitspersymbol?andachannelhavea
capacityC?bitspersecond?.Thenitispossibletoencodetheoutputofthesourcein
suchawayastotransmitattheaverage
C

??H

symbolspersecondoverthechannelwhere?isarbitrarilysmall.Itisnotpossibleto
transmitatC
rateanaveragerategreaterthan
.

H
?s?pij?

BjBi

CH
ofthechannelinputpersecondisequaltothatofthesource,sincethetransmitter
mustbenonsingular,andalsothisentropycannotexceedthechannelcapacity.
HenceH??Candthenumberofsymbolspersecond?H??H?C?H.
Thefirstpartofthetheoremwillbeprovedintwodifferentways.Thefirstmethod
istoconsiderthesetofallsequencesofNsymbolsproducedbythesource.ForN
largewecandividetheseintotwogroups,onecontaininglessthan2?HNmembers
andthesecondcontaininglessthan2RNmembers(whereRisthelogarithmofthe
numberofdifferentsymbols)andhavingatotalprobabilitylessthan?.AsN
increases?and?approachzero.ThenumberofsignalsofdurationTinthechannelis
greaterthan2?CTwith?smallwhenTislarge.ifwechoose
HT???N
C
thentherewillbeasufficientnumberofsequencesofchannelsymbolsforthehigh
probabilitygroupwhenNandTaresufficientlylarge(howeversmall?)andalso
someadditionalones.Thehighprobabilitygroupiscodedinanarbitraryonetoone
wayintothisset.Theremainingsequencesarerepresentedbylargersequences,
startingandendingwithoneofthesequencesnotusedforthehighprobability
group.Thisspecialsequenceactsasastartandstopsignalforadifferentcode.In
betweenasufficienttimeisallowedtogiveenoughdifferentsequencesforallthe
lowprobabilitymessages.Thiswillrequire
Theconversepartofthetheorem,that
cannotbeexceeded,maybeprovedbynotingthattheentropy
??

? ?T1N

C
where
?

issmall.Themeanrateoftransmissioninmessagesymbolspersecondwillthenbe
greaterthan
?

1
?
?????????

1
???????
1
1

??????????

TT1HR

NNCC
16
CH
Anothermethodofperformingthiscodingandtherebyprovingthetheoremcanbe
describedasfollows:
ArrangethemessagesoflengthNinorderofdecreasingprobabilityandsuppose
theirprobabilitiesare

p1?p2?p3pn.LetPs?s?1pi;thatisPsisthecumulativeprobabilityupto,butnot
including,ps.1
Wefirstencodeintoabinarysystem.Thebinarycodeformessagesisobtainedby
expandingPsasabinarynumber.Theexpansioniscarriedouttomsplaces,where
msistheintegersatisfying:
1 1
?m
AsNincreases?,?and?approachzeroandtherateapproaches
.
log2s?1?log2?

ps
ps
Thusthemessagesofhighprobabilityarerepresentedbyshortcodesandthoseof
lowprobabilitybylongcodes.Fromtheseinequalitieswehave
11
?

Pareatleast1largerandtheirbinaryexpansionsthereforedifferinthefirstm
places.Consequentlyalli2mss
thecodesaredifferentanditispossibletorecoverthemessagefromitscode.Ifthe
channelsequencesarenotalreadysequencesofbinarydigits,theycanbeascribed
binarynumbersinanarbitraryfashionandthebinarycodethustranslatedinto
signalssuitableforthechannel.
ps 2ms?1

2msThecodeforPswilldifferfromallsucceedingonesinoneormoreofitsms
places,sincealltheremaining

TheaveragenumberH?have
But,
andtherefore,
ofbinarydigitsusedpersymboloforiginalmessageiseasilyestimated.We
1

??H?

msps

N
????

11111
mplog2psss 2s
logpNpsNNps
?

1
?

1
GN?H??GN?N
AsNincreasesGNapproachesH,theentropyofthesourceandH?approachesH.
Weseefromthisthattheinefficiencyincoding,whenonlyafinitedelayofN
symbolsisused,need
notbegreaterthan1plusthedifferencebetweenthetrueentropyHandtheentropy
GNcalculatedforN
sequencesoflengthN.Thepercentexcesstimeneededovertheidealistherefore
lessthan

GN1??1?
HHN
ThismethodofencodingissubstantiallythesameasonefoundindependentlybyR.
M.Fano.9HismethodistoarrangethemessagesoflengthNinorderofdecreasing
probability.Dividethisseriesintotwogroupsofasnearlyequalprobabilityas
possible.Ifthemessageisinthefirstgroupitsfirstbinarydigitwillbe0,otherwise
1.Thegroupsaresimilarlydividedintosubsetsofnearlyequalprobabilityandthe
particularsubsetdeterminesthesecondbinarydigit.Thisprocessiscontinueduntil
eachsubsetcontainsonlyonemessage.Itiseasilyseenthatapartfromminor
differences(generallyinthelastdigit)thisamountstothesamethingasthe
arithmeticprocessdescribedabove.
10.DISCUSSIONANDEXAMPLES
Inordertoobtainthemaximumpowertransferfromageneratortoaload,a
transformermustingeneralbeintroducedsothatthegeneratorasseenfromtheload
hastheloadresistance.Thesituationhereisroughlyanalogous.Thetransducer
whichdoestheencodingshouldmatchthesourcetothechannelinastatistical
sense.Thesourceasseenfromthechannelthroughthetransducershouldhavethe
samestatisticalstructure
9

TechnicalReportNo.65,TheResearchLaboratoryofElectronics,M.I.T.,March17,1949.

17
asthesourcewhichmaximizestheentropyinthechannel.ThecontentofTheorem
9isthat,althoughanexactmatchisnotingeneralpossible,wecanapproximateit
ascloselyasdesired.TheratiooftheactualrateoftransmissiontothecapacityC
maybecalledtheefficiencyofthecodingsystem.Thisisofcourseequaltotheratio
oftheactualentropyofthechannelsymbolstothemaximumpossibleentropy.
Ingeneral,idealornearlyidealencodingrequiresalongdelayinthetransmitterand
receiver.Inthenoiselesscasewhichwehavebeenconsidering,themainfunctionof
thisdelayistoallowreasonablygoodmatchingofprobabilitiestocorresponding
lengthsofsequences.Withagoodcodethelogarithmofthereciprocalprobability
ofalongmessagemustbeproportionaltothedurationofthecorrespondingsignal,
infact
? ?1?
? ?

logp C

T
mustbesmallforallbutasmallfractionofthelongmessages.Ifasourcecan
produceonlyoneparticularmessageitsentropyiszero,andnochannelisrequired.
For
example,acomputingmachinesetuptocalculatethesuccessivedigitsof ?produces
adefinitesequencewithnochanceelement.Nochannelisrequiredtotransmit
thistoanotherpoint.Onecouldconstructasecondmachinetocomputethesame
sequenceatthepoint.However,thismaybeimpractical.Insuchacasewecan
choosetoignoresomeorallofthestatisticalknowledgewehaveofthesource.We
mightconsiderthedigitsof?tobearandomsequenceinthatweconstructasystem
capableofsendinganysequenceofdigits.Inasimilarwaywemaychoosetouse
someofourstatisticalknowledgeofEnglishinconstructingacode,butnotallofit.
Insuchacaseweconsiderthesourcewiththemaximumentropysubjecttothe
statisticalconditionswewishtoretain.Theentropyofthissourcedeterminesthe
channelcapacitywhichisnecessaryandsufficient.Inthe?exampletheonly
informationretainedisthatallthedigitsarechosenfromtheset0?19.Inthecase
ofEnglishonemightwishtousethestatisticalsavingpossibleduetoletter
frequencies,butnothingelse.Themaximumentropysourceisthenthefirst
approximationtoEnglishanditsentropydeterminestherequiredchannelcapacity.
Asasimpleexampleofsomeoftheseresultsconsiderasourcewhichproducesa
sequenceofletters
chosenfromamongA,B,C,Dwithprobabilities1,1,1,1,successivesymbols
beingchosenindependently.2488

Wehave
?

H?? 1log1?1log1?2log1

?
224488

? bitspersymbol?4

Thuswecanapproximateacodingsystemtoencodemessagesfromthissourceinto
binarydigitswithan

averageof7binarydigitpersymbol.Inthiscasewecanactuallyachievethelimiting
valuebythefollowing4
code(obtainedbythemethodofthesecondproofofTheorem9):
A0B10C110D111
TheaveragenumberofbinarydigitsusedinencodingasequenceofNsymbolswill
be
? 2?
???

N1112 3?7N?2484

Itiseasilyseenthatthebinarydigits0,1haveprobabilities1,1sotheHforthe
codedsequencesisone22
bitpersymbol.Since,ontheaverage,wehave7binarysymbolsperoriginalletter,
theentropiesonatime4
basisarethesame.Themaximumpossibleentropyfortheoriginalsetislog4?2,
occurringwhenA,B,C,Dhaveprobabilities1,1,1,1.Hencetherelativeentropyis
7
.Wecantranslatethebinarysequencesinto

44448theoriginalsetofsymbolsonatwotoonebasisbythefollowingtable:

00011011
A?B?C?D?
18
Thisdoubleprocessthenencodestheoriginalmessageintothesamesymbolsbut
withanaveragecompres
sionratio7.8
AsasecondexampleconsiderasourcewhichproducesasequenceofAsandBs
withprobabilitypforAandqforB.Ifp?qwehave

H??logpp?1?p?1?p??plogp?1?p??1?p??p
?e?plog?p

Insuchacaseonecanconstructafairlygoodcodingofthemessageona0,1
channelbysendingaspecialsequence,say0000,fortheinfrequentsymbolAand
thenasequenceindicatingthenumberofBsfollowingit.Thiscouldbeindicated
bythebinaryrepresentationwithallnumberscontainingthespecialsequence
deleted.Allnumbersupto16arerepresentedasusual;16isrepresentedbythenext
binarynumberafter16whichdoesnotcontainfourzeros,namely17?10001,etc.
Itcanbeshownthatasp?0thecodingapproachesidealprovidedthelengthofthe
specialsequenceisproperlyadjusted.

PARTII:THEDISCRETECHANNELWITHNOISE
11.REPRESENTATIONOFANOISYDISCRETECHANNEL
Wenowconsiderthecasewherethesignalisperturbedbynoiseduring
transmissionoratoneortheotheroftheterminals.Thismeansthatthereceived
signalisnotnecessarilythesameasthatsentoutbythetransmitter.Twocasesmay
bedistinguished.Ifaparticulartransmittedsignalalwaysproducesthesame
receivedsignal,i.e.,thereceivedsignalisadefinitefunctionofthetransmitted
signal,thentheeffectmaybecalleddistortion.Ifthisfunctionhasaninverseno
twotransmittedsignalsproducingthesamereceivedsignaldistortionmaybe
corrected,atleastinprinciple,bymerelyperformingtheinversefunctional
operationonthereceivedsignal.
Thecaseofinteresthereisthatinwhichthesignaldoesnotalwaysundergothe
samechangeintransmission.InthiscasewemayassumethereceivedsignalEto
beafunctionofthetransmittedsignalSandasecondvariable,thenoiseN.
E?f?S?N?
Thenoiseisconsideredtobeachancevariablejustasthemessagewasabove.In
generalitmayberepresentedbyasuitablestochasticprocess.Themostgeneral
typeofnoisydiscretechannelweshallconsiderisageneralizationofthefinitestate
noisefreechanneldescribedpreviously.Weassumeafinitenumberofstatesanda
setofprobabilities
p??i j??
Thisistheprobability,ifthechannelisinstate?andsymboliistransmitted,that

symboljwillbereceivedandthechannelleftinstate?.Thus?and?rangeoverthe
possiblestates,ioverthepossibletransmittedsignalsandjoverthepossible
receivedsignals.Inthecasewheresuccessivesymbolsareindependentlyper
turbedbythenoisethereisonlyonestate,andthechannelisdescribedbythesetof
transitionprobabilitiespi?j?,theprobabilityoftransmittedsymbolibeingreceived
asj.
Ifanoisychannelisfedbyasourcetherearetwostatisticalprocessesatwork:the
sourceandthenoise.Thusthereareanumberofentropiesthatcanbecalculated.
FirstthereistheentropyH?x?ofthesourceoroftheinputtothechannel(thesewill
beequalifthetransmitterisnonsingular).Theentropyoftheoutputofthechannel,
i.e.,thereceivedsignal,willbedenotedbyH?y?.InthenoiselesscaseH?y??H?x?.The
jointentropyofinputandoutputwillbeH?xy?.Finallytherearetwoconditional
entropiesHx?y?andHy?x?,theentropyoftheoutputwhentheinputisknownand
conversely.Amongthesequantitieswehavetherelations
H?x?y??H?x Hx?y??H?y Hy?x??Alloftheseentropiescanbemeasuredonapersecond
orapersymbolbasis.

19
12.EQUIVOCATIONANDCHANNELCAPACITY
Ifthechannelisnoisyitisnotingeneralpossibletoreconstructtheoriginal
messageorthetransmittedsignalwithcertaintybyanyoperationonthereceived
signalE.Thereare,however,waysoftransmittingtheinformationwhichare
optimalincombatingnoise.Thisistheproblemwhichwenowconsider.
Supposetherearetwopossiblesymbols0and1,andwearetransmittingatarateof
1000symbolsper
secondwithprobabilitiesp0?p1?1.Thusoursourceisproducinginformationatthe
rateof1000bits2
persecond.Duringtransmissionthenoiseintroduceserrorssothat,ontheaverage,
1in100isreceivedincorrectly(a0as1,or1as0).Whatistherateoftransmission
ofinformation?Certainlylessthan1000bitspersecondsinceabout1%ofthe
receivedsymbolsareincorrect.Ourfirstimpulsemightbetosaytherateis990bits
persecond,merelysubtractingtheexpectednumberoferrors.Thisisnot
satisfactorysinceitfailstotakeintoaccounttherecipientslackofknowledgeof
wheretheerrorsoccur.Wemaycarryittoanextremecaseandsupposethenoiseso
greatthatthereceivedsymbolsareentirelyindependentofthetransmittedsymbols.

Theprobabilityofreceiving1is1whateverwastransmittedandsimilarlyfor0.
creditfortransmitting500bitspersecondwhileactuallynoinformationisbeing
transmittedatall.Equallygoodtransmissionwouldbeobtainedbydispensing
withthechannelentirelyandflippingacoinatthereceivingpoint.
Evidentlythepropercorrectiontoapplytotheamountofinformationtransmittedis
theamountofthisinformationwhichismissinginthereceivedsignal,or
alternativelytheuncertaintywhenwehavereceivedasignalofwhatwasactually
sent.Fromourpreviousdiscussionofentropyasameasureofuncertaintyitseems
reasonabletousetheconditionalentropyofthemessage,knowingthereceived
signal,asameasureofthismissinginformation.Thisisindeedtheproperdefinition,
asweshallseelater.Followingthisideatherateofactualtransmission,R,wouldbe
obtainedbysubtractingfromtherateofproduction(i.e.,theentropyofthesource)
theaveragerateofconditionalentropy.
R?H?x??Hy?x?
TheconditionalentropyHy?x?will,forconvenience,becalledtheequivocation.It
measurestheaverageambiguityofthereceivedsignal.
Intheexampleconsideredabove,ifa0isreceivedtheaposterioriprobabilitythata
0wastransmittedis.99,andthata1wastransmittedis.01.Thesefiguresare
reversedifa1isreceived.Hence
Hy?x?????
99log?99?0?01log0?01???081bits/symbol
or81bitspersecond.Wemaysaythatthesystemistransmittingatarate1000 ?81?
919bitspersecond.Intheextremecasewherea0isequallylikelytobereceivedas
a0or1andsimilarlyfor1,theaposteriori

2Thenabouthalfofthereceivedsymbolsarecorrectduetochancealone,andwe

wouldbegivingthesystem
probabilitiesare1,1and22
?

Hy?x??? 1log1?1log1

?
2222

?1bitpersymbol

or1000bitspersecond.Therateoftransmissionisthen0asitshouldbe.The

followingtheoremgivesadirectintuitiveinterpretationoftheequivocationandalso
servestojustify
itastheuniqueappropriatemeasure.Weconsideracommunicationsystemandan
observer(orauxiliarydevice)whocanseebothwhatissentandwhatisrecovered
(witherrorsduetonoise).Thisobservernotestheerrorsintherecoveredmessage
andtransmitsdatatothereceivingpointoveracorrectionchanneltoenablethe
receivertocorrecttheerrors.ThesituationisindicatedschematicallyinFig.8.
Theorem10:IfthecorrectionchannelhasacapacityequaltoHy?x?itispossibleto
soencodethecorrectiondataastosenditoverthischannelandcorrectallbutan
arbitrarilysmallfraction?oftheerrors.Thisisnotpossibleifthechannelcapacityis
lessthanHy?x?.
20

You might also like