Information Theory Channel Capacity

Source Coding CHAPTER ONE Not everything that can be counted counts, and not everything that counts can be counted. Albert Einstein (1879-1955) learning objectives 1 After studying this chapter the students will be able to: LO1 Understanding the concept of information. LOZ State and prove the source coding theorem and discuss its ramifications. LO3 Analyze five source coding techniques: the Huffman encoding, Shannon-Fano-Elias encoding, Arithmetic encoding, the Lempel-Ziv encoding and Run Length encoding. 1 1 1 ‘ 1 1 1 1 LO4 Underline the concept of the rate distortion function and the design of the optimum 1 1 1 1 t ; quantizer. LOS Apply the knowledge to study image compression, one of the important application areas of source coding. 1.1 Introduction to Information Theory This chapter comes with a vide‘ overview by the author. Scan here to know more or } Visit http:/qrcode. lipick.com/index.php/S82 Today we live in the information age. Internet has become an integral part of our lives, making this third planet from the sun, a global village. People are seen talking over the cellular phones, sometimes even in cinema halls. Movies can be rented in the form of a oe DVD disk or streamed direetly to our smart phones. Email and web addresses are commonly visible or business cards, Most of the people today prefer to send emails and e-cards to their friends rather than th regular snail mail. Stock quotes and cricket scores can be checked using the mobile phone *Selfies” can bi captured and uploaded on social media sites with just a click of a button,js been the key to success, but jn ‘information lie the tiny >, tay 42. # Information Theory, Co re way they sit next to oq’ 2% Information has become @ wow : 7 by the me! 0 world it is the key). A™ oo : : ee ae ine omnipresent bits) {hat nal pape pubes Hi ist on = ; nee sory that was initiated id Yet the present day’s inrmaton awe Le “ co os iia to foundation of the wonderful field of ee eas a rere Claude E, Shannon Wit ithe aie Nmerigan Electrical Engineer Claude F Sha (948) fs ronson ea inthe Bell System TEM standard communications media, such as tly any 0 electromechanical systems, ne on » and ing ina Is involved in ele! is even applical cover mathematl Theory of Communication interpreted to include the me’ radio, or television, and the signa data-processing devices. The theory humans and other animals. oe The chief concern of information theory © te 7 0 ‘ate or manipulate information. It sets UP ; Sr vatiods ncaa store, and otherwise process information . tie SI ane treated, Sclated to finding the best methods of using various available commu! aco A est meth; vented information, or signal, from the extraneous information, of nother prob orc itis possible to achieve with 2 given information-carrying meg ), While the results are chiefly of the interest to communication enging vind useful in fields like psychology and linguistics. The n fppulation based gene mapping, whose aim isto is (phenotypes). re quite fuzzy. The theory overlaps heavily ved toward the fundamental limitations on wn and less oriented toward the detailed opera cl 5, etronic computers, necha Le Fe signals appearing in the nerve nety s 5 ical laws governing systems design quantitative measures of information and of the hi : for separating the wé is the setting of upper bounds on (often called an information channe some of the concepts have been adopted and fo of mutual information has also found applications in P DNA regions (genotypes) responsible for particular tral The boundaries of information theory a communication theory but is more orient processing and communication of informatio of the devices employed. In this chapter We will start with an intuitive understanding of information and link uncertainty to informatio: We will then define self-information, average self-information (entropy), mutual informatio average mutual information and relative entropy, in LO 1. Next, we shall state and prove the source codi x ling theorem and al: variable and fixed length codes. We will also introduce the notion ofa eee code come : : 1 7 : : : Weil then study fv intresting source coding techniques: the Huffman encoding, Shanno! 5 s ing, Arithmetic encodii i fi 1 ino. ling, Lempel-Ziv encoding and the Run Length encodin : | Next, wewill unds 1 , lerstand the concept of i 1 ; A pt of the rate distorti i i [quantizer We wil find oat how to aleulate the eae onan thedesignoftie ria : study the Markoy process, in LO4, Py rate for a stochastic process. We willal ' i it Finally, we will useall the beautiful co I i : ncepts that we hi image compression, one of the important ABpliestinn, learnt and apply the knowledge t stu JPEG compression standard will be discussed, in Log, es) °F Source coding. In partcsssei Source Coding * 13 1.2 Uncertainty and Information Any information source produces an output that is random in nature, If the source output had no ndomness, i.c., the output were known exactly, there would be no need to transmit it! There exist both analog and discrete information sources, Actually, we live in an analog world, and most sources for example, speech, temperature fluctuations, ete. The adiserete sources are man made sources, for example, a source (say, a man) that generates a sequence of letters from a finite alphabet (while typing email). Before we go on to develop d mathematical measure of information, let us develop an intuitive feel for it Read the following sentences: Understanding the concept of information. are analog Soure’ (A): Tomorrow, the sun will rise from the East. (B) The phone will ring in the next one hour. (©) Itwill snow in Delhi this winter. @ The three sentences carry different amounts of information. In fact, the first sentenct hardly carries any information. It is a sure-shot thing. Everybody knows that the sun rises from the East and the Intution probability of this happening again is almost unity (“Making predictions is risky, especially when it involves the future." - N. Bohr). Sentence (B) appears to carry more information than sentence (A). ‘The phone may ring, or it may not. There isa finite probability that the phone will ring in the next one hour (unless the maintenance people are at work again). The last sentence probably made you read it over twice. This is because it has never snowed in Delhi, and the probability of a snowfall is very low. It is interesting to note that the amount of information carried by the sentences listed above have something to do with the probability of occurrence of the events stated in the sentences. And we observe an inverse relationship. Sentence (A), which talks about an event which has a probability of occurrence very close to | carries almost ho information, Sentence (C), which has a very low probability of occurrence, appears to carry a lot of information (made us read it twice to be sure we got the information right!). The other interesting thing to note is that the length of the sentence has nothing to do with the amount of information it conveys. In fact, sentence (A) is the longest of the three sentences but carries the minimum information. We will now develop a mathematical measure of information. Definition 1.1. Consider a discrete random variable X with possible outcomes Information of the event X= x, is defined as n. The Self \ log P(x) ay We note that a high probability event conveys less information than a low probability event For an event with P(x) = 1, (x) = 0. Since a lower probability implies a higher degree of uncertainty (and vice versa), a random variable with a higher degree of uncertainty contains more information. We will use this correlation between uncertainty and information for physical interpretations throughout this chapter.ogarithm, which ally ( se of the aie Hien the base 18 & the units are jy Ha, celf-information is non-negative, The font tng rc i ally ire of information 1S appropriate, lowing 44. * Inform: The units When the bi nce 0S PX) are why @ ie » meas units). § examples illustr asa fair coin and produces an yin, « - = p(0)= i a or this sources PUY (0)= 0.5. The information tail appear / og P@) =~ og, (0.5) = | bit ; J the output from this binary source (say, yy. ly ssive outputs from this binary soy e the succes | at Consider a block of m binary digits, 7 st able with probability 2". The set inom lig Hx) = i en Indeed. we have to use only one bit to basis 7 to represent H and a 0 to represent T). Now: SUP late statis independent, /.¢.. the source 1S sar . 2” possible m-bit blocks, each of which is equally P' a tee of an m-bit block is ta) = be,P) = 108,27 “ wy vit me Z ‘ i i je m-bit blocks. Thys 9 r r ve indeed need m bits to represent the possil ng Again, we observe that we in leed ni the desired additive property when a number Of sour, logarithmic measure of information possesses outputs are considered as a block. Example 1.2 Consider a discrete, memoryless source (source C) that generates Avo bits x; time. This source comprises two binary sources (sources A and B) as mentioned in Example 1.1, ec source contributing one bit. The two binary sources within the source C are independent. Intuitively, tx information content of the aggregate source (source C) should be the swm of the information containe: in the outputs of the two independent sources that constitute this source C. Let us look at the inio mation content of the outputs of source C. There are four possible outcomes {00, 01, 10, 11}, each wit a probability P(C) = P(A)P(B) = (0.5)(0.5) = 0.25, because the sources A and B are independent. information content of each output from the source C is KC) = log,P(x,) val ~ log, (0.25) = 2 bits e have to use two bits to rey . ‘ . eects aaiea eee tom his emia binary source. Thus, the logaritia Property for independent events. (4 ay Me aol nl Next, consider wo discrete random variables Xand Y with possible outcomes x Fat 1 TT Toot drtCODING - INTRODUCTION To 1 Za : O INFORMATIE oe RTAINTY oo INFORMATION, AVERAGE MUTUAL sonaid ENTROPY, SOURCE CODING THE HUFFMAN CODING, SHANN NG > Si ION- FANO- ELIAS CODING wes aes Questions Ql Write an introductory note Ase information theory. ‘a pene the term information, (RGR, Dec. 2013) ans. Information theory was proposed by communication scientist Claude - Shannon, whose ideas found in the article “The Mathematical Theory of ammication”’ in the Bell System Technical Journal in 1948. Information sayisa branch of probability theory, which can be applied to the study of communication systems. Today, we are living in the age of information. The internet has become ‘important part of our lives, making this, whole world a global village. People ° “ing over the cell phones is a common sight, sometimes even in theatres ema), Movies are rented in the form of a DVD disk. Web addresses and e- “l adresses are common on business cards. Large number of persons “*'t0 send e-cards and e-mails to other persons. 7 !hcontrast, information has the content of any of the standard communication | Suh as television, telephony, telegraphy, or radio, and the signals of electronic "Duters, Servo-mechanism systems, and other data-processing devices. (R.G.PY, Dec. 2013) THEORY, ! i Define the rate of information. regi lfa Source of message generates messages at the rate of r messages ‘ a the rate of information R is defined as the average nual abi i pe er of information bits hy on bits per second. Now, H is the average number 0 “sage, Therefore, R= rH bits/sec aDigital Communication Q.3. Define ¥ s units information. » varios units uf info om m a fine various units of information for vatig an defir 0 Ans, We ©2 2 unit is a bit, if the base ig «gx; °° Das base is “27 then ea elaele or decit. A 5 “ether yet dogerithiy’: ge «7g? then the unit 7 if base is not he the thre js nat, if the base Is js important. Thus if bas Mentioned j¢ Sth, pase 2, or binar a the unit of information as ‘bit’, The Telatongl b considered as base oA in table 5.1. Ship all the units are sho’ Table 5.1 Relati ; sation with [ Belton wth Nts lon wi Olt oe Ibit = —__ log2e logs 10 = 0.6932 nat = 0.3019 deciy it 1 Nats | 1 nat= Tn2 = 1.4426 bits . 1 Decits | 1 decit = Tegio2 = 3.3219 bits 1 nat = —_ M10 Q.4, Define the term entropy. (R.G.P.V, Dec, 2013) ‘Ans. A communication system is not only meant to deal with a single message, but with all possible messages. Hence, although the instantaneous information flows corresponding to individual messages from the source my be erratic, we may describe the source in terms of average information pt individual message known as entropy of the source. 5. Explain the uncertainty of the communication system. Ans. In information theory, innature, i.e., the system’s perfor sense. It can always be explain any information source is considered sats st sid mance can never be explained in adetermins ae ree explained in statistical terms. Hence, the most impo" aa Ommunication system is its uncertainty or unpredictabill: Sit . . tte sends at rahdon communication system of fig. 5.1, where the transit every individual any One of the pre-defined messages. The probabilit System model is statistic yanemission is known. As the comm’ ce ; 1 © explained ue. Cay defined, its average ar overal] performan™ rnicati!+ ati Informati for a statistical parameter ‘ormation Theory 181 eat fed with a probability scheme, is hows a relative measure of Y iol iat relevant to the occurrence Sat rey message in the message re £ amble- Transmitter Receiver Fig. 5.1 A Communication System tenn tho unit af infrwmsatiann 41+ aug rsa (since 2 1). This implies that ify i ited. This is actually a useful gia” ~_ vt 48. # Information Thi Vand viet actu wally transmi ges 0.010 ed tha ry 1 We just flip the Fe xo Ned bit. utual I information al ith a pair of events X and y, which an ly d ae te the 1 Dos The channel always chan at the receiver, it an ‘be conclud a 100 % bit error rate: nd Entropy 1.3 _ Average M ti 7 we with a ine a : le now want to Amd out the average mitual ly by weighting M(x; y;) by the "iti Co} far we have tudied the mutu cs mes of the the two random variables ‘| ; or sbles, This th ne maine simp possible joint events: Sen the two random ¥ e of the jomt “event and summing over all mndom variables X and Vig ties n et PR») 1a 1) = VD Pen C= EE Powe FGIPO aj n between two Fal | Definition 1.4 The Average Mutual Informa’ i apa i = SSPE HH) Poke : aa a ) De Pepreslynes Px) For the case when X and Y are statistically independent, 1(X; ¥) = 0, he there is no average mun f the average mutual information is that /(¥; 1) 2 information between Xand Y-An important property of swith equality if and only if-X and ¥ are statistically ind jependent. fined as ‘Self Information of a random variable X is de -Y P(x)log PO) ial source, H(X) represents the The entropy of X can be interns Definition 1.5 The Average HX) = YPC) a bet of possible output letters from a When X represents the alpha formation per source letter. In this case H(X) is called the entropy. The term entropy has been borrowed from statistical mechs" nese chara Aras the expected value ote ey, . P(X) is used to denote th i i ote the level of disorder in a system. It is interesting to see that the Chin where iti for entropy looks like #8! We observe that since 0< P(x) <1 mt P(x) Example 1.6 — Consid ler a disci ete binary source that emits a sequence of statistically indepen?®™ fthis bine” symbols. The output is either ither 0 with probability p or | with a probability 1 —p. Th =p. The entropy ofexplore efficient representation (4, . yy a source The primary motivatig ig 1g generate’ cient representation of the symbols, Sat a duc to cfr MS) outputs 4 symbol every ; soc co (Pete set of sYMbOIS 2, i= 1, som L, The entropy of hs f a discrete rer js selet id fa Fach ee abn PD!” ec fa bits per source symbols iS : H(X) P(x) log2P) £ logy! 7 i implies that th when te symbols are equally ies it impli Se source rate is HUD! eters in the English alphabets using bit. We obs, yresent the represented using 5 bits. This is an example g hat we wish to represent ME 4h of the letters can be ly repre bi reach letter has a corresponding 5 bit long codes State and prove the source Zoding theorem and discuss its ramifications. (La e average number Of bits P The equality hold: source symbol is H(X) and th Now let us suppose t that 25 = 32 > 26. Hence, ea Fixed Length Code (FLO). Lid Acode is a set of vectors cal .¢ (DMS) outpu gits (bits) R requi R=log,b selected from a finite set of symby jefinition ts a symbol red for uni que coding when L is a power of2 Suppose a discrete memoryless soure (la xp f= 1,2, -y L- The mumber of binary dis and, when L is not a power of 2, it is R= [logsL |+1 (14 letters of the English alphabet, we need R= [1og.26 | +1 =Sbit Le, The fixed length code for the English alphabet suggests that each of the letters in the alphabet intuition equally important (probable) and hence each one requires 5 bits for representation. However, know that some of the letters are less common (x, q, z etc.) while some others are more frequently ust of bits to both the frequently used letters as well as not (,t, eet). It appears that allotting equal number commonly used letters is not an efficient way of representation (coding). Intuitively, we should represent! more frequently occurring letters by fewer numbers of bits and represent the less frequently occurring lett by larger number of bits. In this manner, if we have to encode a wh it r A ole page of written text, we might e' up using fewer number of bits overall. When the source symbols are not equally probable, a more efficie method is to use a Variable Length Code (VLC). Le ‘ ‘As we saw earlier, to encode the Ie Example 1.10 — Suppose we have onl: ig i ‘ ly the first eight letters of the Englis Ms _ vocabulary. The fixed length code for this set of letters would be oe aia _Fixed length code t Letter Codeword Letter Codeword x 000 E 100 B 001 F ' c 010 G i D ol H i ' 7 ‘A variable length code for the same set of leters can beVariable length code 1 Letter Codeword Letter Codewordl An 00 oh 101 B O10 F 110 c on G 110 D 100 u wi Suppose we have to code the series of letters: “A BAD CAB”. The fixed length and the variable length representation of the pseudo sentence would be Fixed Length Code 000 001 000 011 010 000 001 Total bits = 21 Variable Length Code 00 010 00 100 011 00 010 Total bits = 18 Note that the variable length code uses fewer numbers of bits simply because the letters appearing more frequently in the pseudo sentence are represented with fewer numbers of bits. ‘We look at yet another variable length code for the first 8 letters of the English alphabet: Variable length code 2 Letter Codeword Letter Codeword A 0 = 10 B 1 = nn Cc 00 G 000 D 01 H I This second variable length code appears to be more efficient in terms of representation of the letters. Variable Length Code 1 00 010 00 100 011 00 010 Total bits = 18 Variable Length Code 2 0 1001 0001 Total bits = 9 However there is a problem with VLC2. Consider the sequence of bits 0 1001 0001 which is used to represent A BAD CAB in the second variable length coding scheme. We coul regroup the bits in a different manner to have [0] [10]{0]{1] [0]{0][01] which translate to A EAB AAD or we can decode the vector as [0] [1]{0][0] 1] [0]{0]{0][1] which stand Sor A BAAB AAAB ! Obviously there is a problem with the unique decoding of the code. W have no clue where the codeword of one letter (symbol) ends and where the next one begins, since th lengths of the codewords are variable. However, this problem does not exist with the VLCI. It can be see! that no codeword forms the prefix of any other codeword. This is called the prefix condition. So, as soo! 4s a sequence of bits corresponding to any one of the possible codewords is detected, we can declare tha symbol decoded. Such codes are called Instantaneous Codes. There is no decoding delay in these codes If decoding delays are permitted, we may use Uniquely Decodable Codes in which the encoded strin: could be generated by only one possible input string, However, one may have to.wait until the entire string is obtained before decoding even the first symbol. In this example, the VLC2 is not a uniquel decodable code, hence not a code of any practical utility. The VLC1 is uniquely decodable, though less economical in terms of bits per symbol.ISIE rer 1.7 Huffman Coding We will now study an algorithm for constructing efficient source codes for a DMS with source symbols that are not equally probable. A variable ength encoding algorithm was suggested by Huffman in 1952, based on the source symbol probabilities P(x,), = 1, 2, «... £. The algorithm is optimal jn the sense that the average number of bits required to represent the source symbols is a minimum provided the prefix condition is met. The steps of the Huffman coding algorithm are as follows: (i) Arrange the source symbols in decreasing order of their probabilities. (ii) Take the bottom two symbols and tie them together as shown in Fig, 1.11. Add the probabilities of the two symbols and write it on Analyze five source coding techniques: the Huffman encoding, Shannon-Fano- Elias encoding, Arithmetic encoding, the Lempel-Ziv encoding and Run Length encoding. the combined node. Label the two branches with a ‘1’ and a ee Pod 0 as depicted in Fig. 1.11. i (iii) Treat this sum of probabilities as a new probability associated Prd Pat * Po with a new symbol. Again pick the two smallest probabilities, tie ieee eer p> —1— them together to form a new probability. Each time we perform the combination of two symbols we reduce the total number of Fig. 1.11. Combining probal symbols by one. Whenever we tie together two probabilities Huffman coding. (nodes) we label the two branches with a ‘I’ and a ‘0’.Source Coding * 33 Example 1-14 This example shows thal Huffinan coding is not unique. Consider a DMS with seven possible symbo 7 and the corresponding probabilities P(x,) = 0.46, P(x,) = 0.30, P(r) = 0.12, POX) = 0.06, Pixs) = 0.03, P(x.) = 0.02, and P(x,) = 0.01 Syn Probability “Self Codeword Codeword Information Length x 0.46 1.1203 1 1 0.30 1.7370 00 2 0.12 3.0589 010 3 cA 0.06 4.0589 oo 4 Xs 0.03 5.0589 ole 3 Xe 0.02 5.6439 onto 6 x 0.01 6.6439 ont 6 The entropy of the source is found out to be AX) 7 -Y P(x, log, P(x,) = 1.9781 bits it 1.00 0 024 4.0.06 —————— T 0.12 ¥s 0.03 ————} T 0.06 6 0.02 xg |0.03, T 7 0.01 Fig. 1.13 Huffman coding for Example 1.14. and the average number of binary digits per symbol is calculated to be 7 R=>nPO) a = 1(0.46) + 2(0.30) + 3(0.12) + 4(0.06) + 5(0.03) + 6(0.02) + 6(0.01) = 1.9900 bits The efficiency of this code is 1) = (1.9781/1.9900) = 0.9940, DID You We shall now see that Huffman coding is not unique. Consider the combination of the two KNOW smallest probabilities (symbols.x,and.x,). Their sum is equal to 0.03, which is equal to the next higher probability corresponding to the symbol x5, So, for the sevond step, we may choose to put this combined probability (belonging to, say, symbol x,’) higher than, or lower than, theChannel Capacity and Coding CHAPTER TWO ‘Experimentalists think that it is a mathematical theorem while the mathematicians believe it to be an experimental fact. (On the ic », remarked to Poincaré: ] PS Lippman, Gabriel (1845-192)) learning objectives After studying this chapter the students will be able to: LO1 Categorise different types of channels and channel models. LO2 Determine channel capacity mathematically and the capacity of a given channel. LO3 Explain the need for channel coding in the context of the Noisy Channel Coding Theorem. Los State and prove the Information Capacity Theorem and discuss its ramifications. LOS _ Discuss random selection of codes and define the cutoff rate. 2.1 Introduction In the previous chapter we saw that most of the natural see reat os sources have inherent redundancies and it is possible: This chapter comes with a video Og to compress data by removing these redundancies, _: °Y®'View by the author. Scan here Compression is possible by. different source coding | {2 A”0W more or techniques. After efficient representation of source St tl:/arcode pick. convindex pp/589 symbols by the minimum possible number of bits, we : a need to transmit these bit-streams over channels (e.g, telephone tines, +). These bits may be transmitted as they are (foF baseband communie, passband communications), Unfortunately all real-life channels are noisy (this is not ao anfortamate fo Who make a living out of designing communication systems fo nl Ae is ust H u oF noisy channels!), The term noise is! to designate unwanted waves that tend to disturb the transmission and process ing of the wanted signa nels Ptical fibres, wireless cba” ations), or after modulation (Channel Capacity and Coding # 73 y be extemal to tig system (¢.g., atmospheric noise, man For internal t 8. thermal noise, shot noise etc.). In effect, the bit stream obtained at the receiver nd is likely to be different from the bit stream that ig ances aoe ‘of passband communication, the demodulator processes the channel-corrupted waveform and reduces each waveform to a scalar or a vector that represents an estimate of the transmitted data symbols. The detector, which follows the demodulator, may decide on whether the transmitted bit is a 0 or a 1. This is called a hard decision decoding. This decision process at the decoder is like a binary quantization with two levels. If there ‘re more than 2 levels of quantization, the detector is said to perform a soft decision decoding, In the extreme case, no quantization is performed for soft decision decoding. Es ~The use of hard decision decoding causes an irreversible loss of information at the receiver. Suppose the modulator sends only binary symbols but the demodulator has an alphabet with Q symbols. Assuming the use of the quantizer as depicted in Fig. 2.1 (a), we have Q= 8. Such a channel is called a Binary input Q-ary output Discrete Memoryless Channel. The corresponding channel is shown in Fig. 2.1 (b). The decoder performance depends on the location of the representation levels of the quantizers, which in turn depends on the signal level and the noise power. Accordingly, the demodulator must incorporate automatic gain control in order to realize an effective multilevel quantizer. It is clear that the construction of such a decoder is more complicated than the hard decision decoder. However, soft decision decoding can provide significant improvement in the performance over hard decision decoding. ‘ommunication systems. The sources of noi enerated noise eC.) OT internal to the system Oupat oy = by by i by 2 b * Input bs ‘bs cd ¥ % by b ae by © ;, @ © Fig. 2.1 (a) Transfer characteristic of multilevel quantizer. (b) binary input Q-ary output discrete memoryless channel. There are three balls that a digital communication engineer must juggle: (te transmit signal power i i iif ability of the communication system (in terms of the bit error rate). (ii) the channel bandwidth and (iii) the reliability of the comt ys! ake cui wile ‘Channel codir le-off one of these commodities (signal power, idth 0 ee shieve reliable communication in the presence of respect to others. In this chapter, we will study how to acl Hon noise. We shall ask ourselves questions like how many bits per second can be sca sly es sino of given bandwidth and for a given signal to noise ratio (SNR)? For that, we begin by stucying ‘han ‘models first.““In this chapter We will start with defining the different types of channel models such as the Binary Symmes, Channel, which is a special case of the Discrete-input, Discrete-output channel, the Discre ‘Memoryless Channel, the Multiple Input Multiple Output channel, the Relay Channel, Mult ‘Access Channel and the Broadcast Channel, in LOI. ‘ty and learn how to calculate the capacity of a given chang, ! Next, we will define channel capac 2 Weil loka specific examples ofthe Binary Symmetric Channel, Binary Erasure Channel ang ‘Weakly Symmetric Channels, through LO2, ' and look at the Noisy Channel Coding | ievability of a rate. All this in LO3, Hl ' ict We will then motivate the need for Channel Coding Theorem. Weill also discuss what is meant by the achi H Next, we will state and prove the Information Capacity ‘Theorem, and discuss reliable communi. | catign over unreliable channels. Wewill apply this knowledge to study parallel Gaussian channe, | the capacity of Multiple Input Multiple Output channels and the capacity region for Multiple 1 Access Channels, in LO4. ' Finally, in LOS, we will discuss random selection of codes and define the Cutoff Rate. ! 96 Channel Models v Wehave already come across the simplest ofthe channel models, the Binary Symmetric Channel (BSC), in the previous chapter. Ifthe module employs binary waveforms and the detector makes hard decisions, then channel may be viewed as one in which a binary bit stream enters ate transmitting end and another bit stream comes out at the receiving end. BSC is shown in Fig. 2.2 (a) Figure 2.2(b) sliows one possible output ort binary image sent through a given BSC with p = 0.1. Categorise different types of channels nd channel models. nee og : >< id, =< (be 1 ra 1 1 = a aN @) ® Fig. 2.2 (a) Abinary symmetric channel (BSC). (b) How an image might look after transmission through ® BSC. ‘This binary Disrete-input, Diseete-ouput channel is characterized by the set. X= (0,1) of possi inputs, the set Y= {0,1} of possible outputs and a set of conditional probabilities that relate the possidl outputs tothe possible inputs. Let the noise in the channel eause independent erors in the transmitted bins? sequence with average probability of error p. Then, anChannel|} Binary Eneoder [7] Movtulator Demodulator! Channel Detector [7 "| Decoder wel Composite Channet Fig, 2.3. A composite discrete-input, discrete-output channel formed by including the modulator and demodulator/detector. ‘The BSC is a special case of a composite 7 Diserete-output channel depicted in Fi E 1 : the channel be q-ary-symbols,-ie..¥—fxgrxp- ‘output of the detector at the receiving end of the channel consist of Q-ary symbols, i.e., Y= {¥y J» «+ Yo}. We assume that the channel and the modulation are both memoryless. The inputs and the outputs can then be rélated by a set of gO conditional probabilities PU =y,|X=x) = PO;1x) 22) ‘ om where i= 0, 1, ... Q—1 andj =0, 1, ... g— 1. This channel “9 - [mown as @ Discrete Memoryless Channel (DMC) and is - "74 Ate oan naar it aed depicted in Fig. 24 Q.ary output, Definition 2.1 The conditional probability P(y, |x,) is defined as the Channel Transition Probability and is denoted by p;, Definition 2.2 The conditional probabilities {P(y, |x,)} that characterize a DMC can be arranged in the } matrix form P = al . Pis called the Probability Tran: yn Matrix for the channel. Note: Sometimes, ! Pis also referred to as the Transition Probabi Matrix_or the Channel-Transition-Matrix or th ity Matri So far we have discussed a single.channel-with-diserete. inputs the modulator and the demodulator from the physical channel (as depicted in Fig. 2.3). Thus, the input to the channel will be a waveform and the output of the channel will also be a waveform. These are called Waveform Channels. Such channels are typically associated with a given bandwidth, 1. Suppose, x(9) is a bandlimited input to a waveform channel and y(1) is the output of the channel. Then, WO =x() + (9, puts. We can decouple (23) Where, n(1) is the additive nois: x veiver. Such-channels.can_ tis also possibl i ip b mitter-and.ree be rena e Possible to realize multiple channels between the.transn ean readily realized ip wirelese communication scenarios by using multiple antennas at the transmitter and rece “Aree Te four obvious combinations are as follows: ')-Single-Anput-Single Output (SISO): This refers to the familiar witel antenna both at the transmitter and receiver. configuration with a singlecoding and Cryptography Gi) Single Input, Mutiple Output (SIMO) ‘This refers jransmitter and multiple antennas ‘at the receiver. ce Multiple Input Single Output (MISO): ‘This refers to the configul ater bat only a single antenna a he receiver. (MIMO): This refers to the configuration with a single antenna ay fh multiple antennas a, 76 * Information Theory, to the most general configuration with muy Gv) Mattiple Input Multiple Outnut (NE Mfenmas both atthe transmitter and receiver : : Consideran MIMO system with M-transmits anton and Ma receive antennas Letus dence the imp response between the * (j= 12, + Mr) smipamit antenna and the 21 (F= Lo 2s /,) receive antenna Tat, Note that here we are considering the ‘waveform channel and the modulatoridemoduator is notaps aetre channel. The MIMO channel can be represented using a Mg x M matrix as follow. iy hae) Ina, OD (2) MaltO Io OO) wean =| : a int My 9). Iya ity variable Tis us sed to capture the time-varying nature oft (7 transmit antenna, the si ‘A wireless channel is time varying and the : ed from the ignal received at the i rec channel. If a signal s,(¢) is transmitt antenna is given by Me 3400) = Soh j(E98j(D0 7 = by 2s Me es i The input-output relation of a MIMO channel can be expressed succinctly in matrix notation as (26) yO = HE) 80 os sy, OF and, yQ= LO 20 Yu, OF where, = [50 5200 Each link between a pair of transmitter and receiver can be independently represented a5 # discrete memoryless channel. DID You i pow Rg woes also have le ¥ tana where there js.a source, a destination and intermediate re) . it ie communication between the source and the destinatict as shown in Fig, 25(a). There could be several ways to facilitate the transfer of information from the sout ta the destination by hopping over the intermediate nodes. One possibility is that each relay node simply signal and forwards it to the next relay node, maintaining a fixe ‘average transit ide can fist amplifies the received ower. Ths rotocol is known as Amplify-and-Forward (AF) scheme, Alternately, the relay no te veceived See ne agai as The signal before-forwarding it to the-next relay node T™ 01 is known as fal at each relay node requires making a hard decision. This protoc processing of the ecode-and-Forward (DF)scheme. ———— ‘(though, each one of the tan 7 So far we have considered only a single transmitter and a single receiver cane To tach cau nave male antennas!). In real-life, there can be multiple senders and mul vecaae user) want to os hing te possibility of multiple antennas!). Suppose M transmitters (S8¥: mobile as depicted in Fig. 2.5 (b). Thi hae 8 single receiver (say, the base station) over a common ‘channel the scenario. Suppose a O) aie i-gcenaro is known as a Multiple Access Channel. We can also reverst smitter (say, a low earth orbit satelite) wants to communicate Wi"Channel Capacity and Coding * 77 receivers (Say, the home dish antennas) over a common channelyas shown in Fig. 2.5 (¢). This is an example wr, Broadcast Channel. /a~, 4a fo a source @ ) © Fig. 2.5: (a) Relay Channel. (b) Multiple Access Channel. (c) Broadcast Channel. Learning Reviewry, Coding and Cryptography ding scheme (7, k) suc} [so be expressed asa (2, \ “ak 84 * Information Theo! achievable if there exists cot sr _>ee. Te (nk) code may 8 rate ris said to be f error tends to 0 a {Definition 2.7 aximal probability o ya (IM, 1) code, where cussion. We wish to Io0k at que channel with arbitrarily oy a aril low, ime in our dis concept oft piovou (%} Let us now introduce the i : CROW Yow many bits per second can we send over a given 118 , rates? Suppose the rcrete memoryless Source as the source al abet Xan nop i per soure symbol. Lt thesouree generate a symbol ve Seconds, Then the average informtn the source is HYD sts per second. Let us assume thatthe channel Ca be used once every 7; secon eS apg the capacity of the channel I C bits per channel use- Then the channel capacity Pet unit time is + ty : w state 7 theorem known asthe NOY Channel Coding Theorem orn second. We now state Shannon’s secon the Channel Coding Theorem. Theorem 2.1 Noisy C (@ Leta DMS with an alphabet-Xhave ent ryless channel have.capas Ca R= we) .£ : mn SE Ql exists @ coding scheme for which the soure, oufput can be transmitted over the noisy channel ad} trai ybability.of.ertor... . reconstructed.with- shannel Coding Theorem symbols every 7, seconds. Letadsx ropy HX) and produce 3 mabe used once every 7, seconds Then if (ii) Conversely if HY), c ; R arte (qu - 5. 4% tis not posible sara infomatin ove the channel and reconstruct it with an arbitrarily so! probability of error — = a Ce vs ‘The parameter — is called the Critical Rate, Thus all rates below capacity, C, are achievable ‘The channel coding theorem is a very important result in information theory: The Heo on the rate at which 12 spec the channel capacity, C, as a fundamental limit 3 mn ain be carried out over an unreliable (noisy) D! thatthe channel coding theorem tells us about the existence of som reliable communications ina noisy environment. Unfortunately, it does not give us a oh for better: cal these codes. Therefore, channel coding is still an active area of research as the searel ode ill goir Cees sl going on Prom the next chapter nrc we shall study some good chamel eo MS channel, shoul codes that cat the recipe 5) on Example 2.4 Consi der a DMS source that emits e i : squally likely bi 5 ‘every 7, seconds, This entropy for this binary source is ee ee H(p) == log, p ~ (1 ~p)log,(1 =p) = 1 bit A) JThe information rate of this source is H(X) 1, bits/second Suppose we wish to transmit the source symbols over a noisy channel. ‘The source sequence is applied toa channel coder with code rate r. This channel coder uses the channel once every T, seconds to send the coded sequence. We want to have reliable communication (the probability of error as small as desired). From the channel coding theorem, if pe TST (2.13) we can make the probability of error as small as desired by a suitable choice of a channel coding scheme, and hence have reliable communication. We note that the ratio is equal to the code rate of the coder. Therefore, ff, a 2.14) T. (2.14) Hence the condition for reliable communication can be rewritten as r wie! pos the ‘i uuth AWG —_-messe__ “Wench Looed pas Wen only the tow Fo ane dibboas cf So, mace Fi A 0ee Awan eC { j Tefbl) tahun fk | bo sin panence ol ase 1 ete = t \ coro cathy it a pe : = et) [ “ip Jeu a a [punailon fa eves Dydulh a hen ogo Jokes aire sate me the” af gin She acting to, Ales ba es “Ohi: tho aE oi bSaio “al + jos aeponePage No] | | Date i ou i i ‘- oat ] , N= Nace POU ey. éel__Ng = wale 9) Saeal Ve — C Spee nflege ? Vite mms Value {tisk bay{{[Page No.l \ Date | Ti wo know” _cthod- Nese = Vo a" Noun Sana fousey S$ = Ve i 7g 3 Noise" Pousu vt . R Suppose fold fo S- VFA N= Va iit! Mgt [Su g Maye TN Now, madRuMUrE, 900. . Jouels Seen sen euan dion Ly ueirs! Sey —slch: ae ie x wheoe ca PAs value 01 trorieuad — An Jw RMS. Pe ajav noise gee we Knoro tho he og ena) Meer con he > boa ipa ce jog. (4) Patvioua choles which das Hh ppt Visunl_p} Eshop ae tePage No ! Date Ad [sant mann Ne 2 log es a ir Sin iN \ Jo\/2 v NY ws \ “ah ‘bY lace! aaa n) & if “How if Hho! sheaal “Calas rans fac= fc = aa ( i tarts | secana| © tow -prtindless.. chown e) ao t Ao Je =O, bal apace tor armies er AD hae _qitke foe. ip _— endl —tagay ON ma otis paste a4 wig. Loeececee -- Boaydui oth, £0 ASnoud o| inly _ remsniitted ail) he || sinecoased Wiig) adn aygvenehlas O@sse fest IL. gq rot _- Cane C > Maite Bea 4 C= than Nowe yi alo Satu Ru=O -sham—etoo be WiC: satus Livdcke prot “We eguctticn (AL. ie ennui) an hannon— Matthey -"ege [+ 4 a

Information Theory Channel Capacity

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Information Theory Channel Capacity

Uploaded by

Copyright:

Available Formats

You might also like