You are on page 1of 35
“Gel Wardware wags cae nd Digital Sign objectives: * introduces basics of digital si This chapter int OF digital signal: prox qchitectures and ‘hardware units, investi Ted pot aa es Broa ‘ > igates fixed-point and fl ting-poi formats, and illustrates the implementation of digital filters in real time. ee 9.1. Digital Signal Processor Architecture Unlike microprocessors and microcontrollers, digital signal (DS) processors have special features that require operations such as fast Fourier transform (FFT), filtering, convolution: and correlation, and real-time sample-based and block-based-processing! Therefore, DS processors use a different dedicated hardware architecture. s » Wefirst compare the architecture of the general microprocessor with that of the DS processor: The design of general microprocessors and microcontrollers 5 based on the’Von‘Neumann architecture, which was developed nom pecan pet w tten; by: John von Néumann and others in ao a eae ‘eeested that: computer instructions, as we shall discuss, e jeans pecial:- wiring, Figure 9.1 shows the Von Neuman) sit Asshownvin Figure 9.1, a’ Von Neumann Pioneer vi ‘ty for programs and data, a single bus for mea ray =o ogram control unit. The CO Se that the ce ntral Be hing-and ‘execution cycles. aTis me and decodes it to figure tches an instruction from memory @ an arithmetic emma ocRssORS ronar PR prottat § Program control unit ‘Address generator Address bus —S—= Program and data memory y | Data bus in Von Neumann architecture, Inpuvoutput devices FIGURE 9.1 General microprocessor based o1 executes the instruction. The instruction (in machine code) has two parts: the opcode and the operand. The opcode specifies what the operation is, that is, tells the CPU what to do. The operand informs the CPU what data to operate on. These instructions will modify memory, or input and output (I/O). After an instruction is completed, the cycles will resume for the next instruction. One an instruction or piece of data can be retrieved at a time Since the processor proceeds in a serial fashion, it causes most units to stay ina wait state. As noted, the Von Neumann architecture operates the cycles of fetching and execution by fetching an instruction from memory, decoding it via the program control unit, and finally executing the instruction. When execution requires dat movement—that is, data to be read’ from or written to memory—the "\' anemrition, will be fetched after the current instruction is completed. The ; Ret eee eeeeaaet has this bottleneck mainly due to the use of seated emory for both program instructions and data. Increasing the signifi icantly. ‘mory, and computational units can improve speed. bu! To accel i Hi are des ae Ch bee of digital signal processing, DS pres nthe Mark 1 relay-based larvard architecture, which originated !° 7 Pa computers built by IBM in 1944 at Harvard Univer This computer sto; thts red i sina rel latches, Figure 9.3 shows ‘structions on punched tape and data using "hs day’s Harvard i epicted, € Processor has t 3 ard architecture. As depi ea code, while the other wie, Memory spaces. One is dedicated to the Pt oa memory spaces, two cos employed for data. Hence, to accommodtt ie used. In this way; the mee onding address: buses and two date uses ‘Program memory’and data memory have ° what operation to do, then Von speed {nt os Se Program control unit UF Program momory address bus ¥. Data memory address bus Aritht i metic Program lege unit memory ies) - * + Program memory data bus 2 a y ‘Multiplier! . ‘acourrallator ‘Shift unit Digital signal processors based on the Harvard architecture ions to the program memory bus and data memory bus, respectively. ans that the Harvard processor can fetch the program instruction and te at the same time, the former via the program memory bus and odd ‘4 the data memory bus. There is an additional unit called a multiplier fering 05 lator (MAC), which is the dedicated hardware used for the digital perati ea The Jast additional unit, the shift unit, is used for the scaling filtering. for fixed-point implementation when the processor performs digital res. The Von Neumann in Figure 9.3. The it will architectul les described and the control uni ine the operation. Next is the execute cycle. ‘the decoded information, Moon will, modify the Cone of or the memory. Once this is completed, the process will fetch es one instrucon ata tion and continue. The rocessor operat rocessor, in ister holds the uutions of the two the execution cycl de from the memory, processor operat n, the Harvard architecture ‘one register holds the processed, aS jon which data to be SIGNAL processors piaitAl a6 9 Fotch ; Execute based on the Von Neumann architecture, FIGURE 9.3 Execution cycle FIGURE 9.4 _ Execution cycle based on the Harvard architecture. ‘As shown in Figure 9.4, the execute and fetch cycles are overlapped. We c: this the pipelining operation. The DS processor performs one execution cy while also fetching the next instruction to be executed. Hence, the processi speed is dramatically increased. The Harvard architecture is preferred for all DS processors due to the requirements of most DSP algorithms, such as filtering, convolution. @ FFT, which need repetitive arithmetic operations, including multiplication additions, memory access, and heavy data flow through the CPU. _ For other applications, such as those dependent on simple microcontrollets with less of a timing requirement, the Von Neumann architecture may °°“ better choice, since it offers much less silica area and is thus less expansiv® 9.2 Digital Signal re Units g Processor Hardwa@ In this secti Sree in this section, we will briefly discuss special DS processor hardware um! 9.2.1 Multiplier and Accumulator As compared wi i itl ‘a 5 e architecture “1h nthe general microprocessors based on the Von NO" jor + the DS processor uses the MAC, a special hardware" The ipl fe multiplier and accumulator (MAC) dedicated to Ds SP. enhancing ‘the speed of digital filtering. This i ic corresponding instruction is generally Fe ferved to cesiented oe “7 stucture of the'MAC is shown in Figure 9.5. eration. The basic nA ova in Figure 9.5, in a typical hardware MAC, the multiplier has a La input registers, each holding the 16-bit input to the multiplier. The the multiplication is accumulated in a 32-bit accumulator unit. ister holds the double precision data from the accumulator. ee 9.2.2 Shifters digital filtering, to prevent overflo A A te ling-down operation shifts 4 Pp shifts i. Shifting data to Setatp the det the fraction part w, a scaling operation is required lata to the right, while as the right is the 3 sh ata to the lata by 2 and truncating se vple. fot org ggulvalent to multiplying the data by 2. es ane a 3/2 He he right gives O72 ig the Fett, we BING ‘rune 310; shifting 011 to t ho, ee 1-5 results in 1. Shifting the big p0lds that is, 3.x 2 = 6. The DS Pre Shift un data word. To speed i lesigned to accommod: me number ss often shif jon, the peration. rts data by s jal hardware depicted in sil LstGNnaAl processors ar 9 DIGITA nerators for each datum on the ates the addresses TOF ' mn the data b ae ve eae oh "ect hardware unit for See buffering in Used (en 7 a c processed. A SEE jgure 9.2). Figure 9.6 describes the basic mechani te Bee ET g eight data samples. M of i buffering for a buffer havin} ; ; ae buffering, a pointer is used and always points to the newes fs sample, as shown in the figure. After the next sample is obtained from i to-digital conversion (ADO), the data will be placed at the location of. xn = and the oldest sample is pushed out. Thus, the location for (1 — 7) becomes the location for the current sample. The original location for x(n) becomes ¢ location for the past sample of x(n — 1). The process continues according 1, the mechanism just described. For each new data sample, only one location oq, the circular buffer needs to be updated. ‘The circular buffer acts like a first-in/first-out (FIFO) buffer, but each datum on the buffer does not have to be moved. Figure 9.7 gives a simple illustration of the 2-bit circular buffer. In the figure, there is data flow to the ADC (@, 6, ¢,d¢ fg, ...) and a circular buffer initially containing a, b, c, and d. The pointer specifies the current data of d, and the equivalent FIFO buffer is shown on the right side with a current data of dat the top of the memory. When e comesin, as shown in the middle drawing in Figure 9.7, the circular buffer will change the pointer to the next position and update old a with a new datum e. It costs the pointer only one movement to update one datum in one step. However, on the right side, the FIFO has to move each of the other data down to let in the nev 9.2.3 Address Ge Data point in FIGURE 9.6 illustration of circular buffering 9.3 Di lotta an Processors and Manute nutacturers 4) 19 Data flow: a, b,o, Notg,.,, Pe d Curront data ast data | j._— Equivatentto | =f a Past data es dtc [ofa Current data Equivalent to | —- c Data b FIFO Current data f Equivalent t 2 z % d Data +> c Pointer n i top. For this FIFO, it takes four data movements. In the bottom igure 9.7, the incoming datum f for both the circular buffer and the itfer continues to confirm our observations. ite impulse response (FIR) filtering, the data buffer size can reach Hence, using the circular buffer will significantly enhance the ital Signal Processors and anufacturers and special DSP, The general-DSP plications such as digital filtering, ion to these applications, the timized for unique applications signed and optimized for app! 4 convolution, and FFT. In addit E Processor has features that are OP! Pe processors N pression echo cancellation, and adaptive ai om i erin, io processing © ‘a|-DSP processor. " such as a ae on the general te Esp industry are Texas Instry The mat mana MS sola. TI and Analog Devices offer both fee ne (11), Analog Devices, and Mot -point DSP families, while Motoro, “ti floating : a 18 ote, point DSP families ane 5, We will concentrate on TI serilis, Teview i fixed-point DSP ee dy real-time implementation using the fixeq. = architectures, and stu floating-point formats. 9.4 Fixed-Point and Floating-Point Formats I-world data, we need to select an appropriate ps i or algorithms for a certain application, Poss a Wel 32 ot method depres te rocessor’s CPU performs arithmetic. A fixed-point DS Processor Tepresents data in 2's complement integer format and manipulates data using integer arithmetic, while a floating-point processor represents numbers using a mantissa (fractional part) and an exponent in addition to the integer format and operates data using floating-point arithmetic (discussed in a later section). } Since the fixed-point DS processor operates using the integer format, which represents only a very narrow dynamic range of the integer number, a problem such as overflow of data manipulation may occur. Hence, we need to spend much more coding effort to deal with such a problem. As we shall see, we may use floating-point DS processors, which offer a wider dynamic range of data, 50 that coding becomes much easier. However, the floating-point DS process! contains more hardware units to handle the integer arithmetic and the floating: point arithmetic, hence is more expensive and slower than fixed-point process in terms of instruction cycles: It is usually 4 choi ing or proof oF concept developmentvin lier : ly a choice for prototyping or P! When it is time to make the DSP an applicati BR ee irc (ASIC), a chip designed P an application-specific integrate for a particula icati icated hand-code fixed-point implementati ar application, a dedicate a small ciioa ano mentation can be the best choice in terms of performant® The formats’ used iia Citas of floating point. | °Y DSP’ implemicntation can be classified as f**! In order to process real 9.4. xed. > em 9-41 Fixed-Point Format” We begin with 2s,com je. iit i Complement, we can rear urement representation, Considering 2 > 1 Present all:the decimal numbers shown in Table 9" 4a TABLE 9-1 __AS-bI 2's complomany umber ro Decimal Number — 2's Complement ; Ol O10 001 ! 000 | MW : 110 3 101 La Let us review the 2’s complement number s a system usii y oe decimal number to its 2’s complement requ ing Table 9.1. Converting ires the following steps: 1, Convert the magnitude in the decimal to it See er ates ot bik s binary number using the 2, Ifthe decimal number is positive, its binary number is its 2’s complement sentation; if the decimal number is negative, perform the 2’s com- lent operation, where we negate the binary number by changing the ’s to logic 0’s and logic 0’s to logic 1’s and then add a logic 1 to the . For example, a decimal number of 3 is converted to its 3-bit 2’s _ complement as 011; however, for converting a decimal number of —3, we first get. a 3-bit binary number for the magnitude in the decimal, that is, O11. Next, negating the binary number 011 yields the binary number 100. Finally, adding a binary logic 1 achieves the 3-bit 2’s complement repre- sentation of —3, that is, 100 + 1 = 101, as shown in Table 9.1. 5 a 3-bit 2’s complement number system has a dynamic range from —4 » 3, which is very narrow. Since the basic DSP operations include multiplica- Mons and ad itions, results of operation can cause overflow problems. Let us “Amine the multiplications in Example 9.1. each using its 2's complement. cESSORS an 9 «DIGITAL SIGNAL PRO 4“ al. 010 x 001 010 000 + 000 00010 and 2’s complement of 00010 = 11110. Removing two extended sign tj ives 110. me The answer is 110 (—2), which is within the system. 2. 010 x O11 010 010 + 000 00110 and 2’s complement of 00110 = 11010. Removing two extended sign bis achieves 010. Since the binary number 010 is 2, which is not (—6) as we expect, overilow occurs; that is, the result of the multiplication (—6) is out of our dyn: range (—4 to 3). Let us design a system treating all the decimal values as fractional num so that we obtain the fractional binary 2’s complement system shown Table 9.2. To become familiar with the fractional binary 2’s complement syste! convert a positive fraction number j and a negative fraction number ~s decimals to their 2’s complements. Since 3 Gu OXPtix Iter x2, its 2's complement is 011, Note i ; int fork Sires : thi 2 y point Again, since at we did not mark the binary P' 1 FrOXDP 40x27 41 x 2, 9.4 Fh Ike -Potn Y 0nd Floating. Pat Pont Formats Nana pre 9-20 A bit ate ™ representation ema System wai ‘ . NG frevetion peeimal Number Decimal Fraction me 3 wa 2's Complement 2 wa on 1 4 0.10 0 0 O01 wel —1/4 0.00 -2 -2/4 i A “i i att —4/4=-1 ie its positive-number 2’s complement is 001. For the negative number, applying the 2’s complement to the binary number 001 leads to 110 +1 = 111, as w so in Table 9.2. Now let us focus on the fractional binary 2’s complement system. The data are normalized to’ the fractional range from —1 to 1-2" = 3. When we carry out multiplications with two fractions, the result should be a fraction, so that flow ‘can be prevented. Let us verify the multiplication multiplication ‘overt (010) x (101), which is the overflow case in Example 9.1: 0.10 i x O11 ee Seed 010 010 +000 ene 0.0110 ement of 0.0110 = 1.1010. in decimal form should be tuo .) * geaixe? 1) x (0.0110) = -(0 x Del x?) able 92 thet is, verily from Table 9+ can ocEsSsSORS a9 wo Jeast-significant bits to keep the 34, i the last t If we truncate two ET ; number, we have an approximated ans . oi ny ioe (Dx (Ade (1x) +0 x (2) yet s, The error should be bounded by 2-? The truncation error occurs Woe verify that [-1/2-(- 3/8) = 1/8 < 1/4. we can avoid the overflow due to multiplications bu To use such a scheme, H nu cannot prevent the ‘additional overflow. In the following addition example, 0.11 + 0.01 1.00 where the result 1.00 is a negative number. Adding two positive fractional numbers yields a negative number. Hence, overflow occurs. We see that this signed fractional number scheme partially solves the overflow in multiplications. Such fractional number format is called the signed Q-2 format, where there are 2 magnitude bits plus one sign bit. The additional overflow will be tackled using a scaling method discussed in a later section. Q-format number Tepresentation is the most common one used in fixed-point DSP implementation. It is defined in Figure 9.8. a a dae by Figure 9.8, Q15 means that the data are in a sign magnitude ea frag 15 bits for magnitude and one bit for sign. Note that arabe i ign bit, the dot shown in Figure 9.8 implies the binary point. The pani sopealizad to the fractional range from —1 to 1. The range is 4 ided IVE i i = : s ae is, each with a size of 2~!>. The most negative number is ~) hile the most positive number is 1 — 2-15 plication * within the fractional range of —1 ~ Any result from multe sit become familiar with Q-fo! -1 to 1. Let us study the following examP™ rmat number representation. 5 PE ba ioe Implied binary point Ficui RE9.8 Quis (fixed-point) format. 964 Fino Polo ine Fou, '.0-Polmt Formatg a5 om le 9-2. y Find the signed Q-15 Tepresentation for th & he decimal gato Number 0.560123, the conversion process is illustrated usi os ing Ta 7 cane muti: We multiply the number by 3 23. For a positive than I, carry bi | 48 2 Most-significant bit (gp if the product is larger part to the next line for the next multi ), an i \d copy the fraction: ri ati ; al than 1, we carry bit 0 to MSB. The ee by 2; if the product is less magnitude bits. ‘ure continues to collect all 15 a We yield the Q-15 format representation as : 0.100011110110010, Se on Lega hy ae 7 , the truncation error is introduced. However, this error should be less than the interval size, in this y 2zle = 0,000030517. We shall verify this in Example 9.5. An alterna- tive way of conversion is to convert a fraction, let’s say 3, to Q-2 format, tiply it by 27, and then convert the truncated integer to its binary, that D aniteryn (3/4) x 2 =3 =01h. TABLE9.3. Conversion process of @-15 representation. , 1.120246 1 (MSB) 0.240492 ; 0.480984 t 0.961968 il 1.923936 1 1.847872 t 1.695744 1 1.391488 0 0.782976 1 1,565952 1 1.131904 9 0.263808 0 0527616 H 1.055232 0.110464 ificant bit. ificant bit; LSB, least-sis ay, it follows that Jn this 0. 50123) * tS = 18354. evting its binary representation will achieve tp ny cap 18aSd 0 Hts b t . Conv Sas next example illustrates the signed Q-15 representas, answer. a negative number. Example 9.3- , a, Find the signed Q-15 representation for the decimal number ~0.166; 7; Solution: a. Converting the Q-15 format for the same magnitude using the p) have the corresponding positive number with rocedure described in Example 9.2, w: 0.160123 = 0,001010001111110. Then, after applying 2’s complement, the Q-15 format becomes —0.160123 = 1.110101110000010. Alternative way: Since (—0.160123) x 215 = —5246.9, converting the truncate’ number —5246 to its 16-bit 2’s complement yields 1110101110000010. Example 9.4. a. Convert the Q-15 signed number 1.110101110000010 to the decim! number. Solution; 2, Since the number is negative, applying the 2’s complement yields 0.001010001 111110. Then the decimal number is Bart oh got oon / (273 42°84 279 4 271 4g Deg go? 4g 2 pot) = 0.160095 Example 9.5, a, Convert 1 B16 ah Hs oe he Q-5 signed number 0.10001 1110110010 te the dev 9-4 Fixed.) Point od Floating-Point fo 0" Formate aay jon sis th decimal number is Be 1 5 +276 42-7 4 9-1 gh+2 427° 42742 842 oy ne i +2 = 0.560120, Tror in Exam, ‘at the truncatio; As we know, the truncati yea 0.000030517. We verify th {0.560120 — 0.560123] = ple 9.2 is less thi n error is bounded by _ 0.000003 < 0.000030517, Note that the larger the number of bits used, error that may accompany it, the smaller the round-off examples 9.6 and 9.7 are devoted to illustrating data manipulations in th ns in the 1.110101110000010 +0.1000F1110110010 10.011001100110100 ult is : 0.011001 100110100. n be found to be in the decimal form ca! gol 428 = 0.400024. gpg 2742 Pt eee Serie z “and 0.5 in 3 fede 2 + plication OF ©” ‘fixed-point multiplic@ ment format. Solution: a. Since 0.25 follows: 0,010 x 0.100 eae 0.000 00 00 0010 +0000 0.001 000 oro and 0.5 = 0.100, we carry out binary multipiay, = 0.01041 ig ‘ation Truncating the least-significant bits to convert the result to Q-3 forma we have 0.010 x 0.100 = 0.001. Note that 0.001 = 9-3=0.125.. We can also verify that 0.25 x 0.5 = 0.125. Asa result, the Q-format number’ representation is a better choice than the 2’s complement integer representation. But we need to be concerned with the following problems. 1. Converting a decimal number to its Q-N format, where N denotes the number of magnitude bits) we may ‘lose accuracy due to the truncation error, which is bounded by the size of the interval, that is, 2-%. 2. Addition and subtraction may cause overflow, where adding two post'* nbets numbers leads to'a negative number, or'adding two negative num yields a positive number; similarly, subtracting a positive number fre pesnlive nunclee Bives a positive‘ umber, while subtracting @ umber from a positive number results in a negative number. : ees two numbers in Q-15 format will lead to a Q-30 foun weiss Of t bits in total. As in Example 9.7, the multiplicatiot of e is common fora aa that is, 6 magnitude bits and a sign bit. In PY active" doable ibid dive $ processor to hold the multiplication rest!" multiplying tw such a8 MAC operation, as shown in Figure?” : sign-extended bit, We bers in Q-15 format, In Q-30 format, the’ = OT rorgt bit. We may get rid of it by shifting let by OMe Pio and maintaining the Q-31 format for each MAC 8M" 904 Fined 'ed-Polnt end FootingsPoiny Formats 429 45 magnitude bits x ; (als Ls on 30 magnitude bits ype 9.9 Sim bit extended @-20 forma, nol Sometimes, the number in Q31 format needs to be converted to Q-15; fe example, the 32-bit data in the accumulator needs to be son i Gee igital-to-analog conversion (DAC), where the Upper most-significant 16 bits in the Q-31 format must be used to maintain accuracy. We can shift the number in Q-30 to the right by 15 bits or shift the Q-31 number to the right by 16 bits. The useful result is stored in the lower 16-bit memory location. Note that after truncation, the maximum error is bounded by the interval size of 2-15, which satisfies most applications. In using the Q- format in the fixed-point DS processor, it is costive to maintain the accuracy of data manipulation. 4, Underflow can happen when the result of multiplication is too small to be Z represented in the Q-format. As an example, in the Q-2 system shown in ‘able | multiplying 0.01 x 0.01 leads to 0.0001. To keep the result in cate the last two bits of 0.0001 to achieve 0.00, which is zero. , underflow occurs. similar to scientific notation, is used. The general format for lumber representation is given by x=M.2, (9.1) he mantissa, or fractional part, in Q format, and E is the exponent ind exponent are signed numbers. If we assign 12 bits for the bits for the exponent, the format looks like riews 20 an te bit mantissa has limits between —1 and +1, the Sa ene ‘by the number of bits assigned to the exponent, The bigger the bits assigned to the exponent, the larger the eum ei me. : for the mantissa defines the interval in the noe ang : i size is 27!! in the normalize » 210 tener in ann are ws the sae eh 4 ve Ci fern ¢ achieved. Using the format in 0 9,10, we can det and most positive numbe ¢ s: igure MeO Ns! N ass 9D hentan ee 42 bit mantissa pa), jl x GE ae 1 ele [etletl27]2%)2° ee 22 Floating-point format. (1,00000000000),:2°11!? = (=1) x 27 = ~198.9 oie = (= 27!) x 97 FIGURE 9.10 Most negative number = number = (0.1 LITT). = 1275, Most positive The smallest positive number is given by 000 (9-1 sy Smallest positive number = (0.00000000001),-2" 2 = (271) x 2-8 py onent acts like a scale factor to increase the dynamic Tange As we can see, the exp’ r , ly the floating-point format in the follow. of the number representation. We stud’ ing example. Example 9.8. a. Convert each of the following decimal numbers to the floating-poin number using the format specified in Figure 9.10. 1. 0.1601230 2. —20.430527 Solution: a. 1. We first scale the number 0.1601230 to 0.160123/2-? = 0.640492 with an exponent. of —2 (other choices’ could ‘be 0 or —l) to gt Sea 9 OPE en Using 2’s complement, we have = ~ Now we convert the value 0.640492 using Q-I1 form to get 01011 4 " bits yields 00011111. Cascading the exponent bits and the mantis! va 1110010100011111. sin ie 4 tlonal eee = -0.638454,/ we can convert it into the fe ind exponent part as ~20.430527 = —0.638454 x 2" Note that this conversion jig +0.319227 x 96 ~0.15 what we have now, Therein ac ate Mil valid choices. Verting t fore, the e; i id be 0! ee honbr aang zone isl 40100011011. N Not particularly unique; the wy 1 ys be? 01. CO” | j 101011100101, 4 ent bits and mantisca 1; Jing the export Ntissa bits, We achi cass 101101011 100104. + floating arithmetic is more com 1e . plicated. w, ing wo floating-point numbers, © Must obey the rules f satin Rules for arithmet for manipu- metic additio, eee x= M24 n are given as: 2 = My24' ‘The floating-point sum is performed as follows: M ~(E\~E; n+ = ( V+ My x2 1~E:) % (My x 2-(-£i) + Ma) x22 if EB < tion rule, given two Properly normalized floating-point numbers: x1 = M26 |Mi| <1 and 0.5 < |M, |< 1. Then multiplication can be per- ows: XX X2 = (My x Mg) x 25 M x2, ntissas are multiplied while the exponents are added: es M= Mix Ms E=£E,\+. id 9.10 serve to illustrate manipulators. loating-point numbers achieved in Example 9.8: 1110 010100011111 = 0.640136718 x es 0101 101011100101 = —0.638183593 x 2°. exponent as shange the first number to have the same exp ion, we change nd lumber, that is, 3 a 0101 000000001010 = 0.005001068 x ESSORS pigitAat s1GNAL PROC a2 9 Then we add two mantissa numbers: 0.00000001010 + 101011100101 101011101111 and we get the floating number as 0101 101011101111. We can verify the result by the following: 0101 101011101111 (214234 27 $I) x25 .633300781 x 2° = —20.265625, Example 9.10. a. Multiply two floating-point numbers achieved in Example 9.8: 1110 010100011111 = 0.640136718 x 2-> 0101 101011100101 = —0.638183593 x 25. Solution: a. From the results in Example 9.8, we have the bit patterns for these two numbers as E; = 1110, Ep = 0101, M, = 010100011111, Mf, = 101011100101 Adding two exponents in 2’s complement form leads to E=E,+EF)=1110+0101 = 0011, Aree he cxPected, since in decimal domain (— 2) +5=3 mantissas, ee elt the multiplication rule, when multiplying ‘"? ian tor thes fhe ‘alee their corresponding positive values. If " ment form. In our negative, then we convert it to its 2's comP” example, Mi = 010100011111 is a positive ment) However, = ¢ i M2 = 101011100101 is a negative mantissa, since the Mee x yp toi To perf Fran DOANE valde Dee we use 2’s complement to convert M2. NP WOOTLOLL, and note that the multiplication negative. We multiph at 2 bits to give | PY SO Positive mantissas and truncate the res4 01010001 i #011111 > 010100011011 = 001 101000100. 24 Fixed-p. ‘oint and Hoating-Point Fe formate 433 ow we need to add a negative sign to th complement operation, orate 2 ¢ Multiplica S compleme ‘ation result with 2 nt, we have ave 2's ica = 0010111100, Hence, the product is achieved bit mantissa as: by cascading the 4-bit exponent and 12. 0011 110010111100, converting this number back to the deci “fo be 0.408203125 x 2 = —3, pa Serial number, we verify the result Next, we examine overflow and underflow in the floating-point numb: # er system. ‘ion, overflow will occur when a i : ° number is too large t represent the floating-point number system. Adding two ss ahae aed may lead to” a, number larger than 1 or less than —1; and multiplying two sumbers es | the addition of their two exponents, so that the sum of the 0111 011000000000 + 0111, 010000000000. the same and they are the biggest positive ON te that two exponents are ‘ta.ion. We add two positive 4-bit 2’s complement represen' jumbers as 0.11000000000 0.10000000000 T: 01000000000. mantissa numbers is negative. Hence, the over- ite following two numbers: OM 011000000000 x oul 011000000000. 0 positive exponents gives +0iL= 1000 (negatives © the overflonys occurs): bers gives! 0. 10010000000 (OK). ying two mantissa num) 009000000 0,11000000000 = aL siGNAt pROcESsSORS a0 HEDTONT Underflow flow will occur when a number is tog x is ed before, under! eur u tel a the number system. Let us divide the following two flo, numbers: mall t ating, She rig 1001 01000000000 + 01 11 010000000000. First, subtracting two exponents leads to 1001 (negative) — 0111 (positive) = 1001 + 1001 = 0010 (positive; the underflow occurs) Then, dividing two mantissa numbers, it follows that 0.01000000000 + 0.10000000000 = 0.10000000000 (OK). in this case, the expected resulting exponent is —14 in decimal, which However, mplement system. Hence the is too small to be presented in the 4-bit 2’s col underflow occurs, Understanding basic principles of the floating-point formats, we can next examine two floating-point formats of the Institute of Electrical and Electronies Engineers (IEEE). 9.4.3 IEEE Floating-Point Formats Single Precision Format IEEE floating-point formats are widely used in many modern DS processors There = ae types of IEEE floating-point formats (IEEE 154 stander) Oneis feat Tie ingle precision format, and the other is the IEEE double preciso? orm | e single precision format is described in Figure 9.11. me nie ee single precision floating-point standard representali0? bits for each ere 8 exponent bits E, and I sign bit S, with a total o!* seechned ton e ss € mantissa in 2’s complement positive binary fractio limits between rl ond to ene ehe mantissa is within the normalized ran iiinbs here whe We . The sign bit S is employed to indicate the sig? ° u ie oct ene wen S = | the number is negative, and when S = 0 them, pee exponent E is in excess 127 form, The value of 127 8 the from -127 to Soe rie range from 0 to 255, so that E-127 will bave ® "7 the IEEE 754 deandlea (aan: shown in Figure 9.11 can be applice cori sok cams asp eta eoresen =m 9.4 Fixedet Pelnt and Flocting-Point Format Saas, 31 30 23 22 EZ ‘exponent fraction 2 X= (AEX (LF) x 2127 figure 9.11 _ IEEE single precision floating-point for mat, 00000000: «00000 % }00000000000000 = (— 1)° x (1.05) x 2128127 _ > 00011 hose = (=1)" x (1.101,) x 229-27 + 00000! 10100000000000000000000 = (— 1)! x (1.1013) x 29-1 _ Let us look at Example 9.11 for more explanation. 0 51 s=1,E=2' =128 LF = 1.01) = 2)° + Q? = 1.25. » applying the conversion formula leads to =(— 1.25) x 28-127 21.95 x2! = -2.5. ed by the word can be determined g all the exceptional cases: NaN (“Not a number”). clusion, the value x represents the following rules, includin 55 and F is nonzero, then x = is 1, then x = —Infinity. is 0, then x = +Infinity. yx (LF) x 2-127, where 1.F repre- ted by prefixing F with an implicit 55, Fis zero, and S 55, Fis zero, and Si ees 255, then x=(—1 s the binary number crea deading 1 and a binary point. Be Say sae a5 pe TONTAT sronat PROC faa . : and F is nonzero. then x =(— DX OF) x 21% malized” value. 1 Sis 1, then x = —0. his i a fE=0 an “unnor a If E=0, Fis zero, and and Sis 0, then x = 0. w If E=0, Fis zero, ceptional examples are shown as follows: 0 00000000 00000000000000000000000 =0 1 00000000 00000000000000000000000 =-0 OM 00000000000000000000000 = Infinity Lai 0000000000000000000000 = —Infinity OU 00000100000000000000000 = NaN 111111111 00100010001001010101010 = NaN Typical and ext 0 00000001 00000000000000000000000 = (=1)° x (1.02) x 2t-27 8 916 .00000000 10000000000000000000000 = (— 1)° x (0.12) x 20-126 9-127 000000000 00000000000000000000001 = 0, (—1)° x (0.000000000000000000000012) x 20-126 _ 2-'49(smallest positive value) Double Precision Format The IEEE double precision format is described in Figure 9.12. eee souble precision floating-point standard representation requires? tess aa ‘ may be. numbered from 0 to 63, left to right. The first bitis ees mS sae eleven bits are the exponent bits E, and the final 52 bits erea ce its i The IEEE floating-point format in double precision seman sae | , dynamic range, of number representation, 5! her Reais io its; the double precision format also reduces the i Hi akcoepated ae asian Tange of +1 to +2, since there are 52 ™ Hoh teohiuke ens the single precision case of 23 bits. Applying the com own in Figure 9.12 is similar to the single precision © 31 30 20 19 0 31 0 <1 spoon [ tector | taaien fraction i odd register rece ‘even register = (AYP X (1,7) x 2E 1029 FIGURE 9.1 212. IEEE double precision floating-point format. 94 Fixed-Point and Floating-Point For 437 -Point ating-Point Formats yor ont 2: the following numb: seit er in IE} es she fo EE double precision format to the 001000...0.110...0000 ation", », Using the bit pattern in Figure 9.12, we have §=0,E =2 =512 and LF = 1.112 = (2) +Q)1 +27 = 1.75" applying the double precision formula yields 1991.75) x 2517-10 = 1.75 x 2°81 = 2.6104 x 107, xe(- For purposes s of completeness, rules for determining the value x r the double precision word are listed as follows: x represented by 1047 and Fis nonzero, then x = NaN (“Nota number”) 047, ‘Fis zero, and S is 1, then x = —Infinity F = 2047, Fis zero, and S is 0, then x = +Infinity rE < 2087, then x = (— 1)’ x (LF) x2 2E-1023, where 1.F is intended d ) Fepresent the binary number created by prefixing F with an implicit ing land a binary point Q.and F is nonzero, then x =(— 1) x OF) x 271022, This is an Fis zero, and Sis 1, then x= 0 Fis zero, and S is 0, then x= gital Signal Processors ‘ola all manufacture fixed-point sh as Sean oe oa ri Det Te palit . les Val Ter outs Provyevelopments architectural features, and et ne on ones a are T! TMS320C1x ixed- Point Di vice, Texas Instruments, ‘performance. Some of the most CM 300C62x. Motorola mn), TMS320C2%, *TMS320C5% 8d The DSP SOD fm grow. 8 point processors)", fed to continue 10 s Varieties of fixed- cessors are es of fixed-point DS pro’ TAL SIGNAL pROCESSORS an 9 0G! i features such as prograi ; e" some basic common ; 5 : am Memory, : Since they uated address buses, arithmetic logic units (ALUs) ang & | memory with control units, MACs, shift units, and address generators, here we i "OE, overview of the TMS320C54x processor. The typical TMS320C54, fie a “Poi i appears in Figure 9.13. 3 _ aoa point TMS320C54x families supporting 16-bit data have a program memory and data memory in various sizes and configuration, tp include data RAM (random access memory) and program ROM (rey ey memory) used for pro} on gram code, instruction, and data. Four data buses four address buses are accommodated to work with the data memories a program memories. The program memory address bus and program tmenge data bus are responsible for fetching the program instruction. As show, i Figure 9.13, the C and D data memory address buses and the C and D gy ‘ | Program Peace] | tases ff casa areas jenerator a OF Program memory address bus t C data memory address bus t D data memory address bus t E data memory address bs eT Program Data memory | | memory ¥ Program memory dala DiS eee data memory data ~- ¥ D data memory data = bus HH E data memory 0318 Inputvoutput s ¥ devices enmete ‘ Multiptier/ logic alc unit accumulator | | Shift unit FiouRE 9.13. | f 19 Basic architecture of TMS320¢54x family. 9.4 Fixed, l-Polnt ei "and Floating-Point Formats mats 449 gata buses deal with fetching da a0 j address bus and the B data rome a we data into data memory. In addition, theBs # Z es , the E mer tational units consist of an ALU. M. ae ) an c sts Oikos Sue fetch data fen the C, D, a : fetch data from C and D dat ee Ta ne a _ theinory di by 1. The multiplier, on data bus, e capable of operating 17-bit Sen plata Tee eat ier bas tl 5 same capability of bus access as AMAR aioe Be ts for scaling and fractional arithmetic such as we nario fo 7 sed for ‘la memory, whi ‘ ', While th lata bus are deslcalda to ‘Mory data bus can access C, and a shift unit. For it ble shil Q-format. ‘The P ‘ogram control unit fetches instructions via the program memory di ry data jn order to speed up memory acce: ne responsible for program ee a tas Fat dada as d Harvard architecture is employed, where several janthiseiods me time for a given single instruction cycle. Processing per- (0. MIPS. (million instruction, sets per second). To further referred to Dahnoun (2000), Embree (1995), n der Vegte (2002), as well as the website for us. Again, available: © Advance operate at the sai formance. offers 4 explore this subject, the reader is Mfeachor and Jervis (2002), and Va Texas Instruments (www.ti.com). ene 7 ting-Point Processors DSP operations using floating-point advantages of using the floating-point th effects such as overflows, ation errors. Hence, mples to avoid scale the filter Bu: Floating-point ‘DS processors perform arithmetic, as ‘we discussed before. The include getting rid of finite word len} Grtors, truncation errors, and coefficien quantiz i ff in terms: of coding,“ we do not need to do scaling input sa It to fit the DAC word size, { DS processor verflow, shift.the accumulator rest ficients, ‘Or estthmetic. The floating: poin 1 tion precision facilitates @ friendly algorithms. 1 psPai0 ices provi ae amiles such as ADSP2/0%% . fnertcsos os rors a wide range sf floating-point DSP he TMS e first generation, followed by the ‘es. Since the first generation of a float- i ated than later generations but ee the Son architecture first. the Firt-generO NTpgas Instrument briefly- Further detail can 9:14\: shows 3x: families. We discuss $0! 44 9 DIGITAL SIGNAL PROCESSORS Program | [~ RAM RAM Row cache | | block 0: || block 1 64 x 32) | (1K x32) | | (1K x32) Pu oma Tnteger”_]_Integer” |Address generators loating-point|Floating-point| Muitipfier | ALU 8 Extended-pretision registers Address | . Address generator 0 | generator 1 8 Auxiliary registers sz 12 Control registers Control registers Peripheral bus FIGURE 9.14 The typical TMS320C3x ficating-point DS Processor. be found in the TMS320C3x User’s Guide: (Texas Instruments, 1991), the TMS320C6x CPU and Instruction Set Reference Guide (Texas Instruments, 1998), and other studies (Dahnoun; 2000; Embree, 1995; Ifeachor and Jenis 2002; Kehtarnavaz and Simsek, 2000; Sorerigen and Chen, 1997; van der Vi 2002). The TMS320C3x. fami commonly used codes, Similar.to the fixed-point Processor, it uses the Harvard architecture, in which there are Separate ‘buses used for program and data ri that instructions can be fetched at the Same time that data are being a" There also exist memory buses and data buses for direct-memory access (D. etal for concurrent I/O and CPU operations, and peripheral access such as * shifter, internal. buses; a cP ical ‘ramets units (ARAUs), The multiplier ovcrns single-cycle multi ows topcrnit integers and on 32-bit oating-paige ven Using parallel ine pat! smultphentsttuliblicaion,an ALU wil oats sinale qa which reuse ‘multiplication and an addition are equuilly fast, The ARA Us suppor * 9.5 Filter tm 'plementations in Fixed-Pol /-Polnt System 441 nich some of them a i ich son re specific to DSP such as circular b ular buffering and 4 ae seo addressing (digital filtering and FFT operations). The CPU register mt ", feVe™ 7 . pitt. 28 registers, which can be fers. operated o1 ee ctions of the registers i ted on by the multipli sennetion ofthe eter inde ight SEL aes ALU fie for oe for addressing and ee results. Eight pusilary pene an | for integer arithmetic. liary registers ter cae eee of internal variables poeta! registers provide 10708 to allow P Ny a of arithmetic between regi external memory) yam efriciency e great : peers, gisters. In this way, he ominent feature of C3x is its floatis A ak The Fi ymbers with a very large di loating-point capability, allowing oper- ation valgorithm without ge dynamic range, It offers implementati Hf ¢ DSP. algorithm ¥ out worrying about problems such a: ation of ficient Sa Three floating-point formats are ints ore fe ting-point format has 4 7 pauses . A short oe exponent bits, | sign bit, and 11 mantissa bits. 16-bit float en ingle precision format has 8 exponent bits, 1 sign bit, and 23 fraction A 32-bit si pits. A 40-bit extended precision format contains 8 ex; i ign bil 40: ponent bits, 1 bi 41 fraction bits. Although the formats are slightly different from {He IEEE 58 standard, conyersions are available between these formats. ‘The TMS320C30 offers high-speed performance with 60-nanosecond single- ction execution lume, which is equivalent to 16.7 MIPS. For speech- quality. applications with an 8 kHz sampling rate, it can handle over 2,000 cy Je instructions ‘between two samples (125 microseconds). With instruc- cement such as pipelines executing each instruction in a single cycle uired from fetch to execution by the instruction itself) and a apt structure, this high-speed processor validates implementation ne applications in floating-point arithmetic. Finite Impulse Response and i e Response Filter n Fixed-Point izati ch as nd of filter realization structures SUC scade forms (Chapter 6), we dge of the JEEE formats ot te rm I, direct form II, and paralle ant ve igital filter implementation in the fixed-point Pr ae i i used, Integer be Og ini wa low due to see plications a ve ab ie 0 ce i caine ee oe accoun! aintain the We de aie ee be inentat in Q-fo' first, and then ¢ develop. FIR: filter RS oNaAt pmoic es 5° 1st cag 9D UOTEA implementation next. In agg: st ) ue in @-format; the filter output me pul and gain is larger than’ 1: Alva, dder, we can scale the input down .d by the equation anfinite impulse TSI infinite: impu! piven in sume that with @ ai jn Qformat even if the nT Se i avoid overflo c ons.w can be safely determine factor S, which ' . 2D\+--. = Toa /A| Fs (HOD + HD +HOI+-), gy k=0 dder output and Jmax the ma; i mpulse response of the a utp t im here I) fe me ait in Q-format. Note that this is not an optimal factor , eee of reduced signal-to-noise ratio. However, it shall prevent the overflow, Equation (9.2) means that the adder output can actually be expressed as , convolution output: adder output = bya Sealy h(0)x(n) + ACL) x(n = 1) + A(2)x(n — 2) +--+ ‘Assuming the worst condition, that is, that all the inputs x(v) reach a maximum value of Imax and all the impulse coefficients are positive, the sum of the adder gives the most conservative scale factor, as shown in Equation (9.2). Hence, scaling down of the input by a factor of S will guarantee that the output of the adder is in Q-format. ‘When some of the FIR coefficients are larger than 1, which is beyond the range of Q-format representation, coefficient scaling is required. The idea is that scaling down the coefficients will make them less than 1, and later the filtered output will be scaled up by the same amount before it is sent to DAC. Figure 9.15 describes the implementation. In the figure, the scale factor B makes the coefficients by. /B convertible to the Tea sd eae factors of S and B are usually chosen to be a power of 2, 30 peration can be.used in the coding process. Let us implemen! an FIR filter containing filter coeffici i " Q e icier ed-pom oe UR Ges o nts larger than. 1 in the fixed-p x(n) VS X(N) BB. .-ys(n) > coe y(n) FIGURE 9:15 Directs 1S) Direct-form ti iter i ‘implementation of the FIR filter. y 9.5 Filter im plementation: 5 In Fixed-Point § ystems 443 oo le 9.13- ie te FIR filter oi ie y(n) = 0.9x(n) + 3x(n — 1) — 0.9x(n — 2), and gain of 4, and assuming that the input range occupies only 1/4 ona pass : eal range for a particular application, of : E pevelop the DSP implementation equations in the Q-15 fixed-point “ gystem- solution: a, The adder dynamic ran which consis 1 = 4 (My + HCD + 1H) = 1(o9+3+09)=12. we select 5 = 2 (a power of 2). We choose 1, so the Q-15 format can d difference equations are may cause overflow if the input data exist for } of a full nge. The scale factor is determined using the impulse response, ists of the FIR filter coefficients, as discussed in Chapter 3. Overflow may occur. Hence, p= 4 to scale all the coefficients to be less than According to Figure 9.15, the develope by x) = xo y.(n) = 0.225x,(1) + 0.75x,(1 — 1) — y(n) = 8ys(n) ‘direct-form I implementation e of a scale factor t. The factor Cis usually cl tion in DSP. 0.225x,(n — 2) of the IIR filter is illustrated in Cis to scale down the hosen to be a in Figure 9.16, the purpos ‘riginal filter coefficients to the Q-forma' of 2 for using a simple shift opera! yda) ee) ESSORS aaa @ DIGITAL SIGNAL PROC Example 9.14. ‘The following HR filter, y(n) = 2x(n) + 0.5y(n — 1), uses the direct form I, and for a particular application, the maximum Tiyay = 0.010... 02 = 0.25, A. Develop the DSP implementation equations in the Q.15 fixed system, Mut ig “Poin Solution: a, This is an IIR filter whose transfer function is 2 2z HO = 9s Applying the inverse z-transform, we have the impulse Tesponse A(n) = 2 x (0.5)"u(n). To prevent overflow in the adder, we can compute the S factor with the help of the Maclaurin series or approximate Equation (9.2) numerically ‘We get 5 = 025 x (2(0.5)842(0.5)'42(0.5)?+-.) = aan anh, MATLAB function impz() can also be applied to find the impulse response and the S factor: > h = impz (2, (1-0.5}); $ Find ti > sf =0.25" st=1 he impulse response sum(abs(h)) & Determine the sum of absolute values of h(x) Hence, we do not need to scale down all the coeffi selected. From Figure Perform input scaling. However, we need '° cients to use the Q-15 format. A factor of C= 4i5 9.16, we get the difference equations as x(n) = x(n) Yl) = O.5xs(n) + 0.125yr(n — 1) Ip) = 4y,(n) 2) = y(n). We : : vide the orisi™! "<.can develop these equations directly. ‘First, we divide the sents 0 DF rence equation by a factor of 4 all the coefficients less than 1, that i to scale Fn | 98 Filler Implementations In Fixed. Vda) = 2 wey 1 Gos) = GX P80) + 7 x 0.5yp(n — 1) Int Systems 445 and define @ scaled output 1 Isl) = Gyr). substituting ys(7) to the left side of the scaled equation and Finally Sup the fil Jing up the filter output as = et before. s(n) = 4y,(n) we have the same results he fixed-point implementation for the direct form II is more complicated, ie senior direct-form II implementation of the IIR filter is illustrated in Fighter's os Bj "As shown in Figure 9.17, two scale factors A and B are designated to scale denominator coefficients and numerator coefficients to their Q-format repre etal , respectively, Here S isa special factor to scale down the input sample so that the numerical overflow in the first sum in Figure 9.17 can be prevented. The difference equations are given in Chapter 6 and listed here: x(n) = aiw(n — 1) ~aw(n—2)—9 = awn = My bow(n) + biw(n = 1) + 7+ + ba M)- down by the factor Ato ensure that that is, wn) = = all the denom- ) = Faint -1) aagiga 1 aude 2 scaling the second equation yields x(n) w,(m) A w(n)bo/ B Ys! > Cu DP \ aes 4 > i > “SA PROCESSORS L SIGNAL aap 9 DIGUTA 1 st ol yy) = pe = peony thee Pot brw(a~ yy a va) = Bx yn) To avoid the first adder overflow (first equation), the scale factor Sang, safely determined by Equation (9.3): . S = Inax({h(0)| + [A] +A) ++»), 05 where /(X) is the impulse response of the filter whose transfer function is the reciprocal of the denominator polynomial, where the poles can cause a lange value to the first sum: 1 hn) = 2 ( Paz pep aR); (94) All the scale factors A, B, and S are usually chosen to be a power of 2 respectively, so that the shift operations can be used in the coding proces Example 9.15 serves for illustration. Example 9.15. é Given the following IIR filter: Yn) = 0.75x(n) + 1.49x(0 = 1) + 0.75x(n —2) = 1.52y(n— 1) = 0.64y(n - 2), with a passband gain of | and a full range of input, a. Use the direct-form II implementation to develop the DSP implements- tion equations in the QS fixed-point system: Solution: a. The difference equations without scaling in the direct-form II impleme™ tation are given by : ¥Q) = x0) = 1.52000 — 1) 0.64w(n — 2) 3) = O1SW(n) + 1.490(n = 1)'+ 0.75w(n = 2): To Prevent overflow in the first adder, we have the reciprocal °! denominator polynomial as , fhe (Phe A@) = : O=y + US2e7) + 0.6422 ° Using MATLAB function leads to: - ee et Bhs tmpe (1, 111.52 0.643) ; > sf = sum(abs (h))’ sf = 10.4093, 9.6 Digital Signal Processing Programming Examplas 447 We choose the S fuctor as = 16 and we choose A = 2 to scale d denominator coefficients by half. Since the second adder Se ae scaling is a 0:5. 1.49 0.75 ys(n) = ae) + = ve ~N+wln— 2), we have to ensure that each coefficient is less than 1, as well as the sum of the absolute values 0.75 149 075 B B B to avoid second adder overflow. Hence B = 4 is selected. We develop the DSP equations as “x,(n) = x(n)/16 ws(n) = 0.5x5(1) — w(n) = 2ws(n) ‘y,(n) = 0.1875i(n) + 0.37250 yo =x Sys) = 64ys(") The implementation for cascading the second-order section 0.76w(n = 1) — 0.320 ~ 2) (n — 1) + 0.1875w(n — 2) filters can be found ‘aMfeachor and Jervis = jervis (2002). a i: A Practical i resented in the next section, Note that itt = pee all the scaling concerns should be ignored, so that overflow ting-point DS pee processor is used, , hag floating-point format offers & large dynamic range, 3 ly ever happens.

You might also like