Paul Embree C Algorithms For Real-Time DSP PDF

Libeary of Congress Cataloging Publication Data bree, asl M, ‘ ngorthne fr rel ie DSP/ Pal M. Eb Includes iboprapticlefeence ASBNO.13-597353-3 1 € (Computer program language) 2. Computer sletims. 3. Rest neds pressing Tile QATETSCISESS 195 621.382 2085524020 95.4077 ce Acquistions eto: Karen Gets (Cone desiner: ith Leeds Design “Manofactrng buyer: Ales R Hoye ComposiorPrataction sevice: Pie Tie Composition, ne. © 1995 by Prentice Hall PTR Preatie Hal Ie BSS 1 Siren Sctter Company ‘Upper Sale River, New lacy 07458 Allright reserved. Nop ofthis bok my be eprodacd, inary form ory any means, wa permsioe a Weng from the pubis, The publisher ofr scout on his book wie ordered in bulk quantities For more information contact (Cnporat Sales Deparment Prem Hall PTR (One Lake Suse Upper Sule River, New ere 7458 Phos: 300-382-3419 Fae 201-236-7 mil: Corpses @ renal om Pte in the Unie Sates of America Wooe7eseaar ISBN: 0-13-337353-3 Presi Hal! neato (UK) Limit, Landon Premce Hall of Assia ry. Litt. Sry rntice Hal Cn, ne, Toronto. Prentice Hal Hipunoameicam SA. Mexico Prentice Hal of Ida Private Limited, New Del Prentice Hal of apa, ine, Tokyo Simon & Schuster Asa PL, Singapore Editors Pence Hal do Bail Ld, Rio de ano CONTENTS PREFACE vii CHAPTER 1 DIGITAL SIGNAL PROCESSING FUNDAMENTALS 1 1.1 SEQUENCES 2 1.4.1 The Sampling Function 3 1.1.2 Samples Signal Spectra 4 1.1.3 Spectra of Continuous Time and Discrete Time Signals 5 1.2 LINEAR TIME-INVARIANT OPERATORS 8 1.2.1 Causality 10 1.22 Difference Equations 10 1.2.3 The z-Transform Description of Linear Operators 11 1.24 Frequency Domain Transfer Function of an Operator 14 1.25 Frequency Response from the z-Transform Description 15, 13 DIGITAL FILTERS 17 1.3.1 Finite Impulse Response (FIR) Filters 18 1.32 Infinite Impulse Response (IR) Filters 21 1.3.3 Examples of Filter Responses 22 134 Filter Specifications 23 14 DISCRETE FOURIER TRANSFORMS 25 14.1 Form 25 142 Properties 26 1.43 Power Spectrum 27wv Contents 1.44 Averaged Periodograms 28 4.5 The Fast Fourier Transform (FFT) 28 1.46 An Example of the FFT 30 1S NONLINEAR OPERATORS 32 LS. peLaw and A-Law Compression 33, 1.6 PROBABILITY AND RANDOM PROCESSES 35 1.6.1 Basic Probability 35 1.62 Random Variables 36 1.6.3 Mean, Variance, and Gaussian Random Variables 37 1.64 Quantization of Sequences 40 1.65 Random Processes, Autocorrelation, nd Spectral Density 42 1.6.6 Modeling Real-World Signals with AR Processes 43 1.7 ADAPTIVE FILTERS AND SYSTEMS 46 1.7.1 Wiener Filter Theory 48 1.7.2 LMS Algorithms 50 1.8 REFERENCES. 51 CHAPTER 2 _C PROGRAMMING FUNDAMENTALS 53 2.1 THE ELEMENTS OF REAL-TIME DSP PROGRAMMING 53 22 VARIABLES ANDDATA TYPES. 56 22.1 Types of Numbers 56 222 Amays 58 23 OPERATORS 59 2.3.1 Assigament Operators 59 2.3.2 Arithmetic and Bitwise Operators 60 2.3.3 Combined Operators 61 2.34 Logical Operators 61 2.38 Operator Precedence and Type Conversion 62 24 PROGRAM CONTROL 63 2.4.1 Conditional Execution; ££. 242 The switeh Statement 64 2.4.3 Single-Line Conditional Expressions 65 2.44 Loops:while, do-while, and for 66 2.4.5 Program Jumps: break, continue, and goto 67 28 FUNCTIONS 69 2.5.1 Defining and Declaring Functions 69 25.2 Storage Class, Privacy, und Scope 71 2.5.3 Function Prototypes 73 26 MACROS AND THE C PREPROCESSOR 74 26.1 Conditional Preprocessor Directives 74 2.62 Aliases and Macros 75 ee 63 Contents 2.7 POINTERS AND ARRAYS 77 24.41 Special Pointer Operators 77 2.7.2 Pointers and Dynamic Memory Allocation 78 2.73 Amays of Pointers 80 28 STRUCTURES 82 28.1 Declaring and Referencing Structures $2 2.82 Pointers to Structures 84 28.3 Complex Numbers 85 29 COMMON C PROGRAMMING PITFALLS 87 2.9.1 Amray Indexing 87 2.9.2 Failure to Pass-by-Address 87 2.9.3 Misusing Pointers 88 2.10 NUMERICAL C EXTENSIONS 90 2.10.1 Complex Data Types 90 2.102 eration Operators 91 2.11 COMMENTS ON PROGRAMMING STYLE. 92 2ALL Software Quality 93 2.112 Structured Programming 95 2.12 REFERENCES 97 CHAPTER 3 DSP MICROPROCESSORS IN EMBEDDED SYSTEMS 3,1 TYPICAL FLOATING-POINT DIGITAL SIGNAL PROCESSORS 99 3.1 AT&T DSP32C and DSP3210._ 100 3.1.2 Analog Devices ADSP-210XX 104 3.1.3 Texas Instruments TMS320C3X and TMS320C40 108. 32 TYPICAL PROGRAMMING TOOLS FOR DSP. 111 3.2.1 Basic C Compiler Tools 111 3.2.2 Memory Map and Memory Bandwidth Considerations 113 3.2.3 Assembly Language Simulators and Emulators 114 33 ADVANCED C SOFTWARE TOOLS FOR DSP. 117 3.31 Source Level Debuggers 117 3.32 Assembly-C Language Interfaces 120 3.33 Numeric C Compilers 121 34 REAL-TIME SYSTEM DESIGN CONSIDERATIONS 124 3.4.1 Physical InpuOutput (Memory Mapped, Serial, Polled) 124 3.42 Interrupts and Interupt-Driven UO 125 3.4.3 Efficiency of Real-Time Compiled Code 128 3.44 Mulprocessor Architectures 130a Contents CHAPTER 4 REAL-TIME FILTERING 132 4 REAL-TIME FIR ANDIIR FILTERS 132 “4.1.1 FIR Filter Funetion 134 4.1.2 FIR Filter Coefficient Caleulat 4.13 TIR Filter Function 145 4.14 Real-Time Filtering Example 151 42 FILTERING TO REMOVE NOISE _ 158 4.2.1 Gaussian Noise Generation 158 4.22 Signal-to-Noise Ratio Improvement 160 43 SAMPLE RATE CONVERSION 160 “4.3.1 FIR Interpolation 163 4.3.2 Real-Time Interpolation Followed by Decimation 163, 4.3.3 Real-Time Sample Rate Conversion 167 44 FAST FILTERING ALGORITHMS | 168 4.4.1 Fast Convolution Using FFT Methods 170 4.4.2 Interpolation Using the FFT 176 4.5 OSCILLATORS AND WAVEFORM SYNTHESIS. 178 4.5.1 IIR Filters as Oscillators 178 4.5.2 Table-Generated Waveforms 179 46 REFERENCES 184 136 CHAPTERS REAL-TIME DSP APPLICATIONS 186 5.1 FFT POWER SPECTRUM ESTIMATION 186 ‘Sud Speech Spectrum Analysis 187 5.1.2 Doppler Radar Processing 190 52 PARAMETRIC SPECTRAL ESTIMATION 193 5.2.1 ARMA Modeling of Signals 193, 5.2.2 AR Frequency Estimation 198 53 SPEECH PROCESSING 200 ‘5.3.1 Speech Compression 201 5.3.2 ADPCM (G72) 202 54 MUSIC PROCESSING 218 5.4.1 Equalization and Noise Removal 218 5.42 Pitch-Shifting 220 5.43 Music Synthesis 225 58 ADAPTIVE FILTER APPLICATIONS 228 55.5.1 LMS Signal Enhancement 228 5.5.2 Frequency Tracking with Noise 233 54 REFERENCES 237 APPENDIX—DSP FUNCTION LIBRARY AND PROGRAMS 238 INDEX 241 ose PREFACE Digital signal processing techniques have become the method of choice in signal processing as digital computers have increased in speed, convenienee, and availablity. As microprocessors have become less expensive and more powerful, the number of DSP applications which have become commonly available has exploded. Thus, some DSP microprocessors can now be considered commodity products. Perhaps the most visible high volume DSP applications are the so called “multimedia” applications in digital audio, speech processing, digital video, and digital communications. In many cases, these ‘applications contain embedded digital signal processors where a host CPU works in @ loosely coupled way with one or more DSPs to control the signa flow or DSP algorithm ‘behavior at real-time rate. Unfortunately the development of signl processing algo rithms for these specialized embedded DSPs is still difficult and often requires specialized traning in a parcular assembly language for the target DSP. ‘The tools for developing new DSP algorithms are slowly improving asthe need to design new DSP applications more quickly becomes important, The C language is prov- ing itself to bea valuable programming ool for real-ime computationally intensive soft ware tasks. C has high-level language capabilites (such as structures, arrays, and funetions) as well as low-level assembly language capabilities (such as bit manipulation, direct hardware inpuvoutput, and mactos) which makes C_an ideal language { bedded DSP. Most of the manufacturers of digital signal processing devices (such as ‘Texas Instruments, AT&T, Motorola, and Analog Devices) provide C compiles, simulators, and emulators for their parts, These C compilers offer standard C language with extensions for DSP to allow for very efficient code to he generated. For example, an inline assembly language capability is usually provided inorder to optimize the performance of {ime citcal pars ofan application. Because the majority of the code is C, an application can be transferred to another processor much more easily than an all assembly language rogram. ‘This book is constructed in such a way that it will be most useful to the engineer ‘who is familiar with DSP and the C language, but who is not necessarily an expert in both. All ofthe example programs in this book have been tested using standard C compil-Preface cers im the UNIX and MS-DOS programming environments, In addition, the examples have been compiled utilizing the real-time programing tools of specific real-time embed- ‘ded DSP microprocessors (Analog Devices’ ADSP-21020 and ADSP-21062; Texas Instrument’s TMS320C30 and TMS320C40; and AT&T's DSP32C) and then tested with real-time hardware using real world signals. All ofthe example programs presented in the text are provided in souree code form on the IBM PC floppy disk included with the book. ‘The tent is divided into several sections. Chapters | and 2 cover the basic principles of digital signal processing and C programming. Readers familiar with these topics may ‘wish to skip one or both chapters. Chapter 3 introduces the basic real-time DSP program- ‘ming techniques and typical programming environments which ae used with DSP microprocessors. Chapter 4 covers the basic real-time filtering techniques which are the comer- stone of one-dimensional real-time digital signal processing. Finally, several real-time DSP applications are presented in Chapter 5, including speech compression, musi signal processing, radar signal processing, and adaptive signal processing techniques. ‘The floppy disk included with this text contains C language source code forall of the DSP programs discussed inthis book. The floppy disk has a high density format and was written by MS-DOS. The appendix and the READ.ME files on the floppy disk pro- Vide more information about how to compile and run the C programs. These programs hhave been tested using Borland's TURBO C (version 3 and greater) as wel as Microsoft C (versions 6 and greater) for the IBM PC. Real-time DSP platforms using the Analog Devices ADSP-71020 and the ADSP-21062, the Texas Instruments TMS320C30, and the AT&T DSP32C have been used extensively to test the real-time performance of the algorithms. ACKNOWLEDGMENTS: | thank the following people for their generous help: Laura Meres for help in preparing the electronic manuscript and the software for the DSP32C; the engineers at Analog Devices (in particular Steve Cox, Mare Hoffman, and Hans Rempel) for thei review of the manuscript as well as hardware and software support; Texas Instruments for hardware and software support; Jim Bridges at Communication Automation & Control, Ine. and Talal Itani at Domain Technologies, Tne. Paul M. Embree ‘TRADEMARKS. TBM and IBM PC are trademarks ofthe International Business Machines Corporation ‘MS-DOS and MircosoftC are trademarks ofthe Microsoft Corporation. ‘TURBOC is a trademark of Borland Intemational, ‘UNIX is a trademark of American Telephone and Telegraph Corporation. DSP32C and DSP3210 are trademarks of American Telephone and Telegraph Corporation. "TMS320C30, TMS320C31, and TMS320C40 are trademarks of Texas Instramedts Incorporated. ADSP-21020, ADSP-21060, and ADSP-21062 are wademarks of Analog Devices Incorporated. CHAPTER 1 annem DIGITAL SIGNAL PROCESSING FUNDAMENTALS Digital signal processing begins with a digital signal which appears to the computer as a sequence of digital values. Figure 1.1 shows an example ofa digital signal processing op- ration or simple DSP system. There is an input sequence x(n), the operator Gand an ‘output sequence. y(n). A complete digital signal processing system may consist of many operations on the same sequence as well as operations on the result of operations Because digital sequences are processed, ll operators in DSP are discrete time operators (8 opposed to continuous time operators employed by analog systems). Discrete time operators may be classified as time-varying ot time-invariant and linear or nonlinear. Most Of the operators described in this text will be time-invariant with the exception of adaptive filters which are discussed in Section 1.7. Linearity will be discussed in Section 1.2 ‘nd several nonlinear operators will be introduced in Section 1.5. Operators are applied 10 sequences in order to effect the following results; (1) Extract parameters or features from the sequence, (@) Produce a similar sequence with particular features enhanced or eliminated. (@) Restore the sequence to some earlier state (@) Encode or compress the sequence. ‘This chapter is divided into several sections. Section 1.1 deals with sequences of ‘numbers: where and how they originate, their spectra, and their relation to continuous signals. Section 1.2 describes the common characteristics of linear time-invariant opera. tors which are the most often used in DSP. Section 1.3 discusses the class of operators called digital filters. Section 1.4 introduces the diserete Fourier transform (DFTS and 12 Digital Signal Processing Fundamentals Chap. 1 DPS Operation an) x0) 22), 49),40) ep OL} 9, 110.10) FIGURE 1.1. OSP operation FFTs). Section 1.5 describes the properties of commonly used nonlinear operators Section 1.6 covers basic probability theory and random processes and discusses theit application o signal processing. Finally, Section 1.7 discusses the subject of adaptive digital filters. 1.1 SEQUENCES: In order for the digital computer to manipulate a signal, the signal must have been sampled at some interval. Figure 1.2 shows an example of a continuous function of time ‘which has been sampled at intervals of T seconds. The resulting set of numbers is called a sequence. Ifthe continuous time funetion was x(t), then the samples would be a(n) for n, ‘an integer extending over some finite range of values. I is common practice to normalize ‘the sample interval to | and drop it from the equations. The sequence then becomes x(n). ‘Care must be taken, however, when calculating power or energy from the sequences. The ‘sample interval, including units of time, must be reinserted atthe appropriate points in the power or energy calculations. ‘A sequence as a representation ofa continuous time signal has the following important charactristies: xo oT 27 af af 67 er 77 ef oT FIGURE 12 Sempling nase Sec. 1.1 Sequences 3 (1) The signals sampled. thas nite value at ony discrete points (@) The signal is wuncated outside some finite length representing finite time interval, (@) The signa is quantized. Ii limited to discrete steps in amplitude, where the step size and, therefore, the accuracy (or signal fidelity) depends on how many steps are available in the A/D converter and on the arithmetic precision (numberof bits) of the digital signal processor or computer. In order to understand the nature of the results that DSP operators produce, these characteristics must be taken into account. The effect of sampling will be considered in Section 1.1.1. Truncation will be considered in the section on the discrete Fourier trans- {orm (Section 1.4) and quantization willbe discussed in Section 1.7.4 1.1.1 The Sampling Function ‘The sampling function is the key to traveling between the continuous time and discrete time worlds. It is called by various names: the Dirac delta function, te sifting function, the singularity function, and the sampling function among, them. It has the following properties: Property 1. [fo 8—dt= fo. aay Property 2. [3r—e)ar=1. 2 In the equations above, tan be any real number. ‘To see how this function can be thought of as the ideal sampling function, first consider the realizable sampling function, A(), illustrated in Figure 1.3 Its pulse width is one lunit of time and its amplitude is one unit of amplitude. It clearly exhibits Property 2 of the sampling function, When AQ) is multiplied by the function to be sampled, however, the A() sampling function chooses not a single instant in time but range from —4 t0 +4 As a result, Property I of the sampling function is not met Instead the following integral would result: [Lnoau-nar= fro a3) This can be thought of asa kind of smearing ofthe sampling process across a band which is related to the pulse width of A(). A beter approximation to the sampling function would be a function (0) with @ narrower pulse width, As the pulse width is narrowed, however, the amplitude must be increased. In the limit, the ideal sampling function must have infinitely narrow pulse width so that it samples at a single instant in time, and infinitely large amplitude so that the sampled signal stil contains the same finite energy. Figure 12 illustrates the sampling process at sample intervals of 7. The resulting time waveform can be written4 Digital Signal Processing Fundamentals Chap. 1 a FIGURE 13 Realzabe sempling function. sto= Eaton, a ‘The waveform tht resus from this process is impossible to visualize de tothe inf amplitude and zero width ofthe ideal sampling function It tay be ease o picture a Somewhat es than ideal sampling funtion (ne with very small widh and very Inge pla) multiplying the coninaous time waveform. Te shoul be emphasized hat a continous time waveform made fom thes eosin of an infinite st of continous time signals (030 ~ ni) Te ean also be wt. tn 2,0 Scan b-nt) as since the sampling function gives nonzero multiplier only at the values & = nk his lastequaton, the seuence xa) mikes ite pearance. Tiss he sof miso sam, ples on which amos all DSPs base. 1.1.2 Sampled Signal Spectra Using Fourier transform theory, the frequency spectrum of the continuous time waveform x() can be written a6) | —~ Sec.1.1 Sequences 5 ‘and the time waveform can be expressed in terms of its spectrum as x [xe an Since this is tue for any continuous Function of time, (0), its also true for. X(N [aioe Pat (8) Replacing.x,(0 by the sampling representation xo-f] Soean fom rr) ‘The order ofthe summation and integration can be interchanged and Property 1 of the sampling funetion applied o give XN = YL xtarye 2 10) This equation isthe exact form of a Fouricr series representation of X,), a periodic function of frequency having period 1/T. The coefficients of the Fourier series ae x(n) and they canbe calculated from the following integral xeory= 7 X(erwra, att) ‘The last two equations are a Fourier series pair which allow calculation of either the time signal or frequency spectrum in tems of the opposite member of the pair. Notice that the use ofthe problematic signal, is eliminated and the sequence xin?) canbe used instead. 1.1.3 Spectra of Continuous Time and Discrete Time Signals By evaluating Equation (1.7) at r= ni" and setting the result equal to the right-hand side ‘of Equation (1.11) the following relationship between the wo spectra is obtained: atat= [xp ay = Tf Xe? ap ay The right-hand side of Equation (1.7) can be expressed as the infinite sum of a set of inte. ‘gals with finite Fimits sans Sf Baureomra oD31 Signal Processing Fundamentals Chap. 1 By changing variables toA = —m/T (substituting f= A-+ m/T and df= dA) ¥ fraser a, (10) x(nD) Moving the summation ine the np eogizing hate for al integer an nis equal to 1, and equating everything inside the integral tothe similar part of Equation (LUD give the following relation: a X= Y xr+ as) Equation (1.15) shows that the sampled time frequency spect jwency spectrum is equal to an infinite sum of shifted replicas of the continuous time frequency spectrum overlaid on each ‘other. The shift of the replicas is equal to the sample frequency. M Iti interesting to examine the condons under which the two spec are equal cach thet, at es or limited range of frequencies. In the case where there are no spectral components of frequency greater than ‘sr in the original continuous time waveform, the two spectra Iinl + -1 (@) Input spectrum a(n ' ke z (©) Sampled spectrum Ian FIGURE 14 Alasing in theo & ‘quency domain (a Input spectrum. {) Sampled spectrum. (©) Reconstructured spectrum _(e) Reconstructed spectrum. ee Sec. 1.1 Sequences a are equal over the frequency range f= —Ib to f= +r. Of course, the sampled time spee- trum will repeat this same set of amplitudes periodically forall frequencies, while the continuous time spectrum is identically zero for all frequencies outside the specified range. “The Nyquist sampling criverion is based on the derivation just presented and asserts that a continuous time waveform, when sampled at a frequency greater than twice the ‘maximum frequency component in its spectrum, can be reconstructed completely from the sampled waveform. Conversely, if a continuous time waveform is sampled at a frequency lower than twice its maximum frequency component a phenomenon called aliasing occurs. If a continuous time signal is reconstructed from an aliased representation, distortions will be introduced into the result and the degree of distortion is depen- ‘dent on the degree of aliasing, Figure 1.¢ shows the spectra of sampled signals without aliasing and with aliasing, Figure 1.5 shows the reconstructed waveforms of an aliased signal at) " \ Wo ee at) af (>) Sampled rat oft) ‘ ‘. *s >t Tw ee ‘signal. (b Sampled signal. (©) Reconstructed signal {e) Reconstructed signal.8 Digital Signal Processing Fundamentals Chap. 1 1.2 LINEAR TIME-INVARIANT OPERATORS ‘The most commonly used DSP operators are linear and time-invariant (ot LI). The lin- cearity property is stated as follows: Given x(n), finite sequence, and O{ }, an operator in n-space, let y(n) = fxm} (1.16) i x(n) =a54(n) + bxy(n) aan ‘where a and b are constant with respect ton, then, if {) isa linear operator 4) = a(x (n)]+ BO(2, 6m), a8) ‘The time-invariant property means that if Xn) = Of(m)} then the sifted version gives the same response or nm) = Olam) (as) ‘Another way to state tis property is that ifx(n) i periodic with period NV such that xin N= x(n) then if O( ) is time-invariant operator in n space Olx(n+ N)} = Ox(m)} [Next the LITI properties of the operator G{ } will be used to derive an expression and ‘method of calculation for O{x(n)). First, the impulse sequence can be used to represent x0) ina different manner, a= Sxtmain-m) 20) This is because aap otherwise ‘The impulse sequence acts as a sampling or sifting function on the function x(m), using the dummy variable m to sift trough and find the single desired value s(n). Now this somewhat devious representation of x(n) is substituted into the operator Equation (1.16) ‘See. 1.2 Linear Timestnvariant Operators a x= of Seimie-m} az Recalling that O{ } operates only on functions of n and using the linearity property = LY slmyOlugln—m)) (1.23) Every operator has a set of outputs that are its response when an impulse sequence is applied to its input. The impulse response is represented by H(n) so that 1a) = Olu(n)) 12H ‘This impulse response isa sequence that has special significance for Of }, since itis the sequence that occurs at the output of the block labeled © } in Figure 1.1 when an in pulse sequence is applied at the input. By time invariance it must be tre that Ham) = Oluyin—m)) (128) so that yem= S xem nm) (126 Equation (1.26) states that y(n) is equal to the convolution of x(n) with the impulse re sponse hin). By substituting m = n ~ pinto Equation (1.26) an equivalent form is derived vin) = YM p)axtn= 7). an 1c must be remembered that m and p are dummy vatiables and are used for purposes of the summation only. From the equations just derived i is clear thatthe impulse fesponse ‘completely characterizes the operator O{ } and can be used to label the block representing the operator as in Figure 1.6, *——+} i) +r) FIQURE 1.8. Impulse responee repre sentation ofan operator.0 Digital Signal Processing Fundamentals Chap. 1 12.1 Causality In the mathematical descriptions of sequences ane operators thus far, it was assumed that the impulse responses of operators may include values that occur before any applied input stimulus. This isthe most general form of the equations and has been suitable for the development ofthe theory to this point. However, itis clear that no physical system ‘can produce an output in response to an input that has not yet been applied. Since DSP. ‘operators and sequences have their basis in physical systems, itis more useful 1o consider that subset of operators and sequences that can exist inthe real world, ‘The frst step in representing realizable sequences is to acknowledge that any se- ‘quence must have started at some time. Thus, itis assumed that any element of a se {quence in a realizable system whose time index is less than zero has a value of zero. ‘Sequences which start at times later than this can sil be represented, since an arbitrary ‘number of their beginning values can also be zero. However, the earliest true value of any sequence must be ata value of m that is greater than or equal to zero, This attribute of se- {quences and operators is called causality, since it allows all atributes ofthe sequence £0 bbe caused by some physical phenomenon. Clearly, a sequence that has already existed for infinite time lacks a cause, a the term is generally defined, ‘Thus, the convolution relation for causal operators becomes: m= SMe ain m, 128) This form follows naturally since the impulse response is a sequence and ean have no values formless than zero. 1.22 Difference Equations All discrete time, linear, causal, time-invariant operators can be described in theory by the Mth order difference equation Saayn—m)= 5 b,xin-p) «2% where x(n) is the stimulus forthe operator and y(n) isthe results or output of the operator. The equation remains completely general if all coefficients are normalized by the value of ay giving ven)+ Seagntn~ m= bala p) 430) and the equivalent form Sea 0) Sagrtn—m aap xn) Sec. 1.2 Linear Time-Invariant Operators " byxln)+ atm D+ byx(n 2). + by-i(-N+D—ayln—D—a,y(2-2) (132) Hay n= NED, ‘To represent an operator properly may require a very high value of N, and for some complex operators N may have to be infinite. In practic, the value of W is kept within limits ‘manageable by a computer, there are often approximations made of a particular operator tomake N an acceptable size. In Bquations (1.30) and (1.31) the terms y(n ~ m) and x(n ~ p) ate shifted or delayed versions of the functions y(n) and x(n), respectively. For instance, Figute 1.7 shows 1 sequence x(n) and x(n ~ 3), which isthe same sequence delayed by three sample peri- ‘ods. Using this delaying property and Equation (1.32), a structure or flow graph can be ‘constructed forthe general frm of a discrete time LTI operator. This structure is shown. in Figure 1.8, Each of the boxes is a delay element with unity gain. The coefficients are shown next to the legs of the flow graph to which they apply. The cicles enclosing the summation symbol (5) are adder elements, 1.23 The 2Transform Description of Linear Operators “There i a linear transform—called the :-iransform—whichis as useful wo discrete time analysis as the Laplace ransform isto continous time analysis. Its definition is Rex} = Y xe" (1.33) where the symbol Z{} stands for “z-transform of” and the inthe equation i a complex ‘number. Ope ofthe most important properties ofthe z-transform is its relationship o time x() + x(n-9) o 1 2 3 4 5 6 7 B FIGURE 1.7 Shiting of » sequence.2 Digital Signal Processing Fundamentals Chap. 1 x(n) An) FIGURE 1.8 Flow graph structure of lines operators. delay im sequences. To show this property take a sequence, x(n), with a z-transform as follows: Sane Rlx(n))} = Xz) (134) A hited vin of iene tao Klan phl= Soxen— pe (135 By leting m= n~p sbstiion gives Zain pyl= Sater? (136) Sec.1.2 Linear Time-Invariant Operators 1B "Same" aan But comparing the summation in this last equation to Equation (1.33) forthe ztransform of x7), itean be seen that Rian p)] = 2° Rlxlmy) = *x(2), (138) ‘This property ofthe z-transform can be applied to the general equation for LTT operators a8 follows: zfs Sern *2fZraimah aay Since the z-transform is a linear transform, it possesses the distributive and associative properties. Equation (1.39) can be simplified as follows: ZOM+ Ya, Ze p= Yb lan py). (1.40) Using the shift property ofthe z-transform (Equation (1.38)) Vos DaeVo= Saye K@) aap Finally, Equation (1.42) can be rearranged to give the transfer function inthe z transform dom: wo. HO Seer (1.43) Using Equation (1.41), Figure 1.8 can be redrawn in the z-transform domain and this structure is shown in Figure 1.9, The flow graphs are identical if it is understood that @ ‘multiplication by zt in the transform domain is equivalent to a delay of one sampling time interval in the time domain.“ 1 Signal Processing Fundamentals Chap. 1 wa) FIGURE 1.9 Flow graph suucture forthe stransform of an operator. 1.2.4 Frequency Domain Transfer Function of an Operator ‘Taking the Fourier transform of both sides of Equation (1.28) (which deseribes any LTT ‘causal operator) results in the following Fon) = ¥ hom F tx(n-m)} «aay Using one ofthe properties ofthe Fourier transform Flxln~m)}=e PF x(n} 4s) From Equation (1.45) it follows that Y= Y hme PX), 1.46) po ‘Sec. 1.2_Linear Time tnvariant Operators 6 or dividing both sides by X(A) YD _F poner IDF nome, 14 ai 2 nme (any ‘which is easily recognized asthe Fourier transform ofthe series Mf). Rewriting this equation ®D YW = Fl 1.48) Hep D> Fide) (1.48) Figure 1.10 shows the time domain block diagram of Equation (1.48) and Figure 1.11 shows the Fourier transform (or frequency domain) block diagram and equation. The frequency domain description of a linear operator is often used to describe the operator. Most often itis shown as an amplitude and a phase angle plot as a function ofthe variable J (sometimes normalized with respect to the sampling rate, 1/7). 1.25 Frequency Response from the 2Transform Description Recall the Fourier transform pair xn Sanne re aa) and enty= [xine asx Lear Te vst xo——+} im Len z FRGURE 1.1 Tie domain tack seep nh in my (ram of LT system, Lunear Tne Invariant xH——>} in Ley RGURE 1.11 Frequney block di WA) = AA) XA) ‘gram of LTI system.16 Digital ignal Processing Fundamentals. Chap. 1 In order to simplify the notation, the value of 7, the period of the sampling waveform, is ‘normalized to be equal to one. ‘Now compare Equation (1.49) to the equation for the z-transform of x(n) as follows: Samer asp Equations (1.49) and (1.51) are equal for sequences x(n) which are causal (Le. x(n) for all n <0) ifzis set as follows: Xe) na a2) A plot ofthe locus of values for z in the complex plane described by Equation (1.52) is shown in Figure 1.12. The plot sa circle of unit radius. Thus, the z-transform of a causal Sequence, x(n), when evaluated on the unit circle in the complex plane, is equivalent to the frequency domain representation ofthe sequence. This is one of the properties of the ‘transform which make it very useful for discrete signal analysis. FIGURE 1.12 The unit circle in tha zane, See. 1.3 jgital Filters ” ‘Summarizing the last few paragraphs, the impulse response of an operator is simply ‘ sequence, him), and the Fourier transform of this sequence i the frequency response of the operator. The z-transform of the sequence Mm), called H(z), can be evaluated on the ‘it circle to yield the frequency domain representation ofthe sequence. This ean be writ. ten as follows: H)|, sag = HD. as 1.3 DIGITAL FILTERS ‘The linear operators that have been presented and analyzed in the previous sections can be thought of as digital filters. The concept of filtering is an analogy between the action ‘of physical strainer or sifter and the action of a linear operator on sequences when the ‘operator is viewed in the frequency domain, Such a filter might allow certain frequency ‘components of the input to pass unchanged to the output while blocking other compo. ents. Naturally, any such action will have its corresponding result in the time domain ‘This view of linear operators opens a wide area of theoretical analysis and provides in, creased understanding of the action of digital systems, ‘There are two broad classes of digital filters. Recall the difference equation for a general operator: 0 § 4200-0) Sepn0-) ase Notice thatthe infinite sums have been replaced with finite sums. This is necessary in ‘order that the filters can be physically realizable ‘The first class of digital filters have a, equal to 0 for all. The common name for filters ofthis type is finite impulse response (FIR) filters, since thei response to an im. pulse dies away in a finite number of samples. These filters are also called moving aver ‘age (or MA) filters, since the output is simply a weighted average ofthe input values, e eon= Fano) as There is a window of these weights (b,) that takes exactly the Q most recent values of xn) and combines them to produce the Output ‘The second class of digital filters are infinite impulse response (UR) filters. lass includes both autoregressive (AR) filters and the most general form, autoregressive ‘moving average (ARMA) fiers. Inthe AR case all for g= 1 10 Q~ 1 are set 0 x09=20)- Fagyin-p) «56* Digital Signal Processing Fundamentals Chap. 1 For ARMA filters, the more general Equation (1.54) applies. In either type of IR filter, single-impulse response at the input can continve to provide output of infinite duration with a given set of coefficients. Stability can be a problem for IIR filters, since with poorly chosen coefficients, the ourput can grow without bound for some inputs 1.3.1 Finite Impulse Response (FIR) Filters. Restating the general equation for FIR filters e kn) =D byxin= 4). as? ‘Comparing this equation withthe convolution relation for linear operators YH xen my, ‘one can see thatthe coefficients in an FIR filter are identical to the elements in the im pulse response sequence if this impulse response is finite in length. xn) h=Mq) for 123.4 0-1 ‘This means that if one is given the impulse response sequence fora linear operator with a finite impulse response one can immediately write down the FIR filter coefficients, However, as was mentioned atthe start of this section, filter theory looks at linear operators primarily from the frequency domain point of view. Therefore, one is most often sven the desired frequency domain response and asked to determine the FIR filter eoeff- ‘There are a number of methods for determining the coefficients for FIR filters given the frequency domain response. The two most popular FIR filter design methods ae listed and described briefly below 1. Use of the DPT on the sampled frequency response. In this method the required frequency response of the filter is sampled ata frequency interval of UT where Tis the time between samples in the DSP system. The inverse discrete Fourier transform (see section 1.4) is then applied to this sampled response to produce the impulse response of the filter. Best results are usually achieved if a smoothing window is applied to the fre~ {quency response before the inverse DFT is performed. A simple method to obtain FIR fil- {er coefficients based on the Kaiser window is described in section 4.1.2 in chapter 4. 2. Optimal mini-max approximation using linear programming techniques. There is 4 well-known program written by Parks and McClellan (1973) that uses the REMEZ. exchange algorithm to produce an optimal set of FIR filter coefficients, given the required frequency response of the filter. The Parks-McClellan program is available on the IEEE. digital signal processing tape or as part of many of the filter design packages available for personal computers. The program is also printed in several DSP texts (see Elliot 1987 or ‘Sec. 1.3 Digital Fiters 19 Rabiner and Gold 1975). The program REMEZ.C is a C language implementation of the Parks-McClellan program and is included on the enclosed disk. An example ofa filer designed using the REMEZ, program is shown atthe end of section 4.1.2 in chapter 4 ‘The design of digital filters will not be considered in detail here. Interested readers ‘may wish to consult references listed at the end of this chapter giving complete deserip- tions ofall dhe popular techniques. The frequency response of FIR filters can be investigated by using the transfer function developed fora general linear operator: ¥ bet Yo. OE Fae Notice tha the sums have been made finite to make the filter realizable. Since for FIR fil ters the a, are all equal t0 0, the equation becomes: (158) (159) ‘The Fourier transform or frequency response of the transfer function is obtained by let ting z= e¥, which gives HA) HON, pag Dbe (1.60) “This is a polynomial in powers of "ora sum of products ofthe form He) = by +e 4 bye? 4 bg Dg Ie. There isan important class of FIR filters for which this polynomial can be factored into a product of sums from HG) The? +an27 +80] [+10 a6 This expression for the transfer function makes explicit the values ofthe variable =“! which ‘cause H(2) to become zero, These points are simply the roots of the quadratic equation + Gg2"+Bas Which in general provides complex conjugate zero pairs, and the values , which provide single zeros.20 Digital Signal Processing Fundamentals Chap. 1 Jn many communication and image processing applications itis essential to have filters whose transfer functions exhibit a phase characteristic that changes linearly with ‘change in frequency. This characteristic is important because itis the phase tansfer rela tionship that gives minimum distortion to a signal passing through the fier. A very use {al feature of FIR fiers i that fora simple relationship ofthe coefficients, b, the result ing filter is guarented to have a linear phase response. The derivation ofthe Yelationship Which provides a linear phase filter follows, A linear phase relationship to frequency means that Hf) =|H(D) |e, where & and fi are constants. If the transfer function of a filter can be separated into a real function of f multiplied by a phase Factor ef! 8, then this transfer function will exhibit Tinear phase. Taking the FIR filter transfer function: H(z) = by + by ‘and replacing z by €%/to give the frequency response Mbeya by ee HP) = by +e PM + ye PAOD 44 bg se PMO Factoring out the factor 7280-11 an letting {equal (Q~ 1)2 gives HD =e PM hye 4 el! 4 pe EDF tot ly a MON 4 hy oP Combining the coefficiens with complex conjugate phases and placing them together in brackets HA) PE ye hye PA] [be 6g ge P0601] [bre #824 6p ye P21] te} {cach pair of coefficients inside the brackets is set equal as follows by abo b= boe by =29.5, 0. Each term in brackets becomes a cosine function and the linear phase relationship is Achieved. This isa common characteristic of FIR filter coefficients, Sec. 1.3 Digital Fikers a 1.32 Infinite Impulse Response (WR) Fitters. Repeating the general equation for IR filters e a w= F a,s0n-2)-Sr apn) ‘The z-transform of the transfer function of an IR filter is NNo simple relationship exists between the coefficients of the IIR filter and the im- Diulse response sequence such as that which exists in the FIR case. Also, obtaining linear ‘phase IIR filters is nota straightforward coefficient relationship as isthe case for FIR fl. ters. However, IR filters have an important advantage over FIR structures: In genera, IR filters require fewer coefficients to approximate a given filter frequency response than do FIR filters. This means that results can be computed faster on a general purpose ‘computer or with less hardware ina special purpose design. In other words, IR filters are ‘computationally efficient. The disadvantage of the recursive realization is that IR filters are much more difficult to design and implement. Stability, roundoff noise, and some times phase nonlinearity must be considered carefully in all but the most trivial HR filter designs. ‘The direct form HR filter realization shown in Figure 1.9, hough simple in appear- ‘ance, can have severe response sensitivity problems because of coefficient quantization, specially as the order of the filter increases. To reduce these effects, the tansfer function js usually decomposed into second order sections and then realized as cascade sections. ‘The C language implementation given in section 4.1.3 uses single precision floating-point ‘numbers in order to avoid coefficient quantization effects associated with fixed-point plementations that can cause instability and significant changes in the transfer function, TR digital filters can be designed in many ways, but by far the most common HR. a ae all the Same. This is an example of clipping. A region of interest is defined and any valucs out. Side the region are given a constant output. 1.5.1 p-Law and A-Law Compression ‘There ae two other compression laws worth listing because of their use in telephony-— the u-law and A-Law conversions. The law conversion is defined a follows: = sgncy, M+ bfaD (1.80) Soa = 8000) rns ‘here sen( is a function that takes the sign of its argument, and isthe compression pa rameter (255 for Norh Ametican telephone transmission), The input value must be normalized to lic between ~1 and +1. The A-law conversion equations ae as follows: Ail Trina) for Yigl between 0 and UA and Sou = 58S) (sy sant) IMAG Sou = 88) for tf between WA and 1 1m these equations, 4 is the compression parameter (87.6 for Eoropean telephone tran mission) ‘An extreme version of clipping is used in some applications of image processing to Droduce binary pictures. In this technique a threshold is chosen (asually based on a his.34 Digital Signal Processing Fundamentals Chap. 1 Output (Integer) 4095 4. t+ $input (vos owe 4.095 @) 256 128 Input FIGURE 1.19 a) Linear 12-bt ADC. {@) Linear 8bit ADC. fe) Nonlinear 0.128 4.096 togram of the picture elements) and any image element with a value higher than threshold is set to | and any element with a value lower than threshold is set to zero. In this way the significant bits are reduced to only one. Pictures properly thresholded can produce excel- lent outlines of the most interesting objects in the image, which simplifies further processing considerably. ‘See. 1.6 Probability and Random Processes 35 Output (Integer) Input (V) FIGURE 120 clipping to 8 bits 1.6 PROBABILITY AND RANDOM PROCESSES The signals of interest in most signal-processing problems are embedded in an environment of noise and interference. The noise may be due to spurious signals picked up during transmission (interference), or dve t the noise characteristics ofthe electronics that receives the signal or a number of other sources. To deal effectively with noise ina si nal, some model of the noise or of the signal plus noise must be used. Most often a proba bilistic model is used, since the noise is, by nature, unpredictable. This section introduces the concepts of probability and randomness that are basic to digital signal processing and ‘ives some examples of the way a composite signal of interest plus noise is modeled. 1.6.1 Basic Probability Probability begins by defining the probability of an event labeled A as P(A). Event A can ‘be the result of coin tos, the outcome of a horse race, or any other result ofan activity that is not completely predictable. There are thre attributes of this probability P(A): (1) P(A) >=. This simply means that any result wll ether havea postive chance of occurrence or no chanee of occurence {@) P (All possibie outcomes) = 1. This indicates that some result among those possible {is bound to occur, a probability of I being certainty (8) For (4), where (A, 9A) = 0, PCUA) = P(A). For ast of evens, (4), where the events are mutvaly disjoint (no two can occur as the result of a Single trial of36 Digitel Signal Processing Fundamentals Chap. 1 the activity), the probability of any one of the events occurring is equal tothe sum of their individual probabilities With probability defined in this way, the discussion can be extended to joint and conditional probabilities. Joint probability i defined asthe probability of occurrence of a specific set of two or more events as the result ofa single trial ofan activity. For instance, ‘the probability that horse A will finish third and horse B will finish first in a particulay hhorse race isa joint probability. This is writen: PAM B)= PCA and B) = PCAB), (1.82) Conditional probability is defined as the probability of occurrence of an event A given that B has occurred, The probability assigned to event A is conditioned by some know! edge of event B, This is written PCA given B) = PCAIB). «183) If this conditional probability, P(AIB), and the probability of B are both known, the probability of both ofthese events occurring (joint probability) is PAB) = PCAIB)P(B), (18 So the conditional probability is multiplied by the probability of the condition (event B) to get the joint probability. Another way to write this equation is PAB) PB) This is another way to define conditional probability once joint probability is understood. POALB) (1.85) 1.6.2 Random Variables In signal processing, the probability of a signal taking on a certain value or lying in a cer- lain range of values is often desired. The signal in this case can be thought of as random Variable (an element whose set of possible values isthe set of outcomes of the activity), For instance, for the random variable X, the following set of events, which could occur, may exist: Event A X takes on the value of 5 (X= 5) Event BX= 19 Event CX = 1.66 etc ‘This is a useful set of events for discrete variables that can only take on certain specified values. A more practical set of events for continuous variables associates each event with ‘Sec. 1.6 Probability and Random Processes a the variable lying within a range of values. A cumulative distribution function (or CDF) for a random variable can be defined as follows: FQ)= PX S09), (186) ‘This cumulative distribution, function, then, is a monotonically increasing function of the independent variable x and is valid only forthe particular random variable, X. Figure 121 shows an example of a distribution function for a random variable. If F(x) is dfferenti- ated with respect to x the probability density function (or PDF) for X is obtained, repre sented as follows: F(x) P(x): wares 87) Inegting the dtbtnfcton ack ghn alos: Po = [a ss Since F(x) is always monotonically increasing, p(x) must be always positive or zero. Figure 1.22 shows the density function forthe distribution of Figure 1.21. The uly of these functions can be illustrated by determining the probability that the random variable Xlies between a and b. By using probability Property 3 from above PX Sb)= PlacX _auarrizxrion |} ——+f reconstruction} ——e t t t FIGURE 124 Quantization and reconstruction of a signal Sec. 1.6 Probability and Random Processes a 64 au wit oa, Reconstuction —_« nin0 ——* Levels —____# === HH Gat 3 2 ' 000000 ----- o a Ofginal —— xin) ¥ Hay teageaz™ AR Fitor MA Filter Wo) Hey = Bebe" soz tear raz ARMA Filter FIGURE 126 AR, MA, and ARMA fitor structures.“6 Digital Signal Processing Fundamentals Chap. 1 decomposition. This theorem states that any real-world process can be decomposed into a deterministic component (such as a sum of sine waves at specified amplitudes, phases, land frequencies) and a noise process. In addition, the theorem states that the noise process can be modeled asthe outpot of a linear filter excited at its input by a white noise signal. \DAPTIVE FILTERS AND SYSTEMS “The problem of determining the optimum linear filter was solved by Norbert Wiener and others. The solution is refered to as the Wiener filter and is discussed in section 1.7.1, ‘Adaptive filers and adaptive systems attempt to find an optimum set of filter parameters (often by approximating the Wiener optimum filter) based on the time varying input and ‘output signals, In this section, adaptive filters and their application in closed loop adaptive systems are discussed briefly. Closed-loop adaptive systems are distinguished from ‘open-loop systems by the fact that ina closed-loop system the adaptive processor is controlled based on information obtained from the input signal and the output signal ofthe processor. Figure 1.27 illustrates a basic adaptive system consisting ofa processor that is ‘controlled by an adaptive algorithm, which isin turn controlled by a performance calculation algorithm that has direct knowledge of the input and output signals. Closed-Loop adaptive systems have the advantage that the performance calculation algorithm can continuously monitor the input signal (d) and the output signal (y) and determine if the performance of the system is within acceptable limits. However, because several feedback loops may exis in this adaptive structure, the automatic optimization algorithm may be difficult to design, the system may become unstable or may result in ronunique and/or nonoptimum solutions. In other situations, the adaptation process may not converge and lead toa system with grossly poor performance. In spite of these possible drawbacks, closed-loop adaptive systems are widely used in communications, digital storage systems, radar, sonar, and biomedical systems. The general adaptive system shown in Figure 1.27(a) can be applied in several ‘ways, The most common application is prediction, where the desied signal (d) is the application provided input signal and a delayed version of the input signal is provided tothe input of the adaptive processor (x) as shown in Figure 1.27(b). The adaptive processor -st then try to predict the current input signal inorder to reduce the error signal (€)toward a mean squared value of zero. Prediction is often used in signal encoding (for example, speech compression), because if the next values of a signal can be accurately predicted, then these samples need not be transmitted or stored. Prediction can also be used to reduce noise or interference and therefore enhance the signal quality if the adaptive processor is designed to only predict the signal and ignore random noise elements or known interference patterns. ‘As shown in Figure 1.27(0), another application of adaptive systems is system ‘modeling of an unknown or difficult to characterize system. The desired signal (d) isthe ‘unknown system's output and the input tothe unknown system and the adaptive processor (2) is @ broadband test signal (pethaps white Gaussian noise). After adaptation, the Sec. 1.7 Adaptive Fikers and Systems a (desired output) (input (outouty Ye cero (eet) emsagn [ete PS eferon i “dapive algeria [* ) 5 a "Adapive >| bey => prosesor 7) © 2 @ >| Prent ‘aapive fe >| processor | © FIGURE 127 a Closedloop adaptive system; (8) preition: (system modeling48 Digital Signal Processing Fundamentals Chap. 1 "unknown system is modeled by the final transfer function of the adaptive processor. By using an AR, MA, or ARMA adaptive processor, different system models can be oby tained. The magnitude of the error (€) can be used to judge the relative success of each ‘model, 1.7.1 Wiener Fitter Theory The problem of determining the optimum linear fier given the structure shown in Figure 1.28 was solved by Norbert Wiener and others. The solution is referred to as the Wiener ‘filter. The statement ofthe problem is a follows: Determine a set of coefficients, w, that minimize the mean ofthe squared error of the filtered output as compared to some desired output, The error is written tn) = dn) widnE+, (4.106 or in vector form e(n) = din)— wan), (1.107) ‘The mean squared errors a function ofthe tap weight vector w chosen and is written Jon = Eeemer} (1.108) ue. +410) en) FIGURE 1.28. Wienorfiter problem. Sec. 1.7 Adaptive Fiters and Systems 49 ‘Substituting inthe expression for e(n) gives I(w)= E{d(n)d%n) — d(nyu" (nw Kon) = Bfutn)an) ~ d(nyu (ay aha wane) + wana" ny} Jew) = var(d)— pw — wip + wR, 110) where p = E{u(n)d%n)}, the vector that is the product of the cross correlation between the desired signal and each element ofthe input vector. In order to minimize Jw) with respect to w, the tap weight vector, one must se the derivative of Jw) with respect to w equal 1 zero. This will give an equation which, when solved for w. gives wa, the optimum value of w. Setting the total derivative equal to zero gives 2p+2Rwy = 0 quam, Rwy =P. ann) Ifthe matrix Ris invertible (nonsingular) then wo can be solved as 113) So the optimum tp weight vector depends on the autocorrelation of the input [process and the cross correlation between the input process and the desired output, ‘Equation (1.113) is called the normal equation because a filter derived from this equation will produce an error that is orthogonal (or normal) to each element of the input vector. This can be written Efuindes%n)} =0. aa) Is helpful at this point to consider what must be known to solve the Wiener filter problem: (1) The M x M autocorrelation matrix of u(n), the input vector (2) The cross correlation vector between w(n) and dn) the desired response. Its clear that knowledge of any individual u(n) will not be sufficient to calculate these statistics. One must take the ensemble average, E\ }, to form both the autocorrela. tion and the eross correlation. n practice, a model is developed for the input process and from this model the second order statistics are derived, A legitimate question at this point is: What i d(m)? It depends on the problem. One ex- ‘ample of the use of Wiener filter theory isin linear predictive filtering. In this case, the do- sired signal isthe next value of u(n), the input. The atual w() is always available one sam. Ble after the prediction is made and this gives the ideal check on the quality of the prediction82 Digital 1a Processing Fundamentals Chap. 1 Bunnex, P. and Kisate, B. (1991). C Language Algorithms for Digital Signal Processing. Englewood Cif, NJ: Pence Hall arais, F. (1978). On the Use of Windows for Harmonic Analysis withthe Discrete Fourier ‘Transform. Proceedings ofthe IEEE. 66, (1), SI-83, HAYKIN, S. (1986). Adaptive Filter Theory. Englewood Cif, NI: Preatice Hal, MCCLELLAN, J, PARKS, T. and RABINER, LR. (1973). A Computer Program for Designing ‘Optimum FIR Linesr Phase Digital Filters. /EEE Transactions on Audio and Electro-acoutics, AU-2T. (6), 506-526. (MourR, C., LITTLE, J and BANGER, S. (1987), PC-MATLAB User's Guide. Sherbourne, MA: The Math Works OrrenEm, A. and ScHAFER, R. (1975). Digital Signal Processing. Englewood Cliffs, Ni Prentice Hal (Orpen, A. and ScHAPER, R. (1989). Discretesine Signal Processing. Englewood Clifs, Ni Prentice Hal PAFOULIS, A. (1968), Probebiliy, Random Variables and Stochastc Processes. New York ‘McGraw Hil, RapiNe, L. and GOLD, B. (1975). Theory and Application of Digital Signal Processing Englewood Cliffs, Ni: Prentice Hal, STEARNS, S, and DAVID, R. (1988). Signl Processing Algordhms. Englewood Cliffs, NJ: Prentice Hall. ‘STEARNS, Sand DAVID, R. (1993). Signal Processing Algorithms in FORTRAN and C. Englewood Cliffs, NF: Prentice Hall. VAIDYANATHAN,P. (1993), Multrate Systems and Filter Banks, Englewood Chis, NI Prentice Hall ‘WiDkOW, B. and STEARNS, S. (1985). Adaprive Signal Processing. Englewood Cliffs, NJ: Prentice Hall CHAPTER 2 C PROGRAMMING FUNDAMENTALS “The purpose of this chapter is to provide the programmer with a complete overview of, the fundamentals of the C programming language that are important in DSP applications. In particular, text manipulation, bitfelds, enumerated datatypes, and unions are not dis: cussed, because they have limited utility inthe majority of DSP programs. Readers with programming experience may wish to skip the bulk of this chapter withthe possible exception of the more advanced concepts related to pointers and structures presented in ‘sections 2.7 and 2.8, The proper use of pointers and data structures in C can make a DSP program easier to write and much easier fr others to understand. Example DSP programs. in tis chapter and those which follow will clarify the importance of pointers and data structures in DSP programs. 2.1 THE ELEMENTS OF REAL-TIME DSP PROGRAMMING “The purpose of a programming language is to provide a tool so that a programmer can easily solve a problem involving the manipulation of some type of information. Based on this definition, the purpose of a DSP program is to manipulate a signal (a special kind of, information) in such a way that the program solves a signal-processing problem. To do this, a DSP programming language must have five basic elements: (1) A method of organizing differeit types of data (variables and datatypes) {@) A method of describing the operations tobe done (operators) 8) A method of controlling the operations performed based on the results of operations (program contro!)50 Digital Signal Processing Fundamentals Chap. 4 1.72 LMS Algorithms ‘The LMS algorithm isthe simplest and most used adeptive algorithm in use today I this brief section, the LMS algorithm a ts applied othe adaptation of time-varying Flt ters (MA systems) and IIR filters (adaptive recursive fillers or ARMA systems) is de, scribed. A detailed derivation, justification and convergence properties can be Tound ig the references, For the adaptive FIR system the transfer function is described by e vn) = Fb, @atn— 9), (tis) Where b(E) indicates the time-varying coefficients of the filter. With an FIR fier the mean squared errr performance surface in the multidimensional space ofthe filter coer. ficients is quadratic function ad has a singe minimum mean squared error (MMSE), ‘The coefficient values atthe optimal solution is called the MMSE solution, The pos! 2¢ the adaptive process is to adjust the filter coefficients in such a way that they move fro their current position toward the MMSE solution. If the input signal changes with tine the adaptive system must continually adjust the coefficients to follow the MMSE soln. tion. In practice, the MMSE solution is often never reached ‘The LMS algorithm updates the filer coeicients based on the method of steepest descent. This ean be described in vector notation as follows By.) =B,-AV, a116) ‘where B, is the coefficient column vector, 1 is a parameter that controls the rate of convergence and the gradient is approximated as ‘here % isthe input signal column vector and e, isthe error signal as shown on Figure 1.27, Thus, the basic LMS algorithm can be writen as Byay =By +2 Xp «atts, ‘The selection of the convergence parameter must be done carefully, because if tis {00 smal the coefficient vector will adapt very slowly and may not react to changes in the input signal. I the convergence parameter is too large, the system will adapt to noise in the signal and may never converge to the MMSE solution. FFor the adaptive IR system the transfer function is described by y(n) es e Da@ur-9-Fa,@yin—p), any ‘Sec.1.8 References 51 Where b(k) and a(k) indicate the time-varying coefficients ofthe filter. With an IR filter, the mean squared enror performance surface in the multidimensional space of the filter Cocfficients isnot a quadratic function and can have multiple minimums that may cause the adaptive algorithm to never reach the MMSE solution. Because the IR system has ples, the system can become unstable if the poles ever move outside the unit circle dur. ing the adaptive process. These two potential problems ae serious disedvantages of aap tive recursive filters that limit thee application and complexity. For this reason, most ap. Plcations are limited toa small number of poles. The LMS algorithm can again te used {0 update the filter coefficients based on the method of steepest descent. This can be de. scribed in vector notation a follows: Waa = WMV, (1.120) Where W is the coefficient column vector containing the a and b coefficients, M is a di- ‘zonal matrix containing convergence parameters ut fr thea coefficients and vy through Voc that controls the rate of convergence of the b coefficients. In this cas, the gradient is approximated as ay het isthe eror signal as showa in Figure 1.27, and (hm + F bya —9) (um Bath) = 1h—m)+ $0,048, (8~ p. «urs ‘The selection ofthe convergence parameters must be done carefully because if they a too small the coefficient vector will adapt very slowly and may not rest wo changes in the input signal. Ifthe convergence parameters are too large the system will adapt to noise in the signal or may become unstable. The proposed new location of the poles should also be tested before each update to determine if an unstable adaptive filter is ‘about to be used. If an unstable pote location is found the update should not take place ‘and the next update value may lead to a beter solution 1.8 REFERENCES BRiGHAM,E. (1974). The Fast Fourier Transform. Englewood Cliff, NI: Prentice Hall ‘CLARKSON, P (1993). Optimal and Adaptive Signal Processing. FL: CRC Press, Buotr, DF. (BA). (1987), Handbook of Digital Signal Processing. San Diego, CA: Acsdemic Press= C Programming Fundamentals Chap. 2 (4) A method of organizing the data and the operations so that a sequence of program steps can be executed from anywhere in the program (functions and data structures) and (8) A method to move data back and forth between the outside world and the program (Gnpurvourpur) ‘These five elements are required for efficient programming of DSP algorithms. Their im- ‘plementation in C is described inthe remainder of this chapter. ‘As a preview of the C programming language, a simple real-time DSP program is shown in Listing 2.1. It illustrates each ofthe five elements of DSP programming. The listing is divided into six sections as indicated by the comments in the program. This simple DSP program gets a series of numbers from an inpot source such as an A/D converter (the function get imput () is not shown, since it would be hardware specific) and deter ‘mines the average and variance ofthe numbers which were sampled. In signal-processing terms, the output ofthe program isthe DC level and total AC power of the signal "The first line of Listing 21, maim(), declares that the program called main, which thas no arguments, will be defined after the next left brace ({ on the next line). The main program (called main because it is executed first and is responsible for the main control fof the program) is declared in the same way as the functions. Between the lft brace on ‘the second line and the right brace half way down the page (before the line that starts Eloat average...) re the statements tht form the main program. As shown in this example, all statements in C end in a semicolon (+) and may be placed anywhere on the {input line, Infact, ll spaces and carriage control characters are ignored by most C compilers, Listing 2.1 is shown ina format intended to make it easier to follow and modify. ‘The third and fourth lines of Listing 2.1 are statements declaring the functions (average, variance, sqrt) that will be used in the rest of the main program (the function segrt () is defined inthe standard C library as discussed in the Appendix. This first section of Listing 2.1 relates to program organization (element four of the above lis), The beginning of each section of the program is indicated by comments in the pro- ‘gram source code(j.e.,/* section 1 */). Most C compilers allow any sequence of ‘characters including multiple lines and, in some cases, nested comments) between the 7 and #7 delimites ‘Section two of the program dectares the variables to be used. Some variables are declared as single floating-point numbers (such a8 ave and vax); some variables are de clared as single integers (such as 4, count, and number); and some variables are ar rays (such as signal {100}). This program section relates to element one, data organ zation. ‘Section three reads 100 floating-point values into an array called signal using a for oop (similar to a DO loop in FORTRAN). This loop is inside an infinite white loop that is common in realtime programs. For every 100 samples, the program will display the results and then get another 100 samples. Thus, the results are displayed in real-time. ‘This section relates t element five (inputoutput) and element thee (program contro). Section four of the example program uses the functions average and variance , # maint) ‘The Elements of Re Sec.21 ime DSP Programming y+ section 1 */ float average(),variance(),sqrt(); /* declare functions */ float signal 100} ,ave, var: Prsection 2 +/ 7 declare variables */ whtle(a) ( for(count = 0; count < 100 ; signal [count] = getinput count ++) ( /* section 3 */ (7 road input signal */ , ave = average signal, count) : var © variance(signal count) ” i section 4 */ calculate results */ pelnté(*\n\naverage = $£°,ave) printé(* Variance = #£",var) > ) £ ‘ ” i section 5 */ Print results */ float average(flost array(],int size) /* ” jection 6 */ function calculates average */ float sun = 0.0; ° /* intielize and declare sun */ fort ie size ; int) ‘sum = sum + array{i] rovurn (eun/size); calculate sun */ return average */ r oat variance (float exray(),int size) /* function calculates variance */ int i; J declare local variables */ Float ave; float sum = 0.0; (* intielize sum of signal */ Aloat sum = 0.0; (sum of signal squared */ for(i = 0; i < size; ist) ‘pum = sup + arvaylil? und = sun? + arraylij*array{i}; /* calculate both sums */ ) ave = sun/size J calculate average */ return ({sui2 ~ suntave) /(size-1)); /* return variance */ ce of Listing 2.1. Example C program that calcula ‘8 sequence of numbers the average and var to calculate the statistics to be printed, The variables ave and var are used to store the results and the library function print is used to display the results. This part of the program relates to element four (Functions and data structures) because the operations defined in functions average and variance are executed and stored.56 C Programming Fundamentals Chap, 2 Section five wes the library function pede to dplay the resus ave, vay anal cll he function eatin oe 6 dup he senda dvi Ta a of the program relates loment our (anton) and clement eeu ause the operons defined in function aga ae executed an he race played The to functions, average and variance, are defied inthe remain Listing 2. This ast seston relates primi to element two (opera) aa ee tiled operation ofeach function i defined inthe same Way tate mate Pa fied The anton an argument peste dnd and te lc vale ee cach function ze declared. The operation reed by each function se ten deen lowed by artum statement that passes the result bak othe mais ee 2.2 VARIABLES AND DATA TYPES All programs work by manipalating some kindof information. variable in Cis defined by declaring that a sequence of characters (the variable identifier or name) are to be {tealed as a particular predefined type of data. An identifier may be any sequence of chan acters (usually with some length restrictions) that obeys the following three rues (2) All identifiers start witha letter or an underscore (_), (2) The rest ofthe identifier can consist of fetes, underscores, and/or digits, (@) The rest of the identifier does not match any ofthe C keywords. (Check compiler implementation for ais ofthese.) 4m particular, C is case sensitive; making the variables Average, AVERAGE, and AVeRaGe all diferent. The C language supports several different data types that represent inte resent integers (ectared Ant), floating-point numbers (declared fat or double), and text data (de clated char), Also, arrays of each variable type and pointers of each type may be de > arithmetic shift right (numberof bits is operand) ‘The unary bitwise NOT operator, which inverts all the bits in the operand, is imple- ‘mented with the ~ symbol. For example, if 4 is declared as an uneignea int, then 4-07 sets 4 to the maximum integer Value for an unsigned int. 2.3.3 Combined Operators C allows operators to be combined withthe assignment operator (=) so that almost any statement ofthe form = cvariable> ‘can be replaced with = where represents the same variable name in all cases. For example, the following pairs of expressions involving x and y perform the same function: In many eases, the left-hand column of statements will result in a more readable and easier to understand program. For this reason, use of combined operators is often avoided. Unfortunately, some compiler implementations may generate more efficient code if the ‘combined operator is used, 2.3.4 Logical Operators Like all C expressions, an expression involving a logical operator also has a value. A log: ical operator is any operator that gives a result of true of false, This could be a compar son between two values, or the result ofa series of ANDSs and ORS, Ifthe result ofa logical operation is true, it has a nonzero value; if itis false, it has the value O. Loops and ife Programming Fundamentals Chap. 2 statements (covered in section 2.4) check the result of logical operations and change pro. ‘gram flow accordingly. The nine logical operators ae as follows: <—essthan <= less than or equal to == equalto >= greater than or equal 1 > greater than f= notequal to & logical AND TL ogical OR £ —Iogical NOT (onary operator) Note that == can easily be confused with the assignment operator (=) and will result in a valid expression because the assignment also has a value, which is then interpreted as ‘mse or false. Also, && and | | should not be confused with their bitwise counterparts (& and |) as this may result in hard to find logic problems, because the bitwise results may ‘not give true or false when expected. 2.35 Operator Precedence and Type Conversion Like all computer languages, C has an operator precedence that defines which operators in an expression are evaluated firs. If this order isnot desired, then parentheses can be used to change the order. Thus, things in parentheses are evaluated first and items of equal precedence are evaluated from left to right. The operators contained in the parentheses oF ‘expression are evaluated in te following order (listed by deereasing precedence): tee increment, decrement - unary minus ‘multiplication, division, modulus Addition, subtraction shift lft, shift ight relational with less than or greater than equal, not equal bitwise AND * Ditwise exclusive OR 1 bitwise OR ee logical AND I logical OR Statements and expressions using the operators just described should normally use variables and constants of the same type. If, however, you mix types, C doesn’t stop dead Sec.24 Program Contro! 63 (like Pascal) or produce a strange unexpected result (like FORTRAN). Instead, C uses a set of rules to make type conversions automatically. The two basic rules (1) fan operation involves two types, the value with a lower rank is converted to the type of higher rank. This process is called promotion and the ranking from highest to lowest type is double, flost, long, int, short, and char. Unsigned of each of the types outranks the individual signed type. (2) In an assignment statement, the final result is converted tothe type of the variable that is being assigned. This may result in promotion or demotion where the value is truncated to a lower ranking type. Usually these rules work quite well, but sometimes the conversions must be stated explicitly in order to demand that a conversion be done in a certain way. This is accom= plished by type casting the quantity by placing the name of the desired type in parentheses before the variable or expression, Thus, if 4 is an Amt, then the statement 4=10*(1.55¢1.67) ; would set 4 to 32 (the truncation of 32.2), while the statement 410*((int)1.55¢1.67); would set 4 to 26 (the truncation of 26.7 since (4nt) 1.55 is truncated to 1). 2.4 PROGRAM CONTROL ‘The large set of operators in C allows a great deal of programming flexibility for DSP applications. Programs that must perform fast binary or logical operations can do so without Lusing special functions to do the bitwise operations. C also has a complete set of program control features that allow conditional execution or repetition of statements based on the result of an expression. Proper use of these control structures is discussed in section 2.11.2, where structured programming techniques are considered, 2.4.1 Conditional Execution: {£-e18 In C, as in many languages, the 4€ statement is used to conditionally execute a series of statements based on the result of an expression. The 4¢ statement has the following ‘generic format: s£(valwe) else statenent2; ‘where value is any expression that results in (or can be converted to) an integer value, If value is nonzero (indicating a true result), then statement: is executed; otherwise, atatement2 is executed. Note that the result of an expression used for value need ‘ot be the result ofa logical operation—al that is required is thatthe expression results in zero value when statement? should be executed instead of statement. Also, theos Programming Fundamentals Chap. else statement2; portion of the above form is optional, allowing statement io be skipped if value is false, ‘When more than one statement needs to be executed if a particular value is te, a ‘compound statement is used. A compound statement consists of a left brace ({), some ‘number of statements (each ending with a semicolon), and a right brace (9). Note that the body of the maim () program and functions in Listing 2.1 are compound statements, In fact, a single statement can be replaced by a compound statement in any of the contol structures described in this section. By using compound statements, the ££-e1.8e con trol structure can be nested asin the following example, which converts a floating-point ‘number (result) to a 2-bit wos complement number (out): sf trout > 0) ( /* positive outputs if (result > sigma) out = ar /* biggest output */ else J? 0 « result a2) ? a0-al : a2-al; Conditional expressions are not necessary, since 4€-ele statements can provide the Same function. Conditional expressions are more compact and sometimes lead to more66 Programming Fundamentals Chap. 2 efficent machine code. On the other hand, they are often more confusing than the fami. iar L€-e1ae control structure 2.4.4 Loops: while, do-while, and for Chas three control structures that allow a statement or group of statements tobe repeated a fixed or variable number of times. The whe loop repeats the statements until atest expression becomes false, or zero, The decision to go through the loop is made before the loop is ever started. Thus, itis possible that the loop is never traversed. The general form is while (expression) where statement can be 2 single statement or a compound statement enclosed in braces. An example of the latter that counts the number of spaces in a null-terminated siring (an array of characters) follows: spacecount = 0; /* space.count is an int */ b= 0; 1 array index, = 0*/ while(stringli)) ( Le(stringli] «= * ") space_counters ie 1 poxt char */ > Note that ifthe string is zero length, then the value of string J will initially point to ‘the null terminatoe (which has @ Zero or false value) and the white loop will not be exe- ‘cuted. Normally, the whi.1e loop will continue counting the spaces in the tring until he null terminator is reached. ‘The do-whi Le loop is used when a group of statements need to be repeated and the exit condition should be tested atthe end of the loop. The decision to go through the loop one more time is made ater the loop is traversed so thatthe loop is always executed at least once. The format of do-wh41e is similar tothe whi. 1e loop, except thatthe do keyword starts the statement and while (expression) ends the statement. A single ‘of compound statement may appear between the do and the while keywords. A common use for this loop isin testing the bounds on an input variable as the following example i- Tusteaes: a ( print#(*\ninter FFP leogth (1 seant ("8d", ££ ft_length) ) wnile(tte_engen > 1024); 1 whan 1025) =") In this code segment, ifthe integer ££t_ength entered by the users larger than 1024, the user is prompted again until the ££¢_ength entered is 1024 of less. ‘Sec.2.4 Program Control oa ‘The fox loop combines an initialization statement, an end condition statement, and an action statement (executed atthe end of the lop) into one very powerful control structure. The standard form is for(initialize ; test condition ; end update) ‘The three expressions are all optional (for (7) + is an infinite loop) and the statement may bbe single statement, a compound statement or just a semicolon (a nul statement). The most frequent use ofthe fox loop i indexing an array through its elements. For example, for(i = 0; i < length ; is) alil sets the elements ofthe array a to zero from a0} up to and including a [Length-1} This for statement sets 4 t0 zero, checks to see if 4 is less than Length, if so it executes the statement a [4] #07, increments 4, and then repeats the loop until 4 i equal to Length. The integer 4 is incremented or updated at the end ofthe loop and then the test condition statement is executed. Thus, the statement after a €or loop is only executed if the test condition in the fox loop is tue. For loops can be much more complicated, be- ‘cause each statement can be multiple expressions as the following example illustrates: for(i m0, i322; i < 257 ite, 3 = 3853) printe(*\nta $2", i,13); ‘This statement uses two Ants in the for loop (4, 43) to print the frst 25 powers of 3. [Note thatthe end condition is sila single expression (4 < 25), but that the initialization ‘and end expressions are two assignments for the two integers separated by a comma, 2.45 Program Jumps: break, continue, and goto ‘The loop control structures just discussed and the conditional statements (Af, 4£-e180, and switch) are the most important control structures in C. They should be used ex: clusively in the majority of programs. The last three control statements (break, continue, and goto) allow for conditional program jumps. If used excessively, they will make a program harder to follow, more difficult to debug, and harder to modify. ‘The break statement, which was already illustrated in conjunction with the switeh Statement, causes the program flow to break free of the eviteh, for, while, or ‘do-whi Le that encloses it and proceed to the next statement after the associated contol Structure. Sometimes break is used to leave a loop when there are two or more reasons to end the loop. Usually, however, it is much clearer to combine the end conditions in a single logical expression in the loop test condition. The exception to this is when a large ‘number of executable statements are contained in the loop and the result of some state ‘ment should cause a premature end ofthe loop (for example, an end of file or other error condition),68 Programming Fundamental Chap, 2 ‘The continue statement is almost the opposite of break; the continue causes the rest of an iteration to be skipped and the next iteration to be started. The ‘cont nue statement can be used with for, while, and do-whi.ie loops, but cannes be used with switeh. The flow of the lop in which the continue statement appear, is interrupted, but the loop isnot terminated. Although the comtAmue statement can re. sult in very hard-o-follow code, it can shorten programs with nested 4f-e1 0 sta ‘ments inside one of three loop structures The goto statement is available in C, even though itis never requited in C pro ramming. Most programmers with a background in FORTRAN or BASIC computer Ins, ‘guages (both of which require the goto for program contol) have developed bad pro, ‘gramming habits that make them depend on the goto. The goto statement in C uses » Jabel rather than number making things alte beter. For example, one possible legit imate use of goto is for consolidated erro detection and cleanup asthe following simple ‘example illustrates rogram statements status = function_one (alpha, beta, constant); Af(status t= 0) goto error_oxit status = function two (delta, time) ; i€{status != 0) goto error_exie; error_exit! (rend up hare from all errors */ switch (status) ( case 1: print £(*\nbivide by zero error\n*); exit: case 2 print€(*\nout of menoty error\n*) ecto; cave 3: printf (*\ntog overflow error\n"); exe: dotaule rine£(*\nnknow error\n*); ent; Sec.2.6 Functions 69 In the above example, both of the fictitious functions, function_one and function_two (sec the next section concerning the definition and use of functions), Perform some set of operations that can result in one of several errors. If no errors ate de. tected, the function retus zero and the program proceeds normally. If an error is de- fected, the integer status is set to an error code and the program jumps to the label error_excit where a message indicating the type of error is printed before the program is terminated, 25 FUNCTIONS All C programs consist of one or more functions. Even the program executed first is @ function called maim(), as illustrated in Listing 2.1. Thus, unlike other programming languages, there is no distinction between the main program and programs that are called by the main program (sometimes called subroutines). A C function may or may not re- {um a value thereby removing another distinction between subroutines and functions in languages such as FORTRAN. Each C function isa program equal to every other fne- tion. Any function can call any other function (a function can even call itself), or be called by any other function, This makes C functions somewhat different than Pascal procedures, where procedures nested inside one procedure are ignorant of procedures elsc- ‘where in the program. It should also be pointed out that unlike FORTRAN and several other languages, C always passes functions arguments by value not by reference. Because arguments are passed by value, when a function must modify a variable in the calling program, the C programmer must specify the function argument as a pointer to the begin. ning ofthe variable in the calling program's memory (see section 2.7 for a discussion of pointers), 2 Defining and Declaring Functions ‘A function is defined by the function type, a function name, a par of parentheses containing an optional formal argument list, and a pair of braces containing. the optional executable statements. The general format for ANSI Cis a follows: type nano(formai argent List with declarations) i function body , The type determines the type of value the function returns, not the type of arguments. If no type is given, the function is assumed to return an Aint (actually, a variable is also assumed to be of type dint if no type specifier is provided). Ifa function does not return a Value, it should be declared with the type ved. For example, Listing 2.1 contains the function average as follows:0 Programming Fundamentals Chap. 2 Float average(float array{},int size) ‘ Float sum = 0.0; /* Anitialize and declare eum */ for(i = 0; i < size ; is) sums sun + arraylil; /* calculate sun */ roturnisun/size); — /* return average */ ‘The first line inthe above code segment declares a function called avexage will retum 1 single-precision floating-point value and will accept two arguments, The (Wo argument names (axvay and #4:ze) are defined in the formal argument list (also called the formal parameter list). The type ofthe two arguments specify that array is a one-dimensional array (of unknown length) and size is an int. Most modern C compilers allow the ar _gument declarations for a function to be condensed into the argument list. ‘Note that the variable array is actualy just a pointer to the beginning of the oat: array that was allocated by the calling program. By passing the pointer, oly one value is passed to the function and not the large floating-point array. Infact, the Function could also be declared as follows: float average(float *array, int size) This method, although more correct in the sense that it conveys what is passed to the function, may be more confusing because the function body references the variable as array{i} ‘The body of the function that defines the executable statements and local variables to be used by the function are contained between the two braces. Before the ending brace (@), a retum statement is used to return the #Loae result back to the calling program. If the function did not return a value (in which case it should be declared vod), simply omitting the retur statement would return control to the calling program after the lst statement before the ending brace. When a function with no return value must be termi nated before the ending brace (if an error is detected, for example), a returny state ‘ment without a value should be used. The parentheses following the return statement are only required when the result of an expression is returned. Otherwise, a constant or vai able may be retumed without enclosing it in parentheses (for example, return 0; of return ny), ‘Arguments are used to convey values from the calling program to the function, Because the arguments are passed by value, a local copy of each argument is made for the function to use (usually the variables are stored on the stack by the calling program). ‘The local copy of the arguments may be freely modified by the function body, but will not change the values inthe calling program since only the copy is changed. The retum statement can communicate one value from the function to the calling program, Other than this retumed value, the function may not directly communicate back to the calling prograrn. This method of passing arguments by value, such thatthe calling program's Sec.2.5 Functions n variables are igolated from the function, avoids the common problem in FORTRAN ‘where modifications of arguments by a function get passed back to the calling program, resulting in the occasional modification of constants within the calling program, ‘When a fonction must return more than one value, one or more pointer arguments ‘must be used, The calling program must allocate the storage for the result and pass the function a pointer to the memory area to be modified. The function then gets @ copy of the pointer, which it uses (with the indirection operator, *, discussed in more detail in Section 2.7.1) to modify the variable allocated by the calling program. For example, the functions average and variance in Listing 2.1 can be combined into one function that passes the arguments back to the calling program in two #1oat pointers called ave and var, as follows: void stata(float ‘array, int size,float favo, float *var) ‘ float sum = 0.0; /* initialize sun of signal */ float sun? = 0.07 /* sum of signal squared */ for($ = 07 1 < size; ise) ( ‘gun = gun + arrayli];—/* calculate suns */ sund = sum? + array(i]*arrey(i]: save © sun/eize; —/* pass average and variance */ svar = (eus2-sum* (*ave) /(eize-1) , In this function, no value is returned, so itis declared type voLd and no retum statement js used. This stats function is more efficient than the functions average and vari ‘ance together, because the sum of the array elements was calculated by both the average function and the variance function. Ifthe variance is not required by the ealing program, then the average function alone is much more efficient, because the sum of the squares of the array elements isnot required to determine the average alone, 2.5.2 Storage Class, Privacy, and Scope In addition to type, variables and functions have a property called storage class. There are four storage classes with four storage class designators: auto for automatic variables stored on the stack, exttexn for external variables stored outside the current module, ‘static for variables known only in the current module, and register for temporary variables to be stored in one of the registers of the target computer. Each of these four storage classes defines the scope or degree of the privacy a particular variable or function holds. The storage class designator keyword (auto, extern, static, or register) ‘must appear first in the variable declaration before any type specification. The privacy of 1 variable or fumetion is the degree to which other modules or functions cannot access a variable or calla function, Scope is, in some ways, the complement of privacy becauseR C Programming Fundamentals Chap. 2 the soope ofa variable describes how many modules o functions have acces tothe va. ‘Auto variables can ony be declared within a function, ae created when the fine. ton is voked nda los! whon te fneton is exited. ae. varsbles are known oy tothe fonction in which they are declared and dono retain tit value fom ne Son ofa fineton to ander. Because auto variables ae stored on tach fan that uses only auto variables can call iself recursively. The auto keyed fn sed in Cpogams sinc variables declared win func dea ote ates Another important dino ofthe auto sorage class is that an auto variable only defined within the conto stcture hat surrounds i. That the scope of an ne aril inte wothe expressions between he races (Cand ) conaing tc vege aration For example, the following simple program would geacnie a coro ‘since J is unknown outside of the for loop: = maint) ¢ ine 4; for(i 0; i < 10; see) { int 3) (* declare 5 here */ ges, Printé(*ea", 4); , Print€("84",3); + 3 unknown here */ Register variables have the same scope as auto variables, but are stored in fore typeof register inthe target computer Ifthe target computer doesnot have res, ber of registers that can be accessed much faster than outside memory, registee ver, ables can be used to speed up program execution significantly. Most compilers init the ke of register variables to pointers, integers, and characters, because the target man hines rarely have the ability to use registers for floating-point or doubleprecision ope tions. Extern variables have the broadest scope. They are known to all functions in # ‘module and are even known outside of the module in that they are declared, Raters, {avlables are stored in their own separate data area and must be declared outsie of any functions, Functions that access eactern variables must be careful not to call themselves oF call other functions that access the same extern variables, since extern variables ‘etain their values as functions are entered and exited, Ratera is the default sorege Include £4.16 from standard C library #18 expression Conditionally compile the following code if result of expression is tue #ifdef symbol Conditionally compile the following code ifthe ‘symbol is defined #ifndef syabol Conditionally compile the following code ifthe symbol is not defined fer Conditionally compile the following code ifthe associated #4 is not true fenaie Indicates the end of previous #else, #4£, Medes, or #Endet undef macro Undefine previously defined macro 2.6.1. Conditional Preprocessor Directives Most of the above preprocessor directives are used for conditional compilation of por tions of a program. For example, in the following version of the stats function (de scribed previously in seetion 2.5.1), the definition of DEBUG is used to indicate thatthe print statements should be compile: ‘Sec.2.6 Macros and the C Preprocessor B void state(float ‘array, int size, float *ave, float *var) ‘ Float sum = 0.0; /* initialize sun of signal */ float sun2 = 0.0; /* sum of signal squared */ for(i = 0; 4 ay + ta) Adecine wxia,b) Wad > @) ? @ so #etine Miula.b) (la) < 0b) 2 (abs a) #eefine ROND(a) (a) <0) 2 (Ge) ((a)-0.5) : (dnt) ((a)+0.5)) Not that each ofthe above macros is enclosed in parentheses so that it can be used freely in expressions without uncertainty about the order of operations. Parentheses are alse ee All of the macros defined so far have names that contain only capital leters. While this is not required, it does make it easy to separate macro from normal C keywords iB programs where macros may be defined in one module and included (using the ‘Hinclude directive) in another. This practice of capitalizing all macro names and sing Sec.2.7 Pointers and Arrays n lower case for variable and function names will be used in all programs in this book and ‘on the accompanying disk. 12.7 POINTERS AND ARRAYS ‘A pointer is a variable that holds an address of some data, rather than the data itself. The tse of pointers is usually closely elated to manipulating (assigning or changing) the elements of an array of data, Pointers are used primarily for three purposes (1) To point to different data elements within an aay (@) Toallow a program to create new variables while a program is executing (dynamic ‘memory allocation) 8) To access different locations in a data structure ‘The first two uses of pointers will be discussed inthis section; pointers to data structures are considered in section 2.8.2, 2.7.1 Special Pointer Operators ‘Two special pointer operators are required to effectively manipulate pointers: the indirection operator (*) and the address of operator (&). The indirection operator (#) is used Whenever the data stored at the address pointed to by a pointer is required, that is, whenever indirect addressing is required. Consider the following simple program: maint) i int i, eptes iat Jt set the value of i */ per = ef, Y point co address of 1 */ Brinte(*\ntd",i); 7 print twa ways */ printé(*\nkar ptr); sper /* change 1 with pointer */ Printe(*\ntd ta*,ptr,i); /* print change #/ > ‘This program declares that 4 isan integer variable and that ptx is a pointer to an integer ‘ariable, The program firs sets & to 7 and then seis the pointer to the address of by the Statement permed. The compiler assigns 4 and ptxr storage locations somewhere in ‘memory. Atrun time, Dex is sett the starting address of the integer variable 4. "The above rogram uses the function printf (sce section 2.91) o print the integer value of jin two ifferent ways—by printing the contents of the variable 4 (printf ("\akam, 4 and by using the indirection operator (print#("\nea", *ptr) ;). The presence of the * operator in front of ptx directs the compiler to pass the value stored atthe address8 C Programming Fundamentals Chap. 2 pty to the print function (in this case, 7). only pes were used, then the address as. signed to ptx would be displayed instead ofthe value 7. The lat two lines of the exam pie illustrate indirect storage: the data atthe pt address is changed to 11. Ths results in changing the value of 4 only because pe is pointing o the address of 4. ‘An array is essentially a section of memory that is allocated by the compiler and assigned the name given in the declaration statement. In fact, the name given is nothing more than a fixed pointer to the beginning ofthe array. In C, the array name can be used as pointer of it can be used to reference an element ofthe aay (i.c,, a[21). If is declared as some type of array then *a and a[0} are exacly equivalent. Furthermore, (a) and al] are also the same (as long as 4 is declared as an integer), although the meaning ofthe second is often more clear. Arrays can be rapidly and sequentially accessed by using pointers and the increment operator (++). For example, the following three stalements set the first 100 elements ofthe array a to 10: Ant. *pointer; for(i = 0; i < 100 7 i++) *pointerss 0 ‘On many computers this code will execute faster than the single statement for (4=0; 4<1007 L++) a{1]=107, bocause the post increment ofthe pointer is faster than the array index calculation, 2.7.2 Pointers and Dynamic Memory Allocation ‘Chas a set of four standard functions that allow the programmer to dynamically change the type and size of variables and arrays of variables stored in the computer’s memory. C ‘Programs can use the same memory for diferent purposes and not wast large sections of ‘memory on arrays only used in one small section of a program. In addition, auto varables are automatically allocated on the stack atthe beginning of a function (or any seetion of code where the variable is declared within a pair of braces) and removed from the stack when a function is exited (or atthe right brace, }). By proper use of auto variables (Gee section 2.5.2) and the dynamic memory allocation functions, the memory used by a particular C program can be very litle more than the memory required by the program at every step of execution. This feature of C is especially attractive in multiuser environ: ‘ments where the product of the memory size required by a user and the time that memory is used ultimately determines the overall system performance. In many DSP applications, the proper use of dynamic memory allocation can enable a complicated DSP function 10 be performed with an inexpensive single chip signal processor with a small limited inter nal memory size instead of a more costly processor with a lager extemal memory. Four stindard functions are used fo manipulate the memory available toa particular program (sometimes called the heap to differentiate it from the stack). Ma Loe allocates bytes of storage, eaLoe allocates items which may be any number of bytes long, fee removes a previously allocated item from the heap, and reaLoe changes the size of 8 previously allocated item, ‘When using each function, the size ofthe item to be allocated must be passed tothe ‘Sec.2.7 Pointers and Arrays nm function. The function then returns a pointer to a block of memory atleast the size of the item or items requested. In order to make the use of the memory allocation functions portable from one machine to another, the built-in compiler macro sizeof must be Used. For example: int *ptr: per = (int *) malloc(sizeot (ine) )2 allocates storage for one integer and points the integer pointer, px, to the beginning of the memory block. On 32-bit machines this will be a four-byte memory block (or one word) and on 16-bit machines (such asthe IBM PC) this will typically be only two bytes. Because mal Loc (as well 3; ea. Loc and rea1oc) retums a character pointer, it must bbe cast to the integer type of pointer by the (Amt *) cast operator. Similarly, calLoe and a pointer, array, can be used to define a 25-element integer array as follows: array = (int *) calloc(25, sizest(int)); ‘This statement will allocate an array of 25 elements, each of which is the size of an int. ‘on the target machine. The array can then be referenced by using another pointer (chang. ing the pointer array is unwise, because it holds the position of the beginning ofthe allocated memory) or by an array reference such as array {4} (where 4 may be from 0 to 24), The memory block allocated by €a1L0¢ is also initialized to zer0s. Malloc, caLloc, and free provide a simple general purpose memory allocation package, The argument to free (cast as a character pointer) is a pointer toa block previ ‘ously allocated by maLloc or ealee; this space is made available for further alloca tion, but its contents are let undisturbed. Needless to say, grave disorder will result ifthe space assigned by maL2oe is overrun, or if some random number is handed to free. ‘The function f2ree has no return value, because memory is always assumed to be hap- pily given up by the operating system. Realloc changes the size of the block previously allocated to a new size in bytes ‘and returns a pointer to the (possibly moved) block. The contents of the old memory block will be unchanged up to the lesser of the new and old sizes. RealL2oe is used less than ea1oe and mallee, because the size of an array is usually known ahead of time. However, ifthe size of the integer aray of 25 elements allocated inthe last example must be increased to 100 elements, the following statement can be used: array = (Int *) realloc( (char *Jarray, 100*sizeof(int)) Note that ualike ea.10¢, which takes two arguments (one forthe numberof items and ‘one forthe item size), real.loe works similar to ma1oc and takes the total size ofthe array in bytes. ICs also important to recognize that the following two statements are not ‘equivalent tothe previous rea oc statement; free( (char *array): array = (int *) callec(100,sizeot tint) ):20 CProgramming Fundamentals Chap, 2 ‘These statements do change the size of the integer aray from 25 to 100 clements, bt do ‘ot preserve the contents of the first 25 elements. Infact, cal Loe wil initialize all 190 Jntegers to zero, while reaioc will retain the frst 25 and not set the remaining 19 ay elements to any particular value. Unlike fee, which returns no value, ma0e, realoc, and calloe retum 4 ‘ull pointes (0) if there is no available memory or ifthe area has Been corrupted by slog ing outside the bounds of the memory block. When realoc retums 0, the block pointed to by the original pointer may be destroyed. 2.7.3 Arrays of Pointers ‘Any ofthe C datatypes or pointers to ach ofthe datatypes can be declared as an ary, Arrays of pointers are especially useful in accessing large matrices. An array of poimeg to 10 rows each of 20 integer elements can be dynamically allocated as follows: ine ‘mae (20); Ane i; for(i= 0; i <30; ion) ¢ mBEL] = (int *)ealloc(20, sizeof (int); (mae (41) Drintf(*\nError in matrix allocation\n*); exit) |n this code segment, the array of 10 integer pointers is declared and then each pointer is ‘Set to 10 different memory blocks allocated by 10 successive calls to ealloc. After each call (o caLoc, the pointer must be checked to ingure that the memory was availeble (tat [41 will be tue if mae [41 is null). Each element in the matix mat can now be ‘accessed by using pointers and the indirection operator. For example, * (mat [11 * 4) gives the value of the matix element at the Ath row (0.9) and the th column (0-19) and is exactly equivalent to mat {1} (31. In fact, the above code segment is equivalent (in ‘the way mat may be referenced at least) tothe array declaration int mat {101 12011, except that mat: [10] [20} is allocated as an auto variable on the stack and the above calls to eai1.Loc allocates the space for mat on the heap. Note, however, that when mat is allocated on the stack as an auto variable, it cannot be used with free or reatloe ‘and may be accessed by the resulting code in a completly different way ‘The calculations required by the compiler to access a particular clement in a two- dimensional matrix (by using matzix{4] [3], for example) usually take more insnde- tions and more execution time than accessing the same matrix using pointers, This is especially true if many references tothe same matrix row or column are required. However, Sepending on the compiler and the speed of pointer operations onthe target machine, 80, cess to a two-dimensional array with pointers and simple pointers opecands (even incre ‘ment and decrement) may take almost the same time as a reference to a matrix such as Sec.2.7 Pointers and Arrays an ‘241 [31. For example, the product of two 100 x 100 matrices could be coded using ‘two-dimensional array references as follows: int {1001 {1007,®{200] {100} ,ct100) (1001; /+ 3 matrices */ int 4,3, indices */ 7 code to set up mat and vec goes here */ J do matrix miltiply atby for(i= 0; i < 100; jes) { for(i = 0; 4 < 100; je0) ¢ et) = 0; for(k = 0; k < 100 ; kes) eG) = ali)0d * boa L; ) ‘The samme matrix product could aso be performed using arrays of pointers as follows: nt af260) 1100184200) (100) ,cf100) (1001; /* 3 matrices */ Aint *aper, "epee, “epee; 7+ pointers to a,bie */ int 154 1 Andicies */ /* code to set up mat and vec goes here *+/ J @ocearbey for(i = 0; 4 < 100; tee) ( ctr = efi; wwtr = blo}; for(j = 07 J < 100; Ses) ¢ faptr = ali}; septr = (aptres) * (*bpte+s): fork = 1; k< 100; kev) { Noptr += (*aptrss) * Ek) (517 eters ‘The latter form ofthe matin multiply code using arrays of pointers runs 10 to 20 percent aster, depending on te degree of optimization done by the compiler and the capabilities ‘of the target machine. Note that © [4] and a [4] ae references to arrays of pointers each Pointing to 100 integer values. Three factors help make the program with pointers faster: (1) Pointer increments (such as *aptz-+-+) are usually faster than pointer adds (@) No multiplies or shifts are required to access a particular element of each matrix,e2 CProgramming Fundamentals Chap. 2 (8) The first addin the inner most loop (the one involving ) was taken outside the Joop (using pointers apex and bptz) and the initialization of {41 £33 to zero ‘was removed. 2.8 STRUCTURES Pointers and arrays allow the same type of data to be arranged ina list and easily accessed by a program. Pointers also allow arrays to be passed to functions efficiently and dynamically created in memory. When unlike logically related data types must be manipulated, the use of several arrays becomes cumbersome. While itis always necessary to process the individual data types separately, itis often desirable to move all of the related data types as a single unit. The powerful C data construct called a structure allows new data types to be defined asa combination of any nurnber of the standard C datatypes. Once the size and datatypes contained ina structure ae defined (as described in the next section), the named structure may be used as any ofthe other data types in C. Arrays of structures, pointers to structures, and structures containing other structures may all be defined. ‘One drawback ofthe user defined structure is that the standard operators in C do not work with the new data structure. Although the enhancements to C available with the C++ programming language do allow the user to define structure operators (see The Ch Programming Language, Stoustrup, 1986), the widely used standard C language does not support such concepts. Thus, functions or macros are usually created to manipulate the structures defined by the user. As an example, some of the functions and macros required to manipulate structures of complex floating-point data are discussed in section 2.8.3. 2.8.1 Declaring and Referencing Structures. A structure is defined by a structure template indicating the type and name to be used to reference each element listed between a pair of braces. The general form of an N-element structure is as follows: struct tag_nane ( ‘ypel elenent panel type? element pane? ) variable name; In each case, typed, type2, ..., typeN refer to a valid C data type (char, int, float, or double without any storage class descriptor) and element_name}, element_nane2, ..., element_nameNi refer to the name of one of the elements of the data structure. The tag_name is an optional name used for referencing the strac= Sec.28 Structures a ture later. The optional variable_name, of list of variable names, defines the names of the structures to be defined. The following structure template with a tag name of record defines a structure containing an integer called length, a float. called sample_cate, a character pointer called name, and a pointer to an integer array called data: struct record ( int length: float sample rate; char *nane; Int ‘data; a» ‘This strcture template can be used to declare a structure called voice as follows: struct record voice: ‘The structure called voice of type record can then be initialize as follows: voice-Lengeh = 1000; voice.sample_rate = 10.23; voice nane = “voice eigmal*; ‘The last clement of the structure isa pointer to the data and must be set to the beginning ‘fa 1000-element integer array (because length is 1000 in the above initialization), Each clement of the structure is referenced with the form atruct_name.element, Thus, the 1000-clement array associated with the voice structure can be allocated as follows: voice.data = (int *) calloc(1000, sizeof (int)) Similarly, the other three elements ofthe structure can be displayed with the following code segment: print€(*\ntength = #4", voice. length) ; print f(*\nsampling rate = 8£",voice-sample rate); printf(*\nRecord name = 2", voice.nanel; A typedef statement can be used with a structure to make a user-defined data type and make declaring a structure even easier. The typedef defines an alternative name forthe structure data type, but is more powerful than #4efine, since it is a compiler directive 88 opposed to a preprocessor directive. An alternative to the record structure is a ‘typedef called RECORD 1s follows: typedef struct record RECORD;a C Programming Fundamentals Chap. 2 This statement essentially eplaces all occurences of RECORD in the program With the struct record definition thereby defining a new type of variable called RECORG that may be used to define the voice structure with the simple statement RECORD vos ex ‘The typedet statement and the structure definition can be combined so that the tag name record is avoided as follows: typedef struct ( lint length; Hloat sample rate: ‘char “name; ine tanta) ) RECORD; In fact, the typedee statement can be used to define a shorthand form of any type of ata type including pointers, arays, arays of pointes, or another typedef. For tran. ple, typedef char gma3(801; allows $0-character arrays to be casily defined with the simple statement STRING ame, name2}. This shorthand form using the typedef is an exact replacement for the stitement char name1 {80} ,name2 (80) ; 2.8.2 Pointers to Structures Pointers to strctures can be used to dynamically allocate arrays of structures and eff- ciently access strictures within functions, Using the RECORD typedef defined in the last section, the following code segment can be used to dynamically allocate a five clement array of RECORD structures: voices = (RECORD *) calloe(S, sizeof (RECORD) ‘These two statements are equivalent to the single-array definition “RECORD ‘voices 151; excep that the memory block allocated by ea.10¢ can be deallocated bY ‘call to the fee function. With either definition, the length of each element ofthe array ‘could be printed as follows: fords 0; $< 5; ie) Print£(*\nlength of voice 4d = ta", 1,voices{i) length) ‘The voices array can also be accessed by using a pointer to the array of structures. Ir woice._ptr is a RECORD pointer (by declaring it with RECORD *voice_ptr;), then Sec.28 Structures 85 (*voice ptr) «Length could be used to give the length of the RECORD which was Pointed to by voice_ptx. Because this form of pointer operation occurs with structures ‘often in C, a special operator (->) was defined. Thus, voice ptr->iength is equiv. alent to (*voice_ptr) length, This shorthand is very useful when used with func: tions, since a local copy of a structure pointer is passed to the function. For example, the following function wil print the length of each record in an array of RECORD of length size: void print_record Length(RBCORD ‘rec, int size) ‘ for(i'= 0; 4 < aise ; ist) ( Print (*\nlength of record td = $",$,rec_>length); > ‘Thus, a statement like Print_record length(voices,5) ; will print the leagths stored in each ofthe five elements of the array of RECORD structures 2.8.3 Complex Numbers ‘A complex number can be defined in C by using a structure of (wo floats and a typedet as follows: typedef struct ( ‘oat real Three complex numbers, x,y, and = can be defined using the above structure as follows: COMPLEX x.y, 27 In order to perform the complex addition 2 = x + y without functions or macros, the following two C statements are required: Z.real = x.real + y.real; zuimag = x.imag + y-imag ‘These two statements are required because the C operator + can only work withthe indi vidual parts of the complex structure and not the structure itself. In fact, a statement in volving any operator and a structure should give a compiler error. Assignment of any structure (like 2 = 3¢7) works just fine, because only data movement is involved. A sim. ple function to perform the complex addition can be defined as follows:86 C Programming Fundamentals Chap. 2 (COMPLEX cad (COMPLEX a, COMPLE b) fi (* pass by value */ coMPux sum; /* dofine return value */ sun.real = a.real + b.real; ssum.ineg = a-inag + bing: return (sun): , “This function passes the value ofthe a and b structures, forms the sum of a and b, and then returns the complex summation (some compilers may not allow this method of passing structures by value, thus requiring pointers to each ofthe structures). The ead function may be used to set = equal tothe sum of x and y as follows: z= caddy) ‘The same complex sum can also be performed with a rather complicated single line macro defined as follows: Ade£ine CADD(a,b)\, (Ct realea.realtb.real,¢_t.imag=a.imageb-imag,_t) “This macro can be used to replace the eada function used above as follows: compu ces 2 = cADD(z.y) ‘This CADD macro works as desired because the macro expands to three operations separated ‘by commas, The one-ine macro inthis case is equivalent to the folowing three statements x.real + y.real seimag + y-reals ‘The first two operations in the macro are the two sums for the eal and imaginary pars, “The sums are followed by the variable C_t (which must be defined as COMPLEX before using the macro). The expression formed is evaluated from left to right and the whole ex pression in parentheses takes on the value of the last expression, the complex structure ‘©_t, which gets asigned to = asthe last statement above shows. The complex add macro CADD will execute faster than the ead function because the time required to pass the complex structures 2 and y tothe function, and then pass the sum back tothe calling program, is a significant part ofthe time required for the function call. Unfortunately, the complex add macro cannot be used in the same manner as the function, For example: COMPLEX a.b,c,ar 4 = cadd(a,cadatb,c) Sec.28 Common € Programming Pitfalls a will form the sum dmatb+ey as expected. However, the same format using the CAD macro would cause a compiler enor, because the macro expansion performed by the C preprocessor results in an ilegal expression. Thus, the CADD may only be used with simple single-variable arguments. If speed is more important than ease of programming, then tte macro form should be used by breaking complicated expressions into simpler two- ‘operand expressions. Numerical C extensions to the C language support complex num ‘ers in an optimum way and are discussed in section 2.10.1 29 COMMON C PROGRAMMING PITFALLS ‘The following sections describe some of the more common errors made by programmers ‘when they first start coding in C and give a few suggestions how to avoid them, 2.9.1 Array indexing In C, all aray indices start wth zero rather than one. This makes the last index of aN Tong array V-1. This is very useful in digital signal processing, because many of the expressions for filters, -transforms, and FFTs are easier to understand and use with the index starting at zr0 instead of one. For example, the FFT output for k=0 gives the zero frequency (DC) special component ofa discrete time signal. A typical indexing problem is illustrated in the folowing code segment, which is intended to determine the frst 10 powers of 2and store the results in an array called power’ int power2(20); int ips pel for power2(i) P= 2p; , ‘This code segment will compile well and may even run without any difficulty. The problem is thatthe €ox loop index 4 stops on 4=10, and power? [101 is not a valid index to the power? array. Also, the fox loop stats withthe index 1 causing powex2 [0] {0 ‘not be initialized. This results in the first power of two (2°, which should be stored in ‘Power? (01) to be placed in power2 [1]. One way to correct this code is to change the for loop to read for(i = 0; i<10; i++), so that the index to power? starts at 0 and stops at 9. 2.9.2 Failure to Pass-by-Address ‘This problem is most often encountered when first using aean to read in a set of vatiables. If 4 is an integer (declared as Amt 47), then a statement like seané (*%€",4)1 is wrong because scant expects the address of (or pointer to) the location to store the28 Programming Fundementa Chap, 2 integer that is read by scang. The correct statement to read in the integer 4 ig Seané (4a" £1) 1, where the address of operator (&) was used to point tothe adyest of & and pass the address to seanf as required. On many compiles these types of eno, can be detected and avoided by using function prototypes (see section 25.3 for all us Jetiten Functions and the appropriate include files forall C library functions, BY using funetion prototypes, the compiler is informed what type of variable the function expec and wil issue a warning ifthe specified type is nt used inthe calling program. On many UNIX systems, a C program checker called LINT can be used to perform parameter-ype checking, as well as other syntax checking. 2.9.3 Misusing Pointers Because pointers are new to many programmers, the misuse of pointers in C can be par ticularly difficult, because most C compilers will not indicate any pointer errors. Sema ‘compilers issue a warning for some pointer errors. Some pointer errors will result in the ‘rogram not working correctly or, worse yet, the program may seem to work, but will ‘ot work with a certain type of data or when the program is in a certain mode of opera, tion, On many small single-user systems (such asthe IBM PC), misused pointers can cus ily result in writing to memory used by the operating system, often resulting in a system crash and requiring a subsequent reboot There are two types of pointer abuses: seting a pointer to the wrong value (or not initializing it at all) and confsing arrays with pointers. The following code segment shows both ofthese problems ohar tsteing: char neg 10] ine printf(*\nmnter eitte) scant (*@s*, tring); Lao; while(string 1 ++) ¢ stringet; d Print£(-ts ts $a before space", meg, string, i); ‘The first three statements declare that memory be allocated to a pointer variable called string, a 10-clement char aray called mag and an integer called 4. Next the user is ‘asked to enter a tile into the variable called sttxing. The while loop is intended to Search for the fist space in the string and the last print £ statement is intended to ds. play the string and the numberof characters before the frst space There are three pointer problems in this program, although the program will com pile with only one fatal error (and a possible warning). The fatal error message will refer cence the mag= "Title ="; statement, This line tells the compiler to set the address of the msg array to the constant string *PitLe =". This is not allowed so the enor Sec.2.9 Common € Programming Pitfalls 29 “value required” (or something less useful) willbe produced, The role ofan array and a Pointer have been confused and the mag variable should have been declared as a pointer fand used to point to the constant string "Rte =*, which was already allocated ston, age by the compiler. [The next problem with the code segment is that seanf will read the string into the address specified by the argument string. Unfortunately the value of atrdng at exe, ution time could be anything (some compilers will set i to zero), which will probably not point to a place where the title string could be stored. Some compilers wil ewe char *etring,*nsg; -ring-cal loc (80, sizeat (char) ); print£(*\nlnter title); scanf(*ts", string); b=0; while(toteing = + +) ¢ strings; ) nage"Titie = PHint£("¥s 85 44 before space* meg, string, i); ‘The code will now compile and run but will not give the correct response when a ttle String is entered, In fact, the fist characters ofthe ttle string before the fist space will pointer which points toa dynamically allocated section of memory. This pointer problems ‘cam be fixed by using another pointer (called ep) for the while loop as fellows char *string, *ep, ‘meg; int 1; string-calloc(80, sizeof (char); printf (*\nfater title"); Beant ("ts*, string); io, © = string; while(rep ne , nage*Title =*; Printé("ts 4s td before space", mog.string, {);‘Sec.2.10 Numerical C Extensions a1 Another problem with this program segment is that if the string entered contains, spaces, then the white loop will continue to search through memory until it nas space. On a PC, the program will almost always find a space (inthe operating system ‘haps) and will set 4 to some large value. On larger multiuser systems, this may result ig, fatal run-time error because the operating system must protect memory not allocated the program. Although this programming problem is not unique to C, it does illustate important characteristic of pointers—pointers can and will point to any memory loca without regard to what may be stored ther. sum ofa real constant and an imaginary constant, which is defined by using the suffix 4 afer the imaginary part ofthe number. For example, initialization of complex numbers is performed as follows: short int complex i= 3 + 24; Float complex x[3} = (1-042.01, 3.01, 4.0); ‘The following operators are defined for complex types: & (address of), * (point to complex number), + (add), ~ (subtract), * (multiply), / (divide). Bitwise, relational, and logi- cca operators are not defined. If any one of the operands are complex, the other ‘operands will be converted (0 Complex, and the result of the expression will be complex. The crea and cimag operators can be used in expressions to access the real or imaginary part of a complex variable or constant. The cond operator returns the complex conjugate of its complex argument, The following code segment illustrates these operators: 10 NUMERICAL C EXTENSIONS Some ANSI C compilers designed for DSP processors are now available with numeric ‘extensions. These language extensions were developed by the ANSI NCEG (Numeric Extensions Group), 2 working committee reporting to ANSI X3J11. This section gives overview of the Numerical C language ded by the ANSI standards commit [Numerical C has several features of interest to DSP programmers: float complex a,b,er cereal (a)=1.0; (2) Fewer lines of code are roquired to perform vector and matrix operations ccimag(a) 2.0; cereal (b)=2.0*cimag (a) (@) Data types and operators for complex numbers (with real and imaginary compos aoa nents) are defined and can be optimized by the compiler for the target processor, anu ae This avoids the use of structures and macros as discussed in section 2.8.3, @) The compiler can perform better optimizations of programs containing itera 2.10.2 Iteration Operators which allows the target processor to complete DSP tasks in fewer instruction cycles. [Numerical C offers iterators to make writing mathematical expressions that are computed iteratively more simple and more efficient. The two new keywords are tex and eum. Iterators are variables that expand the execution ofan expression to iterate the expression 0 that the iteration variable is automatically incremented from 0 to the value ofthe iterator. This effectively places the expression inside an efficient €ox loop. For example, the following three lines can be used to set the 10 elements of array 43e tothe integers 0 109: 2.10.1 Complex Data Types Complex numbers are supported using the keywords complex, crea, cimag, ‘conj. These keywords are defined when the header fle complexe. is included. The are six integer complex types and three floating-point complex types, defined as shown the following example: short int complex i: int ix1201 int complex 52 be(=1; long int complex i: unsigned short int complex ui; unsigned int complex uj; ‘unsigned long int complex uk: Float complex x: double complex ¥; long double complex 2; ‘The sum operator can be used to represent the sum of values computed from values of an iterator. The argument to sum must be an expression that has a value for each ofthe iter- ted variables, and the order ofthe iteration cannot change the result. The following code segment illustrates the sum operator: float {10}, {10} ,°f10} ,4{10) {10}, (10) 120), £(20) float =: iter 110, ge10, K-10 nam(alT}); /* computes the sum of @ into 8 */ “The real and imaginary parts of the complex types each have the same representations the type defined without the complex keyword. Complex constants are represented 3592 C Programming Fundamentals. gp, D(Tssum(att]); /* sum of a calculated 10 times and storeg in the elanants of b */ 7* computes the sun of the colum elements of d, the statenont is iterated over J */ ‘oun(it)[01);/* sums all the elenente ind */ san(@(1) (K]e{K) (J}) 7 /* matrix multiply */ sum(@C2) 1X] FAK); (* matrix * vector */ (0) =sumiat £21) 11 COMMENTS ON PROGRAMMING STYLE ‘The four common measures of good DSP software are reliability, mainainabiiy, ex sibility, and efficiency A reliable program is one that seldom it ever) fails. This is especialy important DSP because tremendous amounts of data ae often processed using the same progran {he program fails due to one sequence of data passing through the program it may he dig ficult, or impossible, to ever determine what caused the problem. Since most programs of any size will occasionally fail, a maintainable program i ‘one that is easy to fix. A taly maintainable program is one that can be fixed by sy other than the orginal programmer. Tt is also sometimes important to be able to malsong {program on more than one typeof processor, which means that in order for a propa tobe truly maintainable, it must be portable. An extensible program is one that can be easily modified when the requiremen change, new functions need to be added, or new hardware features need tobe exploited, ‘An efficient program is often the key to a successful DSP implementation of» e- sired function. An efficient DSP program will use the ‘mentation cost and development cost) such that inexpensive speech recognition and en ration processors are now available For use by the general public Unfortunately, DSP programs often forsake maintainability and extensibility for ef: ficiency. Such is the case for most currently available programmable signal processing, integrated circuits, These devices are usually programmed in assembly language in such way that itis often impossible for changes to be made by anyone but the orginal po grammer, and after a few months even the original programmer may have to rewrite the rogram to add additional functions. Often a compiler isnot available for the processor o& the processor's architecture isnot well suited to efficient generation of code from a com piled language. The current tend in programmable signal processors appears 10 be 10 Ward high-level languages. In fact, many of the DSP chip manufacturers ae supplying C ‘compilers for their more advanced products Sec.2.11 Comments on Programming Style 93 2.11.1 Software Quality The four measures of software quality (relisbilty, maintainability, extensibility, and eff ciency) ae rather dificult to quantity. One almost has to try to mosify a program to find cout if tis maintainable or extensible. A program is usualy tested in finite number of ‘ways rmuch smaller than the millions of input data conditions. This means that a prosremt «can be considered reliable only alter years of bugefrec ase in many different environ, reat Programs do not acquze these qualities by accident. I is unlikely that good pro- sams willbe incitively created just because the programmer is clever, experienced, oe uses lots of comments, Even the use of structured programming techniques (described briefly in the next section) will not assure that a program is asics to maintain or extend Itis the author's experience thatthe following five coding situations will often lessen the software quality of DSP programs (1) Functions that are too big or have several purposes (@) A main program that does not use functions (3) Functions that are tightly bound to the main program (4) Programming “ticks” that are always poorly documented (8) Lack of meaningful variable names and comments ‘An oversized function (item 1) might be defined as one that exceeds two pages of source listing. A function with more than one purpose lacks strength. A function with one clearly ‘efined purpose can be used by other programs and other programmers. Functions with ‘many purposes will find limited utility and limited acceptance by others, All of the fume, tions described in this book and contained on the included disk were designed with this important consideration in mind. Functions that have only one purpose should rarely ex {ceed one page. This is not to say that all functions will be smaller than this, In time: critical DSP applications, the use of inline code can easily make a function quite long but can sometimes save precious exccution time. Itis generally true, however, that big Programs are more difficult to understand and maintain than small ones ‘A main program that does not use functions (item 2) will often result in an extremely long and hard-to-understand program. Also, because complicated operations often can be independently tested when placed in short functions, the program may be casier to debug. However, taking this rule to the extreme ean result in functions that are Lightly bound to the main program, violating item 3. A function that is tightly bound to the rest of the program (by using too many global variables, for example) weakens the entire program. If there are lots of tightly coupled functions in a program, maintenance becomes impossible. A change in one function can cause an undesired, unexpected ‘change in the rest ofthe functions Clever programming tricks (item 4) should be avoided at all costs a they will often not be reliable and will almost always be difficult for someone else to understand (even ‘with lots of comments). Usually ifthe program timing is 50 close that a tick must be4 Programming Fundamentals EB 00.2.1 Comments on Programming Style 95 used, then the wrong processor was chosen forthe application. Even if the p trick solves a particular timing problem, as soon as the system requirements change they almost always do), a new timing problem without a solution may soon develop. ‘A program that does not use meaningful variables and comments (item 5) is anteed tobe very dificult to maintain. Consider the following valid C program: .2.112 Structured Programming “sructured programming has developed from the otion tha any algo, no mater serrcompex. can be exprescd by using the propranming-contol structs (elt, wile, and sequence. All ropramming languages cust contin some representation of ese Uyee fundamental contol stuctrs. The development of sutured programming Teort penta be A poo t0iaoon aed 4 F vened that if «program wes these thee contol siritures, hen the ogi of the ro- TEC sooo. s 0.0 )pFIREE(™ edt e903 37 ‘ium can be read and understood by beginning atthe frst statement and continuing rrain() {int 0 00,0007 for(_9.90_=2:;_9_0_#4) downward t0 the last, Also, all programs could be written without goto statements Even the most experienced C programmer would have difficulty determining what General, etre programmning practices lead 0 code hat is ese 10 ea, ese 0 three-tine program docs. Even after running such 2 poorly documented program, B maintain, and even easier to write. be hard to determine how the results were obtained. The following program does exag ‘The C language has the three basic control structures as well as three additional the same operation asthe above three Tins but is easy to follow and modify structured-programming constructs called do-while, for, and case, The additional three q contol sictures have been added to C and most other modern languages because they maint) Se convenient, they retain the original goals of structured programming, and their use Hi often makes a program easier to comprehend. int prime_test divisor: : ‘The sequence contol structure is used for operations that willbe executed once ia a Je the citer for’ loop trys all mubers >1 and the inner F fonction or program ina fixed sequence, Tis streture i often used where speed is most for loop checks the umber tested for any divisors imporant and is often refered to as in-line code when the sequence of operations are Jess than iteelé. */ identical and could be coded using one ofthe oer structures. Extensive Use of inline forlprine test = 2 ; 1 prine.test'+) { code can obscre the purpose ofthe code segment. ter(aivieor "heels contol ctr ini the most common way of providing condtonal execution of a sequence of operations based onthe result ofa Topical operation Indenting ) , of different levels of 4£ and @1se statements (as shown in the example in section 2.4.1) {sot required; itis an expression of C programming syle that helps the readability ofthe ee ee eee ec ees oe if-else control structure, Nested whe and fox loops should also be indented for im prime numbers, because the following thee documentation rules were followed: proved readability (as illustrated in setion 27.3). “The case contol structure isa convenient way to execute one of @ Series of opers- ee ee “tions based upon the value of an expression (see the example in section 2.4.2). tis often (1) Vasa names tnt mening inte cnet ofthe program wise el A oul nsend a's of else strctueswhon alae umber of cndtons a sed fssed opon a common exprestion, In , the switch statement gives the expression {0 fest and a series of came slaiements give the conditions to match the expression. A pe default statement can be optionally added to execute a sequence of operations if none af th listed conditions are met. “The lat three control structures (while, do-while, and fr) all provide for repeating Fra sequence of operations a fixed or variable numberof times. These loop statements can | make a program easy to read and maintain The whi.Le loop provides for the iterative ex ‘cuion of series of statements as long asa tested condition is ue; when the condition is fale, execution continues to the nest statment in the program. The do-whL.Le con- ‘ao structure is similar tothe whi-Le loop, except that the sequence of statement is exe- ‘ued at last once. The for contol score provides forthe iteration of statements F With automatic modification ofa variable and s condition that terminates the iteration. For loops are more powerful in C than most languages. C allows for any initializing satement, any erating statement and any terminating statement. The thre statements do 2; prime_test $ divisor != 0 ; divisors+) prine_test) printf(*\ntd*, prine_test) very obvious way, such as initializing an entre array toa constant, (2) Comments preceded each major section of the program (the above program a has one section). Although the meaning of this short program is fairly clear with the comments, it rarely hurts to have too many comments. Adding a blank line tween different parts of a program also sometimes improves the readabilty 4 program because the different sections of code appear separated from each other ‘Statements at different levels of nesting were indented to show which control ture controls the execution of the statements ata particular level. The author to place the right brace (©) with the control structure (fox, while, ££, et) place the left brace ()) on a separate line starting in the same column asthe b ning of the corresponding control structure. The exception to this practice i function declarations where the right brace is placed on a separate line afte ‘gument declarations.96 4 ibiograph ” C Programming Fundamentals atietee Torino q ‘ot need to be related and any of them can be a null stat following three examples of while, do-whi te, and £01 of two of an integer 4 (assumed to be greater than 0) and 109 is as follows: after displaying an error message. Such an error-related exit is performed by calling the C library function ead (n) with a suitable error code, if desired. Similarly, many of the {funetions have more than one retumn statement as this can make the logic in a function ‘much easier to program and in some cases more efficient. ent or multiple stateme; 2 loops all calculate the Set the result to The k= 25 J+ walle Loop ke2ted +/ RENCES whileti > 0} ( k= 2%; FFeven, AR, (1982). The C Puzle Book Englewood Cts, NI: Prentice Hall KeRMIOHAM, B. and PLAUGER, P. (1978). The Elements of Programming Sile. New York ‘McGraw: Hil KERMGHAN, BW. and RITCHIE DM, (1988). The C Programming Language (2nd ed.) Englewood Clifs, NI: Prentice Hal, PRATA, S. (1986). Advanced C Primers +, Indianapolis, IN: Howard W. Sams and Co, PPURDUM, J and LESLIE, TC. (1987). C Standard Library. Indianapolis, IN: Que Co. Rocio, MJ. (1988), Advanced C Programming for Displays. Englewood Cliff, NJ Prentice Hall STEVES, A. (1986). C Development Tools forthe IBM PC: Englewood Clits, Ni: Preatice Hal StmousTaUr, B. (1986). The C++ Programming Language. Reading, MA: Addison-Wesley (WATE, M., PRATA,S. and MARTIN, D, (1987). C Primer Pus. Indianapolis, IN: Howard W. Sams and Co, , ‘The o-whi Le loop is as follows: The £0 loop is. as follows: forte 2ideas iy 2k; /* for loop nny ‘Which form of loop to use is a personal matter. Of the three equivalent code se shown above, the fox loop and the whi.1e loop both seem easy t0 understand ana Drobably be preferred over the do-whie construction ‘The C language also offers several extensions to the six structured prog control structures. Among these are break, continue, and goto (sce section 24 Break and continue statements allow the orderly interruption of events that are ccuting inside of loops. They can often make a complicated loop very difficult to fo ‘because more than one condition may cause the iterations to stop. The infamous, ‘The program examples in the following chapters and the programs contained on| enclosed disk were developed by using structued-programming practices. The code be read from top to bottom, there are no goto statements, andthe six accepted com structures are used, One requirement of structured programming that was not throughout the software in this book is that each program and function have only entry and exit point. Although every function and program does have only one Point (as is required in C), many ofthe programs and functions have multiple ext po ‘Typically, this is done in order to improve the readability ofthe program, For ex efor conditions in a main program often require terminating the program premCHAPTER 3 pale DSP Microprocessors IN EMBEDDED SYSTEMS ‘The term embedded system is often used to refer to 8 processor and associated circus required to perform a particular function that is not the sole purpose of the overall sys tem. For example, a keyboard controller on a computer system may be an embedded system if it has a processor that handles the Keyboard activity for the computer system. In a similar fashion, digital signal processors are often embedded in larger systems to perform specialized DSP operations to allow the overall system to handle general pu- pose tasks. A special purpose processor used for voice processing, including analog-to- digital (A/D) and digita-to-analog (D/A) converters, is an embedded DSP system when it is part of a personal computer system. Often this type of DSP runs only one appli cation (perhaps speech synthesis or recognition) and is not programmed by the end user. ‘The fact that the processor is embedded in the computer system may be unknown to the end user. A_DSP's data format, either fixed-point or floating-point, determines its ability to handle signals of differing precision, dynamic range, and signal-to-noise ratios, Also, ease-of-use and software development time are often equally important when deciding between fixed-point and floating-point processors. Floating-point processor: are often more expensive than similar fixed-point processors but can execute more instructions per second. Each instruction in a floating-point processor may also be ‘more complicated, leading to fewer cycles per DSP function. DSP microprocessors can be classified as fixed-point processors if they can only perform fixed-point mol plies and adds, or as floating-point processors if they can perform floating-point ope- ‘Sec. 8.1 Typical Floating-Point Digital Signal Processors 9 ‘The precision of a particular class of A/D and D/A converters (classified in terms of cost or maximum sampling rate) has been slowly increasing at a rate of about one bit every two years. AC the same time the speed (or maximum sampling rate) has also been increasing. The dynamic range of many algorithms is higher atthe ouxput than at the input and intermediate results are often not constrained to any particular dynamic range. This requires that intermediate results be scaled using a shift operator when a fixed-point DSP is used. This will require more cycles for a particular algorithm in fixed-point than on an equal floating-point processor. Thus, as the A/D and D/A. requirements for a particular application require higher speeds and more bits, a fixed-point DSP may need to be replaced with a faster processor with more bits. Also, the fixed-point program may require extensive modification to accommodate the greater precision. In general, floating-point DSPs are easier to use and allow a quicker time-to- ‘market than processors that do not support floating-point formats. The extent to which this is tre depends on the architecture of the floating-point processor. High-level language programmability, large address spaces, and wide dynamic range associated with floating-point processors allow system development time to be spent on algorithms and signal processing problems rather than assembly coding, code partitioning, quantization error, and scaling. In the remainder of this chapter, floating-point digital signal proces sors and the software required to develop DSP algorithms are considered in more de- tal. 3.1 TYPICAL FLOATING-POINT DIGITAL SIGNAL PROCESSORS. ‘This section describes the general properties of the following three floating-point DSP processor families: AT&T DSP32C and DSP3210, Analog Devices ADSP-21020 and ADSP-21060, and Texas Instruments TMS320C30 and TMS320C40. The information ‘was obtained from the manufacturers’ data books and manuals and is believed to be an accurate summary of the features of each processor and the development tools availabe, Detailed information should be obtained directly from manufacturers, as new features are constantly being added to their DSP products. The features of the three processors are summarized in sections 3.1.1, 3.1.2, and 3.1.3, ‘The execution speed of a DSP algorithm is also important when selecting a processor. Various basic building block DSP algorithms are carefully optimized in assembly language by the procestor's manufacturer. The time to complete a particular algorithm is often called a benchmark. Benchmark code is always in assembly language (sometimes without the ability to be called by a C function) and can be used to give a general mea- sure of the maximum signal processing performance that can be obtained for a particular processor. Typical benchmarks for the three floating-point processor families are shown, inthe following table. Times are in microseconds based the on highest speed processor available at publication time.100 DSP Microprocessors in Embedded Systems Chap, psrsac_] Apspai020 | -TMssa0cw@ spszio | ADSP21060 Maxine lasso ye Speed (IPs) Ey “ 20 02s compex [oes | toni 24s 0457 FFT wit birevene [tine | 20164 eh) "as AIR iter [eves | — 187 mo rn 35 Tp) ‘ie [204 i bas TUR Filer ‘ue [ast 1 2 2 Biguads) tine | 106 a5 Lis ‘at bet ees | 0" 2 se MavixMutipty | tine 10 06 2 ‘SC yete cous fr DSPIIC and DSPI210 ae cack le 3.1.1 AT&T DSP32C and DSP3210 Figure 311 shows a block diagram of the DSP32C microprocessor manufactured by AT&T (Allentown, PA). The following isa brief description ofthis processor provided by AT&T. ‘The DSP32C’s two execution units, the control arithmetic unit (CAU) and the data arithmetic unit (DAU), are used to achieve the high throughput of 20 million insine tions per second (at the maximum clock speed of 80 MHz). The CAU performs 16 or ‘24-bit integer arithmetic and logical operations, and provides data move and contol ca ppbilites. This unit includes 22 general purpose registers. The DAU performs 32-bit floating-point arithmetic for signal processing functions. It includes a 32-bit floating. point multiplier, « 40-bit floating-point adder, and four 40-bit accumulators. The mult plier and the adder work in parallel to perform 25 million floating-point computations er second. The DAU also incorporates special-purpose hardware for data-type conver On-chip memory includes 1536 words of RAM. Up to 16 Mbytes of external mem ory can be directly addressed by the external memory interface that supports wat sates and bus arbitration, All memory can be addressed as 8, 16- or 32-bit data, with the 32-bit data being accessed atthe same speed as 8- or 16-bit data ‘The DSP32C has three VO ports: an external memory port, a serial port and « 16 it parallel port. In addition to providing access to commercially available memory, the e emal memory interface can be used for memory mapped V/. The serial port can inter face toa time division multiplexed (TDM) stream, a codec, or another DSP32C. The pt allel port provides an interface to an external microprocessor. In summary, some ofthe key features of the DSP32C follow. Sec. 3.1 Typical Floating-Point Digital Signal Processors aqaagragee 7d rpm ae AG FIGURE 2.1 Block diagram of DSP22C processor (Courtesy ATA 10102 DSP Microprocessors in Embedded Systems Chap. 3 KEV FEATURES + 63 instrctions, 3-stage pipeline + Full 32-bit floating-point architecture for increased precision and dynamic range + Faster application evaluation and development yielding increased market leverage + Single cycle instructions, eliminating multicycle branches, + Four memory accessesinstruction for exceptional memory bandwidth + Easy-to-learn and read C-like assembly language + Serial and parallel ports with DMA for clean external interfacing + Single-instraction data conversion for IEEE P754 floating-point, 8-, 16-, and 24-it integers, paw, A-Law and linear numbers + Bully vectored interrupts with hardware context saving up to 2 million interrupts per second + Byte-addressable memory efficiently storing 8- and 16-bit data ‘+ Wait-states of 1/4 instruction increments and two independent external memory speed partitions for flexible, cost-efficient memory configurations DESCRIPTION OF ATAT HAROWARE DEVELOPMENT HARDWARE ‘+ Full speed operation of DSP32C (80 MHz) Serial YO through an AT&T T7520 High Precision Codec with mini-BNC connec- tors for analog input and output Provision for external serial VO through a 34-pi Upper 8-bits of DSP32C parallel UO port access via connector [DSP32C Simulator providing an interactive interface tothe development system. ‘Simulator-controlled software breakpoints; examining and changing memory contents or device registers Figure 3.2 shows a block diagram of the DSP3210 microprocessor manufactured by AT&T. The following isa brief description of this processor provided by AT&T. [AT&xT DSP2210 DIGITAL SIGNAL PROCESSOR FEATURES. + 16.6 million instructions per second (66 MHz clock) + 63 instructions, 3-stage pipeline + Two 4-kbyte RAM blocks, 1-Kbyte boot ROM + 32-bit barrel shifter and 32-bit timer + Serial VO port, DMA channels, 2 external interrupts + Pull 32-bit floating-point arithmetic for fast, efficient software development + C-like assembly language for ease of programming. + All single-cycle instructions; four memory accesses/nstruction cycle ‘+ Hardware context save and single-cycle PC relative addressing ow oof (=ae]] [exw oy Lae nr oars ferme ser renee | waiter eS poh 82) yt on404 Taroaare) | | [_ [assconer sesmem _, "eonrmar et ro “porn FLOATS Sec vero —exs oar] io ol pet see ve0 ot [xm Tear} ‘= = sua CAD Comlainmatcunt — SI Sait epwOumet $8o Slaiemncoar 5c Tmerstnwtnl at Random sss anoy GAC Eu Conta Fou Roni Guyionory™” tas Manor epee 1 FIGURE 22 Block diagram of DSPS210 processor (Courtesy ATE)106 SP Microprocessors in Embedded Systems Chap, 3 Microprocessor bus compatibility (The DSP3210 is designed fr efficient bus mas designs. This allows the DSP3210 tobe easily incorporated into microprocesse, based designs 32-bit, byte-addressable adress space allowing the DSP3210 and a microprocesoy to share the same address space and to share pointer values. *+ Retry, elinguish/retry, and bus error support + Page mode DRAM support Direct suppor for both Motorola and Intel signaling ‘AT&T DSP3210 FAMILY HARDWARE DEVELOPMENT SYSTEM DESCRIPTION ‘The MP3210 implements one or two AT&T DSP3210 32-bit floating-point DSPs with « comprehensive mix of memory, digital VO and professional audio signal VO. Tre >MP3210 holds up to 2 Mbytes of dynamic RAM (DRAM). The DT-Connect interface en ables real-time video VO, MP3210 systems include: the processor card; C Host driveg With source code; demos, examples and utilities, with source code; User's Manual, ant (he AT&T DSP3210 Information Manual. DSP3210 is the low-cost Multimedis Processor of Choice. New features added tothe DSP3210 are briefly outlined below. D5P3210 FEATURES. USER BENEFITS *+ Speeds up to 33 MFLOPS. ‘The high performance and large on-chip ‘memory space enable + 2k x32 on-chip RAM fast, efficient processing of complex algorithms, + Pull, 32-bit, floating-point Ease of programming/higher performance. + All instructions are single eycle + Four memory accesses per Higher performance. instruction cycle * Microprocessor bus compatibility Designed for efficient bus maser. + 32-bit byte-addressable designs. This allows the DSP3210 address space to + Retry, relinguishietry easily be incorporated into uP * error Support bbus-based designs. The 32-bit, byte- Boot ROM addressable space allows the Page mode DRAM support DSP3210 and a uP to share the same Directly supports 680X0 ‘address space and to share pointer and 80X86 signaling valves as wel, 3.1.2 Analog Devices ADSP-210Xx Figure 3.3 shows a block diagram of the ADSP-21020 microprocessor manufactured by Analog Devices (Norwood, MA). The ADSP-21060 core processor is similar to the ADSP-21020. The ADSP-21060 (Figure 3.4) also includes a large omehip memory, DMA controler, serial ports, link pons, a host interface and multiprocessing features D> ms osmmasoner ut a : 3 : i i g 105=) ‘CONTROLLER EE SS|ORE 0 Processor — a rraen | RSTRUCTION) See. 3.1 Typical Floating-Point Digital Signal Processors, 107 withthe core processor. The following isa brief description of these processors provided by Analog Devices. ‘The ADSP-210XX processors provide fast, flexible arithmetic computation units, ‘unconstrained data flow to and from the computation units, extended precision and dynamic range in the computation units, dual address generators, and efficient program se- ‘quencing. All instructions execute in a single cycle. It provides one of the fastest cycle times available and the most complete set of arithmetic operations, including seed 1/x, rin, max, clip shift and rotate, in addition tothe traditional multiplication, addition, subtraction, and combined additionsubtraction. It is IEEE floating-point compatible and al- ows interupts to be generated by arithmetic exceptions or latched status exception han- ing. ‘The ADSP-210XX has a modified Harvard architecture combined with a 10-port data register file. In every cycle two operands can be read or written to or from the register file, two operands can be supplied to the ALU, two operands can be supplied to the ‘multiplier, and two results can be received from the ALU and multiplier. The processor's 48-bit orthogonal instruction word supports fully parallel data transfer and arithmetic operations in the same instruct ‘The processor handles 32-bit IEEE floating-point format as well as 32-bit integer and fractional format. It also handles extended precision 40-bit IEEE floating-point for- ‘mats and carries extended precision throughout its computation units, limiting data truncation errors. "The processor has two data address generators (DAGs) that provide immediate or indirect (pre-and post-modify) addzessing. Modulus and bit-zeverse addressing operations are supported with no constraints on circular data buffer placement. In addition t© zero-ovethead loops, the ADSP-210XX supports single-cyele setup and exit for loops. Loops are both nestable (six levels in hardware) and interruptable, ‘The processor supports both delayed and nondelayed branches. In summary, some of the key features of the ADSP-210XX core processor follow: + 48-bitinsrueton, 3/40-bt data words + 80-bit MAC accumulator + 3-stage pipeline 63 instraction types + 32x 48-bit instruction cache + 10-por, 32 40-bit register file (16 registers per set, 2 sets) + level loop stack + 24-bit program, 32-bit data address spaces, memory buses + Linstraction/cycle (pipelined) + Leyele multiply (32-bit or 40-bit oatng-poit or 32-bit fixed-point) + Geycle divide (32-bit or 40-bit loatng- point) + 2eycle branch delay + Zero overhead loops + Bare shifter108 DSP Microprocessors in Embedded Sysome Chap + Algebraic syntax assembly language *+ Multifunction instructions with 4 operations per eyele + Dual address generators + 4-eycle maximum interrupt latency 3.1.3 Texas Instruments TMS320C3X and TMS320C40 Figure 25 shows a block diagram of the TMS320C30 microprocessor and Figure 36 shows the TMS320C40, both manufactured by Texas Instruments (Houston, Ty fe TMS320C30 and TMS320C4, processors ae similar in architecture except th te TMS320C:0 provides hardwate support fr multiprocessor conigntons, The allowing is a brief description of the TMS320C30 processor as provided by Texas Instrumeng “The TMS320C30 can perform parallel multiply and ALU operations on neg og Moating-point data in a single cycle. The processor also possesses 4 general peng register file, propram cache, dedicated auxiliary register arithmetic units (ARAU), Ieee ‘al dual-access memories, one DMA. channel supporting concurrent VO, and a shor ‘machine-cyele time, High performance and ease of use are products ofthese features General-purpose applications are greatly enhanced by the large address spac, ml ‘processor interface, internally and extemally generated wait states, two external inter face ports, two timers, two serial ports, and multiple interupt structure. High-level ln, suage is more easily implemented through a regiser-based architecture, lange eddy Space, powerful addressing modes, flexible instruction set, and well-supported eating pint arithmetic. Some key features ofthe TMS320C30 are listed below. *+ 4 stage pipetine, 113 instructions + One 4K x 32-bit single-cycle dual access on-chip ROM block + Two IK > 32-bit single-cyele dual access on-chip RAM blocks + 64 x 32.bit instruction cache + 32-bit instruction and data words, 24-bit addresses + 40/32:bitfloating-poinvinteger multiplier and ALU + 32-bit barrel shifter + Multiport register file: Bight 40-bit extended precision registers (accumulators) + Two address generators with eight auxiliary registers and two arithmetic units * On-chip direct memory aceess (DMA) controller for concurrent /O + Integer, floating-point, and logical operations + Two- and three-operand instructions PParallel ALU and multiplier instructions ina single eyele Block repeat capability + Zero-overhead loops with single-cycle branches Conditional calls and returns eB g HEE pas EB trxteie on 27 an FIGURE 35 Block lagram of TMS320C30 and TMS82C31 procesor(Courtsy Tora Instrument) 10910 SP Microprocessors in Embedded Systems Chap. 3 me th = 2. = 2 = tet Tar ‘Seaecae ee Sak FIGURE8 Block diggram of TMS320C40 processor (Courtesy Texas Instruments. ‘Sec.3.2 Typical Programming Tools for DSP m + Interlocked instructions for multiprocessing support + Two 32-bit data buses (24- and 13-bit address) + Two serial ports + DMA controller + Two 32-bit timers 32 TYPICAL PROGRAMMING TOOLS FOR DSP ‘The manufacturers of DSP microprocessors typically provide a set of software tools designed to enable the user 1o develop efficient DSP algorithms for their particular processors. The basic software tools provided include an assembler, linker, C compiler, and simulator. The simulator can be used to determine the detailed timing of an algorithm and then optimize the memory and register accesses. The C compilers for DSP processors will usually generate assembly source code so thatthe user can see what instructions are generated by the compiler for each line of C source code. The assembly code can then be ‘optimized by the user and then fed into the assembler and linker. Most DSP C compilers provide a method to add in-line assembly language routines to C programs (see section 3.3.2). Ths allows the programmer to write highly efficient assembly code for time-critical sections of a program. For example, the autocorrelation function of a sequence may be calculated using a fonction similar to a FIR filter where the coefficients and the data are the input Sequence. Each multiply-accumulate in this a gorithm can often be calculated in one cycle on a DSP microprocessor. The same C algo rithm may take 4 oF more eycles per multiple-accumulate. Ifthe autocorrelation calcula tion requires 90 percent of the time in a C program, then the speed ofthe program can be improved by a factor of about 3 ifthe autocorrelation portion is coded in assembly lan ‘guage and interfaced to the C program (this assumes that the assembly code is 4 times faster than the C source code). The amount of effort required by the programmer to create efficient assembly code for just the autocorrelation function is much less than the effort requited to write the entire program in assembly language, Many DSP software tools come with a library of DSP functions that provide highly ‘optimized assembly code for typical DSP functions such as FFTs and DFTs, FIR and T1R. filters, matrix operations, coreations, and adaptive filters. In addition, third partes may provide additional functions not provided by the manufacturer. Much of the DSP library ode can be used directly or with small modifications in C programs. 3.2.1 Basic C Compiler Tools AT&T DSP32C software development tools. ‘The DSP32C’s C Compiler provides a programmer with a readable, quick, and portable code generation tool combined with the efficiency provided by the assembly interface and extensive set of library routines. The package provides for compilation of C source code into DSP32 and1 DSP Microprocessors in Embedded Systems Chap. 3 DSPS2C assembly code, an assembler, simulator anda numberof other useful aig, for source and object code management. The three forms of provided libraries are: + libe A subset of the Standard C Library ‘libs Math Library + libap Application Software Library, complete C-callable se of DSP routines, DSP32C support software library. This package provides assembly-eg, programming. Primary tools ae the assembler, linkerloader, a make utility that provid beter control over the assembly and link/lod tak, and a simulator for program debug, sing. Other uilities are: ibrary archiver, mask ROM formatter object file dumper, sy bol table lster, object file code size reporter, and EPROM programmer formatter The st, package is necessary for interface control of AT&T DSP32C Development Systems, ‘The Application Library has over seven dozen subroutines for arithmetic, mari, fier, adaptive fier, FFT, and graphicsfimaging applications. A files are asiemby source and each subroutine has an example test program. Version 2.2.1 adds four routines for sample rate conversion, AT&T DSP3210 software development tools. This package includes a ¢ language compiler, libraries of standard C functions, math functions, and digital signal processing application functions. A C code usage example is provided for each ofthe ‘math and application library routines. The C Compiler also includes all ofthe assem ber, simulator, and utility programs found in the DSP3210 ST package. Since the C libraries are only distributed as assembled and then archived "a files, customer may also find the DSP3210-AL package useful as a collection of commented assembly code examples. DSP3210 support software library. The ST package provides assembly level programming. The primary tools of the package are the assembler, linket/loadet, and a simulator for program development, testing, and debugging. A 32C to 3210 assem bly code translator assists developers. who are migrating from the DSP32C device. Additional utilities are library archiver, mask ROM formatter, object code disassemble, object file dumper, symbol table lister, and object code size reporter. The AT&T Application Software Library includes over ninety subroutines of typical operations for arithmetic, matrix, iter, adaptive filter, FFT, and graphics/imaging applications. All files are assembly source and each subroutine has an example test programm. Analog devices ADSP-210XX C tools. The C tools for the ADSP-21000 family let system developers program ADSP-210XX digital signal processors in ANSIC. Included are the following tools: the G21K C compiler, a runtime library of C functions, ‘and the CBUG C Source-Level Debugger. G2IK is Analog Devices’ port of GCC, the GNU C compiler from the Free Software Foundation, for the ADSP-21000 family of ¢ig- ital signal processors. G21K includes Numerical C, Analog Devices" numetical pres $00.32 Typical Programming Tools for DSP 3 ing extensions to the C language based on the work of the ANSI Numerical C Extensions Group (NCEG) subcommitte, ‘The C runtime library functions perform floating-point mathematics, digital signal ‘processing, and standard C operations. The functions are hand-coded in assembly lan- ‘guage for optimum runtime efficiency. The C tools eugment the ADSP-21000 family assembler tools, which include the assembler, linker, librarian, simulator, and PROM. spliter. Texas Instruments TMS320C30 C tools. ‘The TMS320 floating-point C ‘compiler isa full-featured optimizing compiler that translates standard ANSI C programs into TMS320C3x/C4x assembly language source, The compiler uses a sophisticated opt- ‘mization pass that employs several advanced techniques for generating efficient, compact. code from C source. General optimizations can be applied to any C code, and target- specific optimizations take advantage of the particular features of the TMS320C3UC4x architecture. The compiler package comes with two complete runtime libraries plus the source library. The compiler supports two memory models. The small memory model en. ables the compiler to efficiently access memory by restricting the global data space to a single 64K-word data page. The big memory model allows unlimited space. ‘The compiler has straightforward calling conventions, allowing the programmer to easily write assembly and C functions that call each other. The C: preprocessor is inte. rated with the parser, allowing for faster compilation. The Common Object File Format (COFF) allows the programmer to define the system's memory map at link time. This ‘maximizes performance by enabling the programmer to link C code and data objects into specific memory areas. COFF also provides rich support for source-level debugging, The compiler package includes a utility that interlists original C source statements into the assembly language output of the compiler. This utility provides an easy method for inspect. ing the assembly code generated for each C statement. All data sizes (char, short, int, long, float, and double) are 32 bits. This allows all types of data to take full advantage of the TMS320Cw/C4x's 32-bit integer and. floating-point capabilities. For stand-alone embedded applications, the compiler enables Tinking all code and initialization data into ROM, allowing C code to run from reset. 3.2.2 Memory Map and Memory Bandwidth Considerations. Most DSPs use a Harvard architecture with three memory buses (program and two data ‘memory paths) or a modified Harvard architecture with two memory buses (one bus is shared between program and data) in order to make filter and FFT algorithms execute ‘puch faster than standard von Neumann microprocessors. Two separate data and address busses allow access to filter coefficients and input data in the same cycle. In addition, ‘most DSPs perform multiply and addition operations in the same cycle. Thus, DSPs exe. cute FIR filler algorithms at least four times faster than a typical microprocessor with the same MIPS rating, ‘The use of a Harvard architecture in DSPs causes some difficulties in writing C Programs that utilize the full potential of the muliply-aecumulate structure and the multi16 DSP Microprocessors in Embedded Systems Chap. ple memory busses, All three manufacturers of DSPs described here provide @ method to ‘assign separate physical memory blocks to different C variable types. For example, ants variables that are stored on the heap can be moved from internal memory to extems) memory by assigning a different address range to the heap memory segment. Inthe as, sembly language generated by the compiler the segment name fora particular C variable ‘of array can be changed to locate it in intemal merpory for faster access of to allow ity be accessed at the same time as the other operands for the multiply or accumulate opera tion. Memory maps and segment names are used by the C compilers to separate different types of data and improve the memory bus utilization. Internal memory is often used for ‘coefficients (because there are usually fewer coefficients) and external memory is used for large data arays, ‘The ADSP-210XX C compiler also supports special keywords so that any C var able or array can be placed in program memory or data memory. The program memory is used to store the program instructions and can also store floating-point or integer data ‘When the processor executes instructions ina loop, an instruction cache is used to allow the data in program memory (PM) and data in the data memory (DM) to flow into the ALU at full speed, The pm keyword places the variable or aray in program memory, and the dim keyword places the variable or aray in data memory. The default for static or _lobal variables is to place them in data memory. 3.2.3 Assembly Language Simulators and Emulators ‘Simulators fora particular DSP allow the user to determine the performance of a DSP al- {gorithm on 2 specific target processor before purchasing any hardware or making a majot investment in software for a particular system design. Most DSP simulator software is available for the IBM-PC, making it easy and inexpensive to evaluate and compare the performance of several different processor. In fact, itis possible to write all the DSP ap: plication software for @ particular processor before designing or purchasing any hardware, Simulators often provide profiling capabilities that allow the user to determine the ‘amount of time spent in one portion of a program relative to another. One way of doing this is forthe simulator to keep a count of how many times the instruction at each address ina program is execute. Emulators allow breakpoints to be set ata particular point in a program to examine registers and memory locations, to determine the results from real-time inputs. Before a breakpoint is reached, the DSP algorithm is running at full speed as ifthe emulator were not present. An in-crcuit emulator (ICE) allows the final hardware to be tested at fll speed by connecting to the user's processor in the user's real-time environment. Cycle ‘counts can be determined between breakpoints and the hardware and software timing of a system can be examined. Emulators speed up the development process by allowing the DSP algorithm to un at full speed in a real-time environment. Because simulators typically execute DSP programs several hundred times slower than in real-time, the wait for a program to reach @ particular breakpoint in a simulator can be a long one. Real world signals from A/D converters can only be recorded and then later fed into a simulator as test data. Although the test data may test the algorithm performance (if enough test data is available, the timing Sec.3.2 Typical Programming Tools for DSP ns of the algorithm under all possible input conditions cannot be tested using a simulator. ‘Thus, in many real-time environments an emulator is required. ‘The AT&T DSP32C simulator is a line-oriented simulator that allows the user to ‘examine all of the registers and pipelines in the processor at any cycle so that small pro- _grams can be optimized before real-time constraints are imposed. A typical computer dia Jog (user input is shown in bold) using the DSP32C simulator is shown below (courtesy of AT&T): sim; sac, Sim: b end bp set at adar Oxta Sin: rum 12 | 00000 __ 46 | r000008+ —__* Joo00: 221 = ox7¢(127) ‘eoo007e* Joos: * x2 = rit 20 | 000000" * ‘+ ro0007e* 0008: a3 = +42 25 | ro00010+___* Sabasar * Joode: riot = + rt 30 | ro0001¢" +s * 0010: wor 34 | eo00018" * * Jooud: 10 = r10 + oxsfen(-127) 38 | x00001c* * + 000080" |oo18: * x3 = r10 42 | x000020"" + ro00080" |ooe: a0 = float (#73) 4 | eoo0024++ . * * |o020: ae ates 2 | 000028" + Fo0007&* F000070** |0024 fd oad torte 59 | 000020 * 000068 ro0006a*+ | 028. vest aa ters. 63 | r000030"-wo00080"" . * Jooze: a0 = ad * a3 63 | roo0034" + + ro0006c* 0030: al = *r8 + al + a3 73 | ro00038%« » F00905e* r000088** | 0034: sre + a3 * te 79 | roovase*___* ro0007e rov0060"*|0038: a2 = *x5 + al + +22 85 | ro004o+e + + |ooae: ai = al + 20 90 | zo000da+ * ro00080" Joodo: x7 = a0 = at + #3 breakpoint at end(Ox000084) decodei*z7 = a0 = al + *r2 Sime 27. 27 = 16000000 Sims made. € manie = 16 In the above dialog the flow of data in the four different phases of the DSP32C in struction cycle are shown along with the assembly language operation being performed. ‘The eycle count is shown on the left side. Register £7 is displayed in floating-point after the breakpoint is reached and the number of wait states is also displayed. Memory reads are indicated with an x and memory writes with ave, Wait states occurring when the same ‘memory is used in two consecutive cycles are shown with [ATaKT D5P32C EMULATION SYSTEM DESCRIPTION + Real-time application development + Interactive interface to the ICE card(s) provided by the DSP32C simulator through, the high-speed parallel interface on the PC Bus Interface Card16 DSP Microprocessors in Embedded Systems Chas 5 0c.3:3 Advanced € Software Tools for DSP 7 * Simolator-cotolled software breskpoins; examine and change memory comegy or device registers + For mult-ICE configuration selection of which ICE card is active by the simula 0 tor APY =-AR3aS),0 ens) Flat SrationR® 4 3010903 Lorn g69kasnt ts 45618000 Loree 8.00,Ri te 01000001 ADDF ine oseot? 02607400 chee Firure 3.7 shows atypia screen fom the ADSP-21020 sren-ovinted simula, Thg simalor allows the timing and debug of assembly language routines. The ver ines ofthe simulator lets you completely contol program execution and easily change nt contents of registers and memory, Breakpoints can also be set to stop execution sf rogram at & particular point The ADSP-21020 simulator creates a epresenaion the device and memory spaces as defined in a system architecture fil. Itean also ne late input and output from VO devices using simulated datafile. Program exccutn ny be observed on a cycle-by-cycle basis and changes can be made on-line to comect enaar ‘The same screen based interface is used for both the simulator andthe in-cteut ema tor. A C-source level debugger is also available with this same typeof interface (see we ‘ion 33.1), Figure 38 shows a typical screen from the TMS320C30 screen-oriented simulator ‘This simulator allows the timing of assembly language routines as well as C source coe ‘because it shows the assembly language and C source code at the same time. All of he registers can be displayed after each step of the processor. Breakpoints can also be set stop execution ofthe program ata particilar point. I ENORY ‘of Synbols loaded 2 oe0ee FIGURE 3.8 TMS220C30 simulator in mixed mode assembly language mode {Courtesy Texas instuments) File Core Memory Execution Setup Help te Memory (Fined? | play. tito 33 ADVANCED C SOFTWARE TOOLS FOR DSP is nate "Rs ston desis sme of te moe vated iret lle fring. ee fis posh miseconr, Seco ceeeapa Cea ea in. 7 Cert ti we Son 233 desir ec ents as haan 7 eae Ost reoseneieg oasctincleyponcascra eta eS the numeric C extensions to the C language usi DSP algorithms as examples (see Section 2.10 for a description of numeric C) sresza791. | SS en 3.3.1 Source Level Debuggers ESB, testa Ect Conmasicon Aion & Cone (CAC) ns (Alen, PA) fer degger H Sees Be KernsPat anol ngs wa Cosune daar sels Sea eee ma me: compatible with the following vendors’ DSP32C board for the AT computer under MS. 7 is er DOS al CAC bom Ach TAT DOPRCDS ee Se MS E i isnt) Ergo) Tran, Longing Sound Ines and Soy Md ee eae ae Pica: (eeeter ancenrteseees, * ‘source code ofthe drives is provided to enable the user to port either debugger toa un- — Srna nernc me Both D3EMU (assembly language only) and D3BUG (C-source and mixed assembly code) are sereen-oriented user-friendly symbolic debuggers and have the following FIGURE 37 ADSP.21020 simulator displaying assembly language and processor ’ ‘episters (Courtesy Ansiog Devices18 DSP Microprocessors in Embedded Systems Chap. 3 + Single-step, multiple breakpoints, run at full speed ‘+ Accumulatorregister/mem displays updated automatically after each step * Global variables with scrolling ‘+ Stand-alone operation or callable by host application suc OnLy ‘+ C-source and assembly language mixed listing display modes + Local and watch variables. + Stack trace display. ‘+ Multiple source filesdirectores with time stamp check, Figure 3.9 shows a typical screen generated using the D3BUG source level debug. ger with the DSP32C hardware executing the program. Figure 3.9 shows the mixed as sembly-C source mode of operation with the DSP32C registers displayed. Figure 3.10 shows the C source mode with the global memory location displayed asthe entire C pro- ‘gram is executed one C source line ata time in this mode. Figure 3.11 shows a typical screen from the ADSP-21020 simulator when C source level debugging is being performed using CBUG. C language variables can be displayed and the entire C program can be executed one C source Tine at atime in this mode, This ‘same type of C source level debug can also be performed using the in-circuit emulator, acc cak cont disk goto bait Lo acm code quitcreg step vars abt 1-008 1-reI9 nie Resor Macetten ricsritcexttttes 030008 freq2 writes Oxfff034 oxttfe3e $6x030060 ore0as3t Oxtitd3e oxtt tte Oxier Ob OxbE 838 : 3908047? ‘20610008, 4 nop oreinitChreq-retiot state vat ‘rlevstate. var! arian xt eree ereHee a2: 0.0090000e+000 43: 1, 70900080+036 FIGURE 39 _0SPS2C debugger D3BUG in mixed assembly C-source mode (Courtesy Communication Automation & Control (CAC), ne (Allentown, PA.) Sec.33 Advanced C Software Tools for DSP 19 reek cont disk goto halt Ivo men code quit reg step vars mix 1-D05 T-neip ‘GLOBAL VARIABLES aataite? loseeze: 2.60sssoe-¢ S881 /» Select two frequencies «/ gese> “req = see fess) Tren est 8655 7+ Calculate the frequency ratio between the se ne re a me ost 472.0: 3 Seu’ initialize cach oxcitiator sost> “oscintttrrey ratio! state werfablesi>: 8062) oscinittrroqaratiog, state-veriebleas); e003 S264 /= Generate 128 samples for cach oscillator #/ a> “osehtstate_varianieel 128. dete feos) Szcntstatenvarlebten2.128. 8068 re Add the tuo waveforms together «/ 0063> “add_tones(datal.datezy: 1.933304¢+026 2 1.9339040+026 freq ration Joseeec: 1.9393040+026 ‘freq ratio 900107 1 .9993046+026, 2 Now compute the fft using the ATAT applicatio errtacieer?edate ind a 9501047 FIGURE 3.10 05P32C debugger D3BUG in C-source mode (Courtesy Communication ‘Automation & Control (CAC, Ine. (Allentown, PA)) File Core tenory Execution Setup Help cB (muzik exe) Continue? ext?" Finish. GBreak> p> own ctxecution..> “ . Symbols.) codes > for(i = 0: tC endl? ive) € Sendout (sig_out): 7s turn off LED «7 -¢ expr— pstg_out Psat3-2500 ——____cauc states — 9 No debug symbois for key down()- Stepping Into code with synbols. err: User halt. Do « CBUG Step/Mext to resume C debugging. S Ere: User hait. Do a CBUG Steprnext to resune © dcuggiog Target Waited 98516749 FRGURE 2.11 ADSP.21020 simuletorespiaying © source code (Courtesy Ansiog120 DSP Microprocessors in Embedded Systems Chap. 3 percent pass/(200.0«ratio (200.0 ="percent pass)/(2 aeitat = fa mete = Fitter we titrratio: + Isizewratio + 1: 2 Genie 2: for(i = 6: ic ratio : 109 ¢ MELD Chto franca e Deal loc(1size,sizeot (ttoat)?: length att, deltet. abeta >: | : Ls ' = APT Seine Loading ch2 out ats ‘84 Symbols loaded 2: elk 1906 FIGURE 2.12 TMS220C20 simulator in C-zource made (Courtesy Texas Instruments Figure 3.12 shows a typical seen from the TMS320C30 simulator when C source level debugging is being performed. C language variables can be displayed and the eatie program can be executed one C source line at atime inthis mode. 3.3.2 Assembly-C Language Interfaces ‘The DSP32C/DSP3210 compiler provides a macro capability for in-line assembly lan uage and the ability to link assembly language functions to C programs. In-line assem bly is useful to control registers in the processor directly or to improve the efficiency of key portions of a C function. C variables can also be accessed using the optional operands as the following scale and clip macro illustrates: asm void scale(fit_ptr, scales ‘ 8 ureg Mt_ptr, scale f,clip; clip) a0 = scalef * seit per al = -a0 + clip a0 = ifeie (clip) Slt ptree = ad = a0 ) ‘Assembly language functions can be easily linked to C programs using sever ‘macros supplied with the DSP32C/DSP3210 compiler that define the beginning andthe ‘nd of the assembly funtion so tht it conforms tothe register usage ofthe C compl. Sec.33 Advanced C Sofware Tools for DSP wat ‘The macro @B saves the calling function's frame pointer and the retum address, The ‘macro @H0 reads the retum address off the stack, performs the stack and frame pointer adjustments, and returns to the calling function. The macros do not save registers used in the assembly language code that may also be used by the C compiler—these must be saved and restored by the assembly code. ll parameters are passed to the assembly lan. ‘guage routine on the stack and can be read off the stack using the macro param), which gives the address of the parameter being passed. ‘The ADSP-210XX compiler provides an asm() construct for in-line assembly lan guage and the ability to link assembly language functions to C programs. Inline assem bly is useful for directly accessing registers in the processor, or for improving the eff ciency of key portions of a C function. The assembly language generated by am () is embedded in the assembly language generated by the C compiler. For example, asm(*bit set imask 0x40;*) will enable one of the interrupts in one cycle. C variables can also be accessed using the optional operands as follows: aem("80=clip #1 by 82)" : “sd (result) stat x), "a" yh) where result, x and y are C language variables defined in the C function where the ‘macro is used, Note that these variables will be forced to reside in registers for maximum efficiency. ADSP-210XX assembly language functions can be easily linked to C programs using several macros that define the beginning and end of the assembly function so that it conforms to the register usage of the C compiler. The macro entry saves the calling funetion’s frame pointer and the retum address. The macro exit reads the return address ‘off the stack, performs the stack and frame pointer adjustments, and returns tothe calling function. The macros do not save registers that are used in the assembly language code ‘which may also be used by the C compiler—these must be saved and restored by the as. sembly code. The first three parameters are passed to the assembly language routine in registers r4,r8, and r12 and the remaining parameters can be read off the stack using the ‘macro reads () ‘The TMS320C30 compiler provides an asm() construct for in-line assembly lan- ‘uage. In-line assembly is useful to control registers in the processor directly. The assem bly language generated by aum() is embedded in the assembly language generated by the C compiler. For example, agm(" LDI GMASK, ZE*) will unmask some ofthe in. {errupts controlled by the variable MASK. The assembly language routine must save the calling function frame pointer and return address and then restore them before retuming to the calling program. Six registers are used to pass arguments to the assembly language ‘outine and the remaining parameters can be read off the stack, 3.3.3 Numeric C Compilers ‘As discussed in section 2.10, numerical C can provide vector, matrix, and complex operations using fewer lines of code than standard ANSI C. In some cases the compiler may be able to perform better optimization for a particular processor. A complex FIR filter can be implemented in ANSI C as follows:12 DSP Microprocessors in Embedded Systems Chap. 3 typedef struct ( ‘float real, imag: > comput: COMPLE float. 11024) ,w(1024); COMPLEX #26, we, out: fout-xeal = 0.0; cut-imag = 0.0; for(i = 0; i 0; le/=2) ( for (J #0; 4< les 50 w= twptes for (P23; hem;d2i42me) ( xip = xi + Je: tenp.real = xi->real + xip->real; Sec.3.3 Advanced € Software Tools for OSP 123 temp. imag = xi-simag + xip->imag: tm.real = xi->real ~ xip->realy tm. imag = xi->inag - Xip->imag: xipreal = tm.realtu.real ~ te. imag‘. imag; xip->inag = tm.real*u.inag + tm. inag*u.real; tad = temps ) ptr = wptr + winder: d index = 2rwindes; ) ‘The following code segment shows the numeric C implementation of a complex FFT ut the bit-reversal step: void fft_nc(int n, complex float +x, complex float *w) ‘ int size, sect, deg = 1; for(sizern/2 ; size > 0; size/=2) ( for(sects0 ; sect ‘The twiddle factors (w) can be initialized using the following numeric C code void init w(int n, complex float *w) ‘ float a = 2.0°Pt/n: w(T) = cost (i*a) + 1itsing (ray; ) [Note thatthe performance ofthe 4n4t_w function is almost identical to a standard C implementation, because most ofthe execution time is spent inside the cosine and sine func-124 DSP Microprocessors in Embedded Systems Chap. 9 tions. The numerical C implementation ofthe FFT als hasan almost identical execution time as the standard C version, ‘4 REAL-TIME SYSTEM DESIGN CONSIDERATIONS. Real-time systems by definition place a hard restriction on the response time to one or ‘more events In a digital signal processing system the events are usually the aval ot "new input samples or the requirement that a new output sample be generated. In this sex ‘ion several real-time DSP design considerations are discussed ‘Runtime initialization of constants in a program during the startup phase ofa DSP's ‘execution can lead to much faster real-time performance. Consider the pre-caleulation \x. which may be used in several places in a particular algorithm. A lengthy divide «each case is replaced by a single multiply ifthe constant i pre-calculated by the compiler and stored in a static memory area. Filter coefficients can also be calculated during the startup phase of execution and stored for use by the real-time code. The tradeoff that re sults is between storing the constants in memory which increases the minimum memery size requirements ofthe embedded system or calculating the constants in real-time. Also, if thousands of coefficients must be calculated, the startup time may become exceeding Jong and no longer meet the user's expectations. Most DSP software development sys ‘ems provide a method to generate the code such that the constants can be placed in ROM, $0 that they do not need to be calculated during startup and do not occupy more expensive RAM, 3.4.1 Physical Input/Output (Memory Mapped, Serial, Polled) ‘Many DSPs provide hardware that supports serial data transfers to and from the processor as well as extemal memory accesses. In some cases a direct memory access (DMA) con. troller is also present, which reduces the overhead of the input/output transfer by transfer. fing the data from memory to the slower UO device in the background of the real-time program. In most cases the processor is required to wait for some number of cycles when- {ever the processor accesses the same memory where the DMA process is taking place. ‘This is typically a small percentage of the processing time, unless the input oF Outpt DMA rate is close tothe MIPS rating ofthe processor. Serial ports on DSP processors typically run at a maximum of 20 to 30 Mbits/ee: ‘ond allowing approximately 2.5 to 4 Mbytes to be transferred each second. Ifthe data input and output are continuous streams of data, this works well with the typical floating: point processor MIPS rating of 12.5 to 40. Only 4 tol0 instructions could be executed between each input or output leading toa situation where very litle signal processing could be performed. Parallel memory-mapped data transfers can take place at the MIPs rating of the processor, ifthe MO device can accept the data at this rate, Ths allows for rapid transfers Of data in a burst fashion. For example, 1024 input samples could be acquired from 4 Sec.34 “Time System Design Considerations 125 10 MHiz A/D converter at full speed in 100 usec, and then a FFT power spectrum calcula. tion could be performed for the next 5 msec. Thus, every 5.1 msec the A/D converter's ‘output would be used, ‘Two different methods are typically used to synchronize the microprocessor with the input or output samples. The first is polling loops and the second is interrupts which are cliscussed inthe next section. Polling loops can be highly efficient when the input and output samples occur ata fixed rate and there are a small number of inputs and outputs, ‘Consider the following example ofa single input and single ouput at the same rate forts:) ¢ While (rin_status & 1); tout = Fileer (in) ) It is assumed that the memory addresses of 4m, out, and im_etatus have been defined previously as global variables representing the physical addresses ofthe UO ports, ‘The data read at 4n_status is bitwise ANDed with I to isolate the least significant bit, [If this bit is 1, the whe loop will loop continuously until the bit changes to 0. This bit Could be called a “not ready flag” because it indicates that an input sample is not avail able. As soon as the next line of C code accesses the Si location, the hardware must set the flag again to indicate thatthe input sample has been transferred into the processor. After the £41ter function is complete, the retumed value is writen directly tothe out, Pt location because the output is assumed to be ready to accept data. IF this were not the ase, another poling loop could be added to check if the output were ready. The worst ase total time involved in the filter function and at least one time through the while Polling loop must be less than the sampling interval for this program to keep up with the ‘eal-time input. While this code is very efficient, it does not allow for any changes in the filter program execution time. If the filter function takes twice as long every 100 samples in order (o update its coefficients, the maximum sampling interval wil be limited by this larger time. This is unfortunate because the microprocessor will be spending almost half Of its time dle in the whi.1e loop. Interruptriven VO, as discussed in the next section, ‘can be used to better utilize the processor in this case. 3.42 Interrupts and Interrupt-Driven 1/0. In an interapt driven VO system, the input or output device sends a hardware inteupt to the microprocessor requesting that a sample be provided to the hardware or accepted as input from the hardware. The processor then has a shor tine to transfer the sample. The interrupt response time is the sum of the interrupt lateney of the processor, the time fe uired to save the current context of the program running before the interrupt occurred {and the time to perform the input or output operation. These operations are almost always performed in assembly language so that the interupt response time can be minimized ‘The advantage ofthe interrupt-driven method is that the processor is fre to perform other tasks, such as processing other interrupts, responding to user requests or slowly changing126 DSP Microprocessors in Embedded Systems Chap, the parameters associated with the algorithm. The disadvantage of interrupts i the over. head associated with the interrupt latency, context save, and restore associated with the imerrupt process. : “The following C code example (file INTOUT.C on the enclosed disk) illustrates the functions required to implement one output interrupt driven process that will generate 1000 samples ofa sine wave: Hinelude finclude ‘include *redspe.h* tefine s1ze 10 int output_store[SIZe]; dint in_inx = 0; volatile int out_ime = 0: void sendout (Float x) void output_ise (ine ino): int in_fifo[10000]; nt index = 02 void main() ‘ static float fas int 4,37 setup _codec(6): for(i = 0 ; 4 < ST2E-1 ; i++) sendout (4): sntermipt (S1G_IRQ3, output_isr); for(E=0.0 ; € < 1000.0 ; £ += 0.005) ¢ sendout (asinf (£°PD)) ; Seer £1825 = 0) ( 2 = 100.0*exp(is5e-5) ; Lea > 30000.0 |} i <0) 5 , > ‘void sendout (Float x) « ‘Sec.3.4 Real-Time System Design Considerations 2 SE (in_inx == S128) nine = while(in_inx == out_in) output_store(in_inx] = (int); ) void output_isr(int ino) ‘ volatile int ‘out int *)0x40000002 At(index < 10000) $n £ifo[indexe+]-16+in_inktout_inx; tout = output_storeout_inat+] << 167 Lflout_ine "= S128) out_ine = 0; d ‘The C function output_iar is show for illustration purposes only (the code is ADSP- 210XX specific), and would usually be writen in assembly language for greatest efficiency. The functions sendout and outpat_isr form a software first-in first-out (FIFO) sample buffer. After each interrupt the output index is incremented with a circular 0-10 index. Each call 10 sendout increments the 4m_dnx variable until itis equal to the owt_inxe variable, at which time the output sample buffer is full and the while loop will continue until the interupt process causes the owt_insc to advance. Because the above example generates a new a value every 25 samples, the FIFO tends to empty during the exgp function call. The following table, obtained from measurements of the example program at a 48 KHz sampling rate, illustrates the changes in the number of samples inthe software FIFO. ‘Sample Index © 7 6 5 ‘As shown in the table, the number of samples in the FIFO drops from 10 to 5 and then is quickly increased to 10, at which point the FIFO is again fulDSP Microprocessors in Embedded Systems Chap 5 The efficiency of compiled C code varies considerably from one compile othe nex, One say to evaluat the efficiency ofa compile iso ty diferent C construct, mt ‘case statements, nested Af statements, integer versus floating-point data whi ie ‘et8us for loops and soon, Is also important o reformulate any algoritim o eng sion o eliminate time-consuming function cals sch as cll to exponential, saree or transcendental functions. The following isa brief list of optimization techngucy ae can improve the performance of compiled C code. (1) Use of arithmetic identties—multplies by 0 or 1 should be eliminated whenever possible especially in oops where one iteration has a multiply by 1 or zero. alld Vides by a constant can also be changed to multiplies. ‘Common subexpression elimination—repeated calculations of same subexpression should be avoided especially within loops or between different functions, ‘Use of intrinsic functions —use macros whenever possible to eliminate the funcion call overhead and convert the operations to in-line code. Use of register varables—force the compiler to use registers for variables which can be identified as frequently used within a function or inside a loop. Loop unrolling—duplicate statements executed in a loop in order to reduce the number of loop iterations and hence the loop overhead. In some eases the loop is ‘completely replaced by in-line code. ‘Loop jamming or loop fusion—combining two similar loops into one, thus reduc ing loop overhead, Use post-ineremented pointers to access data in arrays rather than subscript vai- ables (xearray (+1 is low, xe*pt++ is faster) @ e o) © oy In order to illustrate the efficiency of C code versus optimized assembly code, the follow. ing C code for one output from a 35 tap FIR filter will be used: float in(35},coefs{35)..71 main() ‘ register float * = in, register float out; sw = costs, cout for(i 129 Optimised € Optimised Relative Procesor Code Cyl Assembly Cees oC Code) DepHc. co iW ws (ADSP-21000 185 ra ny TMSS20C30 Bar ro 7 [7] The relative efficiency of the C code is the ratio of the assembly code cycles to the C code cycles An efficiency of 100 percent would be ideal. Not that tis code segment is ‘one of the most efficient loops for the DSP32C compiler but may not be for the other compilers, This isilustrated by the following 35-tap FIR filter code: Hloat in(35) ,coats(351.y; ain() i register int i; register float *x register float +w register float out; for(i = 074697; See) fut += tee # wets , yrouts: > This €ox-loop based FIR C code will execute on the three different processors as follows: Optimized Optimied RelaveEcency Processor Code Cyl Assembly Cyeker oC Code) BsPRe 0 ia? 33 ADSP-21020 10 ry aoa "TMSI0C. 20 s 23 Note thatthe efficiency of the ADSP-21020 processor C code is now almost equal tothe efficiency of the DSP32C C code in the previous example. ‘The complex FFT written in standard C code shown in Section 3.3.3 can be used to4.1 REAL-TIME FIR AND IIR FILTERS CHAPTER 4 REAL-TIME FILTERING Filtering is the most commonly used signal processing technique. Filters are usually used to remove of attenuate an undesired portion of a signal's spectrum while enhancing the desired portions of the signal. Often the undesired portion of a signal is random noi with a different frequency content than the desired portion of the signal. Thus, by design ing a filter to remove some ofthe random noise, the signal-to-noise ratio can be improved in some measurable way. Filtering can be performed using analog circuits with continuous-time analog in puts or using digital circuits with discrete-time digital inputs. In systems where the input signal is digital samples (in music synthesis or digital transmission systems, for example) 1 digital filter can be used directly. Ifthe input signal is from a sensor which produces an analog voltage or curent, then an analog-to-digital converter (A/D converter is required to create the digital samples. In either case, a digital filter can be used to alter the spec ‘rum of the sampled signal, x, in order to produce an enhanced output, y- Digital filtering ‘can be performed in ether the time domain (see section 41) or the frequency domain (see section 44), with general-purpose computers using previously stored digital samples o¢ in real-time with dedicated hardware. Figure 4.1 shows a typical digital filter structure containing NV memory elements sed store the input samples and N memory elements (or delay elements) used to store the ost put sequence. As a new sample comes in, the contents of each of the input memory ele ‘meats are copied to the memory clements fo the right. As each output sample is fomed 132 Sec. 41 [Time FIR and IR Filters 133 FIGURE 4.1 Fier structure of Nth order fiter. The previous Ninput and output samples stored in the delay eloments ar used to form the output sum, by accumulating the products of the coefficients and the stored values, the output mem- ‘ory elements are copied tothe left. The series of memory elements forms a digital delay line. The delayed values used to form the filter output are called caps because each output makes an intermediate connection along the delay line to provide a particular delay. This filter structure implements the following difference equatio e ry Hn) =P bgxln—9)- Yay n~ p). an ‘As discussed in Chapter 1, filters can be classified based on the duration oftheir impulse response, Fillers where the a, terms are zero ate called finite impulse response (FIR) fil fers, because the response of the filter to an impulse (or any other input signal) cannot change NV samples past the last excitation, Filters where one or more of the dy terms are nonzero are infinite impulse response (IIR) filters. Because the output of an HR filter depends on a sum of the N input samples as well as a sum ofthe past N output samples, the ‘output response is essentially dependent on all past inputs. Thus, the filter output response to any finite length input is infinite in length, giving the ITR filter infinite memory. Finite impulse response (FIR) filters have several properties that make them useful for a wide range of applications. A perfect linear phase response can easily be con-134 Real-Time Filtering Chap ¢ structed with an FIR filter allowing a signal to be passed without phase distortion, Siers ar inherent sabe, so stability concerns donot ars nthe design or imple {ation phase of development. Eventhough FIR filters typically require a large mim tmukiplies and adds pe input sample, they cn be implemented using fast como with FET algorithms (se section 441). Also, FIR structures are simpler and ease ‘implement with standard fixed-point cgi cicuits at very high speeds. The only ee bie disadvantage of FIR fers is that hey require more mutplics fora gven regres, fesponse when compared to UR ft and, therefore, often exhibit a longer pace, delay for the input to reach the output. During the pas 20 year, many techniques have been developed forthe design ang {implementation of FIR filters. Windowing is pechaps the simplest and oldest FIR. technique (se section 4.1.2), but is quit imited in practice. The window design mene has no independent contol over the passband and stopband ripple. Also filters witi a conventional responses, sich a maltple passband fiters, cannot be designed. On tks ‘other hand, window design can be done with a pocket calculator and can come cle ‘optimal in some cases, This section discusses FIR filter design with different equirpple error inthe pass ‘bands and stopbands. This class of FIR filters is widely used primarily because of tc well-known Remez exchange algorithm developed for FIR filers by Parks and McClellan. The general Parks-McClellan program can be used to design filters with so eral passbands and stopbands, digital differentiators, and Hilbert transformers. The FIR coefficients obtained program can be used directly with the structure shown in Figure 1 (with the ay terms equal to zero). The floating: point coefficients obtained canbe diraiy ‘used with floating-point arithmetic (see section 4.1.1. ‘The Parks-McCiellan program is available on the IEEE digital signal processing {ape oF as part of many of the filter design packages available for personal computers ‘The program is also printed in several DSP texts (see Elliot, 1987, of Rabiner and Gold, 1975). The program REMEZ.C is a C language implementation of the Parks-McClellan Program and is included on the enclosed disk. An example of a filter designed using the REMEZ program is shown atthe end of section 4.1.2. A simple method to obtain FIR fi. ter coefficients based on the Kaiser window is also deseribed in section 4.1.2. Although this method is not as flexible as the Remez exchange algorithm it does provide optimal 404B From these specifications 8, = 0.01145, “The result of Equation (42) is N = 37. Greater stopband attenuation or a smaller transition band can be obtained with a longer filter. The filter coefficients are obtained by multiplying the Kaiser window coefficients by the ideal lowpass filter coefficients. The ideal lowpass coefficients for a very long odd length filter with a cutoff frequency of f, are sven by the following sine function: sini) a) im Note that the center coefficient is k =O and the filter has even symmetry for all coefficients above k = 0. Very poor stopband attenuation would result ifthe above coefficients138 Ri wer ey it exe ee mii esa, However, by multiplying these coefficients by the appropriate Kaiser window, a2 > bandana cn eae Teyana ee ole} @) were (8) is modified zero order Bessel funtion ofthe fist kind, isthe Kaiser vn dow paranter which determines the opand stent. The empirical onl p We Ag i ss thi 50 a is B= 0.58424 (Agap = 209+ OOTERSY Cys Dh a for a stopbandatenuation of 40 dB, B= 3.39535" Listing 42 shows progish Ks which can be used to calculate the coefficients of a FIR filter using te Kates ec imethod. The length of the filter must be odd and bandpass; bandtop or highpas not can also be designed. Figure 4.3) shows the frequency response of the Teeny sr Point lowpass filter, and Figure 4.3) shows the frequency response of «35 pontion pass filter designed using the Parks McClellan program. The following compar tieg shows the results obtained using the REMEZ.C program: 2 (44 RIME FXCHANGE FIR FILTER DESIGN PROGRAM BXANPLEL — LonPASS PIUTER EXAMPLE? —- BANDPASS FILTER EXAMPLES — DIFFERINTTATOR EXAMPLES — HILBERT TRANSFORMER KEYBOARD ~~ GET INPUT PARAMETERS FROW KEYBOARD selection [1 to 5] 7 5 unber of coefficients (2 co 128) 2 35 Filter types are: I-Bandpass, 2-Differentiator, 3sHiiibert Silter type [1 to 3] 72 onber of bands (1 to 101 ? 2 Now inputting edge (corner) frequencies for 4 band edges edge frequency for edge (comer) #1 [0 to 0.5) 70 ede frequency for edge (comer) #2 [0 to 0.5] 7.39 ‘edge frequency for edge (comer) # 3 (0.19 to 0.5) 2.25 @ Magritte (48) Magid (48) FIR Filter Frequency Response lw | Seana 0a Or as 0202503 035 oa 04S 0. Frequency (ts) [FIR Filter Frequency Response 10, Ee —— , 20) ao | | lll Frequency (ts) FIGURE 43. (a) Froquency response of 37 tap FIR fiter designed using the Kaiser window method. tb Frequency response of 35 tap FIR itor designed Using the Parks-MeClellan program. 0 or s 139“0 edge Exequency tor edge (comer) # gain of band #1 {0 to 1000) 7 2 weight of band #1 (0.1 to 100) 7 2 gain of band # 2 (0 to 1000] 7 0 weight of band #2 [0.1 to 100) 7 2 tooete = 35 Type = 1 ‘foands = 2 Grid = 16 (2) = 0.00 BIZ} = 0.19 B13} = 0.25 Ela] = 0.50 ain, wel) 100 1.00 Gain, wel2] = 0.00 1-00 234567 FINITE IMPULSE RESPONSE (FIR) LINEAR PHASE DIGITAL, FILTER DEST FIUTER LNT st0" MGULSE RESPOUSE HI BU HI By HY Hy HY HY HY Hi Hi HK HC Hi HG HG HG HC RIMEZ EXCHANGE ALCORN BANDPASS FILTER a) = 6 2-77 3 7 aes 5) = -8, 6) 2-1 nes a= 2. 9) = 2. ao) = -3. a) = -8. a) 33) aa) 35) 16) 1) 18) 3 a osu 6087549560002 35 '3600960016-003 (662615827¢-005 .691265583e-003 0864145950-003 359812578¢-003 ‘oen0s0s6ee-002 16960020916-003 (017050147e-002 '756078525e-003 (003477728e-002 '9075031060-003 171576865e-002 '4108154212-002 (073291821e-002 791494030e-002 317008¢79¢-001 '402931165e-001 =H Time Fitering 4 (0.25 to 0.5) 7 5 35) 38) 33) 32) 3) 30) 29) 28) 27) 26) 25) 2a) 23) 2) ay 20) 19) 18) Chop. 4 See. 41 ‘Time FIR and IR Fiktors “4 BAND 2 am 2 ‘voweR BAND EDGE 0.000000 _0.25000000 UPPER BAND EDGE 0.19000000 _0.50000000 DESIRED VALUE 100000000 0.00000000, weIarTDNG 00000000 © 1.00000000, DEVIATION 0-00808741 000808741, DEVIATION IN DB -41.84380885 ~41.84380886, EXTREMAL FREQUENCIES (0.02562800.0520833 00815972 0,1093750 9.13 71528, (011614583 0.182297 0.190000 0.2500000 02586806, o.277777@ 0.303194 013298611 0.576389 0.3854167 (0.4131984 04427083 0.470486 0.500000 FIR coefficients written to text file COm.DAT Note thatthe Parks-MeClellan design achieved the specifications with two fewer coeffi cienls, and the stopband attenuation is 1.8 dB better than the specification. Because the stopband attenuation, passband ripple, and filter length are all specified as inputs to the Parks-MoClellan filter design program, it is often difficult to determine the filter length required for a particular filter specification. Guessing the filter length will eventually reach a reasonable solution but can take long time. For one stopband and one passband, the following approximation for the filter length (N) ofan optimal lowpass filter has been. developed by Kaiser: 201080 8B; -13 + as 1468 where 8, =1-10"e 8, =10-/® P= ap ~ Spas Ss Aga isthe total passband ripple (in dB) ofthe passband from 0 0 frag Ifthe maximum ‘of tie magnitude response is 0 4B, then A, is the maximum atten¥ation throughout the passband. Agop is the minimum stopband attenuation in dB) ofthe stopband from fuyy t0 4J,/2. The approximation for N is accurate within about 10 percent of the actual required filter length (usually onthe low side). The ratio of the passband error (5,) tothe stopband error (8,) is entered by choosing appropriate weights for each band. Higher weighting of stopbands will increase the minimum attenuation; higher weighting of the passband will decrease the passband ripple. ‘The coefficients for the Kaiser window design (variable name £4r_1p£37%) and the Parks-McClellan design (variable name £1r_1p£35) are contained in the include file FILTER H./* Linear phase FIR filter coefficient computation using the Kai design method. Filter length is odd. */ include finelude 500 )( printf (*\nt** Filter length td is too large.\nt, nfilt 5 exit (0) > printf(*\n...filter length: a ...beta: tf", néitt, beta} fe = (ip + fa): himpairl = £0; AE ( filtcat == 2) nimpair) = 1 - fer pite = FI’ fer for ( ne; n< mpair; nev)( i= (mpair ~ nl; hin} = sin(i * pifey / (i * Pr); SE( €lt_eat <2) hin =~ hinls d breaks case 3: case 4: printf(*\n—> ‘Transition bands mist be equal <—*) a ( ‘eftag = 0 switch (£11t eat) ( case 3: fal = get_tlost( fal_s, 0, 0.5); Epi = get_float( fpl_s, fat, 0.5); fp2 = get_float( fp2_s, fpl, 0-5): fa2 © getfloat( Ea2Wa, £2, 0.5); break; case & {pl = get_float( fpls, 0, 0.5); falls, fpl, 0.5); faze, fal, 0.5); fpzs, fa2, 0.5); 41 © fpt - fa1; a2 = £02 - E92; Af ( fabe(al - a2} > 18-5 9¢ printf( *\n,..error...transition bands not equal\n*) flag = -1; d } white (eflag) Goltaf = di; if (file cat == 4) delta€ = -deltat; siltertength( att, deltaf, anfilt, smpair, sbeta); if( mpair > 500 )¢ print€(*\nt** Filter length #4 is too large.\n*, nfilt 1: exit(0); ) print€( *\n..filter length: 4d .. beta: ¥£*, n€ilt, beta); HL = (fat + tpl) / 2; fa = (£92 + £92) / 27 fo = (fu ~ £1); m= (fu + £1) 2 himpair] = 2 * fe; if| £ile_cat = 4) hinpairy 1-2t te LUSTING 42 (Continuechus Time Filtering Chap. pife = PI * for tpitm = 2 ¢ Pr * fm for (n= 0; n < maiz; nee)( A= (pair =n); h(n} = 27 sin(i * pife) * coe(é * tiem / (i * PH; Ae( fiteeat == 4) hin} = -htn}; } breaks default: printé( *\att error\n* ); exit(0); (/* Compute Kaiser window sample values */ y = beta; valizh = izero(y): for (a= 0; n <= mpair; ne) ie (a= mpair) y= bota * eqet(2 - (iL / mpaiz) * (1 / pail); win) = izeroly) / valizb: , J+ faret halé of xesponse */ for(n = 0; n <= npair; née) x(n) © win) * hinds print€(*\n—Piret half of coefficient set...renainder by symmetry—*); peinte(\n # ideal window actual"): peinté(*\n coett value filter cost") for(ns0; n <= npairy nee) peinte(*\n Md 49.6£ 89.6£ —89.6E",m, inl, lal, (m1); d ) J* Uso att to got beta (for Kaiser window function) and nfilt (always 08d valued and = 2*npair +1) using Kaiser's emirical formas */ void filter-length(double att,double deltaf, int *nfilt,int *npair,double *betal c veeta = 0; _/* value of beta if att < 21 */ Aflatt >= 50) *beta = 1102 * (att = 8.71); Af (att < 50 & att >= 21) “beta = -5642 * pow( (ate-21), 0.4) + .07886 * (ate - 21); snpair = (dnt) ( (att ~ 8) / (29 * deleat) }: vngile = 2* tnpair +1; ) J Compute Bessel function Tzero(y) using a series approximation */ double izero(deuble y) 42; as = aa + (yy) tara); saa + as, ) whitet as > return(s) USTING42 (Continued See.4.1 Real-Time FIR and IR Filters us 4.123 IIR Fitter Function Infinite impulse response (IR) filters are realized by feeding back a weighted sum of past ‘output valves and adding this to a weighted sum of the previous and current input values. {In terms of the structure shown in Figure 4.1, IR filters have nonzero values for some or all ofthe a, values. The major advantage of IR filters compared to FIR filters is that a sven order IR filter can be made much more frequency selective than the same order FIR filter. In other words, IR filters are computationally efficient. The disadvantage of the recursive realization is thet IR filters are much more difficult to design and imple rent. Stability, roundoff noise and sometimes phase nonlinearity must be considered carefully in all but the most wivial IR filter designs. ‘The direct form IIR filter realization shown in Figure 4.1, though simple in appear- ance, can have severe response sensitivity problems because of coefficient quantization, especially asthe order ofthe filter increases. To reduce these effects, the transfer function is usually decomposed into second order sections and then realized either as parallel or cascade sections (See chapter 1, section 1.3). In section 1.3.1 an IIR filter design and implementation method based on cascade decomposition of a transfer function into second order sections is described. The C language implementation shown in Listing 4.3 uses single-precision floating-point numbers in order to avoid coefficient quantization effects associated with fixed-point implementations that can cause instability and significant changes in the transfer function. Figure 44 shows a block diagram of the cascaded second order IR filter imple ‘mented by the 44r_#£41ter function shown in Listing 4.3. This realization is known as ‘direct form If realization because, unlike the structure shown in Figure 4.1, it has only Section 1 Section N Input ON etory 2 Ot > Bow FIGURE 4.4. Block diagram of rea:ime IR fiter structure a¢ implemented by function tte titer.4s Real-Time Fitering ering Chap. Air filter ~ Perform IIR filtering sample by sample on floats Implements cascaded direct form 12 second order sections. Requires arrays for history and coefficients ‘The length (n} of the filter specifies the number of sections. The size of the history array is 2+ ‘The size of the coefficient array is 4*n + 1 because the first coefficient is the overall scale factor for the filter Returns one output sample for each input sample, Hloat iix filter (float input, float ‘coef, int n,float “history nee Float input sample pointer to filter costficients ‘unber of sections in filter history array pointer float “history Returns float value giving the current output. float itr filter(float input, float teoet, int n, float “history) i Sloat’ *thist1_per,*hist2_ptr, *coof_per ‘float output new hist, history1, history2; coof_ptr = coef; (7 coetticient pointer */ history: histh_ptr + a; Jt fist history */ /* next history */ Sutput = input + (eoef_ptre+); /* overall input seale factor +/ fori dens ie ( historyi = “histl_pte; history? = vhist2_pery /* history values +/ ‘output = output ~ history + (*coet_ptr++); ew hist = output = history2 * (*coafptres); —/+ poles */ ‘output = newhist + history * (*coef_pers); OUEpUE = oUtpUt + history? * (*coaf_ptrr+); | /* zeros */ shist2 pers = thiset_per: shistlptrey = new hist. LUSTING 43 Function 44x f11ter (input cout,n,bietory). (Continued) Sec41 Re I-Time FIR and IR Fitters “7 pist1_ptrte; nista_perees , return (output): STING 43. (Continued ‘wo delay elements foreach second-order section. This ealization is canonic inthe sense thatthe structure has the fewest adds (4, multiplies (8), and delay elements (2) for each second order section. This realization should be the most efficient for a wide vanety of general purpose processors as well as many of the processors designed specifically for Aigital signal processing. TUR filtering wil be illustrated using a lowpass iter wit similar specifications as used in the FIR filter design example in section 4.1.2. The only difference i tht inthe HR fier specification, linear phase response isnot required. Thus, the passband is 01 0.2, andthe stopband is 025 fo 05 f, The passband ripple must be less than 0 dB andthe Stand satenuation must be greater than 40 4B. Because epic fers (also called Cauer fiers) ‘enerlly give the smallest transition bandwith for a given order, an elliptic design wil be ‘sed. After refering othe many elliptic iter tables, itis determined that fifth order elliptic fier will met the specifications. The elliptic filter tables in Zvere (1967) give an entry for a filter witha 0:28 dB passband ripple and 40.19 dB stopband attenuation as follows 3250 (stopband start of normalized prototype) ‘5401 (fist order real pote) 0.5401 (real part of first biguad section) 05401 (real pat of second biquad section) 1.0277 (imaginary part of frst biquad section) 1.9881 (first zero on imaginary axis) 2,=0.7617 (imaginary part of second biquad section) 4= 1.3693 (second zero on imaginary axis) As shown above, the tables in Zverev give the pole and zero location (real and imaginary coordinates) of each biquad section. The two second-order sections each form a conju ‘ate pole pair and the first-order section has a single pole on the real axis. Figure 4.5(a) shows the locations ofthe 5 poles and 4 zeros on the complex s-plane. By expanding the ‘complex pole pars, the s-domain transfer function ofa fifth-order filter in terms ofthe above variables can be obtained. The z-domain coefficients are then determined using the bilinear transform (see Embree and Kimble, 1991). Figure 45(b) shows the locations of the poles and 2er0s onthe complex z-plane. The resulting z-domain transfer function is as follows: 0ssxi+e) 140.7042? 10.0103 + 27 104362 1-0:5232"" —0.8627 10.6962" Figure 4.6 shows the frequency response ofthis Sth order di3 @) FIGURE 45 Polo-zor pot of fit order eliptic HR lowes fer. a) = plane reprosentation of analog prote- {ype fifth-order alliptfter. Zeros are inicated by “0” and poles ae ind cated by “x (B) plane ropes tion of lowpass digital fiter vith ct - ‘off roquency at 0.2 f, In each case, ‘The function 44z_£42tor (shown in Listing 4.3) implements the direct form Il cascade filter structure illastrated in Figure 4.4. Any number of cascaded second order sections can be implemented with one overall input (xj) and one overall output (yj). The coett- ‘cient array for te fith order elliptic lowpass filter is as follows: See. 4.1 @ ‘Magsinde Real-Time FIR and llR Fikers TUR Fier Frequeacy Response os) oa 4] og} os oul O15 02 023 a3 035 04 as Frequency (0) [UR Fer Fequeny Response as 00s 0101s 0202503 Frequency (0s) FQURE 46 a) Lowpass fith-ordr elliptic I fier linear magnitude fe ‘quency response. {b) Lowpass fifth-order lipic IR fiter frequency ‘sponse. Log magnitude in decibels versus frequency. us150 RealTime Fitering Chap g float Lim tp#5 (131 = ¢ 00552961603, 4363630712, 0.0000000000, 1.0000000000, ~0,5233039260, 0.8604439497, 0.7039934993 ~0.6965782046, 0.4860509932, -0.0103216320, 0000000000, 1.0000000000,, 10000000003 ‘The numberof section required fr thi iter i thee, Because the frond section i {implemented in the same way asthe second-order sections, excep thi the second terms (Uh third and fith coefficient) are zero, The coefficients shown above wen tained using the bilinear transform and are contained in the include file FILTER tf definition of this filter is, therefore, global to any module that includes FILTER. Te Adx_f£4.1ter function filters the floating-point input sequence on a sample-y. basis so that one output sample is returned each time 44 _£41ber is invoked. The Re {ory array is used to store the two history values required for each second-order seen ‘The history data (wo elements per section is allocated by the calling function, The np tial condition ofthe history variables is zero if cal Loc is used, because it sets all hoy located space to zero. Ifthe history ary is declared as static, most compilers iitaine Static space to zero. Other initial conditions canbe loaded into the filter by allocating and initializing the history aray before using the 44z_€42tex function. The coefficcos ofthe filter are stored with the overall gain constant (K) frst, followed by the denoming. {or coefficients that form the poles, and the numerator coefficients that form the zero fr cach section. The input sample is first scaled by the K value, and then each second order section is implemented. The four lines of code in the 4ix_£41ter function used ton: plement each second-order section ae as follows: ourput = output - historyl + (*coet_ptre+); new hist = outpat ~ history? * (‘coef ptr+t); + poles */ output = now hist + history * (*coef_ptr++); Output = output + history2 * (scoef_perts); /* zeros */ ‘The hi story1 and history? variables are the current history associated with the ec- tion and should be stored in floating-point registers (if available) for highest efficiency. ‘The above code forms the new history value (the portion of the output which depends oa the past outputs) in the variable new_hisst to be stored in the history aay for use by the next call to 44z_£41ter. The history aray values ae then updated as follows snista_ptres = vateel_ptry shisth_ptrt+ = new hist; histlpte++; hist2_ptres; ‘This results in the oldest history value (*hiwt2_ptx) being lost and updated with the more recent *histi_ptr value. The mew hist value replaces the old 800. 4.1 ‘+histi_ptr value for use by the next call to 44x_£41tex. Both history pointers are incremented twice to point to the next pair of history values to be used by the next second-order section, 4.14 Real-Time Filtering Example Real-time filters ae filters that are implemented so that a continuous stream of input samples can be filtered to generate a continuous stream of output samples, In many cases, real-time operation restricts the filter to operate on the input samples individually and generate one output sample for each input sample. Multiple memory accesses o previous input data are not possible, because only the current input is available o the filter at any sven instant in time. Thus, some type of history must be stored and updated with cach ‘new input sample. The management of the filter history almost always takes a portion of the processing time, thereby reducing the maximum sampling rate which can be supported by a particular processor. The functions f1x_£41ter and 4ir filter are im. plemented in a form that can be used for real-time filtering, Suppose thatthe functions ‘getinput () and sendout () rctum an input sample and generate an output sample at the appropriate time required by the external hardware. The following code ean be used withthe 4ir_£42ter function to perform continuous real-time filtering: static float histitél; for(23) sendout (dir_filter (getinput() ix Ipf5,3,nisti)): In the above infinite loop for statement, the total time required to execute the 4m, Aix_tilter, and out functions must be less than the filter sampling rate in order 10 insure that output and input samples are not lost. In a similar fashion, a continuous real. time FIR filter could be implemented as follows: static float hist£i3e); For(::) sendout (fir ¢ilter(getinput (), fix 1pe35,35,hiset)); Source code for sendout () and getimput () interrupt driven input/output functions is available on the enclosed disk for several DSP processors. C code which emulates Eimput () and sendout () real-time functions using disk read and write functions is also included on the disk and is shown in Listing 44. These routines can be used to debug real-time programs using a simpler and less expensive general purpose computer environment (IBM-PC or UNIX system, for example). The functions shown in Listing 44 read and waite disk files containing floating-point numbers in an ASCH text format, The functions shown in Listings 4.5 and 4,6 read and write disk files containing fixed-point numbers in the popular WAY binary file format. The WAV file format is part of the Resource Interchange File Format (RIFF), which is popular on many multimedia platforms. (text continues on page 158)ws2 Real-Time Filtering #include J* getinput ~ get one sample fron disk to simlate real-time Jeput +) oat getinput() ‘ static PILE *fp = NULL: float x: /* open input file if not dane in previous calls */ AE (IED) ( char ${00); printf(*\ngnter input file name ? + gets(sl: fp = fopen(s,*r*); sect) printf(*\nerror opening input file in GETINPUD\nt); exit(a)s ) > [* read data until end of £ile */ Lf (fscant (fp, "BE*,6x) te 1) exit (a) return(x): , J? sendout ~ send sample to disk to simlate real-time output */ void sendout (float x) ‘ static FILE *fp = NULL: (* open output file if not done in previous calle */ SEU) ( char 2(801; printf(*\nmcer output file nane ? gete(s): fp = fopents,*w"): ifuitp) printf(*\nBrror opening output file in SEDOUT\A"): exit); > ) Jt write the sample and check for errors cele ee eens ea prinf(*\ngrror writing output fle in SENDOUT\n" exited); LISTING 44. Functions eendout(outyut) and gettnput () used to emulate Fealtime inpwoutput using ASCII text dat fle (contained in GETSENDI- Chap, 4 Sec. 4.1 Real “Time FIR and UR Fi 153 sinctude finclude Hinetude (fetuae rae. Hinelude include “wavne.be include *rtdspe.h* jv code to get samples from a WAV type file format */ yr getinpat - get one sample fron disk to similate realtime input */ + input Wav format header with mull init */ ‘RAVER win, on): can oR cin + OL Ys DATASOR din = (7, OL) WRVEPOMAT wavin = (1, 1, OL, OL, 2, 8}; + global number of samples in data set */ ‘unsigned long int nunber_of samples float getinput () static FILE "fp getwav = NULL: static channel. pusber short int int data(4]; J max of 4 chamels can be read */ unsigned char byte data(é]; /* max of 4 channels can be read */ short int 3; 7" open input file if not done in previous calls */ iE fp_getwav) ( char ©(80); Printé(*\nBater input .WAV file name? *); gets(s) 2 fp_gotwav = fopan(s, xb") Lf(fp_aotwav) ( rint€(*\nfcror opening *.WAY input {lle in GETENPUT\n"); exit); ) (* read and display header information */ fr0ad(éwin, sizeof (WAVE_HDR), 1, fp_getwav) + printf ("\ntototcte" win, chunk 440] ,win-chunk @( 1) ,win.chunk_€(2),win.chunk £4131); print é("\nchunkSize = $id bytes*,win.chunk size) LUSTING 45 Function getsspue() used to emulate realtime input using WAV format binary datafile (contsined in GETWAV.C).{Continuech154 Real-Time Fitering Af (stenicmp(win chunk 4d, “RIFE*,4) 1= 0) ( ‘printe(*\nError in RIFP header\n); exit (aly , Fread(ucin, sizeot (CHUNK HDR) 1, p_getwav) ; peinté(*\n") for(i = 0; print£(*\ntumber of channels = 44°, wavin.nchannels) printf (*\ndanple rate = 1d" wavin.nsamplesPersec}; printf (-\nBlock size of data’= t4 bytes" wavin.nBlockALign} printf (-\nBits per Sample = ta\n*,wavin.wBitsPersanple); /* check channel munber and block size are good */ if (wevin.nchannels > 4 || wavin.nBlockAlign > 8) ( printf(*\nBrror in WAVEfat header ~ Channels/BlockSize\n") exit(1); , fread (édin,sizeot (DATA HDR) ,1, fp_getwav) + prints (=\ntctorctes in.data_typo{0] ,din.data_typel} din.data_type(2] ,din.data_type(31); printf ("\nData Size = 41d bytes ,din.data size); /* e0t the number of samples (global) */ rumber_of_samples = din.data_size/wavin.nBlockAlign; printf Aftwavin.nchannels > 1) ¢ & ( printf(*\nrror Channel Number (0..¥d) - *,wavin.nchannels-1); USTING 45 (Continued) Chap. 4 "\nunber of Samples per Channel = #1d\n* unber_of_sarples!: Sec.4.1 Real-Time FIR and WR Filters 155 A= gotchet) - +0"; Aft < (09) exiten; D whéle(s <0 || i >= wavin.nchannets) ; channel number = , , (jy read data until end of file */ LE (vavin.wltsPersample == 16) ¢ Af (fread (int_data,wavin.nBlockAlign, 1, fp_getwav) = 1) ( flush(); /* flush the output when’ input runs cue */ exit); ) 4 = int_data{channel_nunber! , else { Af (fread (byte_ data, vavin-nBlockALign,1,fp.getwav) t= 1) ( flush); /* flush the output when input runs out */ exit it) > j= byte datalchannel_munber) ; 5 7 0x80; je; ) return ( (float) 5); UsTING 45 (Continued finelude include Hinelude Hinelude nat finelude “weave ht ‘include *rtaape.h" /" code to send samples to a WAV type file format */ /" define BITSIG if want to use 16 bit samples */ /* sendout ~ send sample to disk to similate realtine output */ Static FILE *tp_sendway = ULL; Static DWHORD samples_sent = Ob; /* used by flush for header */ LISTING 4.6 Functions sentoctoutmut) and fiuah() used to emulate feaktime output using WAV format binary data files (contained in 'SENDWAV.C) (Continuech186 Res ime Filtering Chap. g /* wa) format header init +/ ‘static WAVEHDR wout = ( "RIFF", OL}; /* £i11 size at flush */ ‘static CHUNK HDR cout = ( TMAVEEnt * , sizeof (WAVEPORMAT) 7 Static DATA HDR dout = ( “data”, OL }; /* £11 size at flush +/ static WAVEFORMAT wavout = (1, 1, OL, OL, 1, 8} Ant ByteaPersample; short int J; 7* open output file if not done in previous calls */ TEI €>_senduav) ( char #(80); printf(*\nbnter output *.WAV file name 2°): gets(s): fp_sendav = fopen(e, *wb"): AE (fp sendwav) ( print€(*\nBrror opening output *.WAY file in SENDOUT\n*); exit: ) (J+ write out the *.WAV fie format header */ #ifdet BrTsi6 ‘wevout waitaPersample = 16; tiavout ‘nBLoekAL ign = 27 Brint#(*\nUsing 16 Bit Sanples\n"); terse wavout waiterersample = 8; tenaif ‘wavout nSamplesPersec = SAMPLE_RATE; BytesPersample = (int) ceil (wavout waitsPersample/8.0); “wavout ,vgiytesPersec = ByteaPerSample*wavout .nSanplesPerSect ferrite cwout, sizeof (HAVE_IDR) 1, fp_sendev) fwrite scout, sizeof (CHUNK HOR), 1, fp_sendwa) forrite twavout £1 z00f (WAVEFORMAT) ,1, £p_sendwav) : Furie sdout, oizeof (DATA HDR) 1, fp_sendaav) | , /* write the sample and check for exrors */ /* clip output to 16 bits */ = (shore intx: LUSTING 46 (Continued Sec. 4.1 Real-Time FIR and IR Filters 167 ee > 32767.0) 3 = 32767; else if (x < -22768.0) 5 = -32768; fifdef BITSI6 3 "= 0x8000; Af (fwrite(Gj,sizeof(short int) 1, fpsendwav) 1= 1) ( printf (*\nfirror writing 16 Bit output *.WAV Eile in SENDOU\n"); exit(1): }* elip output to 8 bite +/ yi 8 3 “= 0x80; LE(fpube (5. fp_eendav) = BOF) ( printf(*\nBrror writing output *.WAV file in SENDOUT\n*) exits tenait samples_sente+; routine for flush ~ mist call this to update the WAV header + wold flush ‘ int BytesPersample; Bytespersample = (int)ceil (vavout.wBitePerSample/#.0); out .data_size-BytesPersample*samples_sent: Wout chunk si Gout data sizessizeof (DATA HDR) +2izeot (CHUNK HOR) ¢sizeot (WAVERORIAT): (* check for an input WAV header and use the sampling rate, if valid */ if (stenicap(win.chunk id, "RIFF",4) == 0 sk wavin.nsanplesPersee != 0) ( avout nsanplesPersec = wavin.nsamplesversec; wavout navgbytesPersec = BytesPerSanple*wavout nSamplesPerSec; ) Eseek(fp_sendnav, OL, SEBSED); ‘write (wot, 6izeof (WAVE HDR), 1, fp_eendvav) ; furite (icout, sizeof (CHUNK }DR) 1, £_sendaa) furite (awavout, sizeof (WAVEFORMAT),1,fp_sendav) furite (adout, sizeof (DXTA KOR) 1, fp_sendwav) USTING 46 (Continuech153 RealTime Fitering Chay 4 42 FILTERING TO REMOVE NOISE ols is generally unanted and can usualy be redaced by some type of fering Noe canbe highly comelated withthe signal or na completly diferent quency Ban, which cas itis uncomelated. Some types of nose are impulsive innate and cccar tively infequenly, wile ter types of oie appear as narrowband toes near es of interest The most common typeof nose is wideband thermal noise which ngen in the senor or the amplifying electronic circuits. Such noise can often be concn white Gaussian noise, implying that he power spectrum sft andthe distribution ‘mal. The most important considerations in deciding what ype of filer to use to eae oie are the type and characteristics ofthe noise. In many’ cases, very litle bmoes about the nose process contaminating the digital signal and itis usually costly Gn oe, ‘of time and/or money) to find out more about it. One method to study the nose pane mance of a digital system is to generate a model ofthe signal and noise and smu system performance inthis ideal condition. System nose simulation is lustated i ‘next two sections. The simulated performance ean then be compared tothe system pore ‘mance with rel data orto a theoretical mode. 4.2.1. Gaussian Noise Generation The function gaussian (shown in Listing 47) is used for noise generation and is con. tained in the FILTER.C source file, The function has no arguments and retums single random floating-point number. The standard C library function rand is called to eee ate uniformly distributed numbers. The function ana normally returns integers from 0 {0 Some maximum value (a defined constant, RAND_MAX, in ANSI implementations). As shown in Listing 4.7, the integer valves returned by and are converted to float val- es to be used by gaussian. Although the random number generator provided with ‘most C compilers gives good random numbers with uniform distributions and long pe ‘ods, if the random number generator is used in an application that requites truly random, uncorrelated sequences, the generator should be checked carefully. Ifthe rand function is in question, a standard random number generator ean be easily written in C (see Pak and Miller, 1988). The function gaussian returns a zero mean random number with 2 unit variance and a Gaussian (or normal) distribution. It uses the Box-Muller method (see Knuth, 1981; or Pres, Flannary, Teukolsky, and Vetterling, 1987) to map a pair of independent uniformly distributed random variables toa pair of Gaussian random variables, The function rand is used to generate the two uniform variables v1. and v2 from -1 +1, which are transformed using the following statements es vitvi + varva; fac = sqrt (-2.*logir)/x): gstore = vi*fac; gaus = va*facs The X variable isthe radius squared of the random point on the (wa, ¥2) plan. Inthe ‘gaussian function, the x value is tested to insure that itis always les than 1 (which it Sec.42 Filtering to Remove Noise 159 gmussian ~ generates zero mean unit variance Gaussian random nunbers Returns one zero mean unit variance Gaussian randon numbers as a double. Uses the Box-Muller transformation of two uniform random munbers te ccaussian random nunbers. ‘Float gaussiant) c static int reaay J fag to indicated stored value */ static float gstore; [* place to store other value */ static float reonsti = (float) (2,0/RAND_YAX) + static float reonst2 = (float) (RAND_MAX/2.0): ‘float v1.¥2,, fac, gaus: J make two nusbers if none stored */ Sf (ready == 0) { et vA = (Eloat}vand() = roonst2; ¥2 = (£loat}rand() ~ reonst2; vi ‘= const; va *= consti: r=vini + vane; ) whiter > 1.08); /* make radius less than 1 */ (* remap v1 and v2 to two Gausaian munbers */ fac = sqrt (-2.0f"109(r}/2): gstore = vitfac; 7 store one */ aus = v2"fac; /* return one */ ready = 1 (* set ready flag */ , else ( ready = 0; /* reset ready flag for next pair */ gaus = gstore; /* return the stored one */ , return (gaus); LUSTING 47. Function guunsian(). usually is), 50 that the region uniformly covered by (vi, v2) is a circle and so that ‘Jog (x) is always negative and the argument for the square root is positive. The vari. ables gatore and gaus are the resulting independent Gaussian random variables, Because gatusaian must retun one value ata time, the getore variable is a static Aoating-point variable used to store the v1* ac result until the next call to gausaian,160 Real-Time Firing Chap. g ‘The static integer variable ready is used as a flag to indicate if gatore has been stored or if two new Gaussian random numbers should be generated i 15) 4.2.2 Signal-to-Noise Ratio Improvement if One common application of digital fering is signa-t-nolse ratio enhancement. 1 tye signal has limited bandwidth and the noise has a spectrum tat is broad, then after cay be used to remove the part of the noise spectrum that does not overlap the signal spec. trum, If the filter is designed to match the signal perfectly, so that the maximum amon, ‘of noise is removed, then the filter i called a matched or Wiener filter. Wiener fering briefly discussed in section 1.7.1 of chapter I. Figure 4.7 shows a simple example of fikering a single tone with added white noise. The MKGWN program (see Listing 4.8) was used to add Gaussian white noise with a standard deviation of 0.2 toa sine wave at a 0.05 f, frequency as shown in Figue 4:1). The standard deviation of the sine wave signal alone can be easily found tobe (0.7107. Because the standard deviation ofthe added noise is 0.2, the signal-to-noise ratio @ Sample Value 05} of the noisy signal is 3.55 or 11.0 dB. Figure 4.7(b) shows the result of applying the 35. “9 0 700 150 300 350 tap lowpass FIR filter tothe noisy signal. Note that much of the noise is sill presen ba fecce nee is smaller and has predominantly low frequency components. By lowpass filering the Program MKGWN.C Ou 250 noise samples added to the sine wave separately, the signal-to-noise ratio of Figue 4.7(b) can be estimated to be 15 dB. Thus, the filtering operation improved the signal noise ratio by 4 dB. 15) 4.3 SAMPLE RATE CONVERSION Many signal processing applications require thatthe output sampling rate be diffrent than the input sampling rate, Sometimes one secon ofa system can be made mor eff- cient if the sampling rate is lower (suchas when simple FIR filter are involved o ina ‘uansmission) In other cases, the sampling rate must be increased so that the spectral de tals ofthe signal canbe easily identified. In either case, the input sampled signal must be resampled to generate a new outpat sequence with the Same spectral characteristics bu at 2 different sampling rat, Increasing the sampling rate is called interpolation or upsom- piling. Reducing the sampling rac is called decimation or downsampling, Normal, the ampling rate of band limited signal ean be interpolated or decimated by integer raion Such that the spectral content ofthe signal is unchanged. By cascading interpolation and decimation, the sampling rate of a signal can be changed by any rational fraction, PM, where P is the integer interpolation ratio and M is the integer decimation rao. Interpolation and decimation can be performed using filtering techniques (as describe ia this section) or by using the fast Fourier transform (see section 44.2). ‘Decimation is perhaps the simplest resampling technique becase it involves edu ing the number of samples per second required to representa signal. Ifthe input ina i suietly bandstimited such that the signal spectrum is zero for all frequencies above JOM), then decimation can be performed by simply retaining every Mth sample and ©) Sample Valoe + Sample Number FIGURE 4.7 MKGWN program example output. Filtering 2 sine wave with added noise (frequency = 0.05). (a) Unitered version with Gaussian noise (standard deviation = 0.2. (b) Output after lowpass filtering with 35-point FAR ite 18162 Real-Time Fittering Chop. 4 Hinctude finelude include Winetuda “reaepe.h* include “filter. h* MKGIRI.C - Gaussian Noise Filter Beample ‘This program performs filters a sine wave with added Gaussian noise Ht portorms the filtering to implement a 35 point FIR filter (stored in variable fir_Ipt35) on an generated gignal ‘The filter is a LPF with 40 a out of band rejection. the 3 dB point is at @ relative frequency of approximately .25*fs Aloet sigma = 0.2; void mint) ‘ Alone 2, static float hist(341; for(i = 0; 42250; ive) ( X = sin(0.05*2*PT*1) + signa*gaussian\) sendout (fir_fiiter (x, fix 1pf35,35, hist}; LUSTING 48 Program MKGWN to add Gou wave and then perform FR filtering ian white noise to cosine discarding the M ~1 samples in between. Unforunaely, th spectral content of sgn above {10M is rarely zero, and the aliasing eased by the simple decimation amet ways causes trouble. Even when te desired signal i ero above 23), some amount of ‘noise is usually present that will alias into the lower frequency signal spectrum. Aliasing due to decimation canbe avoided by lowpass fering the signal before the samples a decimated. For example, when M = 2, the 3S-poin lowpass FIR filter intedoced in se. tion 4.1.2 canbe sed to eliminate almost ll spectral content above 0.25f; (he ateaor tion above 0.25 f, is greater than 40 4B) A simple decimation program could then be used (0 reduce the sampling rate bya factor of two, An TR lowpase iter (discussed in Section 4.13) could aso be used o eliminate the frequencies above / (2M) as lng 3s linear phase response isnt required ‘Sec.43 Sample Rate Conversion 163 43.1 FIR Interpolation Interpolation isthe process of computing new samples in the intervals between existing «data points. Classical interpolation (used before calculators and computers) involves esti. ‘mating the value of a function between existing data points by fiting the data to a low order polynomial. For example, linear (first-order) or quadratic (second-order) polyno. ‘ial interpolation is often used. The primary attraction of polynomial interpolation is ‘computational simplicity. The primary disadvantage is that in signal processing, the input signal must be restricted to a very narrow band so thatthe output will not have a large amount of aliasing, Thus, band-limited interpolation using digital filters is usually the ‘method of choice in digital signal processing. Band-limited interpolation by a factor P-1 (see Figure 4.8 for an illustration of 3:1 interpolation) involves the following conceptual steps (1) Make an output sequence P times longer than the input sequence. Place the input sequence in the output sequence every P samples and place P — 1 zero values be- {tween each input sample. This is called zero-packing (as opposed to zero-padding), ‘The zero values are located where the new interpolated values will appear, The ef. fect of zero-packing on the input signal spectrum is to replicate the spectrum P ‘times within the output spectrum. This is illustrated in Figure 4 8(a) where the out- Put sampling rate is three times the input sampling rate. (2) Design a lowpass filter capable of attenuating the undesired P — 1 spectra above the ‘original input spectrum. Ideally, the passband should be from 0 to /A2P) and the Stopband should be from f,2P) to ,/2 (where fis the filter sampling rate that is PP times the input sampling rate). A more practical interpolation filter has a trans tion band centered about ,(2P). This is illustrated in Figure 4.8(b). The passband sain of this filter must be equal to P to compensate forthe inserted zeros so thatthe ‘original signal amplitude is preserved, Filler the zero-packed input sequence using the interpolation filter to generate the final P:1 interpolated signal. Figure 4.8(c) shows the resulting 3:1 interpolated spectrum. Note thatthe two repeated spectra are attenuated by the stopband attenuation of the interpolation filter. In general, the stopband attenuation of the filter ‘must be greater than the signal-to-noise ratio of the input signal in ordet forthe in {erpolated signal to be a valid representation ofthe input @ 4.3.2 Real-Time interpolation Followed by Decimation, Figure 4.8(0) illustrates 2:1 decimation after the 3:1 interpolation, and shows the spee- ‘um of the final signal, which has a sampling rate 1.5 times the input sampling tate. ‘Because no lowpass filtering (other than the filtering by the 3:1 interpolation filter) is per. formed before the decimation shown, the output signal near "72 has an unusually shaped Power spectrum due to the aliasing of the 3:1 interpolated spectrum. If this aliasing ‘causes a problem in the system that processes the interpolated output signal, it can be108 Real-Time Fitering Chay. Input Spectrum, toot i 3 3 equa Ts woe Bw Seale" (a) a1, iteration et Reshonce [ye torpotatos 7 £ a Sele? 2 (o) at Interalatonf iput Interpolated Spectum ma a TN pe a £ a Seale © @ FIGURE 42. Illustration of 3:1 interpolation followed by 2:1 decimation. The alased input spectrum inthe decimated output is shown with a dashed lin. (a Example ral input spectrum. (b) 3:1 Interpolation fier response (f; = 3fj. (e) 2:7 interpolated spectrum: (21 decimated output ("= 12/2. eliminated by either lowpass filtering the signal before decimation or by designing the i- terpolation filter to further attenuate the replicated spectra. ‘The interpolation filter used to create the interpolated values can be an UR ot FR lowpass filter. However, if an IIR filter is used the input samples are not preserved €*- actly because of the nonlinear phase response of the TTR filter. FIR interpolation fiers ‘ean be designed such thatthe input samples are preserved, which also reslts in som ‘computational savings in the implementation. For this reason, only the implementation of FIR interpolation willbe considered further, The FIR lowpass filter required for intepo- Iation can be designed using the simpler windowing techniques, In this section, a Kai Sec.43 Sample Rate Conversion 165 window is used to design 2:1 and 3:1 interpolators. The FIR filter length must be odd so thatthe filter delay is an integer number of samples and the input samples can be preserved, The passband and stopband must be specified such that the center coefficient of the filter is unity (the fikter gain will be P) and P coefficients on each side of the filter ‘center are zero, This insures thatthe original input samples are preserved, because the re- Salt of all the multiples in the convolution is zero, except for the center filter coefficient that gives the input sample. The other P — 1 output samples between each original input sample are created by convolutions with the other coefficients of the filter. The folowing. passband and stopband specifications will be used to ilustrate a P-1 interpolation filter: Passband frequencies: 0-0.8f,(2P) ‘Stopband frequencies: 1.2f,/(2P)-0.5 f, Passband gain: P Passband ripple: <0.03 4B Stopband attenuation: > 56 dB ‘The filter length was determined to be 16P ~ 1 using Equation (4.2) (rounding to the ‘nearest odd length) and the passband and stopband specifications. Greater stopband atten ‘uation or a smaller transition band can be obtained with a longer filter. The interpolation filter coefficients are obtained by multiplying the Kaiser window coefficients by the ideal lowpass filter coefficients. The ideal lowpass coefficients for a very long odd length filter with a cutoff frequency of f,/2P are given by the following sinc function: sink! P) a -7Se [Note thatthe original input samples are preserved, because the coefficients are zero for all k= nP, where n is an integer greater than zero and cp = 1. Very poor stopband attenuation would result if the above coefficients were truncated by using the I6P — 1 coeff cients where I< 8P. However, by multiplying these coefficients by the appropriate Kaiser window, the stopband and passband specifications can be realized. The symmetsi- cal Kaiser window, w,, is given by the following expression: 2 Alea fpl-GEY JB) TM nme tr ee ein oe et te Ht er) is fon es ts 2 ete ao, m= * an‘Magnitade (4B) oe © Magritde (48) 8 FIGURE 4 (a! Frequoncy response of 31-point FIR 2:1 interpolation fer (gain = 2 or 6 48. (8) Frequency response of 47-pont FIR 3:1 interpolation ‘er (gain = 3 or 9.64 48). 166 FIR iter Frequency Response Frequency (Oe) posve Sec. 4:3 Sample Rate Conversion 167 43.3 Real-Time Sample Rate Conversion Listing 49 shows the example interpolation program INTERP3.C, which can be used to interpolate a signal by a factor of 3. Two coefficient arrays ae initialized to have the dec- mated coefficients each with 16 coefficients. Each of the coefficient sets are then used individually with the £4x_€11ter function to create the interpolated values to be sent to sendout (). The original input signal is copied without filtering tothe output every P sample (where P is 3). Thus, compared to direct filtering using the 47-point original fi. ter, 15 multiplies for each input sample are saved when interpolation is performed using INTERPS, Note thatthe rate of output must be exactly thre times the rate of input for this program to work in a real-time system, tinclude finelude finelude Bnelude “redspe.ht {INTERP3.C - PROGRAM TO DIQONSTRATE 3:1 FIR FILTER INDERPOLATLOH USES (WO INTERPOLATION FILTERS AND MIVTIPLEE CALLS 10 THE REAL TIME PILTER FUNCTION fir_fi2ter() main() fi a a ‘oat signal_in; interpolation coefficients for the decimated filters */ static float coef3i (16) ,coef32 (161; Distory arrays for the decimated filters */ static float hist31{15],.hist32[15]; 3:1 interpolation coefficients, PB 0-0.133, 58 0.2-0.5 +/ static float interp3{47] = { ~0.00178662, ~0.00275941, 0., 0.00556927, 0.00749928, -0.01268113, ~0.01606336, 0., 0102482278, 003041984, ~0.04de4s86, -0.05417098, 0., 0.07917613, 0.09644332, ~0.14927754, -0.19365910, 0., 0.40682136, 062363913, 0.82363913, 0.40682136, 0, -0.19365910, -0.14927754, 9.09684322, 0.07917613, 0., ~0.08417098, -0.04e84686, 803041984, 0.02482278, 0., -0.01606336, -0.01268113, LUSTING 49 Example INTERP3.C program for 3:1. FIR interpolation. (Continued168 Chep. 4 0.00749928, 0.00556927, 0., ~0.00275941, -0.00178662 ” for(i =0 74 <16 ie) coet3 ti} sinverp3 (3441 for(i = 0 7 1 < 16; 144) coo#32(4} = interps(3*iedl: J make three samples for cach input */ For(s:) ‘signal_in = getinput () sendout(hist31(7}); /* delayed input */ sendout (fir_filter (signal_in, coef31, 16,hist31)) Rendout (fix fiver (eignal_in, coef32_16,hist32)); LUSTING 49 (Continued) Figure 4.10 shows the result of running the INTERP3.C program on the WAVE3 DAT data file contained on the disk (the sum of frequencies 0.01, 0.02 and 0.4). Figure 4,10(a) shows the original data. The result of the 3:1 interpolation ratio is shown in Figure 4.10(0), Note thatthe definition ofthe highest frequency in the original dataset (04 fis ‘much improved, because in Figure 4.10(b) there are 7.5 samples per cycle of the highest, frequency. The startup effects and the 23 sample delay of the 47-point interpolation fier is also easy to see in Figure 4.10(b) when compared to Figure 4.10(a). | FAST FILTERING ALGORITHMS ‘The FFT is an extremely useful tool for spectral analysis. However, another important ap plication for which FFTS are often used is fast convolution. The formulas for convolution ‘were given in chapter 1. Most often a relatively short sequence 20 to 200 points i length (for example, an FIR filter) must be convolved with a number of longer input sequences ‘The input sequence length might be 1,000 samples or greater and may be changing with ‘time as new data samples are taken, ‘One method for computation given this problem is straight implementation of the time domain convolution equation as discussed extensively in chapter 4. The number of real ‘multiplies required is M * (NV ~ M+ 1), where Nis the input signal size and M isthe length of the FIR filter to be convolved with the input signal. There isan altemative to this mther lengthy computation method—the convolution theorem. The convolution theorem states that time domain convolution is equivalent to multiplication inthe frequency domain The ‘convolution equation above can be rewritten inthe frequency domain as follows ¥()= HR) XO a8) ‘Because interpolation is also a filtering operation, fast interpolation can also be pet~ formed in the frequency domain using the FFT. The next section describes the implemen @ ©) ‘Sample Vane ‘Sample Value Program INTERP3.C Input (WAVE3.DAT) os 08} oa 1 od] pies 13030000 SOO SwOSCOO Sample Number FIGURE 410 (0) Example of INTERP3 for 3:1 interpolation. Original WAVES DAT. (b) 3:1 interpolated WAVES DAT output geen70 Soc. 4.4 Fast Filtering Algorithms. ™m Time Ftering Chap ‘The program RFAST (soe Listing 4.10) illustrates the use of the ££ function for fast convolution (see Listing 4.11 for a C language implementation. Note that the inverse FFT is performed by swapping the real and imaginary pars of the input and out. put of the ££ function. The overlap and save method is used to filter the on finwous real-time input and generate a continuous output from the 1024 point FFT. The convolution problem is filtering with the 35-tap low pass FIR filter as was used in sec. tion 422. The filter is defined in the FILTER H header file (variable £42 1p£35), ‘The RFAST program can be used to generate results similar to the result shown in Figure 4.700) {ation of realtime filters using FFT fast convolution methods, ad section 4.4.2 ‘real-time implementation of frequency domain interpolation, oe 4.4.1 Fast Convolution Using FFT Methods ation (8) indicates tt if he frequency domain representations of Mi) an. ae ‘own, then Mca be calculated by simple mutipicaon, The goes be obuined by inverse Fourier tansorm. This sequence of epee oo ‘() Create the array H(k) from the impulse response h(i) using the FFT. Od mes @) Create the aray X(k) from the sequence »(n) using the FFT. 3) Multiply 4 by X point by point thereby obtaining YW) (@) Apply the inverse FFT to ¥() in order to create yn). Hinclude finelude include Hinclude (7 Teverse ££t the multiplied sequences */ ‘ft (eam, (7 Write the result out to a dsp data file */ 7 because a forvard FFT was used for the inverse FFT, ustina: 10 (Continuoch Fast Flt Chap. 4 sec. 44 19 Algorithms. 173 ‘ne output is in the imag part */ for(i = PILTERLENGT 7 i < FPLLENGTH ; it) sendout (samp(i) imag) > y+ overlap the last FIVER LENGTI-1 input data points in the next FFT */ for(i = 0; \< FILTER LENGTH ; ist) ( fsanp(i] real = input_save(): senp[il imag = 0.0; , for( 7 $< FPRLENGIH-PIUTER LENGTH; e+) ( ‘sampli} real = getinput somplil imag = 0.0; > eave the last FILTER LENGTH points for next time */ for(j = 0; 3 < FILTER LENGTH ; j++, Lev) ( fimput_savel3] = sampii] real = getinput() samp{i] imag = 0.0; UsTING 4.10 (Continued ft - in-place radix 2 decimation in time FFT requires pointer ta complex array, x and power of 2 size of FET, m (size of FFT = 2"). Places FFT output on top of input COMPLEX arrey. void fe (COMPLEX ‘x, int m) void ft (COMPLEX *x,int m) static COMPLEX tw: static int mstore = 0; static int n= 1; /* used to atore the w complex array */ /* stores m for future reference */ (* length of {ft stored for future */ Compu u, vamp, em; COMPLE Hei, nxip, tx, twtr: int 4,4, 1,1e,windex: double arg,w_real,w_imag,weecur_real, wrecur_imag,wcenp_real: LUSTING 4.11, Radix 2 FFT function eee (xm). (Continued™ Real-Time Fikering itm mstare) ( /* free previously allocated storage and set new m */ ifmstore t= 0) free(w); if(m == 0) return; /* A£ m0 then done */ Jt m= 20m = ££ length +/ nelcm de = 0/2; /* allocate the storage for w */ = (COMPLEX *) calloc(1e-1, sizeof (COMPLEX) itcw) exit); > /* calculate the w values recursively */ arg = 4.0%atan(2.0)/ weecur teal = wreal /* Pr/te calculation */ cos (arg) + ~sin(arg); xj->real = (float)wrecurresl; xS->imag = (float )wrecurinag, Xie; weemp_real = wrecur_real*w_real - wrecur_tmag*w_imag; wrecurinag = wrecur_real"«_inag + wrecur_inagr. resi; wrecur_real = weemp_real; le =n; windexe for (=O; Lem; 144) ¢ les (* first iteration with no multiplies */ forts xt Leds ane ¢ STING 4.11. (Continued Sec. 4.4 Fast Filtering Algorithms 175 tenp.real = xi->real + xip-sreal, emp. dimag = 3i->imag + xip->imag? xip->real = xi-szeal ~ xip-real; xip->inag = xi—>imag ~ xip->inag? *xi = temp; d /* venaining iterations use stored w */ wpte = w + windex = 2: for = 1; 5 < le: je ¢ ws supte: for Ge js hens ieie ane ¢ xiaked xip = xi + Ie; emp.real = xd->real + xip->real; temp. nag = xi->imag + xip->inag tm.real = xi->real ~ xip->real: tm. imag = xi->imag ~ xip->inags xip->real © tm.real*u.real ~ tm.imagru,inag: xip->img = tm.real*a,inag + tm.imagru.real! oxi = temp) , wotr = wper + windex: ) windex = 24windex; ) /* rearrange data by bit reversing */ i=; for (i= 17 4 < (md); ine ¢ k= m2; whitetk <= 3) ( jei-k: kee k/2; ) Sa jek: ie Ges) ¢ xbox + $5 =x eG temp = td; oxi = xls ex = temp; LUSTING 4.11. (Continued16 Real-Time Fikering Chap, 4 4.42 Interpolation Using the FFT In section 43.2 time domain interpolation was discussed and demonstrated using sever) short FIR filters, In this section, the same proces is demonstrated using FFT techniques ‘The steps involved in 2:1 interpolation using the FFT are as follows: (1) Perform an FFT with a power of 2 length (N) which i grater than or qual tthe Jength ofthe input sequence. (@) Zero pad the frequency domain representation of the signal (a complex ara) by inserting 1 ~ | zeros between the postive and negative half ofthe spectrum, Te Nyguist frequency sample output of the FFT (atthe index N/2) i divided by ? ang placed with the positive and negative parts ofthe spectrum, this results ina sym. metrical spectrum fora real input signal (@) Perfonm an inverse FFT with a length of 2N. (@ Multiply the interpolated result by a factor of 2 and copy the desired portion ofthe result that represent the interpolated input, this is all the inverse FFT samples if the input length was a power of 2. Listing 4.12 shows the program INTFFT2.C that performs 2:1 interpolation using the above procedure and the ££t function (shown in Listing 4.11). Note thatthe inverse FFT is performed by swapping the real and imaginary pars ofthe input and output ofthe ‘ft function, Figure 4.11 shows the result of using the INTFFT2 program on the 128 ‘samples of the WAVE3.DAT input file used in the previous examples in this chapter (hese 256 samples are shown in detail in Figure 4.10(a)). Note thatthe output length is. twice as large (512) and more ofthe sine wave nature of the waveform can be seen in the interpolated result. The INTFFT2 program can be modified to interpolate by a larger power of 2 by increasing the number of zeros added in step (2) listed above. Also, because the FFT is employed, frequencies as high as the Nyquist rate can be accurately interpolated. FIR filter interpolation has a upper frequency limit because of the frequency response of the filter (see section 4.3.1) Hnclude Hinelude include $O then the output will decay toward 2ero and the peak will occur at

tan“ovd)

(4.10)

The peak value wl be
pnt) = any
Np =

‘A second-order difference can be used to generate a response that is an approximation of
this continuous time output, The equation for a second-order discrete time oscillator is
‘based on an IIR filter and is as follows:

Da Yaa + Dkye (4.12)
‘where the x input is only present for r= 0 as an intial condition to start the oscillator and

Yana

6, =22"* costar)
20

‘where ts the sampling peviod (If) and is 2x times the oscillator frequency.

‘The frequency and rate of change of the envelope of the oscillator output can be
changed by modifying the values of d and « on a sample by sample basis. This is illus.
twated in the OSC program shown in Listing 4.13, The output waveform grows from a
‘Peak value of 1.0 to a peak value of 16000 at sample number 5000. After sample 5000
the envelope ofthe output decays toward zero and the frequency is reduced in steps every
1000 samples. A short example output waveform is shown in Figure 4.12.

45.2 Table Generated Waveforms

Listing 4.14 shows the program WAVETAB.C, which generates a fundamental fre-
‘quency ata particular musical note given by the variable key. The frequency in Hertz is
‘elated tothe integer key as follows:

f= aqoez!on a1)

Thus, a key value of zero will give 440 Hz, which is the musical note A above mid-
de C, The WAVETAB.C program stats ata key value of -24 ((wo octaves below A) and
Steps through a chromatic scale to key value 48 (4 octaves above A). Each sample output
‘value is calculated using a linear interpolation of the 300 values in the table gwave. The
300 sample values are shown in Figure 4.13 as an example waveform. The gwave array is
301 clements to make the interpolation more efficient. The first element (0) and the last ele
‘ment (300) are the same, creating 2 circular interpolated waveform. Any waveform can be
Substituted to create different sounds. The amplitude ofthe output is controlled by the env
Variable, and grows and decays at arate determined by tare and amp arrays.Finclude ‘nelude amen. 2

Hinelude *rtdspe.ht

Float ose(float, float, int);
Hloat rate, fr
Heat amp + 16000;

‘void sain()
C
ong int ilength = 100000;

J calculate the rate required to get to desired amp in 5000 somples +/
Fate = (float) exp (Logtamp) /(5000.0));

J start at 4000 He */
‘freq = 4000.0;

J first call to start up oscillator */
sendout (ose (freq, rate,-1))
J special case for first 5000 samples to increase amplitude */
for(i = 0; 1 < $000 7 iss)
‘endout (ose (freq, rate, 0));

J decay the ose 10% every 5000 samples */
rate = (£loat)exp (1030-9) /(5000.0))

for( ; 4 < length ;
4£((442000)|
freq = 0.98*freqz
sendout (250 (fred, rate, 1));

(J change freq every 1000 samples */

,
else ( (/* pomal case */
ssendout (ose (freq, rate, 0));
)
d
Flush ():
)

/* Runction to generate samples from a second order oscillator
rate = envelope rate of change paraneter (close to 1)
cchange_flag = indicates that frequency and/or rate have changed
”

float ose(float freq, float rate, int change_fag)
f
/* calculate this asa static eo it never happens again */
Static float tw_pi_div sample rate = (float) (2.0 * PI / SAMPLE_RATE)
static float yl.y0,a,b,arg:
Float out ,wose;

LUSTING 4.13. Program OSC to generate 8 sine wave signal witha variable
frequency and orvelope using a second order IR section. (Continued)

/* change_flag:
“I= start new sequence fran t=0
0 = no change, generate next sample in sequence
1 = change rate or frequency after start

”
it(change_flag = 0) (
J assume rate and freq change every tine */
wosc = freq * two_pidiv sample rate:
0 * cos (wase) 7

At(change_flag < 0) ( /* re-start case, set state variables */
yO = 0.087
Fetum(yl = ratetsin(wose));
>
)
/* wake new sample */

LUSTING 4:13. (Continued)

Program OSC-C Outpt

0:

&

|

i

300 00015003000 30003500
Sample Number

‘Sample Vale

o:

FIGURE 4.12 Example signal output from the OSC.C program (modified to
‘each peak amplitude in 600 samples and change frequency every 500 sam-
Ble for display purposesSec.4.5 Oscillators and Wave Form Synthesis. 183

we
Hinctude AEG ee chronkated]) rate = ravee(+sell:
Hnctude cnach-b> . oer ee
finelude “reaape.b* (dete Leela spe value tron table */
include “awave:h + gwavet301) array + = neiphnae,
finctuce ’ (son) a frac = phase ~ (float)k;
+ wavetable Misic Generator 4-20-94 A « sample = awave lk);
, apie ts : nis delta = gwave(k+l] - sample; /* possible wave_size+] access */
Semple s=frectdeltay

sine keys /* calculate eutpit snd send to DAC */
void main siguout = envveanple:
t “ ° sendout (sig_out} ;

sine t,toldvet /* eateulate next phase vale */

oldscl, pt hen

float ampold, rate, env, wave_size,dec, phase, frac, delta, sample;

ee aera ,_S#tPhace >= wave_size) phase -= wave_size;

aoe ,
static float trel[5) ={ 0.02, 0.14, 0.6, 1.0, 0.0 3; Htee(hy
static float amps(5} = { 1000.0 , 10000.0, 4000.0, 10.0, 0.0 }; )
ete oe
ogee
a en
matteo" gmt stetam
Seo NE af / |
ee 7
a
(eben te ed ne ne |
a Sle °. 3 o2}/ ~
empold = 1.0; J* always starts at unity */ ary
este ;
t = trel[i}*endi; & 02
Sy eee
nla 7 |
See
phase = 0.0; oa]
eee is se SS,

for(i = 0; i < andi; ies) ¢

/* calculate envelope aeplitude */
FIGURE 4.13 Exemple waveform (owre(3011 array) used by program

LISTING 4.14. Program WAVETAB to generate periodic waveform at any WAVETAB.
frequency. (Continuechoy Real-Time Fikering Chap 4

4.6 REFERENCES:

‘ANTONIOU, A. (1979). Digital Fiters: Analysis and Design. New York: MeGraw Hil

[BRIGHAM, E.O. (1988). The Far Fourier Transform and Its Applications. Englewood Cit, ny
Prentice Hall Bu

CCroctene, RIE and RABINER, LR, (1983), Multrate Digital Signal Processing. Eaglewoog
(Cliffs, Ni: Prentice Hal,

FLiorr, DEF. (Ed). (1987), Handbook of Digital Signal Processing. San Diego, CA: Academic
Press,

wanes, P. and KiwsLe B. (1991). C Language Algorithms for Digital Signal Processing
Englewood Cliffs, NJ Prentice Hall

Guaus, MS. and LAKER, KR. (1981. Modem Filter Design: Active RC and Swiched
Capacitor. Englewood Cifs, NI: Prentice Hall

GEE DIGITAL SIGNAL PROCESSING COMMITTEE (EA). (1979). Programe for Digital Signal
Processing. New York: IEEE Press

JoHNsON, D.E, JOHNSON, LR. and MOORE, HLP. (1980). 4 Handbook of Active Filter. Englewood
(Cliffs, NI: Pretie Hal

JONG, MIT. (1992). Methods of Discrete Signal and System Analysis. New York: McGraw-Hill

KAS JF and SCHAPER, R. W. (Feb. 1980). On the Use of the /-Sinh Window for Specram
"Analysis, IEEE Transactions on Acoustics, Speech, and Signal Procesing, (ASSP-28)().
105-107

KNUTH, DE. (1981) Seminumerical Algorithms, The Art of Computer Programming, Vol. 2 2nd
ed). Reading, MA: Addison-Wesley.

(MCCLELLAN, 3, PARKS, T. and RABINER, L.R. (1973). A Computer Program for Designing
‘Optimum FIR Linear Phase Digital Filters. IEEE Transactions on Audio and Electro-acoustic,
AU-21. (6, 506-526,

MoLER, C:, LITLE, J. and BANGERT, S. (1987). PC-MATLAB User's Guide. Natick, MA: The
Math Werks,

Moscrtuvrz, GS. and HORN, P (1981), Active Filler Design Handbook New York: John Wiley &
Sons.

OprENHEDM, A. and SCHAFER, R. (1975). Digizal Signal Processing, Englewood Cliffs, NI:
Prentice Hal,

(Orpen, A and SCHAPER, R. (1985). Discretesime Signal Processing, Englewood Cis, NS
Preatoe Hal,

Parouus, A. (1984). Probabiliy, Random Variables and Stochastic Processes, (20d). New
York: MeGrase-Hil.

Pak, $.K. and MILLER, K.W. (Oct, 1988) Random Number Generators: Good Ones Are Hod 10
Find, Communications ofthe ACM, (31)(10)-

PARKS, TW. and BURRUS, CS (1987) Digital Filter Desig. New York: John Wiley & Sons.

Press WH, FLANNERY, BP, TEUKOLSKY, S.A. and VETTERLING, W.T, (1987), Numerical
Recipes. New York: Cambridge Press.

RasineR, L. and Gotp, B. (1975). Theory and Application of Digital Signal Process.
Englewood Cliffs, Ni: Prete Hall,

Sec.46 References 185

STEARNS, S. and DaviD, R (1988). Signal Processing Algorithms. Englewood Cis, NI: Pretice
Hall

‘VAN VALKENBUKG, MLE, (1982). Analog Fer Design. Now York: Hot, Rinehart and Winston.
‘Zvenev, A. L.(1967). Handbook of Filter Symhess, New York: John Wiley & Sons.CHAPTER 5

REAL-TIME DSP APPLICATIONS

‘This chapter combines the DSP principles described in the previous chapters withthe
specifications of realtime systems designed to solve real-world problems and provide
complete software solutions for several DSP applications. Applications of FFT spectrum
analysis are described in section 5.1. Speech and music processing are considered in see.
tions 5.3 and 5.4. Adaptive signal processing methods are illustrated in section 8.2 (par.
‘metric signal modeling) and section 5.5 (adaptive frequency tracking).

FFT POWER SPECTRUM ESTIMATION

‘Signals found in most practical DSP systems do not have a constant power spectrum. The
spectrum of radar signals, communication signals, and voice waveforms change contin.
ally with time. This means thatthe FFT of a single set of samples is of very imited wwe
More often a series of spectra are required at time intervals determined by the typeof sig
nal and information to be extracted.

Power spectral estimation using FFTs provides these power spectrum snapshots
(called periodograms). The average of a series of periodograms of the signal is used as
the estimate ofthe spectrum of the signal ata particular time. The parameters ofthe aver=
‘age periodogram spectral estimate are:

(1) Sample rate: Determines maximum frequency to be estimated

(@) Length of FFT: Determines the resolution (smallest frequency difference detectable)

@) Window: Determines the amount of spectral leakage and affects resolution and
noise floor

185

Sec. 5.1 FFT Power Spectrum Estimation 197

(4) Amount of overlap between successive spectra: Determines accuracy of the esti-
‘mate, directly affects computation time

(8) Number of specira averaged: Determines maximum rate of change of the detectable
spectra and directly affects the noise loorof the estimate

5.1.1 Speech Spectrum Analysis,

ne ofthe common application areas for power spectral estimation i speech processing.
‘The power spectra ofa voice signal give essential clues tothe sound being made by the
speaker. Almost all the information in voice signals i contained in frequencies below
3,500 Hz. A common voice sampling frequency that gives some margin above the
Nyauist rate is 8,000 Hz. The spectrum of a typical voice signal changes significantly
every 10 msec or 80 samples at 8,000 Hz. As a result, popular FET sizes for speech pro.
cessing are 64 and 128 points

Included on the MS-DOS disk with this book isa file called CHKL.TXT. This is
the recorded voice of the author saying the words “chicken litle.” These sounds were
chosen because of the range of interesting spectra that they produced. By looking at «plot
of the CHKL.TXT samples (see Figure 5.1) the break between words can be seen and the

[CHKL-TXT Speech Samples

150, —_—

Sample Valve

0 100020003000 4000 $000 6000
Sample Number

ial CHKL.TXT data flo consisting ofthe author's words
ample at 8 Kis (5000 samples are shown).1%

Chap. 5

relative volume can be inferred from the envelope of the waveform. The frequency
tent is more difficult to determine from this plot Frequency con.

‘The program RTPSE (see Listing 5.1) accepts continuous input samples (aig
getinput ()) and generates a continuous set of spectral estimates. The power specng
‘estimation parameters, such as FFT length, overlap, and number of spectra averaged, ne
set by the program to default values. The amount of overlap and averaging ‘can fe
changed in realtime. RTPSE produces an output consisting ofa spectral estimate every 4
‘input samples. Each power spectral estimate is the average spectrum ofthe input file ver
the past 128 samples (16 FFT outputs are averaged together).

Figure 5.2 shows a contour plot of the resulting spectra plotted as a a
sus time plot withthe amplitude ofthe spectrum indicated by the contours. The high re

Hinclude include include ‘include Winclude *reazpe.ht

foo

RADPRCC.C ~ Real-Time Radar processing

‘This program subtracts successive complex echo signals to
Femove stationary targets from a radar signal and then
does power spectral estimation on the resulting samples.
The mean frequency is then estimated by finding the peak of the

Requires complex input (stored real, imag) with 12
consecutive samples representing 12 range locations from
jeach echo.

(J FRT length mst be a power of 2 */
faefine PFT_LEOTH 16
Adetine M4 /* mist be 1og2(FRT BOTH) */
define scuo_stze 12

void main()
c

int Sik

float tempflt,rin,4in,pl.p2;

static float magiFrs LONGI];
Static COMPLEX echoe [BCHO_SIZE] {FFT_LNGI);
static COMPLEX last_echo(BCHO_S1Ze};

LUSTING 52. Program RADPROC to perform realtime rad
ing using the FFT. (Continued)12

J+ read in the first echo */

for(i = 0; 4 < BOMOSIZE ; ise) (
1Last_echo(i} .real

Last_echo(il .inag

>

fortis)
for (Je0; j< FPRLENOTH: J+) (

J+ remove stationary targets by subtracting pairs (highpass filter) +/
for (keO; kee BCHO_SIZB; k++) (
rin = getinput
iin = getinput();
fechos{kl (j] real = rin ~ last_echotk} -reat:
fechos{k) {3} nag ¢ iin - Jast_echo(k] - imag;
ast_echo[k) real = rin;
ast_echo[k} imag = tin;
)
)
J do FFTs on each range sample */
for (k=0; ke BMOSIZE; kee) (

fe (ochos hk) 20)

for(j = 0; 5 < FPRLENGME; 544)
tenpflt = echosk}{j]-real * echos (ky [J] real:
terp£lt 4 echos{k} J] imag * echos(kd [3] imag:
noglj} = tempélts

)

(J find the biggest magnitude spectral bin and output */

tempt = magl0};

iso;
for(i = 1; 4 < FPMLAMOMH ; J++) (
Aftmagl3] > tempeit) (

tempélt = magl3];

,
,

/* interpolate the peak loacation */
pi = mag(i] ~ magli-1};
2 = magli} ~ magii+1)
‘sendout ((float)i + (pl-p2)/(2# (plep2rle-30)))

)
)
,

LUSTING 52 (Continued

Sec.6.2 Paramotric Spectral Estimation 193
added to a moving target signal with Gaussian noise. The data is actually a 2D matrix
representing 12 consecutive complex samples (rel,imag) along the echo in time (repre-
‘senting 12 consecutive range locations) with each of 33 echoes following one after an
‘other. The sampling rates and target speeds are not important tothe illustration ofthe pro-
‘gram. The output of the program is the peak frequency location from the 16-point FFT in
bins (Oto 8 are positive frequencies and 9 to 15 are ~7 to -1 negative frequency bins). A
simple (and efficient) parabolic interpolation is used to give a fractional output inthe re-
sults. The output from the RADPROC program using the RADAR.DAT as input is 24
‘consecutive numbers with a mean value of 11 and a small standard deviation duc to the
added noise. The first 12 numbers are from the first set of 16 echoes and the last 12 num-
bers are from the remaining echoes.

52 PARAMETRIC SPECTRAL ESTIMATION

“The parametric approach to spectral estimation attempts to describe a signal as a result
from a simple system model with a random process as input. The result ofthe estimator is
1 small number of parameters that completely characterize the system model, If the
‘model is a good choice, then the spectrum of the signal model and the spectrum from
‘other spectral estimators should be similar. The most common parametric spectral estima-
tion models are based on AR, MA, or ARMA random process models as discussed in
section 1.6.6 of chapter 1. Two simple applications of these models are presented in the
next two sections,

5.2.1 ARMA Modeling of Signals.

Figure 5.3 shows the block diagram of a system modeling problem that will be used to
Iustrate the adaptive IIR LMS algorithm discussed in detail in section 1.7.2 of chapter 1
isting 5.3 shows the main program ARMA.C, which first filers white noise (generated
using the Gaussian noise generator described in section 4.2.1 of chapter 4) using a second-
‘order IR filter, and then uses the LMS algorithm to adaptively determine the filter function.
Listing 5.4 shows the function 44x biquad, which is used to filter the white
noise, and Listing 5.5 shows the adaptive flier function, which implements the LMS algo-
rithm in a way compatible with real-time input. Although ths is a simple ideal example
‘where exact convergence can be obtained, this type of adaptive system can also be used
to model more complicated systems, such as communication channels or contol systems.
‘The white noise generator can be considered a training sequence which is known to the
algorithm; the algorithm must determine the transfer function of the system. Figure 5.4
shows the error function forthe first 7000 samples of the adaptive process. The error re-
duces relatively slowly due to the poles and zeros that must be determined. FIR LMS al-
gorithms generally converge much faster when the system can be modeled as a MA sys-
tem (see section 5.5.2 for an FIR LMS example). Figure 5.5 shows the path ofthe pole
coefficients (b0,b1) as they adapt to the final result where bO = 0.748 and bl = -0.272

(text continues on page 198)(° 2 Petes (2 » coets) aod 2 seros (3 2 costs) adaptive ise bigund fitter ”

float Lir_adapt_filter(float input, float 4, float

«

ie

a

iE

ia

fa Real-Time DSP Applications

*a, float *b)

static float out histt,out_hist2;

Static float beta(2],bota hl (2],beta h2t2];
static float alpha(3] alpha hi (3) alpha,h2(31;
static float in hist (31;

Hloat output;

output = out_hista * bf0};

CUEPUE +» out_hist2 * bt}; /* poles */

in hist (0) = input

fori = 0; 423; ive)
output += in hist{i) * atsas /* zeros */

calelulate aipha and beta update coefficients +/

fori = 0; <3; ive)

stpha(il = in tist(4) + bi0)eipha hii) + bI1) alpha hati);

beta{O] = out nist + b{0]*bota hi {0} + b(2I*beta hatol;
beta(1) = out hist2 + bio} +betahi(2} + bii}-betaha[1);

error calculation */
e = a output;

update coefficients */
{0} += 0*0.2+aiphatol;
al) #= e0,2*alphala},
(21 += 9*0.06ralphata);

B(O} += €*0,04%netafo};
BIL] += 0*0.02*beta(i]:

update history for alpha */

forli= 07437 ies) (
atpha h2i) = alpha hi (il;
alpha hi(i} = alphati)

,

wpdate history for beta */
for(i = 0; 1<2; ina) (

LUSTING 5.5 Function Ltx_adape_ttiter{tsput,d,a,>), which implements an LMS
adaptive second-order Ih fiter (contained in ARMA.C)(Contincech

Chap. 5

>

Sample Value

Sec. 5.2

(/* wpdate input/output history
‘oat_hist2 = out-histl;
out hist = output,

Parametric Spectral Estimation 197

beta h2[i) = beta ht (il;
eta i(i) = betaiil:

in pist(2] = im niet tay;
imhist(2) = snpue,
return (output);

USTING S'S (Continued

Error Signal from ARMA.

os
06
oa]

02

0a]

oa

|

‘40005000 6000700

01000

20003000

06!

Sample Number

FIGURE 5.4 Error signal during the IR adaptive process, illustrated by the
rogram ARMA.198 RealTime DSP Applications Chap. 5

Pole Location Coefficients Adaptation from ARMA.C
os; - ——

oa

005)

(1) Coefficient

oor 02 03 oa 05 06 07

{0} Coefficient

FIGURE 5.5 Pole cooficionts(b0.b1) during tho IR adaptive process, ilus-
‘watod by tho program ARMA.

5.2.2 AR Frequency Estimation

‘The frequency of a signal can be estimated in @ variety of ways using spectral analysis
‘methods (one of which is the FFT illustrated in section 5.1.2). Another paramettic ap-
proach is based on modeling the signal as resulting from an AR process with a single
‘complex pole. The angle of the pole resulting from the model is directly related to the
‘mean frequency estimate, This model approach can easily be biased by noise or other sig
nals but provides a highly efficient real-time method to obtain mean frequency informa
tion,

‘The first step inthe AR frequency estimation process is to convert the real signal
input to a complex signal. Ths isnot required when the signal is already complex, as is
the case for a radar signal. Real-to-complex conversion can be done relatively simply by
using a Hilbert transform FIR filter. The output ofthe Hilbert transform filter gives the
imaginary part ofthe complex signal and the input signal is the real pat of the complex
signal. Listing 5.6 shows the program ARFREQL, which implements a 35-point Hilbert
transform and the AR frequency estimation process. The AR frequency estimate deter-
mines the average frequency from the average phase differences between consecative

Sec.5.2 Parametric Spectral Estimation 199

#include Minetude include 16-10)
freq = opitatan2 (xi,x0);

freq = 0.0;
sendout (freq);

>

USTINGS.8 (Continued)

emplex samples. The are tangents used to determine the phase angle ofthe complex
{ult Because the calculation ofthe arc tangent is relatively slow, several simpliaions

©, = rel) arate,

‘The average frequency estimate is then

Se,
a
ee ee
Sie enn es etna flees bdo apd oe
wlen is the window length (winien in ‘Program ARFREQ) and controls the number of
ths ae mr ea ar RE on an
am wen i Toc esa ey ome ARFRED pe
ce cen

2n

3 SPEECH PROCESSING

Communication channels never seem to have enough bandwidth to camry the desired
Speech signals from one location to another fora reasonable cost. Speech compression at-
tempts to improve this situation by sending the speech signal with as few bits per second
4s possible. The same channel can now be used to send a larger number of specch signals
at & lower cost. Speech compression techniques can also be used to reduce the amount of
memory needed to store digitized speech.

Sec.5:3 Speech Processing 201

Frequency Estimates from ARFREQ.C
035; ‘

‘Eximate Value (ts)

8 100120140160

Estimate Nomber

FIGURE 5. Frequency stir
CCHKL.DAT epeech data as put

ates from program ARFREO.C, using the

5.3.1 Speech Compression

‘The simplest way to reduce the bandwidth required to transmit speech is to simply reduce
the number bits per sample that are sent. I ths is done in a linear fashion, then the qual.
ity of the speech (in terms of signal-to-noise rato) will degrade rapidly when less than 8
bits per sample are used. Speech signals require 13 or 14 bits with linear quantization in
‘order to produce a digital representation of the full range of speech signals encountered in
telephone applications. The Intemational Telegraph and Telephone Consultative Committee
(CCITT, 1988) recommendation G.71 specifies the basic pulse code modulation (PCM)
algorithm, which uses a logarithmic compression curve called p-law. law (see section
1.5.1 in chapter 1) is a piecewise linear approximation of a logarithmic tansfer curve
consisting of 8 linear segments. It compresses a 14-bit linear speech sample down to 8
bits, The sampling rate is 8000 Hz of the coded output, A compression ratio of 1.75:1 is
achieved by this method without much computational complexity. Speech quality is not
degraded significantly, but music and other audio signals would be degraded. Listing 5.7
shows the program MULAW.C, which encodes and decades a speech signal using j-law
compression. The encode and decode algorithms that use tables to implement the com.202

ime DSP Applications Chap, 5

velude yelude *redspe.h"
relude *m.h*

“AW.C ~ PROGRAM 70 DEMONSTRATE MU LAW SPEECH COMPRESSION

“7
ina

ine 4,37
for(i)
4H = (int) getinput 0:

‘encode 14 Bit Linear input to mi-Law */
5 = abel:
Af) > Oxteee) 3 = onttes
3 = Smumutad(3/21;
ifs < 0) 5 [> oxa0;

decode the @ bit miclaw and send out */
‘endout ( (Eloat)mutabls)) ;
)

LUSTING 5:7 Program MULAW.C, which encodes and decodes a speech signal using
law compression.

pression are also shown inthis li
‘include fle MU.H.

tng. Because the tables are rather long, they are inthe

5.3.2 ADPCM (G.722)

‘The CCITT recommendation G.72 is a standard for digital encoding of speech and audio
signals in the frequency range from 50 Hz. to 7000 Hz. The G.722 algorithm uses sub-
bband adaptive differential pulse code modulation (ADPCM) to compress 14-bit, 16 KHz
samples for transmission or storage at 64 Kbits/sec (a compression ratio of 35).
Because the G.722 method is a wideband standard, high-quality telephone network appli-
cations as well as music applications are possible. Ifthe sampling rate is increased, the
same algoridun can be used for good quality music compression.

‘The G.722 program is organized as a set of functions to optimize memory usge
‘and make it easy to follow. This program structure is especially efficient for G.722. since
‘most ofthe functions are shared between the higher and lower sub-bands. Many of the

$00.53 Speech Processing 203

functions are also shared by both the encoder and decoder of both sub-bands. All of the
functions are performed using fixed-point arithmetic, because this is specified in the
CCITT recommendation. A floating-point version of the G.722 C code is included on
the enclosed disk. The floating-point version runs faster on the DSP32C processor, which
thas limited support for he shift operator used extensively inthe fixed-point implementa-
tion. Listing 5.8 shows the main program G722MAIN.C, which demonstrates the algo-
rithm by encoding and decoding the stored speech signal “chicken litle,” and then oper-
‘ates on the real-time speech signal from getamput (). The output decoded signal

played using sendowt () with an effective sample rate of 16 kHz (one sample is inter-
‘plated using simple linear interpolation giving an actual sample rate for this example of

Hinclude Hinclude "rtdspe-h*

J Main program tor g722 encode and decode dano for 210K0 */

extern int encode int, int):
extern void decode (int) ;
‘extern void reset ();

/* outputs of the decode function */
‘exter int woutl,xout2;

nt ehkd_coded{6000];
exer int pm chk;

void main()
«
int 4,4,01.02
float xt © 0.0;
float xf2 = 0.0;

(/* reset, initialize required manory */
reset ();

/* code the speech, interpolate because St was recorded at 6000 Hz */
for(i = 07 4 < 60007 itt) (
atch LT
32 (chk [Aroha (1431)
‘chil_coded {i }=encode(t3, 21;

d

(J interpolate output to 32 iz */
for(i = 0; i < 6000 ; Sst) ¢

LUSTING 58 The main program (G722MAIN.C), which demonstrates the
"ADPCM algorithm in roatsima. (Continued)Aecode ehk_codedti}) ;
xl = (float) outa;
‘Bendout (0.5°%£2+0.5¢xf1)
Bendout (x£1)
x2 = (float) xout2,
ssendout (0.5°x£240,5¢xE1)
ssendout (xf2);

)

Jy similate a 16 iz sampling rate (actual 4s 32 Kis) +/
/* note: the 9722 standard calls for 16 Klis for voice operation +
while(a) (
1=0.54 (get input () sget input (0);
"2e0.5* (getinpat () sgetinput ())

Srencode (ei, £2);
Aocode (5);

xE1 = (Eleat)xoutt;
‘sendout (0.5* (xftext2)) ;
‘Sendout (x61)

x£2 = (float) xout2;
‘Bendout (0.5* (e€20x62) );
‘Sendout (x62)

USTING 5.8 (Continued

32 kiiz), Listing 5.9 shows the encode function, and Listing 5.10 shows the decode
function; both are contained in G.722.C. Listing 5.11 shows the functions teen,
f4itep, quanti, invaxl, logscl, scalel, upzero, uppol2, uppolt, in.
Ah amd Jogsch, which ar used by the encode and decode Functions Ta Listings
59, 5:10, and 5.11, the global variable definitions and data tables have been omited fy

(text continues on page 215)
722 encode function two ints in, one int out */

sneode(int xin, int xin2)

int “hyper;

int *tanf_per, *tqnf_ptri;
long int xa, xb;

STINGS: Function encode xint,xtn2) (contained in G.722.).(Continuech

ie

a

a

a

n
a

Sec.5.3 Speech Processing
int decis;

int sh; (7 his comes from adaptive predictor */
int any

st i, ins

dint zh, sph,ph yh:
int sel, spl,sl,el;

encode: put input samples in xinl = first value, xin? = second value */
returns il and th stored together */

transmit quadrature mirror filters implenented here */
hoptr = hh

(long) (*tqne_pte++) + (Ah ptres);
(ong) (*eqnepert+) + (¢hoptres)
main multiply accumlate loop for samples and coofficients */
forli= 0; i< 10; ie) (

(ong) (“tant peers) + (sh _peres);
(ong) (*eqnepers+) * (*Roptres)

Final milt/accumilate */
xa += (long) (rtm ptree) + (eh persed;
(long) (*tamf_per) * (*Rper=+);

update delay line tqnf */

tanf_ptrl = tant_ptr - 2,
for(i = 0; i < 22; ive) segue pt = tgmt_ptrt-;
Segnt_ptr~ = xint;

Stqmfper = sind;

x1 = (xa + 3b) >> 15;
ah = (a= Hb) 55 25;

fend of quadrature mirror filter code */

into regular encoder segment here */
starting with lower sub band encoder */

files ~ compute predictor output section - zero section */
stl = £iltez (delay ppl. delay _ditx) ;
filtep - compute predictor output signal (pole section) */

spl = Filtep(ritt,al1,r1t2,a12);

USTINGS9 (Continuec)206 Real-Time DSP Applications Chap. §

—

compute the predictor output value in the lower sub_band encoder */

sl = s2l + spl: |
el =x - el; |

|
want: quantize the difference signal */ |

AL © quanti (el, det);

upd: does both invgal and invgbl- computes quantized difference signal +/
for invgpl, truncate by 2 lebs, so mode = 3 +

rvgal case with mode = 3 */
alt = ((long)detl*qyt_codeé_table(il >> 21) >> 15:

ogacl: updates logarithmic quant. scale factor in low sub band /
bl = Logse] (L2,mbl);

lcalel: compute the quantizer ecale factor in the lower sub band*/
falling paraneters mbl and 8 (constant such that scalel can be scaleh) */
det] = scalel (xb, 8);

varrec - simple addition to compute recontructed signal for adaptive pred +/
plt = dit ¢ sl;

\pzero: update zero section predictor coefficients (sixth order)*/
falling parameters: dit, diti(eire pointer for delaying */

et, diez, ..., alt6 from alt */

wpli (Linear buffer in which all six values are delayed */

‘eturn parans: updated bpli, delayed dltx */

upzero (Alt, delay. Alte, delay_Ppl) :

ppol2- update second predictor coefficient apl2 and delay it as al2 */
alling parameters: all, al2, pit, pltl, plt2 */

12 = uppol2 (all, al2,plt,plet,plt2);

poll supdate first predictor coefficient spli and delay it as ali */
alling parameters: all, apl2, plt, plti *

all = wppoll (ail, al2,pit,plen);

fecons : compute recontructed signal for adaptive predictor */
rit = als alt;

lone with lower subband encoder; now inplenent delays for next tinet/

LUSTING 5.9 (Continuach

See. 6.3 207

‘Speech Processing
rita = leas
ri = rit;
plt2 = pltl:
piel = pit:

/* high band encode */
sth = #ltez (delay_bph, delay. die) ;
foph = filtep(rht, abt, rh2, aha);

/* predic: sh = sph + s2h 4/
‘sh = gph + szh:

(+ subtra: eh = xh ~ oh */
eh = xh ~ sh;

(+ quanth ~ quantization of difference signal for higher sub-band */
+ quanth: in-place for speed parans: eh, deth (has init. value) */
/* return: ih */
ig(en >= 0) (
Sh= 3: /* 2,3 are pos codes */
>
else {

She; /* 0,1 are neg codes */

)
deci = (S64L* (Long) doth) >> 12b;
Af(abe(eh) > decis) ite; /* mih = 2 case */

+ inugah: in-place compute the quintized difference signal
in the higher sub-band*/

ah = ((long)deth*qq?_code2_table(ih}) >> 151 7

(J logech: update logarithmic quantizer scale factor in hi sub-band*/

ih

ogc (ih, mb):

/* note : acalel and scalch use same code, different parameters */
‘doth = scalel (nibh, 10);

/* parrec ~ add pole predictor output to quantized diff, signal (in place)*/

ph = dh + sah;

/+ upuero: update zero section predictor coefficients (sixth order) */
/* calling parameters: dh, dhi(cire), bohi (circ) */
y* return params; updated Epi, delayed dhx */
‘upzero (dh, delay_dhx, delay_bph)
LUSTING 5:9. (Continuecheal ReaLTime SP Aplications Chaps | S0c.63 Speech Processing 209
(pol2: update second predictor coef aph? and delay as ah2 */ | dec_s2] = filtez (dec_del_bpl,dec_del_ditx);

alling params: ahl, ah2, ph, phi, ph2 */ ama ~ >

cur porene’ aphid */ (7 fiatep: compute predictor output stgnal for pole section +/

2 = wppol2 ah ah2 php pha); dec_ppl = £4186 (dec_r1 deal dee142, 032)

poll: update first predictor coef. ephd and delay it as ah +/ @ec_sl = dec spl + dec_szl;

‘ah = uppolt (ant, ah2,ph,pht) ;

/* Anwrpel compute quantized difference signal for adaptive predic in low sb */
dec_dlt = (long) dec doti*ayt_coded_tablelilr >> 21) >> 15)
‘scons for higher subband +/ cae bos

yh sh+ ayy /* Awad: compute quantized difference signal for decoder output in low ab */

41 = ((Long)dec_detl*qy6 codes tabletiirl) 25 151
fone with higher sub-band encoder, now Delay for next time */ aa ne

zh2 = zh; HL = al + doce

/* logscl: quantizer scale factor adaptation in the lover sub-band */

ec_nbl = logsel (él. dee bt);
ultiplaxing ih and i1 to get signals together +/ a =

retum(it | (ih << 6)); /* ecalel: computes quantizer scale factor in the lower sub band */

LUSTING 5:8. (Contiouec) dec_dotl = scale (dec mb.)

J parrec = add pole predictor output to quantized Giff. signal in place) */
| 7 for partially reconatructed signal */
lecode function, result in xoutt and xout? */
Gec_plt « dec_dit + dec_szl;
| decode int input)
/* upzero: update zero section predictor coefficients */

int xat,xa2;/* nf accumiators */ \worero (dec. Alt, dec del. alte, dec_del pl):

int "per;

int pm ‘ac ptr,*ac_ptrt, ad per, tad ptrt; /* wppel2: update second predictor coefficient apl2 and delay it as al2 */
int ir, ih;

int xe,2 dec al2 = uppol2 (dec_al1, dec_a12,dec_plt ,dec_pltt,dee_pit2);

int rLzh:

int a; /* wppoll: update first predictor coef. (pole seticn) +/

Plit transmitted word from input into ile and ih */ dec ali = ppoll (dec_al1, dec_al2,dec_plt,dec_ple);

Ar» input & Ox;

ih = input => 6; /* xecons : conpute recontructed signal for adaptive predictor */

deceit = dec_si + dec_ait;
‘OER SuB_BAND DECODER */
47 done with lower sub bend decoder, implenent delays for next time +/

iltez: compute predictor output for zero section */ eee

LUSTING 6.10 Function decode input) (contained in 6.722.) (Continued)dec_z1t2 = dee_ritt;
dec rie = dec rit:
dec plt2 = dec plti;
oc_pltl = dec ple:

‘+ xc sup-aanp pecopm */
£Sltez: compute predictor output for zero section */
oc_szh = Filter (dec_del_bph,dec_del_dlve):

+ eLitep: compute predictor output signal for pole section +/

dec_sgh = £42tep (dec_rhl,dec_abt dec_ch2,dee_ah2) ;

ec_sh = dec_sph + dac_ezh;

+ Anvgah: in-place compute the quantized difference signal
fin the higher subband */

dec_dh = ( (Long) dec_deth*aq2_code2_table(ih}) >> 15 ;

+ logsch: update logarithale quantizer scale factor in hi sub band */

ec nibh = Logsch (ih, dee nth);
@oc_doth = scalel (dec nbh,10);

+ parrec: compute partially recontructed signal */
@ec_ph = dec_dh + dec szh;

* upzero: update zero section predictor coefficients */

‘upzero (dec_dh, dec_del_ane, dec. de1_bph);

‘uppol2: update second predictor coefficient aph2 and delay it as ah? */

{dec_ah? = uppol2 (dec_ahl,dec_ah2,dec_ph,dec_phl,dec_ph2);
* uppoll: update first predictor coef. (pole sstion) */

‘dec_ah = uppoll (dec. ahi ,dee_ah2, dee_ph,dee_phl)

LUSTING 5.10 (Continued

scalel: compute the quantizer scale factor in the hisher sub band */

Chap.

redic:conpute the predictor cutput value in the higher sub_band decoder +/

Sec.5.3 Speech Processing

an

([* recons + compute recontructed signal for adaptive predictor */

Bh = dec_sh + dec.dh;

(/* done with high band decode, inplenenting delays for next time here */

/* end with receive quadrature mirror filters */
xd = rl = th;
xe = rls thy

/* receive quadrature mirror filters implemented here */
hoptr = h:
ae_ptr = accuncy
adptr = accumd;
zal = (long)xd * (sh ptree):
xa2 = (Gong)xs + (*hoptree):
/* main multiply accumilate loop for samples and coefficients
for(i = 0; 1 < 107 ie) (
yal += (long) (tac_ptre+) + (*hptree):
xa += (Long) (tad_ptres) + (*hptres):

,
/* final mit/accumlate */
yal += (long) (tac_ptr) * (*h ptr+e)
ya2 += (Leng) (tad ptr) * (*Rptre+)

7* scale by 2°14 */
outs = xa >> 14;
out = x2 >> 14;

(/* update delay Lines */
‘ac_ptrl = ac ptr - 1,
adptrl = adptr - 1,
for(i = 0; i < 107 dee) (

tac_ptr- = tac ptrl-;
sadptr = *ed.ptri-;

>
tacptr = xd;
sadptr = xs:

USTINGS.10. (Continued

”22

Time DSP Applications Chap. §

Atez - compute predictor output signal (zero saction) */
ue: bpli-6 and diti-6, output: sel #/

‘leer (int “ppl, int sare)

ane i;
long int 21;
‘A= (long) (*bples) + (vatess);
Oris tz i <6} is)

21 ¢= (long) (*bpl4s) * (sateen);

veturn| (Int) (21 >> 10)); /* 32 here */
Ltep ~ compute predictor output signal (pole section) +/
@ut r1ti-2 and all-2, output spl =!

Attep(int ried, ine alt, int r2e2, ine a2)

fong int pl;

2 = (ong)aiasrier;

1 += (long)al2*rit2;
feturn( (int) ipl >> 14); /+ x2 here +/

antl ~ quantize the difference signal in the lower stly-band */
antl (int e, ine det)

ne rilymily
(ong int w3, deci;

5 of Aitference signal */
a= abaiel)
Permine mil based on decision levels and det gain */
ortmil = 0 ; mil < 30 ; miler) (

decis = (decks levi imil}*(leng)deti) >> isu;

if (wd < decis) break;

‘uii1=30 then wa is less than all decision levels */
f(el >= 0) rit = quantasbe_posinil};

Ase il = quant26bt_neg{nill;

eturn (rit);

LUSTING 5.11, Functions used by the encode and decode algorithms of 6.722 leon
twined in G.722.0). (Continued)

Sec.53 Speech Processing 213

Jt logscl - update the logarithmic quantizer scale factor in lower sub-band */
/* note that nbl is passed and retumed */

int logsel (int 1, int nl)
‘
ong int wets
wd = ((long)nbl * 127b) >> 71; /* leak factor 127/128 */
bl = (int)wd + wlcode_table{ii >> 2};
Af(mb1 <0) nhl = 0;
Aftnbl > 18432) nbl = 20432;
return(abl) ;
)

/* scaiel: compute the quantizer scale factor in the lower or upper sub-band*/

int scalel (int nbl,ine shift constant)

c

Sint 2,92;
wal = (abl >> 6) & 21;
wad = nbl >> 11
wd3 = 2b table[wit) >> (shife_constant + 1 - wi2);
Feturn(wd3 << 3);

,

/* Mpzero ~ inputs: alt, alti(o-s}, bLi[0-5], outputs: updated B1i{0-5) */
/* also inplenents delay of bli and update of alti fron dle */

void upzero(int alt, int “alti, int *b1i)
‘
Ane Aw
Pf dit is zero, then no sum into bli */
ig(aue == 0) (
fori = 0; 1 <6; don) (
DULL] = (ne) ((25SL¢BLL(4)) >> 8b); /* Leak factor of 255/256 +/

)
d
else ¢
for(i = 0; <6 7 den ¢
Af((long)dlt*alei(i) >= 0) waz = 128; else wi2 = -128;
wa = (ine) ((255L*b1i(4}) >> 86); /* Leak factor of 255/256 +/
BLL] = wid + wc;
>
>
(/* Smplenent delay Line for alt */

LUSTING 5.11. (Continued24 Real-Time DSP Applications Chap. 5

auei(s) = aeiia)
piste)

ati)
anei(2}
arity
aiti(o)

sppol2 - update second predictor coefficient (pole section) */
inputs: al, al2, pit, plel, plt2, outputs: apl2 */

‘yppol2 (int al, int ol2,int plt,int ple, int pit2)

Long int wiz, wid:
int aplz,
wad = 4L* (long)all:
ie((ong)ple*pitl >= 01) wi2 = -wa2; /* check sane sign */
wad = wz >> 7 J gain of 1/128 */
Le((ong)plesple2 >= OL)

wa = wD + 126

(/* same sign case */
>
else (

wed = wd = 22

>
apl2 = wh + (127L*(long)al2 >> Th); /* Leak factor of 127/128 */

wll ie Limited to +
S£(ap12 > 12288) api?
) apl2 = -12288;

wppoll - update first predictor coefficient {pote section) */
inputs: all, apl2, plt, plti. outputs: epli */

wppoll (int a12,int apl2,int plt,int pitt)

long ine wa2
ine wo, pli;
wa2 = ((long)al1*255L) >> 81; /* Leak factor of 255/256 */
Le(Congyple*pier >= 01) ¢

apli = (intywa2 +192; /* sane sign case */
d
else (

apil = (intywa2 ~ 192)
d

STING 5:11, (Continued)

Soc.5.3 Speech Processing 216

J nove: was= .9375-.75 is always positive */
‘wad = 15360 ~ apl2; 7 Umit value */
if (apLt > wi3) apll = wid;

Sf tapht < “wd3) api = wad,
return (ap11) ;

)

(J logsch ~ update the logarithmic quantizer scale factor in higher sub-band */
7+ note that nbh is passed and returned */

int logch (int th, int nh)
‘
int wa;
wd = ((long)nbh * 1271) >> 7h:
sibh = vel + wh_code_tablefihl
if tabn < 0) nbn = 0;
Af (bh > 22526) nbh = 22528;
return (ath)

/* leak factor 127/128 */

USTINGS.11, (Continued)

Figure 5.7 shows block diagram of the G.722 encoder (transmitter), and
Figure 5.8 shows a block diagram of the G.722 decoder (receiver). The entire algorithm
has six main functional blocks, many of which use the same functions:

w ‘A transmit quadrature mirror filter (QMF) that splits the frequency band into
‘wo sub-band:

(2&3) A lower sub-band encoder and higher sub-band encoder that operate on the
data produced by the transmit QMF.

(4&5) A lower sub-band decoder and higher sub-band decoder.

© ‘A receive QMF that combines the outputs of the decoder into one value.

TigherSabBand reins , |
Transmit -——*] ADPCM Encoder [~ 1, es
Sera wx | AHS Ene
hee Tower SubBand 4s nome
FH] ADPCM Encoder [~ 7, |

FIGURE 5:7 Block diagram of ADPCM encoder (transmiter) implemented by pro-
ram 6.722.ded

216 Real-Time DSP Applications Chap. §

sive [Higher Sub-Bana
se10ws] ger sub Bena FasaNe
64 koitls hs Decoder "| cuadrature
Se] omux Miron |
48 Keivs [Lower Sub-Band >| Filters
7, *| ADPCM Decoder [7
‘Mode indication

FIGURE 5:8 Block diagram of ADPCM decoder (receiver) implemented by program
Gmc.

‘The G.722.C functions have been checked against the G.722 specification and are
fully compatible withthe CCITT recommendation. The functions and program varibles ne
‘named according to the functional blocks of the algorithm specification whenever posibie

Quadrature miror fiers are used in the ©.722 algorithm as a method of spiting
the frequency band into wo sub-bands (higher and lower). The QMEs also devine te
encoder input fom 16 kHz to 8 KH (transmit QMF) and interpolate the decoder cuipet
from 8 Kitz to 16 kz (receive OMF). These filters are 24tap FIR filters whose impcloe
response can be considered lowpass and highpass filters. Both the transmit and teveivg
(QMS share the same coeficients and a delay line ofthe same numberof taps

Figure 5.9 shows a block diagram of the higher sub-band encodes. The lower and
higher sub-band encoders operate on an estimated difference signal. The number of bit
‘quired (o represent the difference is smaller than the numberof bits requted to repre
sent the complete input signal. This difference signal is obtained by subtracting ¢ pre.
dicted value from the input value:

ele -
eh oh - ah

‘The predicted value, #1 or ab, is produced by the adaptive predictor, which con-
{ains a second-order filter section to model poles, and a sixth-orde filter section to model
2210s inthe input signal. After the predicted value is determined and subtracted from the
input signal, the estimate signal e2 is applied toa nonlinear adaptive quantizer

(One important feature ofthe sub-band encoders is a feedback loop. The output of
the adaptive quantizer is fed to an inverse adaptive quantizer to produce a difference sig
nal. Ths difference signal is then used by the adaptive predictor to produce #3 the esti.
‘mate ofthe input signal and update the adaptive predictor.

‘The G.722 standard specifies an auxiliary, nonencoded data channel. While the
6.722 encoder always operates at an output rate of 64 Kbits per second (with 14-bit,
46kFiz input samples), the decoder can accept encoded signals at 64, 56, or 48 kbps. The
56 and 48 kbps bit rates correspond to the use of the auxiliary data channel, which oper-
ates at either 8 of 16 kbps. A mode indication signal informs the decoder which mode is
being used. This feature is not implemented inthe G.722.C program.

See.5.3. Speech Processing

4-Level
Adaptive
Quantizer

4

20

hy 16 bis,

‘Adaptive
Predictor

FQURE 5.9. Block dia

Program 6.722.

‘Quantizer
‘Adaptation
“Love!
Inverse
Adaptive
Quantizer
de
be a

s0ram of higher sub-band ADPCM encoder implemented by

Figure 5.10 shows a block diagram ofthe higher sub-band decoder. In general, both
the higher and lower sub-band encoders and decoders make the same function cals in al,

‘most the same order because they are
‘adaptive quantizer is used in the lower

similar in operation. For mode 1, a 60 level inverse
= sub-band, which gives the best speech quality. The

higher sub-band uses a 4 level adaptive quantizer.

16 kbivs

4-Level
Inverse

44

Adaptive [——¢———> hy

Quantizer

a

Quantizer
‘Adaptation

‘Adaptive &
Prodictor

FIGURE 5.10 Block dia

Program 6.722.,

}9TaM of higher sub-band ADPCM decoder implemented by218 Real-Time DSP Applications Chap. 5
MUSIC PROCESSING

“Music signals require more dynamic range and a much wider bandwidth than speech sig.
nals. Professional quality masic processing equipment typically uses 18 to 24 bits to rep-
resent each sample and a 48 kHz or higher sampling rate. Consumer digital audio pro-
cessing (in CD players, for example) is usually done with 16-bit samples and a 44.1 kz
‘sampling rate. In both cases, music processing is a far greater challenge toa digital signal
processor than speech processing, More MIPs are required for each operation and quant
“zation noise in filters becomes more important. In most cases DSP techniques are less ex
‘pensive and can provide higher level of performance than analog techniques.

5.4.1 Equalization and Noise Removal

[Equalization refers toa filtering process where the frequency content of an audio signal is
‘adjusted to make the source sound better, adjust for room acoustics, or remove noise that
‘may be in a frequency band different from the desired signal. Most audio equalizers have
‘number of bands that operate on the audio signal in parallel with the output of each fi-
ter, added together to form the equalized signal. This structure is shown in Figure 5.11
“The program EQUALIZ.C is shown in Listing 5.12. Each gan constant is used to adjust
the relative signal amplitude ofthe output of each bandpass filter. The input signal is al-
‘ways added to the output such that if all the gain values are zero the signal is unchanged.
Setting the gain values greater than zero will boost frequency response in each band
For example, a boost of 6 dB is obtained by setting one of the gain valves to 1.0. The
center frequencies and number of bandpass filters in analog audio equalizers vary widely
from one manufacturer to another. A seven-band equalizer with center frequencies at 60,

Band
Fite

Ar Sint

Yo

Band,
L—s) Fie

Gain [6]

cof audio equalization implemented by program
EQUALIZC.

‘Sec.5.4 Music Processing 219

Hinclude -
Hinelude Hinclude “rtdspe.h*

[BOUALTZ.C = PROGRAM TO DEMONSTRATE AUDIO EOUALIZATION
USING 7 IIR BANDPASS FILTERS.

(/* gain values global so they can be changed in real-time */
/ start at flat pass through */
Eloat gain(7} = { 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0

void main()
‘

ane $3

float. signal_in,signal_outr

(* history arrays for the filters */
‘static float hist (7) (2)

/+ bandpass filter coefficients for a 44.1 kHz sampling rate */
7+ canter freqs are 60, 150, 400, 1000, 2400, 6000, 15000 Hz */

/* ae other rates center frogs are: ”
7 at 32 Kes “44, 109, 290, 726, 1742, 4354, 10884 Hz */
y* at 22.1 kite: 30,75, 200, 500, 1200, 3000, 7500 He */

static float BpfI71(5] = (

{0,0025579742, -1.9948111773, 0.9948840737, 0.0, -1.0 ),
{ 010063700872, -1,9868060350, 0.9872598052, 0.0, -1.0),
{ o,0168007612, -1,3632060528, 0.9663984776, 0.0, -1.0),
( 0:o408578217, 18988472425, 0.9182843566, 0.0, -1.0
( 010914007276, -1.7119922638, 08171985149, 0.0, -1.0 0,
{ 014845672876, -1.0703803566, 0.6308654547, 0.0, -1.0),
{ 013760778010, 0.6695280420, 02478443682, 0.0, -1.0 ).
M
fortis) ¢

J+ sum 7 bpf outputs + input for each now sample */
signal_cut = signal_in = getinput (0;
for(i= 0; 4277 S44)
signal_out += gain{i]*{ir_tilter(signal_in, bpf(4),1.histi]);
sendout (Signal_out) ;

LUSTING 6.12 Program EQUALIZ.C, which performs equalization on audio samples
Iinraaltime.al Real-Time DSP Applications Chap. 5

£30, 400 1000, 2400, 6000, and 15000 Hz i implemented by program EQUALIZ.C. The
bandwidth ofeach fiers 60 percent of the center frequency in each case, and tne ee
pling rat is 44100 Hy. This gives the coefficients inthe example equalizer progeny
FQUALIZ.C shown in Lining 5.12, The frequency response ofthe 7 filters is shor
Figure 5.12.

5.4.2 Pitch-Shifting

‘Changing the pitch of «recorded sound is often desired in order to allow it to mix with ¢
new song, oF for special effects wher the orignal sound s shifted in frequen toa porn
here ii no longer identifiable. New sounds are often created by a series of pith ate
and mixing processes,

Pitch shifting can be accomplished by interpolating a signal to anew sampling ate,
4nd then playing the new samples back atthe orginal sampling rate (see Allen 1980,
‘Smith and Gossett, 1984). Ifthe pitch i shifted down (by an interpolation factor grease
{han one), the new sound will havea longer duration, I the pitch i shifted upward oy ca
Interpolation factor less than One where some decimation is occurring), the sound ber
‘comes shorter. Listing 5.13 shows the program PSHIFT.C, which can be used to pitch,

17 Band Bqualizer Filter Frequency Response

MVY

Magnitude (48)

‘or 10 10 108 10s
Frequency Hz, 5248100 Ha)

FIGURE S.12. Frequency response of iter used in program EQUALIZC

‘See.5.4 Music Processing 2a

Shif a sound up or down by any number of semitones (12 semitones isan octave as indi
ated by equation 4.13) It uses long Kaiser window filter for intepolation ofthe sem.
ples as illustrated in section 4.3.2 in chapter 4. The filter coefficients are caleulated in the
first part ofthe PSHIFT program before real-time input and output begins. The filtering is
done with two FIR filter functions, which are shown in Listing 5.14. The history array is
‘only updated when the interpolation point moves to the next input sample. This requires
that the history update be removed from the £4x_£i1tex function discussed previ-
ously. The history is updated by the function £ix_history_update. The coefficients

are decimated into short polyphase filters. An intespolation ratio of up to 300 is per.
formed and the decimation ratio is determined by the amount of pitch shift selected by

the integer variable key.

#include = 50.0) *beta = .1102 * (att - 8.71);
if (ate < 50.0 & att >= 21.0)
Sbeta = 15842 * pow( (att-21.0), 0.4) + .07886 * (att ~ 21.0:
npatr = (Long int) ( (att - 8.0) / (28.72 * deltat) );
retum (2*mpair + 1)

,

(/* Compute Bessel function Tzeroly) using a series approximation */

float izero(float y)(
float $21.0, dsel
fo {

0:

ustINa 5:13. (Continued)—__r

a Real-Time DSP Applications Chap. §

asaya;
as = ds * yyy sara);
steeds;

J white( ds > 18-7 * =);

return(s):

LUSTING 5.13 (Continued
/* mim the fir filter and do not update the history array +/

float fix i1ter mo
(lost fie fier ne_spdate (float snput, float *coef, int n, float thistory)
ine 4

float *hist_ptr, tooef ptr;

‘leat output;

hist_per = history;
cook ptr = coef en —

2 /* point to last coet +/

/* form output accumilation */
output = *hist_ptres * (*coef pts);
for(i= 2; Lon; des) (
SUDUE += (*hiSt DEEN) * (+coef pt);

input * (coet_ptr); + input tap +/

retum (output);
)

(7 wpdate the fir filter history array */

void fir update his oat 5 “
’ update history (float input,int n,float “history)

Float *histptr, ¢histt_per;

hist_ptr = history;
Mat ptr = hist pte (7 use for history update */
fori = 2: Len; in) ¢

) MSELpErr = thistpesve; —/+ wpdate history arcay */
*histi_per = input

(7 last history */

LUSTING 5.14 Functions ste esnt
Weer noupdate and ths etteer.
wate datory used by program PSHIFTC. oe

Sec. 5.4 Music Processing 225
5.43 Music Synthesis
Masic synthesis is @ natural DSP application because no input signal or A/D converter is

required. Music synthesis typically requires that many different sounds be generated at
the same time and mixed together to form a chord or multiple instrument sounds (see
Moorer, 1977). Each different sound produced from a synthesis is referred to as a voice.
‘The duration and starting point for each voice must be independently controled, Listing
5.15 shows the program MUSIC.C, which plays a sound with up to 6 voices atthe same
time. It uses the function note (see Listing 5.16) to generate samples from a second
order IR oscillator using the same method as discussed in section 4.5.1 in chapter 4. The
envelope of each note is specified using break points. The array tel gives the relative
times when the amplitude should change to the values specified in array ammpa. The enve-
lope will grow and decay to reach the amplitude values at each time specified based on
the calculated first-order constants stored in the rat-ea array. The frequency of the sec-
‘ond order oscillator in mote is specified in terms of the semitone note number key. A
‘key value of 69 will give 440 Hz, which isthe musical note A above middle C.

include include include *reaspe.h*
‘include *song.h" 1 songi108} (7) array */
J 6 Voice Music Generator */

typeset struct (
int key, t cindex:
float ew.a.bi
float y1.y0:

) NoTE_STATE;

#etine MAX VOICES 6
float note (NOTE STATE *,int *,float *);

void maint)
‘
long int n,t,told:
int wn, vey?
float anpold:
register long int iendi;
register float sig out

LISTING 5.15 Program MUSIC.C, which ilustrates musi synthesis by play-
Inga 6-voice song. (Continuech26

Chap.

static NOTE_STATE notes [NAX VOICES*SON5_LENGT#) ;

static float treliS]}=( 0.1, 0.2,
static float amps(5] = ( 3000.0 | 5000.0, 400
Static float raves(i0];
static int ebreaks(10};

for(n = 0; m < SOME LENGTH ; nts) {

/* mumber of samples per note */
fendi. = 6*song{n (0);

/* calculate the rates required to get the desired amps */
t= 0
told = 0;
ampold = 1.0; J+ adways starts at unity */
while (amps{i] > 1.0)
‘= trelli}*endi;
rates(i) = exp(log(amps {i} /ampotd) /(t-tola))
ampold = anpelil;
tbreaks{i} = told
ioe:

)

/* set the key numbers for all voices to be played (unum is how many) */
forty = 07 ¥< MAXWOICES ; vés) (
key = sonata} (vel):
(Key) break:
notes{v] -key = keys
notestv].t

for(i = 0; i < endi ; ise) (
sigout = 0.0;
for(v = 0; v < vm ; ve) (
sig_out += note(anotes{(v], threaks, rates):
»
‘Sendout (sigout) ;

Fushi)

USTING 5.15. (Continvech

‘Sec. 5.4 Music Processing 2

finclude Hinelude *vtdspe.h*

/* Punction to generate samples fro a second order oscillator */

(/* key constant is 1/12 */
define KEY_CONSTANT 0,083333333333333,

J this sets the A above middle ¢ reference frequency of 440 Hx */
define 10_PI_DIv_Fs_440 (680.0 * PE / SAMPLE RATE)

/* ow is the cosine constant required for fast changes of envelope */
y* a and b are the coefficients for the difference equation */

J+ yl and yO are the history values */

(tie time index for this note */

/* cindex is the index into rate and threak arrays (reset when t=0) */

typedes struct (
int key, t, cindex:
Hot or, 4,b:

seni-tone pitch to generate,
number 69 will give A above middle C at 440 He.
Fate constants deternines decay or rise of envelope (close to 1)
ehreak array’
Gevermines time index when to change rate
”

(/* Work_STATE structure, time break point array, rate parameter array */

float note (NOTE STATE *9,int *tbreak array, float “rate_erray)
‘

register int ti,ci:

float wose,rate, outs

th = ee)
re-start cage, set state variables */

seu)
wose = TWO_PI_DIVFS_440 * pow(2.0, (s->key-69) * KEX_CONSTANT ;

ne

LUSTING 5.16. Function note (state, threak arcay,rate_array) generates
the samples for ach note inthe MUSIC.C program. (Continued)28 Real-Time DSP Applications Chap. g

S-20W = 2.0 * cos (wesc);

rate = rate array(0)

sa = son * rate: /* rate change +/
ob = “rate * rate;

sy0 = 0.0.

out = ratetainiwosc) :

s->cindex = 0;

)
else (
ci = s->cindex
(* rate change case */
f(t == tbraakarrayfei}) (
ate © rate_array[+eci);
Sb = “rate + rates
s->eindsx = ci;

,

/* make new sample +/
out = sat soyl + eb + ssy0;
yO = ey,

returnicut)

LUSTING 5:18. (Continued)

'APTIVE FILTER APPLICATIONS.

eam cam be effectively improved or enhanced wsing adaptive methods if the signal
Eatbency conten is narow compared to the bandwidth and the tency cocet

5.5.1 LMS Signal Enhancement

Figure 5:13 shows the block diagram of an LMS adaptive signal enfsncement that willbe
eee to illustrate the basic LMS algorithm. This algorithm was described in seedion 1.7.
in chapter 1. The input signal is a sine wave with added white noise. The adaptive LMS

Sec. 6.5 Adaptive Filter Applications 229

FAr sin oy k+ nq

#inctude include include *rtaspe.h*

Heefine x 352
#eefine 1 20 /* Cilter order, (Lengeh 142) */

/* sot convergence parameter */
Float mi» 0.02;

void maint)
‘
Sloat Ins (float, float, float +, int, float, float):
static float din ,bt2t)
float signal_anp,noise_amp,arg,x.¥;
int ie

/* create signal plus noise */
signal_anp = sqrt(2.0)7
noise amp = 0,2*sqrt (12.0);
ang = 2.0¢7/20.0;

LUSTING 5:17 Program LMS.C which illustrates signal-to-noise enhanc
‘ment using the LMS algorithm, (Continued)230 Real:

ime DSP Applications Chap. 5

fortk = 0
atk

ken, ke (
‘signal_anp*2in(arg*k) + nofse_amp*gaussian() ;
>

(* scale based on L */
‘mus = 2.0%mi/ (L417

x = 0.0;
forte = 0; K USTING 6.17 (Continued)

”
function Ine(,4,b,1,mi,aipha)

implenents MMS Algorithm (ket) «b(le)+2¢matets(k) /((141) *8ig)

x = input data

a desired signal

‘b{0:1) © Adaptive coefficients of 1th order fir filter
1 = order of filter (> 1)

mm Convergence parameter (0.0 to 2.0)

alpha = Forgetting factor — sig()=alphat (x(k) *#2)+(I-alpha) “sig (k-1)
(= 0.0 and < 1.0)

retume the filter cutput |
”

float Ime (float x,float d,float *b,int 1,
‘float mu, float alpha)
‘ |
ine 1
oat ¢,m1_e,1ns_const.¥:
static float pei51]; | /* max L = 50 */
static float signa = 2.0; /* start at 2 and update internally */

pe(0)=7

LUSTING 5.18 Function tma(x,4,b,dom-aioka) implements the LMS at
gorithm. (Continued

Sec. 5.5 Adaptive Filter Applications zat

/* calculate filter output */
‘y=b(0]*px101
ford = 1; 1b <2 2 lve)
yayeb (11) *px(10]

J excor signal */
end-y:

J+ update signa */
‘signacalpha* (px(0]*p«(0])+ (1-alpha) "eign:
maesmute/signa;

/* update coefficients */
for(i» 0; 11 <=2; lv)
p(ad)=b(18} +m. e*px(22)
J vedate history */
for(i = 1; tb >s i; My
pe(1Jepxtt-11

returntyl

STING 5.18 (Continued)

sein Ges ising 17 4918 2 mp re FR ter wee er
Siem Ge inca sample Ths desire espns inhi cae he oi
Sa tie ern adeaped von of te np sina The dey ()#
an os camgonens of ads re corned (8 oe -sample day
ice nl fore waves ad whe m9)

pel fori wat rma honing to he program. Rough may
se amen cine the es va form, no ie son as
ac a al tem ny ot converge ily 1a ais
been fund mee aap sem moving fom no sgl (ll coffins
a gn Ths tae approximate 300 samples in Figure 5.4
a ee grimly 30 senpes nF S. 14 Wiha =OSample Value

‘Sample Nomber
Program LMS.C Output

2
Sap Vae
==
===
a
—=__

a) ee er

Sample Number

FIGURE 6:14 (a) Orginal noisy signal used in program LMS.. fb) En-
hanced sional obtained trom program (MS. with ca coor

0 0 1015000 280) 300350

Sec.

©

55 Adaptive Filter Applications 233

Program LMS.C ouput

nT

|

| ||
mA it
“' Y i Wyn |
:

as

o

Sample Valve

so este 200 Sera 290 Jee 900 Beste 230)
Sample Number

FIGURE 5.14. Enhanced signal obtained fom program LMS.C with
m= 0.1. (Continued

5.5.2 Frequency Tracking with Noise

Listing 5.19 shows the INSTF.C program, which uses the 1m function to determine in-
Stantaneous frequency estimates. Instead of using the output ofthe adaptive filter as illus-
‘rated in the last section, the INSTP program uses the filter coefficients to estimate the
frequency content of the signal. A 1024-point FFT is used to determine the frequency re~

sponse of the adaptive filter every 100 input samples. The same peak location finding al-

Borithm as used in section 5.1.2 is used to determine the interpolated peak frequency re-
Sponse of the adaptive filter. Note that because the filter coefficients are real, only the
first half of the FFT output is used forthe peak search.

Figure 5.15 shows the output of the INSTF program when the 100,000 samples

from the OSC.C program (see section 4.5.1 of chapter 4) are provided as an input. Figure

'5.15(a) shows the result without a
white Gaussian noise (standard dev
gram. Listing 5.19 shows how the

oise, and Figure 5.15(b) shows the result when
= 100) is added tothe signal from the OSC pro-
rogram was used to add the noise to the input

signal using the gaussian) function. Note the positive bias in both results du tothe fi-

nite
few

length (128 inthis example) of the adaptive FIR filter, Also in Figure 5.15(b) the first
estimates are off scale because of the low signal level in the beginning portion of the

‘waveform generated by the OSC program (the noise dominates inthe fist 10 estimates).238 Sec. 6.5 Adaptive Filter Applications 238
Winclude /* copy LA. coefficients */
Hinelude for(i =O; Leek; in) ¢
include math. ‘samp{i] real = BET;
Winelude *rtaspe.ht samp{i imag = 0.0;

>

/* Us Instantaneous Frequency Est ination Program */
J+ zero pad */

faetine 1127 J fter order, Le1 coefficients +/ for( ; 1 < FPRLENGTH
define LYAX 200 /* max €ilter order, L+1 coefficients */ ‘sanp{]-real = 0.
Naefine NEST 100 /* estinate decimation ratio in output */ ‘semp{il imag = 0)

)
J Fer Length must. be @ power of 2 */

define FFT_LaoTH 1024 fe (samp,¥)
feefine #10 7 must be Log? (FFT_LENGTH) */
for(j = 0; 3 < FRR LENOTU2 + 34)
/* set convergence paraneter */ tempélt = samp{j]-real * samp(i] reals
float mi = 0.017 tempfit + samp{i]-imag * samp(3) imag;
meg(j] = tempt;
void maint) >
G [* find the biggest magnitude spectral bin and output */
float Ine (float, float, float +, int, float, float) tempfle = magl0};
static Float ILM: se
static COMPLEX samp (PRT LMG) 7 for(j = 1; 3 < FRTImeMY2 ; $69) (
Static float maglFFT_LENCTH]; Letmagis] > tempfity (
Float x,d,tempElt, pl, p2s ‘eempflt » mag(i]:
ine 4,3 kr
)
(* scale based on L */ )
‘ma = 2-Orma/ (C41) 7 /* interpolate the peak loacation */
iE == 0) (
x0. l= p2 = 0.0;
for(2i) ( )
for(i = 0; i < NESE; is) ( else (
J ada noise to input for this example * pl = maglil ~ magli-tI:
x = getinput() + 100-0*gaussian(); p2 = mag(3) - magliei;
ms (x,4,b,L,m, 0.02) d
/* aolay @ one sample */ Ssendout (((Eloat)i + 0.5* ((p1-p2)/(plep2+2e-30)}) /FPT_LEMSTH) 5
asx )

> )
LUSTING 6.18. Program INSTF.C. which uses function tma(x,4,>,1,maipba) LUSTING 5.19. (Continued)
to implement the LMS frequency racking algo. (Continuedh©)

236

0.00,—

09

oo

006

05

Frequency (i)

0.04

0

0.02

oory

09 pe

08

007

005

05

Frequeney

ood

FRGURE 5.16.) Frequency etimates obtained tom program INSTEC
{solid tine) with input trom OSCE and correct eqvonces hee re
Benarated by OSCE. The NSTE.C program shown tetera ane
ato not 84d noe othe input signal Frequency ese nar
fram proaram INTEC (oi tn) with agua fom Coe eon
quences (dashed le) generated by OSCE, Note wih « senccnneee
‘ton o 100 was added tothe signal Wom the OSC props,

Sec.5.6 References 237

5.6 REFERENCES

‘Moonen, J. (August 1977) Signal Processing Aspects of Computer Musi: A Survey. Proceedings
Of IBEE, 65, (8,

‘ALLES, H,(Apmi 1980). Music Synthesis Using Real Time Digital Technique. Proceedings of the
IEEE, 68, (8),

Swann J. and Gosserr, P. (1984) A Fexible Sample Rate Conversion Method. Proceedings of
ICASSP.

(Choctnmne, R, and Rasen, L. (March 1981). Interpolation and Decimation of Digital Signals
‘A Tutorial Review. Proceedings ofthe IEEE, 69, 300-331

SSkoLIK, M (1980), Inraduction to Radar Systems, 2nd ed). New York: McGraw-Hill

General Aspects of Digital Transmission Systems (Nov, 1988). Terminal Equipments
Recommendations G.700-G795. Intemational Telegraph and Telephone Consltative
‘Committe (CCITT) 9h Penary Assembly, Melbourne.APPENDIX

ee

DSP FUNCTION LIBRARY
AND PROGRAMS

‘The enclosed disk is an IBM-PC compatible high-density disk (1.44 MBytes capacity)
and contains four directories called PC, ADSP2IK, DSP32C, and C30 for the specific
[programs that have been compiled and tested for the four platforms discussed in this
‘book, Each directory contains a file called READ.ME, which provides additional infor-
imation about the software. A short description of the platforms used in testing associated
with each directory is as follows:

Diretory ‘Avalble Sampling
[Name Platform wd to Compe and Test Programs MIP Rate ali)
re ‘General Purpose IBM PC or wrksttion No ‘Any

ansicy Realtime
ADSP2IK Analog Devices BZ-LAB ADSP.21000/ADSP-21050

(Gerson. comps stare) 2 2
DSPIC__CACACH-AO Board with DEDADALI6

(Gerson 16.1 compiler sofware) ns 6
co Domain Technologies DSPCard-C31

(eso

50 comple sft) 16s

‘The following table is a program list of the C programs and functions described in detail
in chapters 3, 4, and 5. The first column gives the section number in the text where the
program is described and then a short description of the program. The remaining columns
ive the filenames of the four different versions of the source code for the four different
platforms. Note thatthe files from each platform are in diferent directories as shown in
the previous table.

238

DSP Function Library and Programs 239

PC a0KO DSPRC s70C¥
‘lense filename flemame femme
co ¢a a C8
333 Mt Poin FFT Test Function fic fim SK
5342 Imemupt Deven Oupot example NA ino NANA
“41.1 FIR Filer Function (ic filter) fiter Stee fiter er
412 FIR Fier Coetcen by Kaiser Window ksfir NANA NA
4412 FIR Fier Coefiiet by Paks McClln rm NA NANA,
412 Fier Fansion (i ier) ‘iter filer iher_ fer
{1A Real-Time getinput Function (ASCH ext for PC) gstend—gstinpat_—getinpat send 0
‘614 Real-Time getinput Postion (WAV Bleformat)getwny NA NANA
{414 RealTime Sendo! Fantion (ASCII et ferPC) send sendootsendout_— send. 30
{414 RealTime sendot Fonction (WAV flefomut)sendwav NANA NA
42.1 Gaasan Noise Generation Function iter fer = iter er
42.2 Signalo-Noise Ratio improverent akgwa mga mkgen kg
433 Sanple Rte Conversion examele imps NANA ONA
{141 Fas Convoltion Using FFT Mets fast fas fats? fas00
442 Imeroltion Using the FFT ium = NANA NA
45.0R ites a Oscars me ese
14.52 Tobe Generted Waveforms waveu wave waved wave
5.1.1 Speech Spectrom Analyt me NANA NA
‘1.2 Doppler Rade Proceing mime NA NANA
552.1 ARMA Modeling of Signals me NA NANA
$52.2 AR Fregieney Estimation Wig = NA NANA,
53:1 Speech Compression molow slaw ala ul
52.2 ADPOM (6.722 fixes. pia) 2 gmzik NA gm
$532 ADPCM (6.722 oaing-oin) NA 722.210 g722_Me M2
$5461 Baan and Noise Removal cai unin equliz—esuaiz
542 Pik Siting hit pat pet shit
543 Matic Syuesis fmosic mut mu2z m3
5.51 LMS Signal Enhancement Im NANA NA
552 Frguency Tracking wit Noise inf NANA NA

‘Ne "NArefer to programs tht re ot applicable toa patch platform

‘Make files (with an extension MAK) are also included on the disk for each plat-
form. Ifthe user does not have a make utility avalible, PC batch files (with an extension
BAT) are also included with the same name as the make file. The following table is a
‘make file list for many ofthe C programs described in detail in Chapters 3, 4 and 5240 DSP Function Library and Programs Appendix.

Po ai0Ke sre —saecae

‘ename flename flename —_flenane

(mak) Camb) mak) (may
3442 lnemupe- Driven Ou Exam NA ik NA NA

422SimbioNiieReoinpovenest mee nkgen igen ty
439 Sample Rae Coneron Example ine NA NA
441 Fe Comoiten Using FFT Nebo rts elke
442 Inerpoation Using te FFT ita NA NA NA INDEX
45.11 Fee Osins oe aktk tae we
43:2 Tale Geer Werle vwvesb —waveub wren waren
5.11 Spach Spc Any, me NA XA Ra
5.12 Doppler Rader Prcensing wipes NA Na Na
52.1 ARMA Most of Sigs ‘om NA NA NA
322 AR Pequeney Eaton sin aiegafnrtg
531 Speech Compreson muy ine aulew gata
53:2 ADRCM (G72 fae po) wm gman “NA grad
453.2 ADPCM (6-72 sing pis) NA gmat gate granar
54.1 Equation ad Noe Removal ewe @ « «
S42 Pit Siting i Ps » fy
1543 Mase Sytess Ce an un
5351s Spel atncnent mw we a Avent. Langs nrc, 12
3:32 Frgueny Thing win Noe ination tt in Ap convene, 354,125,132 28 enanon, 2.261613, 138, 141, 4,17,
‘cumulation 136 25, Tot 13, 16166231
‘Note: NA" refers o progam ht ot apical wo a uric orm eet 46,48 50,5152, 11,112,186 195, acre 9, 8, IL
194 195,196 197.196, a 205,205,207, toma gs
21, 216207, a8 28,291,283 sega (A
spi le 14611 eras pes 18

eso pert 77.57
ADPCM, 202, 204,218,217
‘ADSP-21020, 99, 14, 105, 16.1
‘ADSP 21060, 9,104,107
[ADSP-210XX, 106,107, 12,14, 121.127

119,129,130

lies 74.95
‘Maing. 7. 16,163 ina ansform, 2,147,150
‘log ies, 21 bicrevera 122,123
‘alg oil converter, 41, 42,132 ews operon, 58, 60

AAR Processes, 3 Bor Malle eth 158
‘sctitectare, 92,99, 102, 107,108, 113,116,130, bury 39"

ARFREQC, 198,199,201

wihnete operators, 9, 60 y

[ARMA Str 17.18 preprocessor, 74 37, 113.

ARMAC, 193, 19,19, 196,197,198 (Programming Pitas 37

ray inde, 66,78 Ges. 82,97

SPs 54,3658, 59.7881, 82,84,88, 114,128, Gas 7.9, 8,85, 64, 89,1501
14,167, 18,29, 226| m

sssembly language. 4, 92,99, 102,108. 111,113, case nemeat 65
14,115,116, 117, 18,120,121, 125,127 st operat, 79

Pa20
20 Index Indox
coy. 10 sync meme san 7.78 ° L
St oon 170 entea ibe 9,68 68,69
tipping, 33,38 Griz. a, 215,216 Try fancons 87,113.
‘coetfickent quantization, 21, 145 GT22.C, 204, 208, 211, 214, 215, 216, 217 ‘near interpolation, 179, 203
cbt one tec, 909511, 31. 12012112128, SEERSST Saas 15 e199, warps 2, He 125,32
Semen 209095 9 tio ea tea pes, $23, 112
enle onje 19.2091 tp fit 1,109 ‘aTsENDC. 18 wastes ie
SSorcconn 8 ‘Sorcemen 28,220 corneal ee nt pero. 8
ris baa Hountizecai i920 Sette 981418, 25 tii Sos oe in 20.2,
Soe mmr 35.57.90. ‘cpl fon oueh aioe Sens
‘Sane emt 108 20 Seon 5,1, 09.92.03, 1214125, cal ies 38,70
‘Sayed sees 6 Scud ame ean 9,208
‘Spee 20,201, 22 ‘Son 3,158 8 Wm 8
Scnsomt cmp, Ron 1.37.0, 96,6, 52.6, 64.65, ing win toa on 396.91
endiona xeon 63,95 67, 7086, 67,91, 95, 138, 138 165 ee pe 788 961, 10.1,
Sita 256 2074.90 124,28, en 9.99 a
oa a tcenen Sees e swt he, 2,219.18, 14, 19,18,
comin 166,683,996 127 Wee eh 1
‘onincon tins .
rot sce 6,66. 67.95,96 1 M
‘Shree 4 as, has fut convoatn 14 168,190,171 172 e
‘onvoltin 9,10, 18,25. 134135, 165,168,170, fast ering, 168 BMC 58.8897 Paracas oe ta ait]
Tn fc ureter, 262852, 1614 salle 17, See are une
crs contin #7 erp. 02, 31919 1, ie conus ‘kuna. 00.98
ee if eae ci
> es 2 i, 1,802,195, 178 Sacer
fn oa ‘filter, 145, 146, 147, 148, 150, 151, 219, 239° “matrix operations, 90, 1
gpa 58 00, ferme staa ‘pte pone 9.10, 17-18 2,133, 10,15, van, 37,38, 4, 4,42, 43, 44 464,505,
Si gprs 250 fearon 8 26 seo. salons
‘epee 1 mune nese ra, ria aa a
ie, 262 1,164,110, 205221, ITER, 1 60.12.17 Lagi 89.9 Sa
a Engi eon FR) 17,1310 ice sane OS
deci wie, 5 Faria 50,11 11,28, Sele R 8 ae es
SSyct ,19,10 146,165,168, to iae ne id 14 18 sin, 9.1.85. ae
Dre inane 25 3.20 at oe 1 16.2 puoi ate
Stes ein 107383 1,.206 mas 23 29.2 ae ate iat
Siren 8 fe fet, 3185196151, 1,67, 168.8, BIER scons, epee
Si er, 17, 18,3153 16,184 an za 28 24 ai 16.164 16 18 16, ea
oi tela ts ane es interrupts, 102, 107, 121, 125, 126 moving average (MA), 44
aie fn 515157, 180,183,236 ier, 0,107. Mua,
fe 13,195 18 pao tres 10
fer lop, SA, 67,72, 76,87, 91, 94.95.96. 135 DET IB tre 132 176, 182, 186,201, 202, 218, 25,226,
discrete Time Signals, $ Fourier transform, 1, 3. 4, 14,15, 17, 18,19, 25,26, ‘averse PPT. 170, 171, 172,176,177 8
Sites Sean me, i86 iene PT O12 176 17
Sonn 98 fe, 19,9898 98 12 125.173 feo, 67,90, 91, 95 15
‘Soumenm eer an’ 1-1, 1629 2425., x
aeeiaie a ea « noise, 21, 2, 35,42, 43,4448, 46,47, 50,51, 98,
tng sy soon 1517.1 rnd 18134157138, 11, 1,18, 145, 1S 16,162 163,186, 167,183
Sentatig 1) fea nee 11 19.2021.22.23 ae nen 4:96 7, 29 1, 18, 1 1 i eh
Darren 559.9.9, 116 is oe 1 218 300.295, ie ata IA BA 28
Dares, 0 8129 secur wring 8628288 kota 28
sesso 00 OL ioe 4.118, ema 8 1, ak Su 366 1,76, 909114 cue
tin tk a1 fect ue 3 GARG I :

EE24 Index Index 245
eee eee Peete oo sve
eat ry Seiseein oe
aoeee ER OUT aay Mourad nasa
: Sonerees Sarat wanes
Socusteseeee a Saraces
a a aoens °
ates oe See nme
eter ee eeeeae
Soca So eoaes
oversized function, 93, ‘register, 11, 72, 107, 108, 111, 115, 118, 120, 121, ‘cite, 16,17, 51,178
eae cS ans eT no
. ee SSE OUTTS onan, Rawle
-Remez exchange algorithm, 18, 134, 140 (68, 77, 82, 84, 85, 86, 90, 95, 96, user interface, 116,
ee eee tat dco ones a
eras ee eee eee ‘
aes Eanes SS
ae ees = ieee eae
eae . .
ae 7
ee ante eo
an Sereno a Sica alt
pointers, $3, 56, 60, 69, 71,72. 73, 77,78, 80. 81, sampling function, 3,4, 5 tightly bound, 93. waveforms, 7,178, 179, 186
euieneia sits anaes Pe sc nammanun WEED
pole-zero plot, 149 ‘163, 201, 202, 204, 218, 219, 220 170, 176 white noise, 42,4, 46, 160, 162,193, 28, 231
‘poles, 51, 146, 147, 149, 150, 178, 193, 194, 195, ‘scaling, 9 ‘time invariant operators, 1, 8, 10 Wiener filter, 46, 48,49, 160
a ee eee sug wekooe nei
coats ae eee
uae, See eee ete
eee ae eae ee
Seeetrs cetitcr Soe
<2 see ere
aes ce
monies ete
coe aoa
eee eae
° ot
an eens
eas
. arora
estas

SsC ALGORITHMS
FOR REAL-TIME DSP
——— ee OF

PauL M. EMBREE$

Paul Embree C Algorithms For Real-Time DSP PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Paul Embree C Algorithms For Real-Time DSP PDF

Uploaded by

Copyright:

Available Formats

You might also like