# A New Sequence in Signals and Linear Systems

Part I: ENEE 241
Department of Electrical and Computer Engineering
University of Maryland, College Park
Draft 8, 01/24/07
Contents
1 Numbers, Waveforms and Signals 1
1.1 Real Numbers in Theory and Practice . . . . . . . . . . . . . 2
1.1.1 Fixed-Point Representation of Real Numbers . . . . . 2
1.1.2 Floating-Point Representation . . . . . . . . . . . . . . 3
1.1.3 Visualization of the Fixed and Floating Point Repre-
sentations . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 Finite-Precision Representations of Real Numbers . . 5
1.1.5 The IEEE Standard for Floating-Point Arithmetic . . 6
1.2 Round-oﬀ Errors . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Error Deﬁnitions . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Round-Oﬀ . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Round-Oﬀ Errors in Finite-Precision Computation . . 10
1.3 Review of Complex Numbers . . . . . . . . . . . . . . . . . . 13
1.3.1 Complex Numbers as Two-Dimensional Vectors . . . 13
1.3.2 The Imaginary Unit . . . . . . . . . . . . . . . . . . . 15
1.3.3 Complex Multiplication and its Interpretation . . . . . 15
1.3.4 The Complex Exponential . . . . . . . . . . . . . . . . 16
1.3.5 Inverses and Division . . . . . . . . . . . . . . . . . . . 18
1.4 Sinusoids in Continuous Time . . . . . . . . . . . . . . . . . . 20
1.4.1 The Basic Cosine and Sine Functions . . . . . . . . . . 20
1.4.2 The General Real-Valued Sinusoid . . . . . . . . . . . 21
1.4.3 Complex-Valued Sinusoids and Phasors . . . . . . . . 23
1.5 Sinusoids in Discrete Time . . . . . . . . . . . . . . . . . . . . 27
1.5.1 Discrete-Time Signals . . . . . . . . . . . . . . . . . . 27
1.5.2 Deﬁnition of the Discrete-Time Sinusoid . . . . . . . . 27
1.5.3 Equivalent Frequencies for Discrete-Time Sinusoids . . 29
1.5.4 Periodicity of Discrete-Time Sinusoids . . . . . . . . . 32
1.6 Sampling of Continuous-Time Sinusoids . . . . . . . . . . . . 34
1.6.1 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 34
ii
iii
1.6.2 Discrete-Time Sinusoids Obtained by Sampling . . . . 35
1.6.3 The Concept of Alias . . . . . . . . . . . . . . . . . . 36
1.6.4 The Nyquist Rate . . . . . . . . . . . . . . . . . . . . 39
Problems for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . 42
2 Matrices and Systems 49
2.1 Introduction to Matrix Algebra . . . . . . . . . . . . . . . . . 50
2.1.1 Matrices and Vectors . . . . . . . . . . . . . . . . . . . 50
2.1.2 Signals as Vectors . . . . . . . . . . . . . . . . . . . . 52
2.1.3 Linear Transformations . . . . . . . . . . . . . . . . . 54
2.2 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . 56
2.2.1 The Matrix of a Linear Transformation . . . . . . . . 56
2.2.2 The Matrix-Vector Product . . . . . . . . . . . . . . . 57
2.2.3 The Matrix-Matrix Product . . . . . . . . . . . . . . . 58
2.2.4 Associativity and Commutativity . . . . . . . . . . . . 59
2.3 More on Matrix Algebra . . . . . . . . . . . . . . . . . . . . . 62
2.3.1 Column Selection and Permutation . . . . . . . . . . . 62
2.3.2 Matrix Transpose . . . . . . . . . . . . . . . . . . . . . 63
2.3.3 Row Selection . . . . . . . . . . . . . . . . . . . . . . . 65
2.4 Matrix Inversion and Linear Independence . . . . . . . . . . . 66
2.4.1 Inverse Systems and Signal Representations . . . . . . 66
2.4.2 Range of a Matrix . . . . . . . . . . . . . . . . . . . . 67
2.4.3 Geometrical Interpretation of the Range . . . . . . . . 68
2.5 Inverse of a Square Matrix . . . . . . . . . . . . . . . . . . . . 72
2.5.1 Nonsingular Matrices . . . . . . . . . . . . . . . . . . 72
2.5.2 Inverse of a Nonsingular Matrix . . . . . . . . . . . . . 73
2.5.3 Inverses in Matrix Algebra . . . . . . . . . . . . . . . 75
2.5.4 Nonsingularity of Triangular Matrices . . . . . . . . . 76
2.5.5 The Inversion Problem for Matrices of Arbitrary Size 77
2.6 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . 79
2.6.1 Statement of the Problem . . . . . . . . . . . . . . . . 79
2.6.2 Outline of Gaussian Elimination . . . . . . . . . . . . 79
2.6.3 Gaussian Elimination Illustrated . . . . . . . . . . . . 81
2.6.4 Row Operations in Forward Elimination . . . . . . . . 83
2.7 Factorization of Matrices in Gaussian Elimination . . . . . . 84
2.7.1 LU Factorization . . . . . . . . . . . . . . . . . . . . 84
2.7.2 The Matrix L . . . . . . . . . . . . . . . . . . . . . . . 85
2.7.3 Applications of the LU Factorization . . . . . . . . . . 88
2.7.4 Row Operations in Backward Substitution . . . . . . . 89
2.8 Pivoting in Gaussian Elimination . . . . . . . . . . . . . . . . 92
iv
2.8.1 Pivots and Pivot Rows . . . . . . . . . . . . . . . . . . 92
2.8.2 Zero Pivots . . . . . . . . . . . . . . . . . . . . . . . . 92
2.8.3 Small Pivots . . . . . . . . . . . . . . . . . . . . . . . 93
2.8.4 Row Pivoting . . . . . . . . . . . . . . . . . . . . . . . 95
2.9 Further Topics in Gaussian Elimination . . . . . . . . . . . . 98
2.9.1 Computation of the Matrix Inverse . . . . . . . . . . . 98
2.9.2 Ill-Conditioned Problems . . . . . . . . . . . . . . . . 100
2.10 Inner Products, Projections and Signal Approximation . . . . 102
2.10.1 Inner Products and Norms . . . . . . . . . . . . . . . 102
2.10.2 Angles, Projections and Orthogonality . . . . . . . . . 103
2.10.3 Formulation of the Signal Approximation Problem . . 105
2.10.4 Projection and the Signal Approximation Problem . . 106
2.10.5 Solution of the Signal Approximation Problem . . . . 108
2.11 Least-Squares Approximation in Practice . . . . . . . . . . . 111
2.12 Orthogonality and Least-Squares Approximation . . . . . . . 116
2.12.1 A Simpler Least-Squares Problem . . . . . . . . . . . 116
2.12.2 Generating Orthogonal Reference Vectors . . . . . . . 117
2.13 Complex-Valued Matrices and Vectors . . . . . . . . . . . . . 122
2.13.1 Similarities to the Real-Valued Case . . . . . . . . . . 122
2.13.2 Inner Products of Complex-Valued Vectors . . . . . . 123
2.13.3 The Least-Squares Problem in the Complex Case . . . 126
Problems for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . 128
3 The Discrete Fourier Transform 144
3.1 Complex Sinusoids as Reference Vectors . . . . . . . . . . . . 145
3.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 145
3.1.2 Fourier Sinusoids . . . . . . . . . . . . . . . . . . . . . 146
3.1.3 Orthogonality of Fourier Sinusoids . . . . . . . . . . . 148
3.2 The Discrete Fourier Transform and its Inverse . . . . . . . . 153
3.2.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . 153
3.2.2 Symmetry Properties of the DFT and IDFT Matrices 155
3.2.3 Signal Structure and the Discrete Fourier Transform . 158
3.3 Structural Properties of the DFT: Part I . . . . . . . . . . . . 163
3.3.1 The Spectrum of a Real-Valued Signal . . . . . . . . . 163
3.3.2 Signal Permutations . . . . . . . . . . . . . . . . . . . 166
3.4 Structural Properties of the DFT: Part II . . . . . . . . . . . 170
3.4.1 Permutations of DFT and IDFT Matrices . . . . . . . 170
3.4.2 Summary of Identities . . . . . . . . . . . . . . . . . . 171
3.4.3 Main Structural Properties of the DFT . . . . . . . . 172
3.5 Structural Properties of the DFT: Examples . . . . . . . . . . 176
v
3.5.1 Miscellaneous Signal Transformations . . . . . . . . . 176
3.5.2 Exploring Symmetries . . . . . . . . . . . . . . . . . . 177
3.6 Multiplication and Circular Convolution . . . . . . . . . . . . 182
3.6.1 Duality of Multiplication and Circular Convolution . . 182
3.6.2 Computation of Circular Convolution . . . . . . . . . 185
3.7 Periodic and Zero-Padded Extensions of a Signal . . . . . . . 187
3.7.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . 187
3.7.2 Periodic Extension to an Integer Number of Periods . 188
3.7.3 Zero-Padding to a Multiple of the Signal Length . . . 191
3.8 Detection of Sinusoids Using the Discrete Fourier Transform . 192
3.8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 192
3.8.2 The DFT of a Complex Sinusoid . . . . . . . . . . . . 193
3.8.3 Detection of a Real-Valued Sinusoid . . . . . . . . . . 197
3.8.4 Final Remarks . . . . . . . . . . . . . . . . . . . . . . 199
Problems for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . 201
4 Introduction to Linear Filtering 215
4.1 The Discrete-Time Fourier Transform . . . . . . . . . . . . . 216
4.1.1 From Vectors to Sequences . . . . . . . . . . . . . . . 216
4.1.2 Development of the Discrete-Time Fourier Transform 217
4.2 Computation of the Discrete-Time Fourier Transform . . . . 220
4.2.1 Signals of Finite Duration . . . . . . . . . . . . . . . . 220
4.2.2 The DFT as a Sampled DTFT . . . . . . . . . . . . . 223
4.2.3 Signals of Inﬁnite Duration . . . . . . . . . . . . . . . 225
4.3 Introduction to Linear Filters . . . . . . . . . . . . . . . . . . 229
4.3.1 Linearity, Time-Invariance and Causality . . . . . . . 229
4.3.2 Sinusoidal Inputs and Frequency Response . . . . . . 231
4.3.3 Amplitude and Phase Response . . . . . . . . . . . . . 233
4.4 Linear Filtering in the Frequency Domain . . . . . . . . . . . 236
4.4.1 System Function and its Relationship to Frequency
Response . . . . . . . . . . . . . . . . . . . . . . . . . 236
4.4.2 Cascaded Filters . . . . . . . . . . . . . . . . . . . . . 238
4.4.3 The General Finite Impulse Response Filter . . . . . . 239
4.4.4 Frequency Domain Approach to Computing Filter Re-
sponse . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
4.4.5 Response to a Periodic Input . . . . . . . . . . . . . . 243
4.5 Frequency-Selective Filters . . . . . . . . . . . . . . . . . . . . 247
4.5.1 Main Filter Types . . . . . . . . . . . . . . . . . . . . 247
4.5.2 Ideal Filters . . . . . . . . . . . . . . . . . . . . . . . . 248
4.5.3 Amplitude Response of Practical Filters . . . . . . . . 249
vi
4.5.4 Practical FIR Filters . . . . . . . . . . . . . . . . . . . 251
4.5.5 Filter Transformations . . . . . . . . . . . . . . . . . . 254
4.6 Time Domain Computation of Filter Response . . . . . . . . 256
4.6.1 Impulse Response . . . . . . . . . . . . . . . . . . . . 256
4.6.2 Convolution and Linearity . . . . . . . . . . . . . . . . 258
4.6.3 Computation of the Convolution Sum . . . . . . . . . 259
4.6.4 Convolution and Polynomial Multiplication . . . . . . 262
4.6.5 Impulse Response of Cascaded Filters . . . . . . . . . 264
4.7 Convolution in Practice . . . . . . . . . . . . . . . . . . . . . 266
4.7.1 Linear Convolution of Two Vectors . . . . . . . . . . . 266
4.7.2 Linear Convolution versus Circular Convolution . . . . 267
4.7.3 Circular Buﬀers for Real-Time FIR Filters . . . . . . . 270
4.7.4 Block Convolution . . . . . . . . . . . . . . . . . . . . 271
Problems for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . 274
Epilogue 284
The Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
The Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . 287
Nyquist Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Chapter 1
Numbers, Waveforms and
Signals
1
2
1.1 Real Numbers in Theory and Practice
The real line R consists of a continuum of points, each point representing a
diﬀerent number. A real number can be
• rational, i.e., a ratio m/n of two integers, where (n ,= 0); or
• irrational ; examples of irrational numbers include π,

2, etc.
0 x R
Figure 1.1: The real line.
1.1.1 Fixed-Point Representation of Real Numbers
This is the most natural representation of a real number x. The sign of x
(also denoted by sgn(x)) is followed by the integer part of x, the (ﬁxed)
fractional point, and the fractional part of x.
+/- Integer Part Fractional Part
.
Figure 1.2: Fixed-point representation of a real number. Frac-
tional part may consist of inﬁnitely many digits.
The representation uses a base (or radix) B, which is a positive integer.
All digits (in both the integer and fractional parts of x) are integers ranging
from 0 to B −1, inclusive. For example,
• in the decimal representation of x, the base B = 10 is used, and the
possible digits are 0, 1, . . . , 9;
• in the binary representation of x, the base B = 2 is used, and the
possible digits are 0 and 1, only.
The integer part of x is always a ﬁnite string of digits. The fractional part
is, in general, an inﬁnite string; in other words, non-terminating expansions
are the rule rather than the exception. For example (B = 10):
π = 3.1415926 . . .

2 = 1.4142136 . . .
3
The rational part of x has a terminating expansion if and only if x is an exact
multiple of a negative power of B (i.e., it equals B
−l
, where l is integer).
Interestingly, terminating expansions can be expressed in two equivalent
non-terminating forms. For example (B = 10):
13.75 = 13.7500000 . . . = 13.7499999 . . .
Clearly, decimal (B = 10) representations are most common. Binary
forms (B = 2) are equally (if not more) useful in machine computations.
Conversion to binary form involves expressing the integer and fractional
parts of x in terms of nonnegative and negative powers of 2, respectively.
For example,
13 = 1 2
3
+ 1 2
2
+ 0 2
1
+ 1 2
0
= (1101)
2
0.75 = 1 2
−1
+ 1 2
−2
= (0.11)
2
and thus
13.75 = (1101.11)
2
= (1101.1100000 . . .)
2
= (1101.1011111 . . .)
2
where both non-terminating expansions are shown. Note that for a positive
integer l, 2
−l
can be expressed as 5
l
10
−l
. Thus, if the binary represen-
tation of x has a terminating fractional part, so does the decimal one. The
converse is not, however, true; e.g., the decimal fraction 0.6 does not have a
terminating binary expansion.
1.1.2 Floating-Point Representation
A ﬂoating-point representation of x with respect to base B is usually speci-
ﬁed as
sgn(x) (0.d
1
d
2
d
3
. . .) B
E
where:
• the fraction 0.d
1
d
2
d
3
. . . is known as the mantissa of x; and
• E is the exponent of the base B.
The integer factor B
E
is also known as the characteristic of the ﬂoating-
point representation. Note that the ﬂoating-point representation of x is not
unique. For example, x = 80 can be written as
0.8 10
2
= 0.08 10
3
= 0.008 10
4
= . . .
4
The ﬂoating point representation can be made unique by normalization,
i.e., requiring that d
1
be nonzero. Thus the normalized ﬂoating-point rep-
resentation of x = 80 is x = 0.8 10
2
. Similarly,
π = (0.31415926 . . .) 10
1
13.75 = 0.1375 10
2
= (0.110111)
2
2
4
0.01375 = 0.1375 10
−1
Note that in the binary (B = 2) case, normalization forces the ﬁrst fractional
digit d
1
to be equal to 1.
1.1.3 Visualization of the Fixed and Floating Point Repre-
sentations
Assume a ﬁxed base B. If, in the ﬁxed-point representation of x, we keep
the sign and integer part of x ﬁxed and vary the fractional part, the value of
x will range between two consecutive integers. In other words, the integer
part of x corresponds to a unit-length interval on the real line, and such
intervals form a partition of the real line as shown in Figure 1.3.
−4 −3 −2 −1 0 1 2 3 4
Figure 1.3: Partition of the real line into intervals of unit length.
In the normalized ﬂoating-point representation of x, keeping both the
sign and exponent E of x ﬁxed and varying the mantissa also results in x
ranging over an interval of values. Since d
1
≥ 1, the range of 0.d
1
d
2
d
3
. . . is
the interval [1/B, 1], shown in bold in Figure 1.4.
0 1/2 1 0 1/10 1
B=2 B=10
Figure 1.4: Shown in bold are the ranges of values of the normal-
ized mantissa of binary (left) and decimal (right) numbers.
Thus positive numbers with the same exponent E form the interval
[B
E−1
, B
E
]. This means that varying E results in a partition of the positive
5
0 .25 .5 1 2 4
E=−1 E=0 E=1 E=2
Figure 1.5: Partition of the real line according to the exponent E
in the binary ﬂoating-point representation.
real line into intervals of diﬀerent length—the ratio of interval lengths being
a power of B. The binary case is illustrated in Figure 1.5.
Clearly, the same holds for the negative real line. Note that x = 0 is the
only number that does not have a normalized representation (all the d
i
’s are
zero).
1.1.4 Finite-Precision Representations of Real Numbers
Machine computations require ﬁnite storage and ﬁnite processing time. If
K bits are allocated to storing a real variable, then that variable can take
at most 2
K
values. These values will be collectively termed as the precise
numbers. In the course of a numerical computation, every real number is
approximated by a precise number. This approximation is known as roundoﬀ
(more on that later).
Fixed-point computation involves, essentially, integer arithmetic. The
2
K
precise numbers are uniformly spaced on an interval of length A. In a
typical application, a precise number would be an exact multiple of A 2
−K
ranging in value from −(A/2) +A 2
−K
to A/2.
Although ﬁxed-point computation is relatively simple and can be at-
tractive for low-power applications, it can lead to large relative errors when
small values are involved. Floating-point computation, on the other hand,
is generally more accurate and preferable for most applications.
Precise numbers in ﬂoating-point computation are determined by assign-
ing K
e
bits to the exponent and K
m
bits to the mantissa, where
1 +K
e
+K
m
= K
(One bit is required for encoding the sign of the number.) The 2
K
e
possible
exponents are consecutive integers; this places the positive precise numbers
in K
e
adjacent intervals of the form [2
E−1
, 2
E
). Within each such interval,
there are 2
K
m
uniformly spaced precise numbers, each corresponding to a
6
diﬀerent value of the mantissa. Details of an actual implementation (e.g.,
in MATLAB) are described below.
1.1.5 The IEEE Standard for Floating-Point Arithmetic
MATLAB uses what is known as double-precision variables in its ﬂoating-
point computations. The precise numbers and their encoding follow the
IEEE 754 Standard. K = 64 bits are used for each precise number, with
K
e
= 11 bits allocated to the exponent and K
m
= 52 bits allocated to the
mantissa. The bits are arranged as shown in Figure 1.6:
Sign (1 bit)
Exponent Codeword
(11 bits)
Mantissa Codeword
(52 bits)
d
53
d
2
Figure 1.6: Binary encoding of a double-precision ﬂoating-point
number.
The value of the sign bit is 0 for positive numbers, 1 for negative ones.
Read as a binary integer with the most signiﬁcant digit on the left (as
usual), the exponent codeword ranges in value from C = 0 to C = 2
11
−1 =
2047. The precise number a is then obtained as follows:
• If 1 ≤ C ≤ 2046, then a has a normalized ﬂoating-point representation
with exponent E = C−1022 and mantissa given by 0.1d
2
. . . d
53
, where
d
2
, . . . , d
53
are read directly oﬀ the mantissa codeword in the order
shown in Figure 1.6.
a = sgn(a) (0.1d
2
. . . d
53
) 2
C−1022
• If C = 0, then a has a denormalized ﬂoating-point representation:
a = sgn(a) (0.0d
2
. . . d
53
) 2
−1021
= sgn(a) (0.d
2
. . . d
53
) 2
−1022
Note that d
2
= . . . = d
53
= 0 gives a = 0.
• If C = 2047, the value of a is either inﬁnite (Inf) or indeterminate
(NaN). An inﬁnite value (positive or negative) is encoded using an all-
zeros mantissa. Any other mantissa word denotes an indeterminate
value (such values arise from expressions such as 0/0, ∞−∞, etc.).
7
The eﬀective range of exponents in the normalized ﬂoating-point repre-
sentation is E = −1021 to E = 1024. As discussed earlier, each exponent E
corresponds to a range of 2
52
precise numbers uniformly spaced in [2
E−1
, 2
E
)
(on the positive real line). Denormalization also allows 2
52
uniformly spaced
precise numbers in the interval [0, 2
−1022
).
MATLAB can display the binary string corresponding to any ﬂoating
point number (after approximating it by the nearest precise number); the
hexadecimal form is used (0000 through 1111 are encoded as 0 through F).
The command
format hex
causes all subsequent numerical values to be shown in that form. Conversely,
the command
hex2num(’string ’)
where string consists of 16 hex digits (= 64 bits), will output the precise
number corresponding to the input string. Thus:
• hex2num(’0000000000000001’) gives the smallest positive number,
namely 2
−1074
, which is approximately equal to 4.94 10
−324
;
• hex2num(’7FEFFFFFFFFFFFFF’) gives the largest positive number, namely
2
1024
−2
971
, which is approximately equal to 1.80 10
308
.
8
1.2 Round-oﬀ Errors
1.2.1 Error Deﬁnitions
The error in approximating a number x by ˆ x is given by the diﬀerence
ˆ x −x
The relative error is given by the ratio
ˆ x −x
x
and is often quoted as a percentage. We will use the term “absolute” (error,
relative error) to designate the absolute value of the quantities deﬁned above.
Word of Caution: The term absolute error is sometimes used for the (signed)
error ˆ x−x itself, to diﬀerentiate it from the relative error. We will not adopt
this deﬁnition here.
1.2.2 Round-Oﬀ
Round-oﬀ is deﬁned as the approximation of a real number by a ﬁnite-
precision number. This approximation is necessary in most machine com-
putations involving real numbers, due to the storage constraints (ﬁnite word
lengths for all variables involved).
The most common round-oﬀ methods are chopping and rounding.
Chopping
Chopping is the approximation of a real number x by the closest precise
number whose absolute value is less than, or equal to, that of x (i.e., the
approximation is in the direction of the origin). Chopping is illustrated in
Figure 1.7.
y
y
x
x
0
Figure 1.7: Chopping (ˆ) of positive and negative numbers.
As an example, consider a decimal ﬁxed-point machine with four-digit
precision. This means that the precise numbers are all multiples of 10
−4
that fall in a range (−A/2, A/2], where A is a large number. Assuming that
9
x falls in the above range, chopping x is the same as keeping the ﬁrst four
fractional digits in the ﬁxed-point decimal representation of x.
Thus, denoting chopping by “→”, we have:
π = 3.1415926 . . . → 3.1415
1 −

2 = −0.4142136 . . . → −0.4142
The same procedure would apply to a decimal ﬂoating-point machine
with (the same, for the sake of comparison) four-digit precision. In this
case, four-digit precision refers to the decimal expansion of the mantissa.
Thus, the same numbers would be chopped as follows:
π = (0.31415926 . . .) 10
1
→ 0.3141 10
1
1 −

2 = (−0.4142136 . . .) 10
0
→ −0.4142 10
0
The signiﬁcant digits of a number begin with the leftmost nonzero digit
in its ﬁxed-point representation; that digit (known as the most signiﬁcant
digit) can be on either side of the fractional point. Thus, for example, the
signiﬁcant digits of π are:
31415926...
The signiﬁcant digits are the same as the fractional digits in the normalized
ﬂoating-point representation.
Note that in the case of π, the ﬂoating-point machine produced one less
signiﬁcant digit for π than the ﬁxed-point machine. This is consistent with
the fact that ﬁxed-point computation results in errors ˆ x−x that are roughly
of the same order of magnitude regardless of the order of magnitude of x;
thus the relative error improves as the order of magnitude of x increases.
Floating-point computation, on the other hand, maintains the same order of
magnitude for the relative error throughout the range of values of x for which
a normalized representation is possible; while the expected actual (i.e., not
relative) error increases with [x[.
Rounding
Rounding is the approximation of x by the closest precise number. If x is
midway between two precise numbers, then the number farther from the
origin is chosen (i.e., the number with the larger absolute value). Rounding
is illustrated in Figure 1.8.
In the ﬁxed-point, four decimal-digit precision example discussed earlier,
rounding (again denoted by “→”) gives
π = 3.1415926 . . . → 3.1416
1 −

2 = −0.4142136 . . . → −0.4142;
10
y
y
x
x
Figure 1.8: Rounding (ˆ).
while in the ﬂoating-point case,
π = (0.31415926 . . .) 10
1
→ 0.3142 10
1
1 −

2 = (−0.4142136 . . .) 10
0
→ −0.4142 10
0
Note that in the case of π, rounding gave a diﬀerent answer from chop-
ping. Rounding is preferable, since it always results in a lower absolute
error value [ˆ x−x[ than does chopping. The error resulting from chopping is
also biased: it is always negative for positive numbers and positive for neg-
ative numbers. This can lead to very large cumulative errors in situations
where one sign (+ or −) dominates, e.g., adding a large number of positive
variables scaled by positive coeﬃcients.
1.2.3 Round-Oﬀ Errors in Finite-Precision Computation
Fact. Round-oﬀ errors are ubiquitous.
Example 1.2.1. Suppose we enter
x = 0.6
in MATLAB. In decimal form, this is an exact fraction (6/10), which is
far easier to specify numerically than, say, π. Yet 6/10 does not have a
terminating binary expansion, i.e., it is not an exact multiple of 2
−l
for
some integer l. As a result, MATLAB will round 0.6 to 53 signiﬁcant binary
digits (equivalent to roughly 16 decimal digits). The value x = 6/10 falls in
the range [1/2, 1), where the spacing between precise numbers equals 2
−53
.
It follows that the round-oﬀ error in representing x will be less than 2
−54
in
absolute value, but it won’t be zero.
Fact. The order of computation aﬀects the accumulation of round-oﬀ errors.
As a rule of thumb, summands should be ordered by increasing absolute value
in ﬂoating-point computation.
11
Example 1.2.2. Assume a decimal ﬂoating-point machine with four digit
precision. At every stage of the computation (including input and output),
every number x is rounded to the form
sgn(x) (0.d
1
d
2
d
3
d
4
) 10
E
where d
i
are decimal digits such that d
1
> 0 (for normalization), and E is
assumed unrestricted.
Suppose that we are asked to compute
x
0
+x
1
+. . . +x
10
where x
0
= 1 and x
1
= . . . = x
10
= 0.0001. Note that the ratio of x
0
to
each of the other numbers is 10
−4
, which corresponds to four decimal digits.
Also, each of the eleven numbers corresponds to a precise number, i.e., no
round-oﬀ error is incurred in representing x
0
, . . . , x
10
.
Clearly, the exact answer is 1.001, which is the precise four-digit ﬂoating-
point number 0.100110
1
. If we perform the summation in the natural order
(i.e., that of the indices), then we have:
x
0
+x
1
= 1.0001 → 0.1000 10
1
0.1000 10
1
+x
2
= 1.0001 → 0.1000 10
1
.
.
.
.
.
.
.
.
.
0.1000 10
1
+x
10
= 1.0001 → 0.1000 10
1
Thus at each stage, four-digit precision produces the same result as adding
zero to x
0
. The ﬁnal result is clearly unsatisfactory (even though the relative
−3
).
The exact answer can be obtained by summing x
1
through x
10
ﬁrst, and
0
:
x
1
+x
2
= 0.0002 = 0.2000 10
−3
0.2000 10
−3
+x
3
= 0.0003 = 0.3000 10
−3
.
.
.
.
.
.
.
.
.
0.9000 10
−3
+x
10
= 0.0010 = 0.1000 10
−2
0.1000 10
−2
+x
0
= 0.1001 10
1
Note that no rounding was necessary at any point in the above computation.
12
Fact. Computing the diﬀerence of two quantities that are close in value may
result in low precision. If higher precision is sought, a diﬀerent algorithm
may be required.
Example 1.2.3. Consider the computation of 1 − cos x for x = 0.02 on a
ﬂoating-point machine with three-digit precision. The exact answer is
1 −cos(0.02) = 1 −(0.999800007 . . .) = (0.1999933...) 10
−3
If cos(0.2) is ﬁrst computed correctly and then rounded to three signiﬁcant
0.1000 10
1
−0.1000 10
1
= 0
which is clearly unsatisfactory. A diﬀerent algorithm is needed here.
The Taylor series expansion for cos x gives
cos x = 1 −
x
2
2!
+
x
4
4!

x
6
6!
+. . .
1 −cos x =
x
2
2!

x
4
4!
+
x
6
6!
−. . .
Evaluating the ﬁrst three terms in the second expansion with three-digit
precision, we obtain:
x
2
/2 = 0.200 10
−3
x
4
/24 = 0.667 10
−8
x
6
/720 = 0.889 10
−13
It is possible to estimate, using the remainder form of Taylor’s theorem,
the error involved in approximating 1 −cos x by the ﬁrst three terms in the
expansion. That estimate turns out to be much less than the smallest of the
three terms. Since the ratio between consecutive terms is much larger than
10
3
, summing with 3-digit precision will result in an answer equal to the ﬁrst
term, namely 0.200 10
−3
; while the error will be dominated by the second
term. The relative error is thus roughly (0.667 10
−8
)/(0.200 10
−3
) =
3.3310
−5
, which is very satisfactory considering the low precision involved
in the computation.
13
1.3 Review of Complex Numbers
Complex numbers are especially important in signal analysis because they
lead to an eﬃcient representation of sinusoidal signals—arguably the most
important class of signals.
1.3.1 Complex Numbers as Two-Dimensional Vectors
A complex number z is a two-dimensional vector, or equivalently, a point
(x, y) on the Cartesian plane:
0
z=(x,y)
x
y
r
θ
Figure 1.9: Cartesian and polar representation of a complex num-
ber z.
The Cartesian coordinate pair (x, y) is also equivalent to the polar coor-
dinate pair (r, θ), where r is the (nonnegative) length of the vector corre-
sponding to (x, y), and θ is the angle of the vector relative to positive real
line. We have
x = r cos θ
y = r sinθ
r =
_
x
2
+y
2
As implied by the arrow in Figure 1.9, the angle θ takes values in [0, 2π).
It is also acceptable to use angles outside that interval, by adding or sub-
tracting integer multiples of 2π (this does not aﬀect the values of the sine and
cosine functions). In particular, angles in the range (π, 2π) (corresponding
to points below the x-axis) are often quoted as negative angles in (−π, 0).
A formula for θ in terms of the Cartesian coordinates x and y is
θ = arctan
_
y
x
_
+ (0 or π)
14
where π is added if and only if x is negative. Since the conventional range
of the function arctan() is the interval [−π/2, π/2], the resulting value of θ
is in the range [−π/2, 3π/2).
We also have the following terminology and notation in conjunction with
the complex number z = (x, y) = (r, θ):
• x = 'e¦z¦, the real part of z;
• y = ·m¦z¦, the imaginary part of z;
• r = [z[, the modulus or magnitude of z;
• θ = ∠z, the phase or angle of z.
The complex conjugate of z is the complex number z

deﬁned by
'e¦z

¦ = 'e¦z¦ and ·m¦z

¦ = −·m¦z¦
Equivalently,
[z

[ = [z[ and ∠z

= −∠z
Example 1.3.1. Consider the complex number
z =
_

1
2
,

3
2
_
shown in the ﬁgure.
−1/2 0
3/2
Re
Im
z
z*
Example 1.3.1
We have 'e¦z¦ = −1/2, ·m¦z¦ =

3/2. The modulus of z equals
[z[ =
¸
¸
¸
_
_
1
2
_
2
+
_

3
2
_
2
= 1
15
while the angle of z equals
∠z = arctan(−

3) +π = −
π
3
+π =

3
Also,
z

=
_

1
2
, −

3
2
_
with
[z

[ = 1 and ∠z

= −

3

1.3.2 The Imaginary Unit
A complex number is also represented as
z = x +jy
where j denotes the imaginary unit. The ‘+’ in the expression z = x + jy
represents the summation of the two vectors (x, 0) and (0, y), which clearly
yields (x, y) = z.
It follows easily that if c is a real-valued constant, then
cz = cx +j(cy)
Also, if z
1
= x
1
+jy
1
and z
2
= x
2
+jy
2
, then
z
1
+z
2
= (x
1
+x
2
) +j(y
1
+y
2
)
Finally, the complex conjugate of x +jy is given by
z

= x −jy
Note that
'e¦z¦ =
z +z

2
and ·m¦z¦ =
z −z

2j
1.3.3 Complex Multiplication and its Interpretation
Unlike ordinary two-dimensional vectors, complex numbers can be multi-
plied together to yield another complex number (vector) on the Cartesian
plane. This becomes possible by making the assignment
j
2
= −1
16
and applying the usual distributive and commutative laws of algebraic mul-
tiplication:
(a +jb)(c +jd) = ac +jad +jbc +j
2
bd
This result becomes more interesting if we express the two complex numbers
in terms of polar coordinates. Let
z
1
= r
1
(cos θ
1
+j sin θ
1
) and z
2
= r
2
(cos θ
2
+j sin θ
2
)
Then
z
1
z
2
= r
1
r
2
(cos θ
1
+j sin θ
1
)(cos θ
2
+j sin θ
2
)
= r
1
r
2
[(cos θ
1
cos θ
2
−sin θ
1
sin θ
2
) +j(cos θ
1
sin θ
2
+ sin θ
1
cos θ
2
)]
= r
1
r
2
[cos(θ
1

2
) +j sin(θ
1

2
)]
where the last equality follows from standard trigonometric identities. Since
a sine-cosine pair uniquely speciﬁes an angle in [0, 2π), we conclude that
[z
1
z
2
[ = [z
1
[[z
2
[ and ∠z
1
z
2
= ∠z
1
+∠z
2
Thus, multiplication of two complex numbers viewed as vectors entails:
• scaling the length of either vector by the length of the other; and
• rotation of either vector through an angle equal to the angle of the
other.
If both complex numbers lie on the unit circle (given by the equation [z[ = 1),
then multiplication is a pure rotation, as shown in Figure 1.10.
1.3.4 The Complex Exponential
The Taylor series expansions for sinθ and cos θ are given by
cos θ = 1 −
θ
2
2!
+
θ
4
4!

sin θ = θ −
θ
3
3!
+
θ
5
5!

A power series for cos θ+j sinθ can be obtained by combining the two Taylor
series:
cos θ +j sin θ = 1 +jθ −
θ
2
2!
−j
θ
3
3!
+
θ
4
4!
+j
θ
5
5!

17
φ
θ
θ+φ
1 −1
j
−j
0
w
z
wz
Figure 1.10: Multiplication of two complex numbers w and z on
the unit circle.
Starting with j
1
, consecutive (increasing) powers of j circle through the four
values j, −1, −j and +1. Thus the power series above can be expressed as
1 +jθ +
(jθ)
2
2!
+
(jθ)
3
3!
+
(jθ)
4
4!
+
(jθ)
5
5!
+
which is recognizable as the Taylor series for e
x
with the real-valued ar-
gument x replaced by the complex-valued argument jθ. This leads to the
expression
e

= cos θ +j sin θ
known as Euler’s formula. Changing the sign of θ yields the complex con-
jugate:
e
−jθ
= cos(−θ) +j sin(−θ) = cos θ −j sin θ
By adding and subtracting the last two equations, we obtain
cos θ =
e

+e
−jθ
2
and sinθ =
e

−e
−jθ
2j
respectively.
We have therefore derived an alternative expression for the polar form
of a complex number z = (r, θ):
z = re

= [z[e
j∠z
18
Note that if z
1
= [z
1
[e
j∠z
1
and z
2
= [z
2
[e
j∠z
2
, then direct multiplication
gives
z
1
z
2
= [z
1
[[z
2
[e
j∠z
1
e
j∠z
2
= [z
1
[[z
2
[e
j(∠z
1
+∠z
2
)
as expected.
1.3.5 Inverses and Division
The (multiplicative) inverse z
−1
of z is deﬁned by the relationship
zz
−1
= 1
The inverse exists as long as z ,= 0, and is easily obtained from the polar
form of z:
z
−1
= [z[
−1
_
e
j∠z
_
−1
= [z[
−1
e
−j∠z
Thus the inverse of z is a scaled version of its complex conjugate:
z
−1
= [z[
−2
z

and in the special case where z lies on the unit circle, z
−1
= z

.
For the Cartesian form for the inverse of z = x +jy, we have
z
−1
= [z[
−2
z

=
x −jy
x
2
+y
2
As in the case of real numbers, the division of z
1
by z
2
can be deﬁned in
terms of a product and an inverse:
z
1
z
2
= z
1

1
z
2
Example 1.3.2. If z
1
=

3 −j and z
2
= 1 +j, then
z
1
z
2
=

3 −j
1 +j
=
(

3 −j)(1 −j)
1
2
+ 1
2
=

3 −1
2
−j

3 + 1
2
19
For the second expression on the right-hand side, we used the Cartesian
form for the inverse of z
2
= 1 + j. Alternatively, one can multiply both
numerator and denominator by z

2
= 1 −j.
In polar form, z
1
= 2e
j(−π/6)
and z
2
=

2e
j(π/4)
, thus
z
1
z
2
=
2

2
e
j(−π/6−π/4)
=

2e
j(−5π/12)

20
1.4 Sinusoids in Continuous Time
1.4.1 The Basic Cosine and Sine Functions
Let θ be a real-valued angle, measured in radians (recall that π rad is the
same as 180
0
). The basic sinusoidal functions are cos θ and sin θ, shown in
Figure 1.11.
cosθ
sinθ
0 2π 4π θ 0 2π 4π θ
1
−1 −1
1
Figure 1.11: The cosine (left) and sine (right) functions.
Our ﬁrst observation is that both cos θ and sinθ are periodic with period
equal to 2π. This means that for any positive or negative integer k,
cos(θ + 2kπ) = cos θ
sin(θ + 2kπ) = sinθ
Our second observation is that the cosine function is symmetric, or more
precisely, even-symmetric about the origin, while the sine function is anti-
symmetric (or odd-symmetric):
cos(−θ) = cos θ
sin(−θ) = −sin θ
Our third observation is that either function can be obtained from the
other by means of a shift in the variable θ. The sine function is obtained
from the cosine by a delay in θ equal to π/2 radians; equivalently, the cosine
is obtained from the sine by an advance in θ. This follows also from the
trigonometric identity
sinθ = cos(π/2 −θ)
21
and the fact that cos θ = cos(−θ). We thus have
sinθ = cos(θ −π/2)
cos θ = sin(θ +π/2)
1.4.2 The General Real-Valued Sinusoid
The shift properties discussed above allow us to express sinusoidal signals in
terms of either the sine or the cosine function. We obtain the general form
for the real-valued sinusoid by modifying the function cos θ in three ways:
1. Vertical Scaling. Multiplication of cos θ by a constant A yields an am-
plitude (peak value) equal to [A[. It is customary to use positive values for
A. A negative A would result in a sign inversion, which is better described
using a shift in θ:
−cos θ = cos(θ +π)
2. Horizontal Scaling. The horizontal scale can be changed by making θ a
linear function of another variable; time t is often chosen for that purpose.
We thus have, for Ω ≥ 0,
θ = θ(t) = Ωt
and the resulting sinusoid is
x(t) = Acos Ωt
The scale parameter Ω is known as the angular frequency, or radian frequency
of the signal x(t). If t is measured in seconds, then Ω is measured in radians
A full cycle (period) of cos θ lasts 2π radians, or equivalently, 2π/Ω
seconds. It follows that the period T of x(t) is
T =

(seconds)
The (cyclic) frequency of x(t) is the number of cycles per unit time, i.e.,
f =
1
T
=

The unit of f is cycles/second, or Hertz (Hz). We thus have
In Figure 1.12, Ω
2
> Ω
1
, and thus T
2
< T
1
.
22
cosΩ t
1
0 T
1
T
2
t t 0
cosΩ t
2
Figure 1.12: Sinusoids of two diﬀerent frequencies.
3. Horizontal Shifting. The introduction of a phase shift angle φ in the
argument of the cosine results in
x(t) = Acos(Ωt +φ)
which is the most general form for the real-valued sinusoid. A positive value
of φ represents a phase (angle) advance of φ/2π cycles, or a time advance of
φ/Ω seconds; a negative value of φ represents a delay. Clearly, x(t) can be
also expressed in terms of the sine function, using an additional phase shift
(see the discussion earlier):
sin(Ωt +φ) = cos
_
Ωt +φ −
π
2
_
cos(Ωt +φ) = sin
_
Ωt +φ +
π
2
_
Example 1.4.1. Let us sketch the signal
x(t) = 3 cos
_
10πt +

4
_
showing all features of interest such as the signal amplitude; the initial value
of the signal (t = 0); and the positions of zeros, maxima and minima.
First, we draw one cycle of the cosine function (peak to valley to peak)
without marking the time origin. The duration of that cycle is the period
T of x(t), which is obtained from Ω (the coeﬃcient of t):
Ω = 10π rad/sec ⇒f = 5 Hz ⇒T = 0.2 sec
23
The cycle drawn starts on a peak, which would be the value of the signal
at t = 0 if the phase shift were equal to 0. In this case, the phase shift
equals 3π/4, which is equivalent to 3/8 cycles. Thus the correct placement
of the time origin (t = 0) is to the right of the leftmost peak, at a distance
corresponding to 3/8 cycles, or 0.075 seconds. The value of the signal at
t = 0 is
3 cos(3π/4) = −3

2/2 = −2.121
The positions of the zeros, maxima and minima of x(t) are now marked
relative to the origin. Note that these occur at quarter-cycle intervals, i.e.,
every 0.05 seconds. Since the phase shift at t = 0 is 3/8 cycles, the ﬁrst
such point on the positive t-axis will occur 1/8 cycles later (i.e., at t = 0.025
seconds), and will be a minimum.
t -.075 -.025 0 .025 .075 .125
T = 0.2 sec
-3
+3
-2.121
Example 1.4.1
More cycles of the signal can then be added (e.g., the dotted line in the
ﬁgure).
1.4.3 Complex-Valued Sinusoids and Phasors
In our discussion of complex numbers, we established Euler’s formula
e

= cos θ +j sin θ
24
and obtained the identities
cos θ =
e

+e
−jθ
2
and sinθ =
e

−e
−jθ
2j
It is clear from the ﬁrst of the two identities that cos θ can be obtained
from the two complex numbers e

and e
−jθ
(both on the unit circle) by
projecting either number onto the real axis, or, equivalently, by averaging the
two numbers. Analogous conclusions can be drawn about the sine function,
using the imaginary axis and a slightly diﬀerent linear combination.
The real-valued sinusoid
x(t) = Acos(Ωt +φ)
can also be obtained from the complex-valued sinusoid
z(t) = Ae
j(Ωt+φ)
using
x(t) = 'e
_
Ae
j(Ωt+φ)
_
or, equivalently,
x(t) =
Ae
j(Ωt+φ)
+Ae
−j(Ωt+φ)
2
=
z(t) +z

(t)
2
To interpret the above relationships, think of z(t) and z

(t) as two points
on the circle [z[ = A, with initial angular positions (at t = 0) given by φ
and −φ. The two points rotate in opposite directions on the circle; their
angular velocities are equal in magnitude. The real-valued sinusoid x(t) is
obtained by either projecting z(t) onto the real axis, or taking the midpoint
of the segment joining z(t) and z

(t), as shown in Figure 1.13.
The foregoing discussion used the concept of negative angular frequency
to describe an angle which is decreasing linearly in time. Negative frequen-
cies are of no particular value in expressing real sinusoids, since
Acos(−Ωt +φ) = Acos(Ωt −φ) ,
i.e., changing the sign of Ω is no diﬀerent from changing the sign of the phase
angle φ. Using negative frequencies for complex sinusoids, on the other hand,
allows us to linearly combine such complex signals to obtain a real-valued
sinusoid. This concept will be further discussed in signal analysis.
25
z(t)
z (t)
*
Ωt+φ
−Ωt−φ
jA
−jA
A
−A
x(t)
Figure 1.13: Rotating phasor.
A complex sinusoid such as z(t) is also known as a rotating phasor. Its
initial position at t = 0 is referred to as a stationary phasor. As expected,
Ae
j(Ωt+φ)
= Ae

e
jΩt
i.e., the rotating phasor is obtained from the stationary phasor by rotation
through an angle Ωt.
Stationary phasors are particularly useful in depicting the relative phases
of sinusoidal signals of the same frequency, such as voltages and currents
in an AC (alternating current) circuit. One particular application involves
sums of such signals:
Fact. If x
1
(t), . . . , x
M
(t) are real sinusoids given by
x
m
(t) = A
m
cos(Ωt +φ
m
)
then
x
1
(t) + +x
M
(t) = Acos(Ωt +φ)
where
Ae

=
M

m=1
A
m
e
jφm
The fact that a sum of sinusoids of the same frequency is also sinusoidal
is not at all obvious. To prove it, note ﬁrst that
x
m
(t) = 'e
_
A
m
e
(jΩt+φ
m
)
_
26
so that
M

m=1
x
m
(t) =
M

m=1
'e
_
A
m
e
(jΩt+φm)
_
Since 'e¦z
1
+z
2
¦ = 'e¦z
1
¦ +'e¦z
2
¦, it follows that
M

m=1
x
m
(t) = 'e
_
M

m=1
A
m
e
(jΩt+φ
m
)
_
= 'e
_
M

m=1
A
m
e

m
e
jΩt
_
= 'e
__
M

m=1
A
m
e
jφm
_
e
jΩt
_
= 'e
_
Ae
j(Ωt+φ)
_
= Acos(Ωt +φ)
where
A =
¸
¸
¸
¸
¸
M

m=1
A
m
e
jφm
¸
¸
¸
¸
¸
and φ = ∠
_
M

m=1
A
m
e
jφm
_
i.e.,
Ae

=
M

m=1
A
m
e
jφm

One way of interpreting this result is that the stationary phasor Ae

corresponding to the sum signal x(t) is just the (vector) sum of the stationary
phasors of each of the components of x
m
(t). The same, of course, holds for
the rotating phasors.
Example 1.4.2. To express
x(t) = 3 cos
_
10πt +

4
_
+ 5 sin
_
10πt +
π
6
_
as a single sinusoid, we ﬁrst write the second term as
5 cos
_
10πt +
π
6

π
2
_
= 5 cos
_
10πt −
π
3
_
The stationary phasor for the sum signal x(t) is given by
3 e
j(3π/4)
+ 5 e
−j(π/3)
= 0.3787 −j2.209
= 2.241 e
−j1.401
Thus
x(t) = 2.241 cos(10πt −1.401)
27
1.5 Sinusoids in Discrete Time
1.5.1 Discrete-Time Signals
A discrete-time, or discrete-parameter, signal is a sequence of real or complex
numbers indexed by a discrete variable (or parameter) n, which takes values
in a set of integers I. Examples of I include inﬁnite sets such as Z (all
integers), N(positive integers), etc.; as well as ﬁnite sets such as ¦1, . . . , M¦.
In the latter case, the signal is also called a vector.
In what follows, we will assume that the discrete (time) parameter n takes
values over all integers, i.e., n ∈ Z. The notation for a discrete-time signal
involves square brackets around n, to diﬀerentiate it from a continuous-time
signal:
s[n], n ∈ Z distinct from s(t), t ∈ R
.
.
.
.
.
.
.
. .
.
.
0
n
x[n]
x[0]
Figure 1.14: A discrete-time signal.
One important note here is that n has no physical dimensions, i.e, it is a
pure integer number; this is true even when s[n] represents samples of a
continuous-parameter signal whose value varies in time, space, etc.
1.5.2 Deﬁnition of the Discrete-Time Sinusoid
A sinusoid in discrete time is deﬁned as in continuous time, by replacing the
continuous variable t by the discrete variable n:
x[n] = Acos(ωn +φ)
y[n] = Asin(ωn +φ)
z[n] = Ae
j(ωn+φ)
= x[n] +jy[n]
28
Again, we assume A > 0. We will use the lower-case letter ω for angular
frequency in discrete time because it is diﬀerent from the parameter Ω used
for continuous-time sinusoids. The diﬀerence is in the units: t is measured
in seconds, hence Ω is measured in radians per second; while n has no units,
hence ω is just an angle, measured in radians or radians per sample.
Example 1.5.1. If ω = 0, then
x[n] = Acos φ, y[n] = Asinφ, and z[n] = Ae

i.e., all three signals are constant in value.
Example 1.5.2. If ω = π, then
z[n] = Ae
j(πn+φ)
= A(e

)
n
e

= A(−1)
n
e

since e

= −1. Thus also
x[n] = A(−1)
n
cos φ and y[n] = A(−1)
n
sin φ
The signal x[n] is shown in the ﬁgure. Note the oscillation between the
two extreme values +Acos φ and −Acos φ, which is the fastest possible
for a discrete-time sinusoid. As we shall see soon, ω = π rad/sample is,
eﬀectively, the highest possible frequency for a discrete-time sinusoid.
.
0
n
x[n] (ω=0)
. . . . . . .
. . .
. . .
0
n
x[n] (ω=π)
Acosφ
−Acosφ
Acosφ
Example 1.5.2
Example 1.5.3. The discrete-time sinusoid
x[n] =

3 cos
_
πn
3

π
6
_
is shown in the ﬁgure (on the left). The graph was generated using the stem
command in MATLAB:
29
n = -2:8;
x = sqrt(3) * cos((pi/3)*n -(pi/6));
stem(n,x)
Note that the signal is periodic—it repeats itself every 6 samples. The angle
in the argument of the cosine is marked on the unit circle (shown on the
right), for n = 0, ..., 5.
0 1
n=0
n=1
n=2
n=3
n=4
n=5
−2 0 2 4 6 8
−1.5
−1
−0.5
0
0.5
1
1.5
n
x
[
n
]
Example 1.5.3
1.5.3 Equivalent Frequencies for Discrete-Time Sinusoids
As we mentioned earlier, the angular frequency parameter ω, which appears
in the argument
ωn +φ
of a discrete-time sinusoid, is an angle. It actually acts as an angle in-
crement: With each sample, the angle in the argument of the sinusoid is
incremented by ω.
Since the sinusoids sinθ, cos θ and e

are all periodic in θ (with period
2π), it is conceivable that two diﬀerent values of ω may result in identical
sinusoidal waveforms. To see that this is indeed the case, consider
z[n] = Ae
j(ωn+φ)
We know that e

= e

if and only if the angles θ and ψ correspond to the
same point on the unit circle, i.e.,
ψ = θ + 2kπ
30
where k is an integer. If we replace ω by ω + 2kπ in the equation for z[n],
we obtain a new signal v[n] given by
v[n] = Ae
j((ω+2kπ)n+φ)
= Ae
j(ωn+φ+2knπ)
= Ae
j(ωn+φ)
= z[n]
where the angle argument was reduced by an integer (= kn) multiple of 2π
without aﬀecting the value of the sinusoid. Thus the signals z[n] and v[n]
are identical.
Deﬁnition 1.5.1. The angular frequencies ω and ω

are equivalent for
complex-valued sinusoids in discrete time if
ω

= ω + 2kπ, (k ∈ Z)
Equivalent frequencies result in identical complex sinusoids, provided
the initial phase shifts φ are equal. Thus the eﬀective range of frequen-
cies for complex sinusoids can be taken as (0, 2π] or (−π, π] rad/sample,
corresponding to one full rotation on the unit circle.
Example 1.5.4. The three complex sinusoids
z
(1)
[n] = exp
_
j
_

πn
6

__
z
(2)
[n] = exp
_
j
_
11πn
6

__
z
(3)
[n] = exp
_
j
_
23πn
6

__
all represent the same signal (in discrete time).
The notion of equivalent frequency also applies to the real-valued sinusoid
x[n] = Acos(ωn +φ)
since x[n] = 'e¦z[n]¦. Thus two frequencies which diﬀer by an integral
multiple of 2π will result in identical real-valued sinusoids provided the initial
phase shifts φ are the same.
31
Moreover, for the real valued sinusoid above, it is fairly easy to show that
using ω

= −ω +2kπ instead of ω will result in an identical signal provided
the sign of φ is also reversed. Let
u[n] = Acos((−ω + 2kπ)n −φ)
Then, recalling the identity
cos(−θ) = cos(θ)
we have
u[n] = Acos((−ω + 2kπ)n −φ)
= Acos(ωn +φ −2kπn)
= Acos(ωn +φ)
= x[n]
We also note that no sign reversal is needed for φ if ω = 0 or ω = π.
Deﬁnition 1.5.2. The angular frequencies ω and ω

are equivalent for real-
valued sinusoids in discrete time if
ω

= ±ω + 2kπ, (k ∈ Z)
This further reduces the eﬀective range of frequency for real sinusoids to
the interval [0, π]. This is illustrated in Figure 1.15.
ωn+φ
−ωn−φ
1
Same value obtained
for cos(.)
0 π
Figure 1.15: Two equivalent frequencies for real sinusoids (left);
eﬀective range of frequencies for real sinusoids (right).
32
Example 1.5.5. The three real sinusoids
x
(1)
[n] = cos
_

πn
6

_
x
(2)
[n] = cos
_
11πn
6

_
x
(3)
[n] = cos
_
23πn
6

_
all represent the same signal, which is also given by (note the sign reversal
in the phase):
cos
_
πn
6
−φ
_
= cos
_
13πn
6
−φ
_
= cos
_
25πn
6
−φ
_

1.5.4 Periodicity of Discrete-Time Sinusoids
While continuous-time sinusoids are always periodic with period T = 2π/Ω,
periodicity is the exception (rather than the rule) for discrete-time sinusoids.
First, a refresher on periodicity:
Deﬁnition 1.5.3. A discrete-time signal s[n] is periodic if there exists an
integer N > 0 such that for all n ∈ Z,
s[n +N] = s[n]
The smallest such N is called the (fundamental) period of s[n].
For z[n] = Ae
j(ωn+φ)
, the condition
(∀n) z[n +N] = z[n]
holds for a given N > 0 if and only if
ω(n +N) +φ = ωn +φ + 2kπ
which reduces to
ω =
k
N

Thus z[n] is periodic if and only if its frequency is a rational (i.e., ratio
of two integers) multiple of 2π. The period (smallest such N) is found by
cancelling out common factors between the numerator k and denominator
N.
The same condition for periodicity can be obtained for the real-valued
sinusoid x[n] = Acos(ωn +φ).
33
Example 1.5.6. The signal
x[n] = cos(10n + 0.2)
is not periodic, while
s[n] = cos
_
13πn
15
+ 0.4
_
is. Since
13π
15
=
13
30

(note that there are no common factors in the numerator and the denomi-
nator of the last fraction), the period of s[n] is N = 30.
34
1.6 Sampling of Continuous-Time Sinusoids
1.6.1 Sampling
Sampling is the process of recording the values taken by a continuous-time
signal at discrete sampling instants. Uniform sampling is most commonly
used: the sampling instants T
s
are seconds apart in time, where T
s
is the
sampling period. The sampling rate, or sampling frequency, f
s
is the recip-
rocal of the sampling period:
f
s
=
1
T
s
(samples/sec)
If the continuous time signal is x(t), the discrete-signal resulting from sam-
pling is
x[n] = x(nT
s
)
.
.
.
. .
.
.
.
.
−2.0 0 2.0 t
x(t)
x[0]
x[1]
x[-1]
Figure 1.16: Sampling every two seconds (i.e., T
s
= 2.0).
Sampling is one of two main components of analog-to-digital (A-to-D),
the other component being quantization. The reverse process of digital-
to-analog (D-to-A) conversion involves interpolation, namely reconstructing
the continuous-time signal from the discrete (and quantized) samples. The
sampling rate f
s
determines, to a large extent, the quality of the interpolated
analog signal. If the sampling rate is high enough, the samples will contain
suﬃcient information about the sampled analog signal to allow accurate
interpolation; otherwise, ﬁne detail will be lost and the reconstructed signal
will be a poor replica of the original. The minimum sampling rate f
s
required
for accurate interpolation is known as the Nyquist rate, which depends on the
range of frequencies spanned by the original signal (in terms of its sinusoidal
components). Although the theory behind the Nyquist rate is beyond the
scope of this chapter, the necessity of sampling at the Nyquist rate (or
higher) can be seen by considering simple properties of sampled sinusoidal
signals.
35
1.6.2 Discrete-Time Sinusoids Obtained by Sampling
Sampling a continuous-time sinusoid produces a discrete-time sinusoid. Con-
sider, for example, the real-valued sinusoid
x(t) = Acos(Ωt +φ)
which has period T = 2π/Ω and cyclical frequency f = Ω/2π. We have
x[n] = Acos(ΩnT
s
+φ)
= Acos(ωn +φ)
where
ω = ΩT
s
=

f
s
Note that, as expected, ω has no physical dimensions, i.e., it is just an angle.
It can also be expressed as
ω = 2π
T
s
T
= 2π
f
f
s
Thus the frequency of the discrete-time sinusoid obtained by sampling a
continuous-time sinusoid is directly proportional to the sampling period T
s
,
or equivalently, inversely proportional to the sampling frequency f
s
.
Suppose T
s
is small in relation to T, or equivalently, f
s
is high compared
to f. Since many samples are taken in each period, the sample values
will vary slowly, and thus frequency of the discrete-time signal will be low.
As T
s
increases (i.e., f
s
decreases), the frequency ω of the discrete-time
sinusoid also increases. Recall, however, that there is an upper limit, equal
to π rad/sample, on the eﬀective frequency of a real-valued discrete-time
sinusoid. That value will be attained for
T
s
=
π

=
T
2
,
(equivalently, for f
s
= 2f). When T
s
exceeds T/2, the eﬀective frequency of
the resulting samples will decrease with T
s
. This is consistent with the fact
that
π +δ and π −δ = −(π +δ) + 2π
are equivalent frequencies. The eﬀective frequency will become zero when
T
s
= T (or f
s
= f), and then it will start increasing again, etc.
36
Example 1.6.1. Consider the continuous-time sinusoid
x(t) = cos(400πt + 0.8)
which has period T = 5 ms and frequency f = 200 Hz. Suppose the following
three sampling rates are used: 1,000, 500, and 250 samples/sec.
In the ﬁrst case, T
s
= 1 ms and ω = (400π)/1000 = 2π/5. The resulting
signal is
x
1
[n] = cos
_
2πn
5
+ 0.8
_
In the second case, T
s
= 2 ms and ω = 4π/5. The resulting signal is
x
2
[n] = cos
_
4πn
5
+ 0.8
_
In the third case, T
s
= 4 ms and ω = 8π/5. The resulting signal is
x
3
[n] = cos
_
8πn
5
+ 0.8
_
= cos
_
2πn
5
−0.8
_
i.e., it is a sinusoid of the same frequency as x
1
[n] (but with diﬀerent phase
shift).
Example 1.6.2. The graphs show the continuous time sinusoid Acos(2πft)
sampled at f
s
= 8f (left) and f
s
= (8f)/7 (right). The sample sequence
on the right is formed by taking every seventh sample from the sequence
on the left. Clearly, the two sequences are identical, and are given by the
discrete-time sinusoid Acos(πn/4).
1.6.3 The Concept of Alias
The notion of equivalent frequency was implicitly used in Example 1.6.1 to
show that a given continuous-time sinusoid can be sampled at two diﬀerent
sampling rates to yield discrete sinusoids of the same frequency (diﬀering
possibly in the phase). If we were to plot the two sample sequences using an
actual time axis, i.e., t instead of n (as in the graphs of Example 1.6.2), we
would observe that one sequence of samples contains a lot more information
about the analog signal. In the case of a sinusoidal signal, most of this
information is redundant: as few as three samples may suﬃce to determine
the unknown parameters Ω, φ and A (only two samples if the frequency Ω
is known), and hence to interpolate the signal perfectly. On the other hand,
37
0 t
A
0
t
A
Example 1.6.2
for an arbitrary signal x(t) that is neither sinusoidal nor (more generally)
periodic, interpolation from a ﬁnite number of samples is impossible; for such
a signal, a higher sampling rate is likely to result in better (more faithful)
reconstruction.
Most analog (i.e., continuous-time) signals encountered in practice can
be formed by summing together sinusoids of diﬀerent frequencies and phase
shifts. In most cases, the frequencies involved span an entire (continuous)
interval or band, in which case integration is more appropriate than summa-
tion. The details of the sinusoidal representation of continuous-time signals
(known as the Fourier transform) are not essential at this point. To appreci-
ate a key issue which arises in sampling analog signals, it suﬃces to consider
the simple class of signals which are sums of ﬁnitely many sinusoids.
Consider again the sinusoid
x(t) = Acos(Ωt +φ)
sampled at a rate f
s
which is ﬁxed (unlike the situation discussed in the
previous subsection). The resulting discrete-time sinusoid x[n] has frequency
ω = ΩT
s
= 2π(f/f
s
).
Consider also a second sinusoid
x

(t) = A

cos(Ω

t +φ

)
sampled at the same rate, resulting in x

[n] having frequency ω

= Ω

T
s
=
2π(f

/f
s
).
38
The two frequencies ω and ω

are equivalent (allowing both x[n] and
x

[n] to be expressed in terms of the same frequency) provided that
ω

= ±ω + 2kπ , k integer
This is equivalent to
f

f
s
= ±
f
f
s
+k
or, simply,
f

= ±f +kf
s
for some integer k.
Frequencies f and f

(in Hz) satisfying the above relationship are known
as aliases of each other with respect to the sampling rate f
s
(in sam-
ples/second). Continuous-time sinusoids at frequencies related in that man-
ner, will, when sampled at the appropriate rate, yield discrete-time samples
having the same (eﬀective) frequency.
Example 1.6.3. Consider again Example 1.6.1, where f = 200 Hz.
If f
s
= 1, 000 Hz, the aliases of f are at
f

= 200 +k(1000) and f

= −200 +k(1000) Hz
If f
s
= 500 Hz, the aliases of f are at
f

= 200 +k(500) and f

= −200 +k(500) Hz
If f
s
= 250 Hz, the aliases of f are at
f

= 200 +k(250) and f

= −200 +k(250) Hz
The aliases (in Hz) are plotted for each sampling rate. Clearly, only
positive frequencies are of interest, and thus it suﬃces to consider k ≥ 0 in
both series. In the ﬁrst two cases, the lowest (positive) alias is f itself. In
the third case, the lowest alias is at −200 + 250 = 50 Hz.
Example 1.6.4. The graphs illustrate how three continuous-time sinusoids
with frequencies f = 0, f = f
s
and f = 2f
s
can yield the same discrete-time
sequence with ω = 0.
39
−200 0 200 800 1,200
Sampling Rate = 1,000 samples/sec
−200 0 200 300 700 800 1,200
Sampling Rate = 500 samples/sec
−200 0 50 200 300 450 550 700 800 950 1,050 1,200
Sampling Rate = 250 samples/sec
Example 1.6.3
Example 1.6.4
1.6.4 The Nyquist Rate
The term aliasing is used in reference to sampling continuous-time signals
which are expressible as sums of sinusoids of diﬀerent frequencies (most
signals fall in that category). Aliasing occurs in sampling a continuous-time
signal x(t) if two (or more) sinusoidal components of x(t) are aliases of each
other with respect to the sampling rate.
For example, consider
x(t) = cos Ωt + 4 cos Ω

t
where Ω and Ω

are aliases with respect to the sampling rate f
s
. The result-
ing sequence of samples can be expressed as
x[n] = cos ωn + 4 cos ω

n
Since ω and ω

are equivalent frequencies (and no phase shifts are involved
here), we can also write
x[n] = 5 cos ωn
40
Reconstructing x(t) from the samples x[n] without knowing the ampli-
tudes of the two components cos Ωt and cos Ω

t is a hopeless task: the same
sequence of samples could have been obtained from, e.g.,
u(t) = 2 cos Ωt + 3 cos Ω

t or v(t) = −cos Ωt + 6 cos Ω

t
which are very diﬀerent signals from x(t). This shows convincingly that
aliasing is undesirable in sampling signals composed of sinusoids. Thus
absence of aliasing is a necessary condition for the faithful reconstruction of
the sampled signal.
As we mentioned earlier, typical analog signals consist of a continuum of
sinusoids spanning an entire interval, or band, of frequencies; these sinusoidal
components are “summed” by an integral over frequency (rather than by a
conventional sum). The eﬀect of aliasing is the same whether we sum or
integrate over frequency: components at frequencies that are aliases of each
other cannot be resolved on the basis of the samples obtained, hence the
amplitude and phase information needed to incorporate those components
into the reconstructed signal will be unavailable.
If the sinusoidal components of x(t) have frequencies which are limited to
the frequency band [0, f
B
] Hz (where f
B
is the signal bandwidth), aliasing is
avoided if no frequency f in that band has an alias f

= ±f +kf
s
(excluding
itself) in the same band. With the aid of Figure 1.17, we will determine the
minimum sampling rate f
s
such that no aliasing occurs.
−f f −f 0 f f f +f
s B s B s B s
−f −f 0 −f +f f −f +2f
s B B s s B s
Figure 1.17: Illustration of the Nyquist condition: −f
B
+f
s
> f
B
,
i.e., f
s
> 2f
B
.
Consider the alias relationship f

= f +kf
s
ﬁrst. Each k corresponds to
an interval [kf
s
, f
B
+kf
s
] of aliases of frequencies in [0, f
B
] (top graph). No
aliasing means that neither of the intervals [−f
s
, f
B
−f
s
] and [f
s
, f
B
+f
s
]
(corresponding to k = −1 and k = 1, respectively) overlaps with [0, f
B
].
This is ensured if and only if f
s
> f
B
.
41
Next, consider the alias relationship f

= −f +kf
s
, which gives aliases of
[0, f
B
] in the interval [−f
B
+ kf
s
, kf
s
] (bottom graph). No aliasing means
that the interval [−f
B
+f
s
, f
s
] does not overlap with [0, f
B
], which is ensured
if and only if f
s
> 2f
B
.
Thus aliasing is avoided if f
s
> 2f
B
. In other words, the sampling
rate must be greater than twice the bandwidth of the analog signal. This
minimum rate is known as the Nyquist rate. Sampling at a rate higher than
the Nyquist rate is a necessary condition for the faithful reconstruction of
the signal x(t) from its samples x[n].
It turns out that the condition f
s
> 2f
B
is also suﬃcient for the complete
recovery of the analog signal from its samples. The proof of this fact requires
more advanced tools in signal analysis, and will be outlined in the epilogue
following Chapter 4.
42
Problems
Section 1.1
P 1.1. What is the IEEE 754 double-precision encoding (in hexadecimal form)
of the decimal numbers 0.125 and −15.75? You can use MATLAB (format
P 1.2. Type help round and help fix in MATLAB and read the description
of these two functions. If x is an arbitrary number and m is a positive integer,
explain in your own words the eﬀect of using the MATLAB commands
10^(-m)*round(x*10^m)
and
10^(-m)*fix(x*10^m)
P 1.3. A terminating decimal expansion may correspond to an inﬁnite (non-
terminating) binary expansion. Such is the case with the decimal num-
ber 0.1, which is rounded in MATLAB—i.e., it is represented by a dif-
ferent precise binary number a. What is the value of the error a − 0.1?
(Hint: using format hex, you can obtain the hexadecimal representation
3FB999999999999A for a. You may ﬁnd the geometric sum formula useful
here.)
Section 1.2
P 1.4. (i) Using your calculator, compute the value of
y = 1 −
sin x
x
for x = 0.05 radians, rounding to four signiﬁcant digits at the end. Denote
1
.
(ii) Repeat the calculation of y, rounding to four signiﬁcant digits at each
2
.
(iii) Compute the value of y using the expansion
sin x = x −
x
3
6
+
x
5
120

rounding to four signiﬁcant digits at each stage. Denote your answer by y
3
.
43
P 1.5. (i) Let x = 0.02. Using the full precision available on your calculator,
compute
y =
1 +x

1 + 2x
−1
1
.
(ii) Repeat the calculation of y, rounding to four signiﬁcant digits at each
2
. Also compute the error relative to y
1
.
(iii) Compute the value of y using the binomial expansion
(1 +a)
r
= 1 +ra +
r(r −1)
2!
a
2
+
r(r −1)(r −2)
3!
a
3
+
(valid for any r and [a[ < 1). Include powers of x up to x
2
and use four-digit
3
. Again, compute the error relative to
y
1
.
P 1.6. (i) Compute ln(1 +c) +ln(1 −c) for c = 0.002 using the full precision
available on your calculator (or MATLAB).
(ii) Repeat the computation with reduced precision, rounding each loga-
rithm to six signiﬁcant digits. What is the relative error (using the answer
obtained in (i) above as the true value)?
(iii) Compute the same quantity using the Taylor series expansion
ln(1 +x) = x −
x
2
2
+
x
3
3

x
4
4
+
(valid for −1 < x ≤ 1) with six-digit precision. What is the relative error in
this computation?
(iv) Use the approximate form
f

(x) ≈
f(x + ∆) −2f(x) +f(x −∆)

2
(where f

denotes second derivative and ∆ is small) to obtain a new ap-
proximation to ln(1 + c) + ln(1 − c) using six-digit precision. What is the
relative error in this computation?
P 1.7. Suppose N is an integer. In inﬁnite precision arithmetic, summing
together N numbers, each equal to 1/N, yields an answer exactly equal to
1. In ﬁnite precision arithmetic, that is not necessarily the case.
Try out the following MATLAB script (sequence of commands), after setting
(i) N = 100, 000 (= 10
5
) and (ii) N = 131, 072 (= 2
17
). Compute the error
relative to the exact answer (= 1) in each case. Explain any diﬀerences
between the two cases.
44
y = 0;
for i = 1:N
y = y+(1/N);
end
y
Section 1.3
P 1.8. Which of the following expressions represents the complex number
2 + 3j in MATLAB?
2 + 3*j
2 + 3*i
2 + j3
2 + 3j
[2+3j]
[2 + 3j]
[2 +3j]
P 1.9. Simplify the complex fraction
(1 +j

3)(1 −j)
(

3 +j)(1 +j)
using (i) Cartesian forms; and (ii) polar forms, throughout your calculation.
P 1.10. Let N be an arbitrary positive integer. Evaluate the product
N−1

k=1
_
cos
_

N
_
+j sin
_

N
__
,
Section 1.4
P 1.11. Consider the continuous-time sinusoid
x(t) = 5 cos(500πt + 0.25)
45
where t is in seconds.
(i) What is the ﬁrst value of t greater than 0 such that x(t) = 0?
(ii) Consider the following MATLAB script which generates a discrete ap-
proximation to x(t):
t = 0 : 0.0001 : 0.01 ;
x = 5*cos(500*pi*t + 0.25) ;
For which values of n, if any, is x(n) zero?
P 1.12. The value of the continuous-time sinusoid x(t) = Acos(Ωt+φ) (where
A > 0 and 0 ≤ φ < 2π) is between −2.0 and +2.0 for 70% of its period.
(i) What is the value of A?
(ii) If it takes 300 ms for the value of x(t) to rise from −2.0 to +2.0, what
is the value of Ω?
(iii) If t = 40 ms is the ﬁrst positive time for which x(t) = −2.0 and x

(t)
(the ﬁrst derivative) is negative, what is the value of φ?
P 1.13. The input voltage v(t) to a light-emitting diode circuit is given by
Acos(Ωt + φ), where A > 0, Ω > 0 and φ are unknown parameters. The
circuit is designed in such a way that the diode turns on at the moment the
input voltage exceeds A/2, and turns oﬀ when the voltage falls below A/2.
(i) What percentage of the time is the diode on?
(ii) Suppose the voltage v(t) is applied to the diode at time t = 0. The
diode turns on instantly, turns oﬀ at t = 1.5 ms, then turns on again at
t = 9.5 ms. Based on this information, determine Ω and φ.
P 1.14. Using stationary phasors, express each of the following sums in the
form Acos(Ωt +φ).
(i)
cos
_
7πt −
π
6
_
−sin
_
7πt −
π
6
_
+ 3 cos
_
7πt +
π
4
_
(ii)
2.26 cos(43t + 0.11) −5.77 cos(43t + 2.08) + 0.49 cos(43t + 1.37)
46
Section 1.5
P 1.15. (i) For exactly one value of ω in [0, π], the discrete-time sinusoid
v[n] = Acos(ωn +φ)
is periodic with period equal to N = 4 time units. What is that value of ω?
(ii) For that value of ω, suppose the ﬁrst period of v[n] is given by
v[0] = 1, v[1] = 1, v[2] = −1 and v[3] = −1
What are the values of A > 0 and φ?
P 1.16. (i) Use the trigonometric identity
cos(α +β) = cos αcos β −sin αsin β
to show that
cos(ω(n + 1) +φ) + cos(ω(n −1) +φ) = 2 cos(ωn +φ) cos ω
(ii) Suppose
x[1] = 1.7740, x[2] = 3.1251 and x[3] = 0.4908
are three consecutive values of the discrete-time sinusoid x[n] = Acos(ωn +
φ), where A > 0, ω ∈ [0, π] and φ ∈ [0, 2π]. Use the equation derived in (i)
to evaluate ω. Then use the ratio x[2]/x[1] together with the given identity
for cos(α +β) to evaluate tanφ and hence φ. Finally, determine A.
P 1.17. (i) Use MATLAB to plot four periods of the discrete-time sinusoid
x
1
[n] = cos
_
7πn
9
+
π
6
_
(ii) Show that the product
x
2
[n] = x
1
[n] cos(πn)
is also a (real-valued) discrete-time sinusoid. Express it in the formAcos(ωn+
φ), where A > 0, ω ∈ [0, π] and φ ∈ (0, 2π).
47
Section 1.6
P 1.18. The continuous-time sinusoid
x(t) = cos(2πft +φ)
is sampled every T
s
= 1/(12f) seconds to produce the discrete-time sinusoid
x[n] = x(nT
s
)
(i) Write an expression for x[n]. What is the angular frequency ω of x[n],
(ii) Consider the discrete-time signal
y[n] = x[n + 1] + 2x[n] +x[n −1]
Using phasors, express y[n] in the form
y[n] = Acos(ωn +ψ)
clearly showing the values of A and ψ.
P 1.19. The continuous-time sinusoid
x(t) = cos(150πt +φ)
is sampled every T
s
= 3.0 ms starting at t = 0. The resulting discrete-time
sinusoid is
x[n] = x(nT
s
)
(i) Express x[n] in the form x[n] = cos(ωn +φ) i.e., determine the value of
ω.
(ii) Is the discrete-time sinusoid x[n] periodic? If so, what is its period?
(iii) Suppose that the sampling rate f
s
= 1/T
s
is variable. For what values
of f
s
is x[n] constant for all n? For what values of f
s
does x[n] alternate in
value between −cos φ and cos φ?
P 1.20. Consider the continuous-time sinusoid
x(t) = 5 cos(200πt −0.8) + 2 sin(200πt)
48
where t is in seconds.
(i) Express x(t) in the form
x(t) = Acos(200πt +φ)
(ii) Suppose x(t) is sampled at a rate of f
s
= 160 samples/second. The
discrete-time signal x[n] = x(n/f
s
) is expressed as
x[n] = Acos(ωn +ψ)
with ω in the interval [0, π]. What is the value of ω? What is the relationship
between ψ and φ?
P 1.21. For what frequencies f in the range 0 to 3.0 KHz does the sinusoid
x(t) = cos(2πft)
yield the signal
x[n] = cos(0.4πn)
when sampled at a rate of f
s
= 1/T
s
= 800 samples/sec?
P 1.22. The continuous-time sinusoid
x(t) = cos(300πt)
is sampled every T
s
= 2.0 ms, so that
x[n] = x(0.002n)
(i) For what other values of f in the range 0 Hz to 2.0 KHz does the sinusoid
v(t) = cos(2πft)
produce the same samples as x(t) (i.e., v[] = x[]) when sampled at the same
rate?
(ii) If we increase the sampling period T
s
(or equivalently, drop the sampling
rate), what is the least value of T
s
greater than 2 ms for which x(t) yields
the same sequence of samples (as for T
s
= 2 ms)?
Chapter 2
Matrices and Systems
49
50
2.1 Introduction to Matrix Algebra
The analysis of signals and linear systems relies on a variety of mathematical
tools, diﬀerent forms of which are suitable for signals and systems of diﬀering
complexity. The simplest type of signal we consider is a discrete-time signal
of ﬁnite duration, which is the same as an n-dimensional vector consisting
of real or complex-valued entries. In representing such signals and their
transformations resulting from numerical processing, certain concepts and
methods of linear algebra are indispensable. Understanding the role played
by linear algebra in the analysis of such simple (ﬁnite-dimensional) signals
also provides a solid foundation for studying more complex models where the
signals evolve in discrete or continuous time and are inﬁnite-dimensional.
2.1.1 Matrices and Vectors
A mn matrix is an ordered array consisting of m rows and n columns of
elements:
A =
_
¸
¸
¸
_
a
11
a
12
. . . a
1n
a
21
a
21
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
_
¸
¸
¸
_
A compact notation for the above matrix is
A = [a
ij
]
m×n
= [a
ij
]
The last form is used whenever the dimensions m and n of A are known or
can be easily inferred.
The elements (also called entries) of the matrix A are real or, more
generally, complex numbers. We deﬁne
R
m×n
= space of all (mn)-dimensional real -valued matrices
C
m×n
= space of all (mn)-dimensional complex-valued matrices
where C denotes the complex plane.
A vector is a special case of a matrix where one of the dimensions (m or
n) equals unity. A row vector has the form
_
a
1
a
2
. . . a
n
¸
and lies in the space R
1×n
or, more generally, C
1×n
.
51
A column vector has the form
_
¸
¸
¸
_
a
1
a
2
.
.
.
a
m
_
¸
¸
¸
_
and lies in the space R
m×1
or, more generally, C
m×1
.
Row vectors and column vectors can be used interchangeably to represent
an ordered set of (ﬁnitely many) numbers. Thus, in many ways, the spaces
R
1×n
and R
n×1
are no diﬀerent from R
n
, the space of all n-tuples of real
numbers. In fact, the redundant dimension (unity) in R
1×n
and R
n×1
is
often omitted when the orientation (row or column) of the vector can be
easily deduced from the context. The same holds for C replacing R.
The usual notation for a vector in R
n
or C
n
is a lower-case boldface
letter, e.g.,
a = (a
1
, . . . , a
n
)
In operations involving matrices, the distinction between row and column
vectors is important. Unless otherwise stated, a lower-case boldface letter
will denote a column vector, i.e.,
a =
_
¸
¸
¸
_
a
1
a
2
.
.
.
a
m
_
¸
¸
¸
_
To denote row vectors in expressions involving matrix operations, the trans-
pose (T), or complex-conjugate transpose (H), operators will be used, i.e.,
a row vector will be denoted by a
T
or a
H
.
In MATLAB, matrices are entered row-by-row, where rows are separated
with a semicolon:
A = [-1 3 4; 0 1 5; 2 2 1]
Thus [-1 3 4] is a row vector, while [-1; 0; 2] is a column vector. The
transpose operator T is entered as .’ and the complex conjugate transpose
H as ’. Both produce the same result in the case of real-valued vectors:
[-1; 0; 2].’
and
52
[-1; 0; 2]’
are the same as
[-1 0 2]
The default orientation of a vector in MATLAB is horizontal, i.e., a row.
This means that MATLAB functions returning vector-valued answers from
scalar arguments will (usually) do so in row-vector format. An example of
this is the notation start:increment:end, which generates a row vector.
MATLAB will ignore the orientation of row and column vectors in places
where it is unimportant, e.g., in arguments of certain functions.
An error message will be generated if the orientation of the vector is
inconsistent with the matrix computation being performed. A notable ex-
ception is the addition of a scalar to a vector, which is interpreted in an
element-wise fashion, i.e.,
1 + [-1 9 7]
results in
[0 10 8]
and
1 - [-1; 9; 7]
is the same as
[2; -8; -6]
2.1.2 Signals as Vectors
A discrete-time signal consisting of ﬁnitely many samples can be represented
as a vector
x = (x
1
, . . . , x
n
)
where n is the total number of samples. Here, x
i
is an abbreviated form
of x[i] (in the earlier notation). The choice of i = 1 for the initial time
is mainly for consistency with standard indexing for matrices. The initial
index i = 0 will be used at a later point.
From now on, and through the end of Section 2.12, all matrices and
vectors will be assumed to be real-valued.
53
Deﬁnition 2.1.1. The linear combination of n-dimensional vectors x
(1)
, . . . , x
(r)
with coeﬃcients c
1
, . . . , c
r
in R is deﬁned as the n-dimensional vector
x = c
1
x
(1)
+ +c
r
x
(r)

(Recall that multiplication of a vector by a scalar amounts to multiplying
each entry of the vector by the scalar; while addition of two or more vectors
amounts to adding their respective components.)
Any vector x in R
n
can be expressed as a linear combination of the
standard unit vectors e
(1)
, . . . , e
(n)
, deﬁned by
e
(i)
j
=
_
1, i = j;
0, i ,= j.
Example 2.1.1. Consider the three-dimensional vector x = (1, −2, 3). Ex-
pressing it as a column-vector (the default format for matrix operations),
we have
x =
_
_
1
−2
3
_
_
= (1)
_
_
1
0
0
_
_
+(−2)
_
_
0
1
0
_
_
+(3)
_
_
0
0
1
_
_
= e
(1)
−2e
(2)
+3e
(3)
The linear combination of unit vectors shown above is illustrated graph-
ically. Note that the i
th
unit vector corresponds to a single pulse of unit
height at time i.
.
. ..
.
. .
.
. ...
.
.
.
1
−2
3
1
1
1
(−2)
(1)
(3)
Example 2.1.1
54
2.1.3 Linear Transformations
The concept of linearity is crucial for the analysis of signals and systems.
The most important time-frequency transformations of signals, such as the
Laplace, Fourier and z-transforms, are linear. Linear systems form a class
which is by far the easiest to study, model and simulate. In fact, in study-
ing the behavior of many nonlinear systems, it is common to use linear
approximations and techniques from the theory of linear systems.
The simplest model for a linear system involves a ﬁnite-dimensional input
signal x ∈ R
n
and a ﬁnite-dimensional output signal y ∈ R
m
. The system
A, and more precisely,
A : R
n
→R
m
,
acts on the input x to produce the output signal y represented by
y = A(x)
A x y
Figure 2.1: A system A with input x and output y.
Deﬁnition 2.1.2. (Linearity) We say that the transformation (or system)
A is linear if, for any two input vectors x
(1)
and x
(2)
and coeﬃcients c
1
and
c
2
,
A(c
1
x
(1)
+c
2
x
(2)
) = c
1
A(x
(1)
) +c
2
A(x
(2)
)
To interpret this property in terms of an actual system, suppose at ﬁrst
that an input signal signal x
(1)
is applied to such a system, resulting in a
response (i.e., output) y
(1)
. The system is then “reset” and is run again
with input x
(2)
, resulting in a response y
(2)
. Finally, the system is reset
(once again) and run with input
x = c
1
x
(1)
+c
2
x
(2)
If the system is linear, the ﬁnal response of the system will be
y = c
1
y
(1)
+c
2
y
(2)
55
This deﬁning property of linear systems is also known as the superposition
property, and can be extended to an arbitrary number of input signals:
A(c
1
x
(1)
+ +c
r
x
(r)
) = c
1
A(x
(1)
) + +c
r
A(x
(r)
)
As a consequence of the superposition property, knowing the response of a
linear system to a number of diﬀerent inputs allows us to determine and
predict its response to any linear combination of those inputs.
Example 2.1.2. Consider the transformation A : R
n
→ R
n
which scales
an n-dimensional vector by λ ∈ R:
A(x) = λx
Clearly,
A(c
1
x
(1)
+c
2
x
(2)
) = λ(c
1
x
(1)
+c
2
x
(2)
)
= c
1
(λx
(1)
) +c
2
(λx
(2)
)
= c
1
A(x
(1)
) +c
2
A(x
(2)
)
and thus the transformation is linear. Scaling is one of the simplest linear
transformations of n-dimensional vectors; and in the case n = 1, it is the
only linear transformation.
Example 2.1.3. Consider the transformation A : R
3
→R
3
which cyclically
shifts the entries of x to the right:
A(x
1
, x
2
, x
3
) = (x
3
, x
1
, x
2
)
We have
A(c (u
1
, u
2
, u
3
) +d (v
1
, v
2
, v
3
)) = A(cu
1
+dv
1
, cu
2
+dv
2
, cu
3
+dv
3
)
= (cu
3
+dv
3
, cu
1
+dv
1
, cu
2
+dv
2
)
= c (u
3
, u
1
, u
2
) +d (v
3
, v
1
, v
2
)
= cA(u
1
, u
2
, u
3
) +dA(v
1
, v
2
, v
3
)
Thus a cyclical shift is also a linear transformation.
56
2.2 Matrix Multiplication
2.2.1 The Matrix of a Linear Transformation
The superposition property of a linear transformation A : R
n
→ R
m
was
stated as
A(c
1
x
(1)
+c
2
x
(2)
) = c
1
A(x
(1)
) +c
2
A(x
(2)
)
We also saw that any n-dimensional vector x can be expressed as a linear
combination of the unit vectors e
(1)
, . . . , e
(n)
:
x = x
1
e
(1)
+ +x
n
e
(n)
Thus the result of applying the linear transformation A to any vector x is
given by
A(x) = x
1
A(e
(1)
) + +x
n
A(e
(n)
)
We conclude that knowing the eﬀect of a linear transformation on each unit
vector in R
n
suﬃces to determine the eﬀect of that transformation on any
vector in R
n
.
In systems terms, we can say that the response of a linear system to an
arbitrary input can be computed from the responses of the system to each of
the unit vectors in the input signal space. It follows that the m-dimensional
vectors A(e
(1)
), . . . , A(e
(n)
) completely specify the linear transformation A.
Deﬁnition 2.2.1. The matrix of a linear transformation A : R
n
→ R
m
is
the mn matrix A whose j
th
column is given by A(e
(j)
).
Example 2.2.1. Suppose the transformation A : R
3
→R
2
is such that
A(e
(1)
) =
_
2
1
_
, A(e
(2)
) =
_
−1
0
_
and A(e
(3)
) =
_
4
−1
_
Then the matrix of A is given by
A =
_
2 −1 4
1 0 −1
_
Applying A to x = (x
1
, x
2
, x
3
) produces
A(x) = x
1
_
2
1
_
+x
2
_
−1
0
_
+x
3
_
4
−1
_
=
_
2x
1
−x
2
+ 4x
3
x
1
−x
3
_

57
2.2.2 The Matrix-Vector Product
Deﬁnition 2.2.2. Let A be a mn matrix representing a linear transfor-
mation A : R
n
→ R
m
, and let x be a n-dimensional column vector. The
product Ax is deﬁned as the result of applying the transformation A to
x. In other words, Ax is the linear combination of the columns of A with
coeﬃcients given by the corresponding entries in x.
Example 2.2.1. (Continued.) If
A =
_
2 −1 4
1 0 −1
_
and x =
_
_
x
1
x
2
x
3
_
_
then
Ax =
_
2x
1
−x
2
+ 4x
3
x
1
−x
3
_

A diﬀerent way of obtaining the product Ax is by taking the inner (dot)
product of each row of A with x. To see why this is so, let
A =
_
¸
¸
¸
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
_
¸
¸
¸
_
Then, for x = (x
1
, . . . , x
n
), we have
Ax = x
1
_
¸
¸
¸
_
a
11
a
21
.
.
.
a
m1
_
¸
¸
¸
_
+x
2
_
¸
¸
¸
_
a
12
a
22
.
.
.
a
m2
_
¸
¸
¸
_
+ +x
n
_
¸
¸
¸
_
a
1n
a
2n
.
.
.
a
mn
_
¸
¸
¸
_
=
_
¸
¸
¸
_
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
_
¸
¸
¸
_
Thus the i
th
element of the resulting vector y = Ax is given by
y
i
= (Ax)
i
= a
i1
x
1
+a
i2
x
2
+ +a
in
x
n
=
n

j=1
a
ij
x
j
i.e., it equals the inner product of the i
th
row of A and x.
58
2.2.3 The Matrix-Matrix Product
The matrix-vector product can be generalized to the product of two matri-
ces, A and B, by considering the linear transformations represented by A
and B.
Suppose A : R
p
→R
m
and B : R
n
→R
p
are two linear transformations,
and consider the composition A◦ B deﬁned by
(A◦ B)(x) = A(B(x))
where x is n-dimensional. Thus B is applied to x ﬁrst, to produce a p-
dimensional vector B(x); then A is applied to B(x) to produce (A◦ B)(x).
In systems terms, we have two linear systems A and B connected in series
(also tandem or cascade), as illustrated in Figure 2.2.
A x y=A(B(x)) B
B(x)
Figure 2.2: Two systems connected in series.
Deﬁnition 2.2.3. If A is the m p matrix of the transformation A :
R
p
→ R
m
and B is the p n matrix of the transformation B : R
n
→ R
p
,
then the product AB is deﬁned as the mn matrix of the transformation
A◦ B : R
n
→R
m
.
To obtain AB in terms of the entries of A and B, we examine the j
th
column of AB, which we denote here by (AB)
·j
.
We know that (AB)
·j
is obtained by applying A◦B to the j
th
unit vector
e
(j)
in the input space R
n
, i.e.,
(AB)
·j
= (A◦ B)(e
(j)
) = A(B(e
(j)
))
Since B(e
(j)
) is the j
th
column of B (denoted by (B)
·j
), we have
(AB)
·j
= A((B)
·j
) = A(B)
·j
Thus the j
th
column of the product AB is given by the product of A and
the j
th
column of B:
(AB)
·j
=
_
¸
¸
¸
_
a
11
a
12
. . . a
1p
a
21
a
22
. . . a
2p
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mp
_
¸
¸
¸
_
_
¸
¸
¸
_
b
1j
b
2j
.
.
.
b
pj
_
¸
¸
¸
_
59
It can be seen that the i
th
entry of (AB)
·j
, which is the same as the (i, j)
th
entry of AB, is given by the inner product of the i
th
row of A and the j
th
column of B.
Fact. If A is m p and B is p n, then AB is the m n matrix whose
(i, j)
th
entry is given by
(AB)
ij
= a
i1
b
1j
+a
i2
b
2j
+ +a
ip
b
pj
=
p

k=1
a
ik
b
kj

Example 2.2.2. If
A =
_
2 −1 4
1 0 −1
_
and B =
_
_
1 −2
0 3
1 5
_
_
then
AB =
_
6 13
0 −7
_

Example 2.2.3. If
A =
_
_
4
1
−2
_
_
and B =
_
1 −3 2
¸
then
AB =
_
_
4 −12 8
1 −3 2
−2 6 −4
_
_
Note that in this case, the (i, j)
th
element of AB is simply a
i
b
j
; this is
because the rows of A and columns of B each consist of a single element.
2.2.4 Associativity and Commutativity
Fact. Matrix multiplication is associative.
If the dimensions of A, B and C are such that the products AB and
BC can be formed, we can also form the product ABC, which is the matrix
of the cascaded transformation A◦ B ◦ C depicted in Figure 2.3.
Grouping C and B together, we can write the output y of the cascade
as A(BC)x. Grouping B and A together, we have the equivalent expression
(AB)Cx. Thus
ABC = A(BC) = (AB)C
60
A
x y=A(B(C(x)))
B
B(C(x))
C
C(x)
Figure 2.3: Three systems connected in series.
which is also known as the associative property of matrix multiplication. It
can be extended to the product of L matrices, where L is arbitrary. The
number of matrices in the product can be reduced to L−1 by multiplying out
any two adjacent matrices without changing their position in the product;
this procedure can then be repeated until the product is expressed as a single
matrix.
Fact. Matrix multiplication is not commutative.
We say that A and B commute if
AB = BA
Clearly, the only way the products AB and BA can both exist and have
the same dimensions is if the matrices A and B are both square, i.e., nn.
Still, this is not suﬃcient for commutativity, as Example 2.2.4 below shows.
Example 2.2.4. Consider the two-dimensional case (n = 2). Let A repre-
sent projection of (x
1
, x
2
) on the horizontal (x
1
) axis. The linear transfor-
mation A applied to the two unit vectors (1, 0) and (0, 1) yields
A(1, 0) = (1, 0), A(0, 1) = (0, 0)
and therefore
A =
_
1 0
0 0
_
Let Brepresent a counterclockwise rotation of (x
1
, x
2
) by an angle θ. We can
obtain the elements of B either from geometry, or by noting that rotation
by angle θ amounts to multiplication of the complex number x
1
+ jx
2
by
e

= cos θ +j sin θ. The result of the multiplication is
x
1
cos θ−x
2
sin θ+j(x
1
sin θ+x
2
cos θ) = (x
1
cos θ−x
2
sinθ, x
1
sin θ+x
2
cos θ)
and thus
B =
_
cos θ −sin θ
sin θ cos θ
_
61
Clearly,
AB =
_
cos θ −sin θ
0 0
_
while
BA =
_
cos θ 0
sin θ 0
_
Thus the two matrices do not commute except in the trivial case where
sin θ = 0, i.e, θ = 0 or θ = π. This can be also seen using a geometrical
argument. Brieﬂy, the transformation A ◦ B (whose matrix is AB) is a
cascade of a rotation (ﬁrst), followed by projection on the x
1
-axis. The
result of the transformation is therefore always on the x
1
-axis, i.e., it has
x
2
= 0. On the other hand, projection on the x
1
-axis followed by rotation
through an angle other than 0 or π will result in a vector with x
2
,= 0.
Example 2.2.5. If A and B are rotations by angles φ and θ, respectively,
then
AB = BA
i.e., the two matrices commute. This is because the order of the two rotations
is immaterial—both AB and BA represent a rotation by an angle θ+φ.
62
2.3 More on Matrix Algebra
2.3.1 Column Selection and Permutation
We saw that if A is a mn matrix and e
(j)
is the j
th
unit vector in R
n×1
,
then
Ae
(j)
= (A)
·j
where (A)
·j
is the j
th
column of the matrix A. Thus multiplication of a
matrix A by a (column) unit vector is the same as column selection, as
illustrated in Example 2.3.1.
Example 2.3.1.
_
_
a b c
d e f
g h i
_
_
_
_
0
1
0
_
_
=
_
_
b
e
h
_
_

By appending one or more unit vectors to e
(j)
, it is possible to select two
or more columns from the matrix A. This is based on the following general
fact.
Fact. If
B =
_
B
(1)
B
(2)
¸
where B
(1)
and B
(2)
are submatrices of B, each having the same number of
rows as B, then
AB =
_
AB
(1)
AB
(2)
¸
Similarly, if
A =
_
A
(1)
A
(2)
_
where A
(1)
and A
(2)
are submatrices of A, each having the same number of
columns as A, then
AB =
_
A
(1)
B
A
(2)
B
_
Also,
AB =
_
A
(1)
B
(1)
A
(1)
B
(2)
A
(2)
B
(1)
A
(2)
B
(2)
_

The formulas above are easily obtained by recalling that the (i, j)
th
el-
ement of the product AB is the inner product of the i
th
row of A and the
j
th
column of B. They can be extended to horizontal partitions of A and
vertical partitions of B involving an arbitrary number of submatrices.
63
Choosing B
(1)
, B
(2)
, . . . as unit vectors in R
n×1
, we obtain a matrix
consisting of columns of A. This is illustrated in Example 2.3.2 below.
Example 2.3.2.
_
_
a b c
d e f
g h i
_
_
_
_
0 0
1 1
0 0
_
_
=
_
_
b b
e e
h h
_
_
_
_
a b c
d e f
g h i
_
_
_
_
0 0 1
1 0 0
0 1 0
_
_
=
_
_
b c a
e f d
h i g
_
_
Note that the second product in Example 2.3.2 resulted in a matrix with
the same columns as those of A, but permuted. This is because
_
_
0 0 1
1 0 0
0 1 0
_
_
consists of all the unit (column) vectors in arbitrary order. This type of
matrix is known as a permutation matrix.
Deﬁnition 2.3.1. A nn matrix P is a permutation matrix if all rows and
columns of P are unit vectors in R
n
.
Two equivalent conditions are:
• the rows of P are the unit vectors in R
1×n
(in any order); and
• the columns of P are the unit vectors in R
n×1
(in any order).
The product AP is a matrix whose columns are the same as those of A,
ordered in the same way as the unit vectors are in the columns of P. Note
that A can be any mn matrix, i.e., it need not be square.
2.3.2 Matrix Transpose
If A is a m n matrix, then A
T
is the n m matrix whose rows are the
columns of A (equivalently, whose columns are the rows of A). A
T
is known
as the transpose of A, and T is known as the transpose operator.
64
Deﬁnition 2.3.2. If A = [a
ij
], then A
T
is deﬁned by
(A
T
)
ij
= (A)
ji
= a
ji
for every pair (i, j).
Example 2.3.3.
_
a b c
¸
T
=
_
_
a
b
c
_
_
_
_
a b c
d e f
g h i
_
_
T
=
_
_
a d g
b e h
c f i
_
_
_
_
a b c
b d e
c e f
_
_
T
=
_
_
a b c
b d e
c e f
_
_
Note that the third transposition in Example 2.3.3 above resulted in
the same (square) matrix. This was a consequence of symmetry about the
Deﬁnition 2.3.3. A n n matrix A is symmetric if A = A
T
. Equivalent
conditions are:
• for every (i, j), a
ij
= a
ji
; and
• for every i, the i
th
row of Aequals (the transpose of) its i
th
column.
Fact.
(A
T
)
T
= A
Fact. Assuming the product AB exists,
(AB)
T
= B
T
A
T
To prove this fact, note that the (i, j)
th
element of (AB)
T
is the (j, i)
th
element of AB. This equals the inner product of the j
th
row of Aand the i
th
column of B; which is the same as the inner product of the the j
th
column
of A
T
and the i
th
row of B
T
. This is just the (i, j)
th
element of B
T
A
T
.
65
2.3.3 Row Selection
As usual, let A be a m n matrix. If e
(i)
is the i
th
unit vector in R
m×1
,
then (e
(i)
)
T
is also the i
th
unit vector in R
1×m
. The product
(e
(i)
)
T
A
is a row vector whose transpose equals
((e
(i)
)
T
A)
T
= A
T
e
(i)
namely the i
th
column of the nm matrix A
T
. As we know, this is the same
as the i
th
row of A. We therefore see that left multiplication of a matrix A
by a row unit vector results in selecting a row from A.
Example 2.3.4.
_
0 1 0
¸
_
_
a b c
d e f
g h i
_
_
=
_
d e f
¸

Similarly, left-multiplying A by a mm permutation matrix results in
a permutation of the rows or A. That is, PA is a matrix whose rows are the
same as those of A, ordered in the same way as the unit vectors are ordered
in the rows of P. (Note that this ordering is not necessarily the same as
that of the unit vectors in the columns of P.)
Example 2.3.5.
_
_
0 1 0
1 0 0
0 0 1
_
_
_
_
a b c
d e f
g h i
_
_
=
_
_
d e f
a b c
g h i
_
_

Fact. If P is a permutation matrix, then so is P
T
. Furthermore,
P
T
P = I
where
I
ij
=
_
1, i = j;
0, i ,= j.
The ﬁrst statement follows directly from the deﬁnition of a permutation
matrix. To prove the second statement, note that the (i, j)
th
element of
P
T
P equals the inner product of the i
th
and j
th
columns of P. Since the
columns of P are distinct unit vectors, it follows that the inner product
equals unity if i = j, zero otherwise.
66
2.4 Matrix Inversion and Linear Independence
2.4.1 Inverse Systems and Signal Representations
We used the equation
y = A(x) = Ax
to describe the response y of a linear system A to an input signal x. In
general, the matrix A is of size mn, which means that the input vector x
is n-dimensional and the output vector y is m-dimensional.
In almost all cases, the output signal y contains useful information about
the input signal x, which may otherwise be “hidden” (i.e., not directly acces-
sible). The natural question to ask is whether this information is complete;
in other words, whether it possible to recover x from y. The transformation
y →x, if properly deﬁned, would be the inverse system A
−1
.
A careful approach to this (inversion) problem must ﬁrst address the
following two issues:
• Existence of an inverse: Given any y in the output space (R
m
), can
we ﬁnd a vector x in the input space (R
n
) such that y = A(x)? If
not, for which vectors y in R
m
is this possible?
• Uniqueness of the inverse: If, for a particular vector y in R
m
, there
exists x in R
n
such that y = A(x), is x unique? Or do there exist
other vectors x

,= x which also satisfy y = A(x

)?
As we will soon see, the answers to questions posed above (for a given
system A) determine whether the inverse system A
−1
can be deﬁned in a
meaningful way. How that system can be obtained practically from A is a
separate problem which we will discuss later.
Inversion of linear transformations is also important in signal analysis.
Consider a mn matrix V consisting of columns v
(1)
, . . . , v
(n)
:
V =
_
v
(1)
. . . v
(n)
¸
If the column vectors are taken as reference signals (e.g., discrete-time sinu-
soids in Fourier analysis), a natural question to ask is whether an arbitrary
signal s in R
m
can be expressed as a linear combination of these reference
signals, i.e., whether there exists a vector c of coeﬃcients c
1
, . . . , c
n
such
that
s =
n

r=1
c
r
v
(r)
67
or equivalently,
s = Vc
This problem of signal representation is analogous to the system inversion
problem posed earlier, with a change of notation from y = Ax to s = Vc.
In particular, the existence and uniqueness questions formulated earlier are
also relevant in signal representation.
In some applications, the set of reference signals v
(1)
, . . . , v
(n)
may not
be suﬃciently large to provide a representation for every signal s in R
m
,
i.e., the answer to the existence question is negative. In such cases, we
are often interested in the best approximation to a signal s in terms of a
linear combination of the columns of V. In other words, we are interested
in minimizing some function of the error vector
Vc −s
by choice of c. We will later study the solution to this problem in the case
where the function chosen for that purpose is the sum of squares of the error
vector entries.
2.4.2 Range of a Matrix
The range, or column space, of a mn matrix A is deﬁned as the set of all
linear combinations of its columns, and is denoted by 1(A). Thus
1(A) = ¦Ac : c ∈ R
n×1
¦
1(A) is also the range of the transformation A, which maps R
n
into R
m
.
Clearly,
1(A) ⊂ R
m×1
The concept of range allows us to reformulate the existence and unique-
ness questions of the previous section as follows:
• Existence of an inverse: What is the range 1(A)? Does it coincide
with R
m×1
, or is it a proper subset of R
m×1
?
• Uniqueness of an inverse: For y ∈ 1(A), does the set of vectors
x ∈ R
n×1
satisfying y = Ax consist of a single element? Or does it
contain two or more vectors?
In what follows, we will examine the properties of the range of a square
(i.e., n n) matrix A, and argue that the answers to the existence and
68
uniqueness questions posed above are either both aﬃrmative or both nega-
tive. Our discussion will involve the concept of linear independence of a set
of vectors, and will also allow us to draw certain conclusions about matrices
of arbitrary size mn.
2.4.3 Geometrical Interpretation of the Range
1(A) is a linear subspace of R
m×1
. Here, “linear” means that all linear
combinations of vectors in 1(A) also lie in 1(A). To see why this has to
be so, take two vectors in 1(A), say Ac
(1)
and Ac
(2)
, and form their linear
combination λAc
(1)
+µAc
(2)
. This can be written as A(λc
(1)
+µc
(2)
), and
therefore also lies in 1(A).
The largest Euclidean space R
m×1
which we comfortably visualize is the
three-dimensional one, i.e., R
3×1
. It has four types of subspaces, categorized
by their dimension d:
• d = 0: Only one such subspace exists, consisting of a single point,
namely the origin 0 = [0 0 0]
T
.
• d = 1: Straight lines through the origin.
• d = 2: Planes through the origin.
• d = 3: Only one such subspace exists, namely R
3×1
itself.
The range of a 3 n matrix will fall into one of the above categories,
i.e., it will have dimension 0, 1, 2 or 3. To see how algebraic relationships
between columns aﬀect the dimension of the range, let us focus on the special
case of a square (3 3) matrix A given by
A =
_
a
(1)
a
(2)
a
(3)
¸
The range of the ﬁrst column a
(1)
consists of all vectors of the form
c
1
a
(1)
, where c
1
∈ R. As c
1
varies, we obtain a full line through the origin
unless a
(1)
= 0. Thus with only one exception, the range of a single column
vector is one-dimensional, as illustrated in Figure 2.4.
Clearly, the same statements can be made about the range of the second
column a
(2)
. If we now consider the range of the matrix [a
(1)
a
(2)
], we
observe the following: provided 1(a
(2)
) and 1(a
(2)
) are two distinct lines
through the origin, linear combinations of a
(1)
and a
(2)
will produce a linear
subspace of higher dimension, i.e., a plane. In fact, 1([a
(1)
a
(2)
]) is the
69
y
1
y
2
y
3
a
(1)
R(a )
(1)
0
Figure 2.4: The range of a single nonzero vector a
(1)
is a straight
line through the origin.
unique plane which contains the lines 1(a
(2)
) and 1(a
(2)
), as illustrated in
Figure 2.5 (on the left).
It is important to note that any vector y on the plane 1([a
(1)
a
(2)
]) has
a unique representation as a linear combination of a
(1)
and a
(2)
. This is
illustrated in Figure 2.5 (on the right), where a line parallel to a
(2)
is drawn
through y to intersect the line containing a
(1)
. This results in two vectors,
c
1
a
(1)
and c
2
a
(2)
, and the unique representation
y = c
1
a
(1)
+c
2
a
(2)
a
(1)
R([a a ])
(1) (2)
0
a
(2)
R([a a ])
(1) (2)
a
(1)
a
(2)
0
y
c a
(1)
1
c a
(2)
2
Figure 2.5: The range of [a
(1)
a
(2)
] is a plane through the origin
(left). Any point on that plane has a unique representation as a
linear combination of a
(1)
and a
(2)
(right).
As pointed out earlier, 1([a
(1)
a
(2)
]) is two-dimensional if and only if
1(a
(1)
) and 1(a
(2)
) are two distinct lines through the origin. This is the
70
same as saying that neither a
(1)
nor a
(2)
is an all-zeros vector; and, in
addition, the two vectors are not multiples of each other. These conditions
can be concisely stated as follows: The only linear combination
c
1
a
(1)
+c
2
a
(2)
which equals the all-zeros vector 0 = [0 0 0]
T
is the one with c
1
= c
2
= 0.
Finally, let us consider the third column a
(3)
. Arguing in the same way
as before, we see that, provided 1([a
(1)
a
(2)
]) is a (two-dimensional) plane
that does not contain a
(3)
, linear combinations of vectors in 1([a
(1)
a
(2)
])
with a
(3)
will produce a linear subspace of higher dimension, namely R
3×1
itself. Furthermore, any vector in R
3×1
will have a unique representation of
the form
y = c
1
a
(1)
+c
2
a
(2)
+c
3
a
(3)
This is illustrated in Figure 2.6 (on the right), where a line parallel to a
(3)
is drawn through y to intersect the plane 1([a
(1)
a
(2)
]). We thus determine
c
3
, and, proceeding as before (i.e., as in the two-dimensional case), we also
determine c
1
and c
2
.
a
(1)
R([a a ])
(1) (2)
0
a
(2)
a
(3)
a
(2)
a
(1)
a
(3)
c a
(3)
3
y
0
Figure 2.6: The range of the matrix [a
(1)
a
(2)
a
(3)
] is the en-
tire three-dimensional space R
3×1
(left). Any point inR
3×1
has a
unique representation as a linear combination of a
(1)
, a
(2)
and a
(3)
(right).
In conclusion, 1(A) = R
3×1
if and only if none of the columns of A
can be expressed as a linear combination of the others; this condition also
prohibits any of the columns from being an all-zeros vector or a multiple of
another column. An equivalent way of stating this is as follows: The only
linear combination
c
1
a
(1)
+c
2
a
(2)
+c
3
a
(3)
71
which equals the all-zeros vector 0 is the one where c
1
= c
2
= c
3
= 0.
We have thus arrived at one of the most important concepts in linear
algebra, that of linear independence. We restate it for an arbitrary set of
m-dimensional vectors.
Deﬁnition 2.4.1. The vectors a
(1)
, . . . , a
(n)
in R
m×1
are linearly indepen-
dent if the only linear combination
c
1
a
(1)
+ +c
n
a
(n)
which equals the all-zeros vector 0 is the one where c
1
= . . . = c
n
= 0.
An equivalent deﬁnition of linear independence in terms of the m n
matrix
A =
_
a
(1)
. . . a
(n)
¸
is the following: The columns of A are linearly independent if the only
solution to the equation
Ac = 0
is the all-zeros vector c = 0 (in R
n×1
).
72
2.5 Inverse of a Square Matrix
2.5.1 Nonsingular Matrices
In the previous section, we showed that linear independence of the columns
of a 3 3 matrix A is a necessary and suﬃcient condition for the range (or
column space) 1(A) of A to have dimension d = 3, i.e., coincide with R
3×1
.
Since 1(A) is the set of vectors y ∈ R
3×1
for which the equation
Ax = y
has a solution (given by x), it follows that linear independence is a necessary
and suﬃcient condition for the above equation to have a solution for every
y in R
3×1
.
By extension, the same argument can be applied to an arbitrary n
n matrix. In other words, linear independence of the columns of A is a
necessary and suﬃcient condition for 1(A) to coincide with R
n×1
, and for
the above equation to have a solution for every y in R
n×1
. Thus linear
independence provides an answer to the question of existence of an inverse.
The uniqueness of the inverse can be also addressed using the concept
of linear independence. Using the argument made for the case n = 3, we
conclude that if 1(A) = R
n×1
, then every vector in R
n×1
has a unique rep-
resentation as a linear combination of columns of A, i.e., the above equation
has a unique solution. If, on the other hand, 1(A) has dimension d < n,
meaning that the columns of A are linearly dependent, then at least one
column of A lies on the subspace generated by the remaining columns and
can thus be expressed as a linear combination of those columns. Any point
in that subspace will have inﬁnitely many representations. As an example,
take the case n = 3 and suppose that
c
1
a
(1)
+c
2
a
(2)
+c
3
a
(3)
= 0
where c is such that (say) c
3
,= 0. Then
a
(3)
= −
λc
1
c
3
a
(1)

λc
2
c
3
a
(2)
+ (1 −λ)a
(3)
for every value of λ ∈ R, showing that a
(3)
(and by extension, every linear
combination of a
(1)
, a
(2)
and a
(3)
) has inﬁnitely many representations.
In conclusion, linear independence of the columns of a n n matrix is
a necessary and suﬃcient condition for the existence of a solution to the
equation
Ax = y
73
for every y ∈ R
n×1
, as well as for the uniqueness of that solution.
Deﬁnition 2.5.1. A nn matrix is nonsingular if its columns are linearly
independent. It is singular if its columns are linearly dependent.
Example 2.5.1. The 3 3 matrix
A =
_
_
1 a b
0 1 c
0 0 1
_
_
is nonsingular. Note that the ﬁrst column does not equal the all-zeros vector.
Also, the second column cannot be obtained as from the ﬁrst one by scaling
(no scaling of zero can produce a nonzero entry in the second row). Similarly,
the third column cannot be obtained as a linear combination of the ﬁrst and
second columns (no linear combination of zeros can produce a nonzero entry
in the third row).
Example 2.5.2. The 3 3 matrix
A =
_
_
1 1 3
−1 0 −1
0 1 2
_
_
is singular, since
_
_
1
−1
0
_
_
+ (2)
_
_
1
0
1
_
_
=
_
_
3
−1
2
_
_
(A test for nonsingularity will be developed later in conjunction with Gaus-
sian elimination.)
2.5.2 Inverse of a Nonsingular Matrix
From the foregoing discussion, it follows that a nonsingular matrix A deﬁnes
a linear transformation A : R
n
→R
n
which is a one-to-one correspondence.
In particular, an inverse transformation A
−1
: R
n
→R
n
also exists, and
y = A(x) ⇔ x = A
−1
(y)
The transformation A
−1
is also linear. To see this, let y
(1)
and y
(2)
be
any two vectors in R
n
, and let
x
(1)
= A
−1
(y
(1)
) and x
(2)
= A
−1
(y
(2)
)
74
Since the forward transformation A is linear, we have for any c
1
and c
2
,
A(c
1
x
(1)
+c
2
x
(2)
) = c
1
A(x
(1)
) +c
2
A(x
(2)
)
= c
1
y
(1)
+c
2
y
(2)
Therefore
A
−1
(c
1
y
(1)
+c
2
y
(2)
) = c
1
x
(1)
+c
2
x
(2)
= c
1
A
−1
(y
(1)
) +c
2
A
−1
(y
(2)
)
which proves that A
−1
is a linear transformation. As such, it has a matrix,
denoted by A
−1
.
Deﬁnition 2.5.2. If A is a nonsingular matrix corresponding to a linear
transformation A, then A
−1
is the matrix of the inverse transformation
A
−1
.
From the above deﬁnition, we have
(A
−1
)
−1
= A
and we can also evaluate the product AA
−1
. Indeed, AA
−1
is the matrix
−1
. Since
A
−1
(A(x)) = x
for every x ∈ R
n
, it follows that
A
−1
◦ A = I
namely the identity transformation. The corresponding matrix is the n n
identity matrix
I =
_
¸
¸
¸
_
1 0 . . . 0
0 1 . . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . 1
_
¸
¸
¸
_
which clearly satisﬁes
Ix = x
for every x ∈ R
n×1
. We conclude that
AA
−1
= I
Similarly, A
−1
◦ A is also an identity system, and therefore
A
−1
A = I
Either of the last two equations can be used to check whether two given
matrices are inverses of each other.
75
2.5.3 Inverses in Matrix Algebra
Fact. If A and B are both nonsingular, then so is the product AB, and
(AB)
−1
= B
−1
A
−1
This fact can be demonstrated schematically (see Figure 2.7), by noting
that the AB is the matrix of the cascade A◦B. This cascade can be inverted
by applying the transformation A
−1
to the output, followed by B
−1
.
y
A B
x
y
A B
x
−1 −1
Figure 2.7: Illustration of the identity (AB)
−1
= B
−1
A
−1
.
Thus
(A◦ B)
−1
= B
−1
◦ A
−1
and consequently
(AB)
−1
= B
−1
A
−1
An alternative proof is based on the identities developed in the previous
section. We have
A
−1
(AB) = (A
−1
A)B = B
where the ﬁrst equality is due to the associativity of the matrix product. It
follows that
B
−1
A
−1
(AB) = B
−1
B = I
and thus AB and B
−1
A
−1
are inverses of each other.
Fact. If A is nonsingular, then so is A
T
, and
(A
T
)
−1
= (A
−1
)
T
To prove this identity, recall that
(AB)
T
= B
T
A
T
76
Taking B = A
−1
, we obtain
I
T
= (A
−1
)
T
A
T
But I is symmetric: I
T
= I. Therefore
(A
−1
)
T
A
T
= I
which means that
(A
T
)
−1
= (A
−1
)
T

2.5.4 Nonsingularity of Triangular Matrices
Deﬁnition 2.5.3. A lower triangular matrix A is a square matrix deﬁned
by
(∀j > i) a
ij
= 0
i.e., all elements above the main diagonal (also known as superdiagonal el-
ements) are zero. Similarly, an upper triangular matrix A is a square
matrix whose elements below the main diagonal (the subdiagonal elements)
are zero.
Triangular matrices have numerous applications in linear algebra, and
are particularly useful in solving equations of the form Ax = b, where
A is square. The question of invertibility (i.e, nonsingularity) of triangular
matrices is important in all such applications. As it turns out, nonsingularity
can be readily determined by inspection of the matrix.
Recall how, in Example 2.5.1, we argued that the upper triangular matrix
_
_
1 a b
0 1 c
0 0 1
_
_
is nonsingular. A similar argument could be given for the lower triangular
matrix
_
_
1 0 0
a 1 0
b c 1
_
_
but there is no need to do so: a matrix is nonsingular if and only if its trans-
pose is. In what follows, we make a general observation about triangular
matrices of arbitrary size.
Fact. A triangular matrix A is nonsingular if and only if all elements on
its main diagonal are nonzero, i.e., a
ii
,= 0 for all i.
77
To prove the “if” statement, suppose that all (main) diagonal elements
are nonzero. Arguing as in Example 2.5.1, we see immediately that the
ﬁrst column does not equal the all-zeros vector (since a
11
,= 0). Examining
each of the following columns in turn, we see that the j
th
column cannot be
expressed as a linear combination of the ﬁrst j −1 columns since its nonzero
diagonal entry a
jj
would have to be obtained by a linear combination of
zeros—which is clearly impossible. Thus the columns of A are linearly
independent, and A is nonsingular.
To prove the “only if” statement, we argue by contradiction. Suppose
that there is at least one zero on the diagonal, and that the leftmost such
zero is in the j
th
column, i.e.,
a
jj
= 0 and (∀i < j) a
ii
,= 0
This means that the ﬁrst j columns of A will be of the form
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
a
11
a
12
a
13
. . . a
1,j−1
a
1j
0 a
22
a
23
. . . a
2,j−1
a
2j
0 0 a
33
. . . a
3,j−1
a
3j
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . a
j−1,j−1
a
j−1,j
0 0 0 . . . 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 0
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
Since diagonal elements a
11
through a
j−1,j−1
are all nonzero, the ﬁrst j −1
columns will be linearly independent (see the “if” argument above). The
same will be true of the column segments obtained by discarding the zeros
beneath the (j − 1)
th
row. Since the range of j − 1 linearly independent
vectors in R
j−1
is the entire space R
j−1
, we can obtain the ﬁrst j − 1
entries of the j
th
column by linearly combining the corresponding segments
of the ﬁrst j −1 columns. The entire j
th
column can therefore be obtained
by a linear combination of the ﬁrst j − 1 columns, and the matrix A is
singular.
2.5.5 The Inversion Problem for Matrices of Arbitrary Size
The concept of linear independence allows us to investigate the existence
and uniqueness of the solution of the equation
Ax = y
78
where y ∈ R
m×1
is known, x ∈ R
n×1
is unknown and the matrix A is of
size mn, with n ,= m.
In short, depending on whether m > n or m < n, and on whether the
columns of A are linearly dependent or independent, a solution may or may
not exist for a particular y ∈ R
m×1
. If a solution exists for all y ∈ R
m×1
,
then that solution cannot be unique. This is in sharp contrast to the case
m = n, where existence of a solution for all y ∈ R
m×1
is equivalent to
uniqueness of that solution (and also equivalent to linear independence of
the columns of A). It also tells us that deﬁning an inverse matrix A
−1
in
the case n ,= m can be very tricky—in fact, it is not done.
In more speciﬁc terms, the equation Ax = y seeks to express the m-
dimensional column vector y as a linear combination of the n columns of
A. A solution exists if and only if y ∈ 1(A); and it is unique if and only
if every y ∈ 1(A) is given by a unique linear combination of columns of
A, i.e., if and only if the columns of A are linearly independent. Note that
since A has n columns, the dimension d of 1(A) must satisfy d ≤ n. And
since 1(A) is a subset of R
m×1
, we must also have that d ≤ m. Thus
d ≤ min(m, n).
The overdetermined case (n < m). Here d < m, hence 1(A) is a lower-
dimensional subspace of R
m×1
. Thus there are (inﬁnitely many) y’s for
which the equation has no solution. For those y’s for which a solution exists
(i.e, the y’s in 1(A)), the solution is unique if and only if the columns of A
are linearly independent, i.e., d = n.
The underdetermined case (n > m). The number of columns of A exceeds
m, the dimension of R
m×1
. We know that at most m of these columns can
be linearly independent, i.e., d ≤ m. There are two possibilities:
• If d = m, then 1(A) = R
m×1
and a solution exists for every y ∈
R
m×1
. That solution is not unique since the n−m redundant columns
can be also used in the linear combination Ax which gives y.
• If d < m, then 1(A) is a lower-dimensional subset of R
m×1
. Then a
solution cannot exist for every y; if it does exist, it cannot (again) be
unique.
The overdetermined case will be examined later on from the viewpoint
of signal approximation, i.e, solving for x which minimizes a function of the
error vector Ax −y.
79
2.6 Gaussian Elimination
2.6.1 Statement of the Problem
We now focus on solving the equation
Ax = b
where A is a n n matrix, and x and b are both n-dimensional column
vectors. This (matrix-vector) equation can be expanded into n simultaneous
linear equations in the variables x
1
, . . . , x
n
:
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= b
2
.
.
.
.
.
.
.
.
.
a
n1
x
1
+a
n2
x
2
+ +a
nn
x
n
= b
n
We know that a unique solution x exists for every b if and only if A
is nonsingular. The procedure (algorithm) known as Gaussian elimination
produces the unique solution x in the case where A is nonsingular, and can
be also used to detect singularity of A.
2.6.2 Outline of Gaussian Elimination
The algorithm is based on the fact that if
Ax = b
and
MAv = Mb
for two nonsingular matrices A and M, then x = v. Indeed,
v = (MA)
−1
Mb = A
−1
M
−1
Mb = A
−1
b = x
where all the inverses in the above equalities are valid by the assumption
of nonsingularity. Thus the equations Ax = b and MAx = Mb have the
same solution.
In Gaussian elimination, left-multiplication of both sides of the equation
by a nonsingular matrix is performed repeatedly until the matrix on the left
80
is reduced to the n n identity:
Ax = b
M
(1)
Ax = M
(1)
b
M
(2)
M
(1)
Ax = M
(2)
M
(1)
b
.
.
.
.
.
.
.
.
.
M
(l)
M
(2)
M
(1)
Ax = M
(l)
M
(2)
M
(1)
b
where
M
(l)
M
(2)
M
(1)
A = I or equivalently, M
(l)
M
(2)
M
(1)
= A
−1
The solution is then given by the right-hand side vector:
x = M
(l)
M
(2)
M
(1)
b
Recall that left multiplication of a matrix by a row vector (i.e., a vector-
matrix product) produces a linear combination of the matrix rows, with
coeﬃcients given by the corresponding entries in the row vector. By con-
sidering each row of the matrix M separately, we see that the product MA
(which has the same dimensions as A) consists of rows that are linear combi-
nations of the rows of A. Thus Gaussian elimination proceeds by a sequence
of steps, each step involving certain row operations which are “encoded” as
a nonsingular matrix M
(j)
. We should add that, since the same matrix
M
(j)
also multiplies the right-hand side vector, we eﬀectively take linear
combinations of entire equations (not just matrix rows).
Gaussian elimination is completed in two phases:
• forward elimination (also known as forward substitution), followed by
• backward substitution (also known as backward elimination).
The forward elimination phase is completed in n−1 steps. During the j
th
step, scaled versions of the j
th
equation are subtracted from the equations
below it so as to eliminate the variable x
j
from all these equations. The
matrix M
(j)
is a (nonsingular) lower triangular matrix whose form we will
soon examine. As we shall see later, in certain situations it may be desirable
(or even necessary) to change the order of equations indexed j through n
prior to executing the j
th
step. Such interchanges can be formally dealt
with by inserting permutation matrices into the string M
(l)
M
(1)
; an
easier, and more practical, way of representing such permutations will also
be discussed.
81
The backward substitution phase is completed in n steps. The j
th
step solves for variable x
n−j+1
using the previously computed values of
x
n
, x
n−1
, . . . , x
n−j+2
. The matrices M
(j)
are (nonsingular) upper triangular
matrices. Diﬀerent equivalent forms for these matrices will be examined in
Section 2.7. The solution of Ax = b is thus obtained in a total of l = 2n−1
steps.
2.6.3 Gaussian Elimination Illustrated
Consider the 3 3 equation given by
2x
1
+x
2
−x
3
= 6
4x
1
−x
3
= 6
−8x
1
+ 2x
2
+ 3x
3
= −10
Since row operations (i.e., multiplication by matrices M
(j)
) are per-
formed on both sides of the equations, it is convenient to append the vector
b to the matrix A:
S =
_
A b
¸
We will use the symbol S to denote the initial matrix [A b] as well as
ij
will refer to the (i, j)
th
element in the current matrix
S.
We display S in a table, with an additional empty column on the left
labeled “m” (for “multiplier”).
m x
1
x
2
x
3
b
2 1 −1 6
4 0 −1 6
−8 2 3 −10
We eliminate x
1
from the second row by adding a multiple of the ﬁrst
row (to the second row). By inspection of the ﬁrst column, the value of the
multiplier equals −(4/2) = −2. Similarly, x
1
can be eliminated from the
third row by adding a multiple of the ﬁrst row (to the third row), the value
of the multiplier being −(−8)/2 = 4. These two values are entered in the
m column, in their respective rows. The coeﬃcient of x
1
in the ﬁrst row is
also underlined; this coeﬃcient was used to obtain the multiplier for each
subsequent row.
m x
1
x
2
x
3
b
2 1 −1 6
−2 4 0 −1 6
4 −8 2 3 −10
82
Upon adding the appropriate multiples of the ﬁrst row to the second and
third, we obtain
m x
1
x
2
x
3
b
2 1 −1 6
0 −2 1 −6
0 6 −1 14
To eliminate x
2
from the third row, we add a multiple of the second
row (to the third row), the value of the multiplier being −(6/(−2)) = 3.
As before, we enter the multiplier in the m column, in the third row. The
coeﬃcient of x
2
in the second equation is also underlined.
m x
1
x
2
x
3
b
2 1 −1 6
0 −2 1 −6
3 0 6 −1 14
Upon adding the multiple of the second row to the third one, we obtain
m x
1
x
2
x
3
b
2 1 −1 6
0 −2 1 −6
0 0 2 −4
At this point we have completed the forward elimination. The resulting
equations are
2x
1
+x
2
−x
3
= 6
−2x
2
+x
3
= −6
2x
3
= −4
In backward substitution, we solve for x
3
(third equation):
x
3
= (−4)/2 = −2 ;
then use the result to solve for x
2
(second equation):
x
2
= (−6 −x
3
)/(−2) = 2 ;
and ﬁnish by solving for x
1
(ﬁrst equation):
x
1
= (6 −x
2
+x
3
)/2 = 1 .
Thus x = [1 2 −2]
T
.
83
2.6.4 Row Operations in Forward Elimination
Let S be the current update of [A b], and denote by (S)

the i
th
row of S.
As we saw in the previous subsection, the j
th
step in the forward elimination
entails replacing each row (S)

below (S)

by
(S)

+m
ij
(S)

where
m
ij
= −
s
ij
s
jj
Rows (S)

, . . . , (S)

are not modiﬁed.
These row operations can be carried out using matrix multiplication,
namely by updating S to the product M
(j)
S, where
M
(j)
=
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
1
1
.
.
.
1
1
m
j+1,j
1
.
.
.
.
.
.
m
nj
1
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
(only nonzero elements are shown above). Note that the lower triangular
matrix M
(j)
diﬀers from the identity only in the subdiagonal segment of the
j
th
column. That column segment (given by (m
j+1,j
, . . . , m
n,j
the m column in the Gaussian elimination table.
Thus, in the example of the previous subsection, we had
M
(1)
=
_
_
1 0 0
−2 1 0
4 0 1
_
_
and M
(2)
=
_
_
1 0 0
0 1 0
0 3 1
_
_
.
84
2.7 Factorization of Matrices in Gaussian Elimi-
nation
2.7.1 LU Factorization
Consider again the 3 3 example of Subsection 2.6.3:
2x
1
+x
2
−x
3
= 6
4x
1
−x
3
= 6
−8x
1
+ 2x
2
+ 3x
3
= −10
We saw that the row operations performed during the forward elimination
phase amounted to left-multiplying the matrix
S =
_
A b
¸
=
_
_
2 1 −1 6
4 0 −1 6
−8 2 3 −10
_
_
by the lower-triangular matrices
M
(1)
=
_
_
1 0 0
−2 1 0
4 0 1
_
_
and M
(2)
=
_
_
1 0 0
0 1 0
0 3 1
_
_
in succession. The update of S at the end of the forward elimination phase
was
_
_
1 0 0
0 1 0
0 3 1
_
_
_
_
1 0 0
−2 1 0
4 0 1
_
_
_
_
2 1 −1 6
4 0 −1 6
−8 2 3 −10
_
_
=
_
_
2 1 −1 6
0 −2 1 −6
0 0 2 −4
_
_
and thus
M
(2)
M
(1)
_
A b
¸
=
_
U y
¸
where the matrix
U = M
(2)
M
(1)
A =
_
_
2 1 −1
0 −2 1
0 0 2
_
_
was upper triangular. We then solved
Ux = y
85
by backward substitution, a procedure which, as we shall soon see, can be
also implemented by a series of matrix products.
Recall that triangular matrices are nonsingular if and only if they have
no zeros on the main diagonal. Thus both M
(1)
and M
(2)
are nonsingular
in this case, and since M
(2)
M
(1)
A = U, we have that
A = (M
(1)
)
−1
(M
(2)
)
−1
U
In general, the inverse of a (nonsingular) lower triangular matrix is also
lower triangular; and the product of two such matrices is also lower trian-
gular. Thus we can write
A = LU
where
L = (M
(1)
)
−1
(M
(2)
)
−1
is lower triangular.
The foregoing discussion is clearly applicable to any dimension n, i.e.,
forward elimination on a nonsingular n n matrix A yields a factorization
of the form A = LU, where
L = (M
(1)
)
−1
(M
(n−1)
)
−1
2.7.2 The Matrix L
It turns out L is particularly easy to compute due to the following interesting
fact.
Fact. Let C
(j)
be a n n matrix whose entries are zero with the possible
exception of the subdiagonal segment of the j
th
column. If k ≥ j, then
C
(j)
C
(k)
= 0
(Here 0 is the all-zeros matrix in R
n×n
.)
To prove this fact, note the special form of C
(j)
(elements not shown are
zero):
C
(j)
=
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
0
0
.
.
.
0
0
m
j+1,j
0
.
.
.
.
.
.
m
nj
0
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
86
Right multiplication of C
(j)
by C
(k)
will yield all-zero vectors in all columns
of C
(j)
C
(k)
except (possibly) the k
th
column. That column is a linear combi-
nation of the columns of C
(j)
with coeﬃcients given by corresponding entries
in the k
th
column of C
(k)
. Since the nonzero entries in the k
th
column of
C
(k)
appear below the main diagonal, only columns indexed k + 1 through
n of C
(j)
are included in the linear combination. But those columns are all
zero, since k + 1 > j by hypothesis. This completes the proof.
Now, the lower triangular matrix
M
(j)
=
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
1
1
.
.
.
1
1
m
j+1,j
1
.
.
.
.
.
.
m
nj
1
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
can be expressed as
M
(j)
= I +C
(j)
Using the previous fact, we have
(I +C
(j)
)(I −C
(j)
) = I −C
(j)
+C
(j)
−C
(j)
C
(j)
= I
and therefore
(M
(j)
)
−1
= I −C
(j)
=
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
1
1
.
.
.
1
1
−m
j+1,j
1
.
.
.
.
.
.
−m
nj
1
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
Using the same fact, we have that
(M
(1)
)
−1
(M
(2)
)
−1
= (I −C
(1)
)(I −C
(2)
)
= I −C
(1)
−C
(2)
+C
(1)
C
(2)
= I −C
(1)
−C
(2)
87
and thus the product (M
(1)
)
−1
(M
(2)
)
−1
is obtained by “overlaying” the two
matrices, i.e., adding their subdiagonal parts while keeping the same (unit)
diagonal. Right multiplication by (M
(3)
)
−1
, . . . , (M
(n−1)
)
−1
in succession
yields the ﬁnal result
L = (M
(1)
)
−1
(M
(2)
)
−1
(M
(n−1)
)
−1
= I −C
(1)
−C
(2)
− −C
(n−1)
or
L =
_
¸
¸
¸
¸
¸
¸
¸
_
1
−m
21
1
−m
31
−m
32
.
.
.
.
.
.
.
.
.
−m
n−1,1
−m
n−1,2
. . . 1
−m
n1
−m
n2
. . . −m
n,n−1
1
_
¸
¸
¸
¸
¸
¸
¸
_
(all superdiagonal elements are zero).
In other words, the subdiagonal entries of L are obtained from the m
Example 2.7.1. Continuing the example of Subsection 2.7.1, we have
M
(1)
=
_
_
1 0 0
−2 1 0
4 0 1
_
_
and M
(2)
=
_
_
1 0 0
0 1 0
0 3 1
_
_
Therefore
(M
(1)
)
−1
=
_
_
1 0 0
2 1 0
−4 0 1
_
_
and (M
(2)
)
−1
=
_
_
1 0 0
0 1 0
0 −3 1
_
_
Using the overlay property, we obtain
(M
(1)
)
−1
(M
(2)
)
−1
=
_
_
1 0 0
2 1 0
−4 −3 1
_
_
Thus the LU factorization of A is given by
_
_
2 1 −1
4 0 −1
−8 2 3
_
_
=
_
_
1 0 0
2 1 0
−4 −3 1
_
_
_
_
2 1 −1
0 −2 1
0 0 2
_
_

88
Note: The overlay property cannot be used to compute the product
L
−1
= M
(n−1)
M
(2)
M
(1)
since the order of the nonzero column segments in the M
(j)
’s is reversed.
Thus in Example 2.7.1,
L
−1
= M
(2)
M
(1)
,= M
(1)
M
(2)
=
_
_
1 0 0
−2 1 0
4 3 1
_
_
2.7.3 Applications of the LU Factorization
In many practical situations, the equation
Ax = b
is solved for the same matrix A (which could represent a known system or a
set of standard reference signals), but for many diﬀerent vectors b. The LU
factorization of A allows us to store the results of the forward elimination
in a way that minimizes the computational eﬀort involved in solving the
equation for a new value of b. Indeed, forward elimination on A can be
performed ahead of time (i.e., oﬀ-line). It is interesting to note that the
information contained in the pair (L, U) (ignoring the known zero and unit
elements) amounts to n
2
real numbers, which is exactly the same amount
as in the original matrix A.
For a given b, the equation
Ax = b
is equivalent to
LUx = b
which has solution
x = A
−1
b = U
−1
L
−1
b
We saw that forward elimination transforms A to L
−1
A = U; and that
backward substitution transforms U to U
−1
U = I. Focusing on the updates
of the right-hand side vector b, we note that forward elimination transforms
b to
y = L
−1
b
and backward substitution transforms y to
x = U
−1
y
89
Thus in eﬀect, forward elimination solves the equation
Ly = b
while backward substitution solves
Ux = y
The two systems above can be solved using the same principle, namely
substitution of known values to equations below (in the former case) or
above (in the latter case). Thus the terms substitution and elimination can
be used interchangeably in both the forward and the backward phase of
Gaussian elimination. The similarity between the two procedures becomes
more prominent when the factors L and U are known ahead of time, in
which case there is no need to derive L by working on A.
2.7.4 Row Operations in Backward Substitution
The analogy between the lower triangular system Ly = b and the upper
triangular system Ux = y suggests that row operations in backward substi-
tution may also be implemented by a series of matrix multiplications similar
to those used in forward elimination. This is indeed true, and is illustrated
in Example 2.7.2.
Example 2.7.2. The forward elimination phase in Subsection 2.6.3 resulted
in
m x
1
x
2
x
3
b
2 1 −1 6
0 −2 1 −6
0 0 2 −4
Dividing each row by its (main) diagonal element, we obtain
m x
1
x
2
x
3
b
1 1/2 −1/2 3
0 1 −1/2 3
0 0 1 −2
We now enter multipliers 1/2 and 1/2 for the ﬁrst and second rows respec-
tively. These are sign-inverted entries from the third column (corresponding
to x
3
).
m x
1
x
2
x
3
b
1/2 1 1/2 −1/2 3
1/2 0 1 −1/2 3
0 0 1 −2
90
Scaling the third row by the multiplier shown and adding it to the cor-
responding row results in eliminating x
3
from both the ﬁrst and second
equation:
m x
1
x
2
x
3
b
−1/2 1 1/2 0 2
0 1 0 2
0 0 1 −2
Finally, we use the multiplier −1/2 (shown above) to scale the second row
prior to adding it to the ﬁrst one, which results in elimination of x
2
from
the ﬁrst equation:
m x
1
x
2
x
3
b
1 0 0 1
0 1 0 2
0 0 1 −2
We have thus reduced U to the identity I, and obtained the solution x in
the last column.
In terms of matrix multiplications, we have
I = M
(5)
M
(4)
M
(3)
U
where
M
(3)
=
_
_
1/2 0 0
0 −1/2 0
0 0 1/2
_
_
M
(4)
=
_
_
1 0 1/2
0 1 1/2
0 0 1
_
_
and
M
(5)
=
_
_
1 −1/2 0
0 1 0
0 0 1
_
_

We make the following additional observations on Example 2.7.2.
• M
(3)
is actually diagonal, with inverse
D = (M
(3)
)
−1
=
_
_
2 0 0
0 −2 0
0 0 2
_
_
91
The result of left-multiplying U by M
(3)
= D
−1
is
V = D
−1
U =
_
_
1 1/2 −1/2
0 1 −1/2
0 0 1
_
_
which, like L, has diagonal entries normalized to unity. We thus have
an equivalent normalized factorization
A = LDV
• If the diagonal entries are not normalized prior to backward substitu-
tion, then the multipliers will be of the form −s
ij
/s
jj
, as was the case
in forward elimination. At the end, U will be reduced to a diagonal
(not necessarily identity) matrix.
92
2.8 Pivoting in Gaussian Elimination
2.8.1 Pivots and Pivot Rows
The j
th
step in the forward elimination process aims at eliminating x
j
from
all equations below the j
th
one. This is accomplished by subtracting multi-
ples of the j
th
row of the updated matrix
S = M
(j−1)
M
(2)
M
(1)
_
A b
¸
from rows indexed j + 1 through n. Thus for i ≥ j + 1 (only), row (S)

is
replaced by
(S)

+m
ij
(S)

This is equivalent to left-multiplying S by
M
(j)
=
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
1
1
.
.
.
1
1
m
j+1,j
1
.
.
.
.
.
.
m
nj
1
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
The multipliers are given by
m
ij
= −
s
ij
s
jj
The denominator s
jj
in the above expression is known as the pivotal element,
or simply pivot, for x
j
. The j
th
row, multiples of which are subtracted from
the remaining n−j rows (below it), is known as the pivot row for x
j
. Clearly,
the pivot row contains s
jj
in its j
th
position (column).
2.8.2 Zero Pivots
If a zero pivot s
jj
is encountered in the forward elimination process, then
the j
th
row cannot be used for eliminating x
j
from the rows below it—the
multipliers m
ij
will equal inﬁnity for all i such that s
ij
,= 0.
Assuming that such s
ij
,= 0 exists (for i > j), it makes sense to inter-
change the i
th
and j
th
rows and use the nonzero value—formerly s
ij
and
now s
jj
—as pivot. This is a simple concept which will be illustrated later.
93
An interesting question is what happens when s
jj
, as well as all elements
below it, are zero. To answer this question, suppose that s
jj
is the ﬁrst zero
pivot encountered. We will then have
M
(j−1)
M
(2)
M
(1)
A =
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
s
11
s
12
s
13
. . . s
1,j−1
s
1j
. . .
0 s
22
s
23
. . . s
2,j−1
s
2j
. . .
0 0 s
33
. . . s
3,j−1
s
3j
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . s
j−1,j−1
s
j−1,j
. . .
0 0 0 . . . 0 0 . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 0 . . .
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
where the diagonal entries s
11
, . . . , s
j−1,j−1
are previously used pivots and
are therefore nonzero. As we argued earlier in Subsection 2.5.4, the j
th
column can be obtained as a linear combination of columns to its left, and
thus the matrix is singular. Since
M
(j−1)
M
(2)
M
(1)
is singular, A must be singular also. (This can be argued by contradiction:
if A were nonsingular, then
M
(j−1)
M
(2)
M
(1)
A
would be the product of two nonsingular matrices, and would thus be non-
singular.)
Singularity of A will therefore result in a zero pivot during the forward
phase of the Gaussian elimination. Recall that singularity implies that
Ax = b
will have a solution only for certain vectors b, namely those in the range
of A (which has dimension smaller than n − 1). If the solution exists, it
will not be unique. The course of action when singularity in A is detected
depends very much on the application in question.
2.8.3 Small Pivots
We saw that row interchanges are necessary when zero pivots are encoun-
tered (assuming A is nonsingular). Row interchanges are also highly de-
sirable when small pivots are likely to cause problems in ﬁnite-precision
calculations.
94
Consider the simple 2 2 system given by
x
1
+a
12
x
2
= b
1
a
21
x
1
+a
22
x
2
= b
2
where a
12
, a
21
, a
22
, b
1
and b
2
are of the same order of magnitude, and are
all much larger than . The exact solution is easy to obtain in closed form;
if we ignore small contributions due to , we have the approximations
x
1
·
a
12
b
2
−a
22
b
1
a
12
a
21
and x
2
·
b
1
a
12
Neither x
1
nor x
2
is particularly large if the constants involved (except )
are of the same order of magnitude.
Clearly, the same solution is obtained regardless of the pivoting order.
But if precision is an issue, then the order shown above can lead to large
errors in the variable x
1
, which is obtained from the computed value of x
2
(by back-substitution):
x
1
=
b
1
−a
12
x
2

We know that the value of x
1
is not particularly large; therefore the numer-
ator cannot be that much larger than the denominator. Yet both b
1
and
a
12
x
2
are (absolutely) much larger than . This means that the diﬀerence
b
1
−a
12
x
2
is very small compared to either term, and is likely to be poorly
approximated if the precision of the computation is not high enough. The
following numerical example further illustrates these diﬃculties.
Example 2.8.1. Consider the system
0.003x
1
+ 6x
2
= 4.001
x
1
+x
2
= 1
which has exact solution x
1
= 1/3, x
2
= 2/3. Let us solve the system using
four-digit precision throughout.
Treating the equations in their original order, we obtain a multiplier
m
21
= −(1.000)/(0.003) = −333.3
and thus the second equation becomes
(1 −(333.3)(6))x
2
= 1 −(333.3)(4.001)
95
The resulting value of x
2
is
x
2
= 0.6668
Although the relative error in x
2
is only 0.02%, the eﬀect of this error is
much more severe on x
1
: using that value for x
2
in the ﬁrst equation, we
have
x
1
=
4.001 −(6)(0.6668)
0.003
Rounding the product (6)(0.6668) to four digits results in 4.001, and thus
1
= 0.000. This is clearly unsatisfactory.
Interchanging the equations and repeating the calculation, we obtain
x
2
= 0.6667 and x
1
= 0.3333, i.e., the correct values (rounded to four
signiﬁcant digits).
2.8.4 Row Pivoting
Row pivoting is a simple strategy for avoiding small pivots, which can be
summarized as follows.
1. At the beginning of the j
th
step, let I
j
be the set of indices i corre-
sponding to rows that have not yet been used as pivot rows.
2. Compare the values of [s
ij
[ for all i in I
j
. If i = i(j) yields the largest
such value, take i(j) as the pivot row for the j
th
step.
3. Remove index i(j) from the set I
j
.
4. For each i in I
j
, compute m
ij
= −s
ij
/s
i(j),j
.
5. For each i in I
j
, add the appropriate multiple of row i(j) to row i.
6. Increment j by 1.
In implementing row pivoting, we use an additional column (“p” for
“pivot order”) to indicate the ﬁrst, second, third, etc., pivot row.
Example 2.8.2. Consider the system
−x
2
+ 3x
3
= 6
3x
1
+ 2x
2
+x
3
= 15
2x
1
+ 4x
2
−5x
3
= 1
96
Clearly, pivoting is required for the ﬁrst step since a
11
= 0. We will follow
the row pivoting algorithm given above. We initialize the table:
p m x
1
x
2
x
3
b
0 −1 3 6
3 2 1 15
2 4 −5 1
To eliminate x
1
, we look for the (absolutely) largest value in the ﬁrst (x
1
)
column, which equals 3. The corresponding row is chosen as the ﬁrst pivot
row, shown as “1” in the pivot column. Multipliers are also computed for
the remaining rows:
p m x
1
x
2
x
3
b
0 0 −1 3 6
1 3 2 1 15
−2/3 2 4 −5 1
After eliminating x
1
from rows 1 (trivially) and 2, we compare the second
(x
2
) column entries for those rows. The pivot is 8/3, and row 3 is the pivot
row. A “2” is marked for that row in the pivot column, and the multiplier
is computed for row 1:
p m x
1
x
2
x
3
b
3/8 0 −1 3 6
1 3 2 1 15
2 0 8/3 −17/3 −9
After eliminating x
2
from row 1, we mark a “3” for that row in the pivot
column (even though it is not used for pivoting subsequently):
p m x
1
x
2
x
3
b
3 0 0 7/8 21/8
1 3 2 1 15
2 0 8/3 −17/3 −9
Backward substitution can be carried out by handling the equations in
pivoting order (i.e., 1, 2, 3, from the p column):
3x
1
+ 2x
2
+x
3
= 15
(8/3)x
2
−(17/3)x
3
= −9
(7/8)x
3
= 21/8
Solving in the usual fashion, we obtain x
3
= 3, x
2
= 3 and x
1
= 2.
97
Row pivoting results in a so-called permuted LU factorization of the
matrix A:
LU = PA
where L and U are lower and upper triangular, respectively, and P is a
permutation matrix which rearranges the rows of A in the order in which
they were used for pivoting. Thus in Example 2.8.2,
P
_
_
1
2
3
_
_
=
_
_
2
3
1
_
_
where the vector on the right is just the pivot order column at the end of
the forward elimination. Thus
P =
_
_
0 1 0
0 0 1
1 0 0
_
_
and PA =
_
_
2 4 −5
3 2 1
0 −1 3
_
_
The matrices L and U are obtained by applying the same permutation
to the multiplier vectors and S (at the end of forward elimination). Thus in
the same example,
L = I −P
_
_
0 3/8 0
0 0 0
−2/3 0 0
_
_
=
_
_
1 0 0
2/3 1 0
0 −3/8 1
_
_
and
U = P
_
_
0 0 7/8
3 2 1
0 8/3 −17/3
_
_
=
_
_
3 2 1
0 8/3 −17/3
0 0 7/8
_
_
If the permuted LU factorization of Ais known, then the equation Ax =
b can be solved using the forward elimination and backward substitution
equations
Ly = Pb and Ux = y .
98
2.9 Further Topics in Gaussian Elimination
2.9.1 Computation of the Matrix Inverse
In most computations, the inverse A
−1
of a nonsingular n n matrix A
appears in a product such as A
−1
b or, more generally, A
−1
B. As long as
A is available, it is best to compute that product via Gaussian elimination,
i.e., by solving
Ax = b
or, if B has more than one column,
AX = B
(here X has the same number of columns as B). This approach is faster (at
least by a factor of two), and produces a much smaller average error in the
resulting entries of Ax−b, than computing the inverse A
−1
followed by the
product A
−1
b.
If Ax = b needs to be solved for many diﬀerent vectors b, having A
−1
computed and stored oﬀ-line oﬀers no particular advantage over the standard
LU factorization of A. This is because the product A
−1
b requires the same
number (≈ 2n
2
) of ﬂoating-point operations as the two triangular systems
Ly = b and Ux = y combined.
If we are asked to compute A
−1
explicitly, we can do so by solving the
n
2
simultaneous equations in
AX = I
where X = [x
(1)
. . . x
(n)
]. This is equivalent to solving n systems of n
simultaneous equations each:
Ax
(1)
= e
(1)
, Ax
(2)
= e
(2)
, . . . , Ax
(n)
= e
(n)
We illustrate this procedure, also known as Gauss-Jordan elimination, by
an example.
Example 2.9.1. To invert
A =
_
_
1 −1 2
−2 1 1
−1 2 1
_
_
we construct a table with the elements of S = [A I]:
m x
1
x
2
x
3
b
1
b
2
b
3
1 −1 2 1 0 0
2 −2 1 1 0 1 0
1 −1 2 1 0 0 1
99
Eliminating x
1
from rows 2 and 3 (using the pivot and multipliers shown
above), we obtain
m x
1
x
2
x
3
b
1
b
2
b
3
1 −1 2 1 0 0
0 −1 5 2 1 0
1 0 1 3 1 0 1
Eliminating x
2
from row 3 (using the pivot and multiplier shown above), we
obtain
m x
1
x
2
x
3
b
1
b
2
b
3
1 −1 2 1 0 0
0 −1 5 2 1 0
0 0 8 3 1 1
i.e., S = [U L
−1
] at this point. Scaling the rows of U by the diagonal
elements, we obtain
m x
1
x
2
x
3
b
1
b
2
b
3
−2 1 −1 2 1 0 0
5 0 1 −5 −2 −1 0
0 0 1 3/8 1/8 1/8
Substituting x
3
back into rows 2 and 1 (using the multipliers shown above),
we obtain
m x
1
x
2
x
3
b
1
b
2
b
3
1 1 −1 0 2/8 −2/8 −2/8
0 1 0 −1/8 −3/8 5/8
0 0 1 3/8 1/8 1/8
Substituting x
2
back into row 1 (using the multiplier shown above), we
obtain
m x
1
x
2
x
3
b
1
b
2
b
3
1 0 0 1/8 −5/8 3/8
0 1 0 −1/8 −3/8 5/8
0 0 1 3/8 1/8 1/8
i.e., S = [I A
−1
] at the end. Thus
A
−1
=
_
_
1/8 −5/8 3/8
−1/8 −3/8 5/8
3/8 1/8 1/8
_
_

100
In case where zero pivots are encountered, pivoting is necessary in the
above procedure. The ﬁnal table is of the form
[P
−1
P
−1
A
−1
]
where P is the permutation matrix in the (permuted) LU factorization of
A:
LU = PA
Note that P
−1
= P
T
by the fact given in Subsection 2.3.2.
2.9.2 Ill-Conditioned Problems
In Section 2.8, we saw how small pivots can cause large errors in ﬁnite-
precision computation, and how simple techniques such as row pivoting can
essentially eliminate these errors. There are, nevertheless, problems where
errors due to ﬁnite precision are unavoidable, whether pivoting is used or
not. Such problems are referred to as ill-conditioned, primarily in regard to
the matrix A.
We can think of an ill-conditioned matrix A as one that is “nearly sin-
gular” in terms of the precision used in solving Ax = b. This concept can
quantiﬁed in terms of the so-called condition number of A, which can be
computed in MATLAB using the function COND. A well-conditioned ma-
trix is one whose condition number is small (the minimum possible value is
1). An ill-conditioned matrix is one whose condition number is of the same
order of magnitude as 10
K
, where K is the number of signiﬁcant decimal
digits used in the computation—the same deﬁnition can be given in terms
of binary digits by changing the base to 2.
Ill-conditioned problems are characterized by large variations in the solu-
tion vector x caused by small perturbations of the parameter values (entries
of A and b). They are thus extremely sensitive to rounding. The following
example demonstrates this sensitivity.
Example 2.9.2. Consider the system
1.001x
1
+x
2
= 2.001
x
1
+ 1.001x
2
= 2.001
which has exact solution
x
1
= x
2
= 1.
Note how the two columns of A are nearly equal, and thus A is nearly singu-
lar. The condition number of A equals 2.001 10
3
. Based on the foregoing
101
discussion, large errors are possible when rounding to three signiﬁcant digits.
In fact, we will show that this is the case even with four-digit precision.
Suppose we change the right-hand vector to
b
1
= 2.0006 , b
2
= 2.0014
Both these values are rounded to 2.001 using four-digit precision, with an
error of 0.02%. Ignoring any further round-oﬀ errors in the solution of the
system (i.e., solving it exactly), we have the exact solution
x
1
= 1.600 , x
2
= 0.3995
Note the 60% error in each entry relative to the earlier solution.
We give a graphical illustration of the diﬃculties encountered in this
problem. The two equations represent two nearly parallel straight lines on
the (x
1
, x
2
) plane. A small parallel displacement of either line (as a result
of changing the value of an intercept b
i
) will cause a large movement of the
solution point.
0 0.5 1 1.5 2 2.5
0
0.5
1
1.5
2
2.5
x
1
x
2
Example 2.9.2
Ill-conditioned problems force us to reconsider the physical parameters
and models involved. For example, if the signal representation problem
s = Vc is ill-conditioned, we may want to choose a diﬀerent set of reference
signals in the matrix V. In a systems context, where, for example, y = Ax
is an observation signal for a hidden state vector x, an ill-conditioned matrix
Amight force us to seek a diﬀerent measuring device or observation method.
102
2.10 Inner Products, Projections and Signal Ap-
proximation
2.10.1 Inner Products and Norms
The inner product of two vectors a and b in R
m×1
is deﬁned by
¸a, b) =
m

i=1
a
i
b
i
Clearly, the inner product is symmetric in its two arguments and can be
expressed as the product of a row vector and a column vector:
¸a, b) = a
T
b = b
T
a = ¸b, a)
Also, the inner product is linear in one of its arguments (i.e., vectors) when
the other argument is kept constant:
¸c
1
a
(1)
+c
2
a
(2)
, b) =
m

i=1
_
c
1
a
(1)
i
+c
2
a
(2)
i
_
b
i
= c
1
¸a
(1)
, b) +c
2
¸a
(2)
, b)
The norm of a vector a in R
m×1
is deﬁned as the square root of the
inner product of a with itself:
|a| = ¸a, a)
1/2
=
_
m

i=1
a
2
i
_
1/2
or equivalently,
|a|
2
= ¸a, a) =
m

i=1
a
2
i
By the Pythagorean theorem, |a| is also the length of a. Clearly,
|a| = 0 ⇔ a = 0
i.e., the all-zeros vector is the only vector of zero length. Also, if c is a (real)
scaling factor,
|ca| =
_
m

i=1
c
2
a
2
i
_
1/2
= [c[ |a|
103
0 a
b b-a
θ
Figure 2.8: The geometry of vectors a, b and b −a.
2.10.2 Angles, Projections and Orthogonality
In general, two nonzero vectors a and b in R
m×1
deﬁne a two-dimensional
subspace (i.e., plane) through the origin. By convention, the angle between
a and b takes value in [0, π].
Applying the cosine rule to the triangle formed by a, b and the dotted
vector (parallel to) b −a in Figure 2.8, we obtain
|b −a|
2
= |a|
2
+|b|
2
−2|a| |b| cos θ
Using the linearity property stated earlier, the left-hand side can be also
expressed as
|b −a|
2
= ¸b −a, b −a)
= ¸a, a) +¸b, b) −2¸a, b)
= |a|
2
+|b|
2
−2¸a, b)
Thus
|a|
2
+|b|
2
−2¸a, b) = |a|
2
+|b|
2
−2|a| |b| cos θ
and consequently
cos θ =
¸a, b)
|a| |b|
Another feature of the geometry of two vectors in R
m×1
is the projection
of one vector onto the other. Figure 2.9 shows the projection f of b onto a,
which is of the form
f = λa
where λ is a scaling factor having the same sign as cos θ. To determine the
value of λ, note that the length of f equals
|f | = |b| [ cos θ[ =
[¸a, b)[
|a|
104
0 a
b
f
θ
Figure 2.9: Projection of vector b onto vector a.
The unit vector parallel to a is given by
e
(a)
=
1
|a|
a
and f is obtained by scaling e
(a)
by the “signed” length of f :
f =
¸a, b)
|a|
e
(a)
=
¸a, b)
|a|
2
a
Therefore λ = ¸a, b)/|a|
2
.
Example 2.10.1. Let a = [−1 1 − 1]
T
and b = [2 − 5 1]
T
in R
3×1
. We
then have
|a|
2
= 3 ⇒ |a| =

3
and
|b|
2
= 30 ⇒ |b| =

30
Also,
¸a, b) = −2 −5 −1 = −8 ⇒ cos θ = −
8
3

10
The projection of b onto a is given by
f = −
8
3
a
while the projection of a onto b is given by
g = −
8
30
b = −
4
15
b
105
Deﬁnition 2.10.1. We say that a and b are orthogonal, and denote that
relationship by
a⊥b
if the angle between a and b equals π/2, or equivalently (since cos(π/2) = 0),
¸a, b) = 0
Clearly, if a and b are orthogonal, then the projection of either vector
onto the other is the all-zeros vector 0.
2.10.3 Formulation of the Signal Approximation Problem
Our discussion of matrix inversion was partly motivated by the following
signal approximation problem. Given n reference signals v
(1)
, . . . , v
(n)
in
R
m×1
, how can we best approximate an arbitrary signal vector s in R
m×1
by a linear combination of these signals? The approximation would be of
the form
ˆs =
n

r=1
c
r
v
(r)
= Vc
where the mn matrix V has the n reference signals as its columns:
V =
_
v
(1)
. . . v
(n)
¸
Since the approximation ˆs is a linear combination of reference signals, it
makes little sense to include in V a signal which is expressible as a linear
combination of other such (reference) signals; that signal would be redun-
dant. We therefore assume the following:
Assumption. The columns of V, namely v
(1)
, . . . , v
(n)
, will be assumed
linearly independent for the remainder of this chapter.
The above assumption immediately implies that n ≤ m, i.e., the number
of reference signals is no larger than the signal dimension (i.e., vector size).
As we saw earlier, any set of m linearly independent vectors in R
m×1
can be
used to obtain any vector in that space (by means of linear combinations). It
is therefore impossible to have a set of m+1 (or more) linearly independent
vectors in R
m×1
.
The case n = m has already been considered in the context of matrix
inversion. If the columns the (square) matrix V are linearly independent,
then V is nonsingular and the equation
Vc = s
106
has a unique solution c given by c = V
−1
s. Thus there is no need to ap-
proximate here; we have an exact representation of s as a linear combination
of the n = m reference signals.
Thus the only real case of interest is n < m, i.e., where the range 1(V)
of V is a linear subspace of R
m×1
of dimension d = n < m. Any signal s
that does not belong to 1(V) will need to approximated by a ˆs in 1(V).
To properly formulate this problem, we will evaluate each approximation
ˆs based on its distance from s. Thus the best approximation is one that
minimizes the length |ˆs−s| of the error vector ˆs−s over all possible choices
of ˆs in 1(V). Clearly, that approximation will also minimize the square of
that length:
|ˆs −s|
2
=
m

i=1
(ˆ s
i
−s
i
)
2
By virtue of the expression on the right-hand side (i.e., sum of squares of
entries of the error vector), this solution is known as the least-squares (linear)
approximation of s based on v
(1)
, . . . , v
(n)
.
2.10.4 Projection and the Signal Approximation Problem
In the case where there is only one reference vector v
(1)
(i.e., m > n = 1),
the solution of the signal approximation problem can be obtained using two-
dimensional geometry. If, in Figure 2.9, we replace b by s and a by v
(1)
,
we see that the point closest to s on the line generated by v
(1)
is none other
than the projection f of s onto v
(1)
. Thus the least-squares approximation
of s based on v
(1)
is given by ˆs = f . It is also interesting to note that the
error vector ˆs −s is orthogonal to v
(1)
, i.e.,
ˆs −s ⊥ v
(1)
Similarly, three-dimensional geometry provides the solution to the ap-
proximation problem when two reference vectors v
(1)
and v
(2)
are given (i.e.,
m > n = 2). Again, the solution ˆs is obtained by projecting s on the plane
1(V). The error vector satisﬁes
ˆs −s ⊥ v
(1)
and ˆs −s ⊥ v
(2)
i.e., it is orthogonal to both v
(1)
and v
(2)
. Figure 2.10 clearly shows that
ˆs −s is also orthogonal to every vector on the plane 1(V).
Our intuition thus far suggests that the solution to the signal approxima-
tion problem for any number n < m of reference vectors might be obtained
107
0
v
(2)
s
v
(1)
s
R(V)
Figure 2.10: The projection of s on the plane generated by v
(1)
and v
(2)
.
by projecting s onto the subspace 1(V) generated by v
(1)
, . . . , v
(n)
; where
the projection ˆs ∈ 1(V) satisﬁes
ˆs −s ⊥ v
(j)
for every 1 ≤ j ≤ n.
Before showing that this is indeed the case, we note the following:
• If ˆs−s ⊥ v
(j)
for every vector v
(j)
, then ˆs−s ⊥ a for every a in 1(V).
This is because the inner product is linear in one of its arguments
when the other argument is ﬁxed; thus if a is a linear combination of
v
(1)
, . . . , v
(n)
and ¸ˆs −s, v
(j)
) = 0 for every j, then ¸ˆs −s, a) = 0 also.
• The n conditions ¸ˆs −s, v
(i)
) = (v
(i)
)
T
(ˆs −s) = 0 can be expressed in
terms of a single matrix-vector product as
V
T
(ˆs −s) = 0
where the vector 0 is n-dimensional; or equivalently, as
V
T
ˆs = V
T
s
We prove our conjecture by showing that if there exists ˆs in 1(V) sat-
isfying V
T
ˆs = V
T
s, its distance from s can be no larger than that of any
other vector y in 1(V); i.e.,
|y −s|
2
≥ |ˆs −s|
2
108
Indeed, we rewrite the left-hand side as
|y −ˆs +ˆs −s|
2
= ¸y −ˆs +ˆs −s, y −ˆs +ˆs −s)
Treating y −s and ˆs −s as single vectors, we obtain (as in the derivation of
the formula for cos θ in Subsection 2.10.2)
|y −s|
2
= |ˆs −s|
2
+|y −ˆs|
2
+ 2¸y −ˆs, ˆs −s)
The quantity |y−ˆs|
2
is nonnegative (and is zero if and only if y = ˆs). The
inner product ¸y − ˆs, ˆs − s) equals zero since y − ˆs belongs to 1(V) and
ˆs −s is orthogonal to every vector in 1(V). We thefore conclude that
|y −s|
2
≥ |ˆs −s|
2
with equality if and only if y = ˆs. Although the existence of s in 1(V)
satisfying
V
T
ˆs = V
T
s
has not yet been shown, it follows from the “if and only if” statement (above)
that there can only be one solution ˆs to the signal approximation problem.
2.10.5 Solution of the Signal Approximation Problem
It remains to show that
V
T
ˆs = V
T
s
has a solution ˆs in 1(V); this means that ˆs = Vc for some c to be deter-
mined. We rewrite the equation as
V
T
Vc = V
T
s
i.e.,
Ac = b
where:
• A = V
T
V is an n n symmetric matrix whose (i, j)
th
entry equals
the inner product ¸v
(i)
, v
(j)
) (hence the symmetry); and
• b = V
T
s is a n 1 vector whose i
th
entry equals ¸v
(i)
, s).
109
Note that once the inner products in V
T
V and V
T
s have been computed,
the signal dimension m altogether drops out of the picture, i.e., the size of
the problem is solely determined by the number n of reference vectors.
A unique solution c exists for every s provided the n n matrix V
T
V
is nonsingular, i.e.,
x ,= 0 ⇒ (V
T
V)x ,= 0
To see why the above implication is indeed true, recall our earlier as-
sumption that the columns of the mn matrix V are linearly independent.
Thus
x ,= 0 ⇒ Vx ,= 0
⇔ |Vx|
2
> 0
where the squared norm in the last expression can be also expressed as
|Vx|
2
= (Vx)
T
(Vx) = x
T
V
T
Vx
Thus
x ,= 0 ⇒ x
T
(V
T
V)x > 0
and hence (V
T
V)x cannot be the all-zeros vector 0; if it were, then the
scalar x
T
(V
T
V)x would equal zero also. This establishes that V
T
V is
nonsingular.
We therefore obtain the solution to the least squares approximation prob-
lem as
ˆs = Vc
where
c = (V
T
V)
−1
V
T
s
A single expression for ˆs is
ˆs = V(V
T
V)
−1
V
T
s
(Note that the mm matrix V(V
T
V)
−1
V
T
is also symmetric.)
Example 2.10.2. Consider the case m = 3 and n = 2, with v
(1)
= [−1 1 −
1]
T
and v
(2)
= [2 −5 1]
T
(same vectors as in Example 2.10.1). Thus
V =
_
_
−1 2
1 −5
−1 1
_
_
110
and
V
T
V =
_
|v
(1)
|
2
¸v
(1)
, v
(2)
)
¸v
(2)
, v
(1)
) |v
(2)
|
2
_
=
_
3 −8
−8 30
_
Let us determine the projection of ˆs of s = [2 − 1 − 1]
T
onto the plane
deﬁned by v
(1)
and v
(2)
. We have
V
T
s =
_
¸v
(1)
, s)
¸v
(2)
, s)
_
=
_
−2
8
_
and thus ˆs = Vc, where
c =
_
3 −8
−8 30
_
−1
_
−2
8
_
=
1
13
_
2
4
_
Therefore
ˆs =
2
13
v
(1)
+
4
13
v
(2)
=
2
13
_
_
3
−9
1
_
_

As a ﬁnal observation, note that the formulas
c = (V
T
V)
−1
V
T
s
and
ˆs = V(V
T
V)
−1
V
T
s
which we derived for the case n < m under the assumption of linear inde-
pendence of the columns of V, also give us the correct answers when n = m
and V
−1
exists:
c = (V
T
V)
−1
V
T
s = V
−1
(V
T
)
−1
V
T
s = V
−1
s
and
ˆs = V(V
T
V)
−1
V
T
s = VV
−1
s = s
111
2.11 Least-Squares Approximation in Practice
The basic technique of least-squares approximation developed in the previ-
ous section has numerous applications across the engineering disciplines and
in most scientiﬁc ﬁelds where data analysis is important. Broadly speak-
ing, least-squares methods allow us to to estimate parameters for a variety
of models that have linear structure. To illustrate the adaptability of the
least-squares technique, we consider two rather diﬀerent examples.
Example 2.11.1. Curve ﬁtting. We are given the following vector s con-
sisting of ten measurements taken at regular time intervals (for simplicity,
every second):
t 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
s −0.7 −0.2 0.6 1.1 1.7 2.5 3.1 3.8 4.5 5.0
Suppose we are interested in ﬁtting a straight line
ˆ s(t) = at +b
through the discrete plot ¦(t
i
, s
i
), 1 ≤ i ≤ 10¦ so as to minimize
10

i=1
(ˆ s(t
i
) −s
i
)
2
Letting ˆ s(t
i
) = ˆ s
i
, we see that the vector ˆs is given by
ˆs = at +b1
where 1 is a column vector of unit entries. We can then restate our problem
as follows: choose coeﬃcients a and b for t and 1, respectively, so as to
minimize
|ˆs −s|
2
We thus have a least-squares problem where
• the length of the data vector is m = 10; and
• the number of reference vectors (i.e., t and 1) is n = 2.
112
Again, we begin by computing the matrix of inner products of the reference
vectors, i.e., V
T
V, where V = [t 1]. We have
t
T
t =
10

i=1
i
2
= 385
t
T
1 =
10

i=1
i = 55
1
T
1 =
10

i=1
1 = 10
and therefore
V
T
V =
_
t
T
t t
T
1
t
T
1 1
T
1
_
=
_
385 55
55 10
_
Also,
t
T
s =
10

i=1
is
i
= 171.2
1
T
s =
10

i=1
s
i
= 21.4
and thus
V
T
s =
_
t
T
s
1
T
s
_
=
_
171.2
21.4
_
The least squares solution is then given by
_
a
b
_
=
_
385 55
55 10
_
−1
_
171.2
21.4
_
=
_
0.6485
−1.4267
_
and the resulting straight line ˆ s(t) = at + b is plotted together with the
discrete data points.
It can be seen that the straight-line ﬁt is very good in this case. The
mean (i.e., average) square error is given by
1
10
|ˆs −s|
2
= 0.00501
Taking the square root
_
1
10
|ˆs −s|
2
= 0.0708
113
0 1 2 3 4 5 6 7 8 9 10 11
−1
0
1
2
3
4
5
t
s
Example 2.11.1
we obtain the root mean square (r.m.s.) error. The mean absolute error is
1
10
10

i=1
[ˆ s
i
−s
i
[ = 0.0648
Remark. The same technique can be used for ﬁtting any linear combination
of functions through a discrete data set. For example, by expanding the set
of reference vectors to include the squares, cubes, etc., of the entries of the
abscissa (time in this case) vector t, we can determine the optimal (in the
least-squares sense) ﬁt by a polynomial of any degree we choose.
Example 2.11.2. Range estimation. In this example, we assume that we
have two signal sources, each emitting a pure sinusoidal waveform. The
signals have the same amplitude (hence also power), but diﬀerent frequencies
and phase shifts. The sources are mobile, and their location at any time is
unknown.
A receiver, who knows the two source frequencies Ω
1
and Ω
2
, also knows
that the signal power follows an inverse-power law, i.e., it decays as R
−γ
,
where R is the distance from the source and γ is a known positive constant.
Based on this information, the receiver attempts to recover the range ra-
tio R
1
/R
2
(i.e., the receiver’s relative distance from the two sources) from
114
s(t) = σR
−γ/2
1
cos(Ω
1
t +φ
1
) +σR
−γ/2
2
cos(Ω
2
t +φ
2
) +z(t)
Here, z(t) represents interference, or noise, that has no particular structure
(at least none that can be used to improve the model for s(t)). Use of
the common parameter σ here is consistent with the assumption that both
sources emit at the same power.
Suppose that the receiver records m samples of s(t) corresponding to
times t
1
, . . . , t
m
. Since the distances R
1
and R
2
both appear in the am-
plitudes of the received sinusoidal components, it makes sense to estimate
those amplitudes. Note that neither of the two components in their respec-
tive form shown above can be used as a reference signal for the least-squares
solution—the unknown phase shifts must somehow be removed. To that
end, we write
cos(Ω
k
t +φ
k
) = cos φ
k
cos(Ω
k
t) −sinφ
k
sin(Ω
k
t)
and use two reference signals for each Ω
k
, namely cos(Ω
k
t) and sin(Ω
k
t). In
their discrete form, these signals are given by vectors
u
(1)
=
_
cos(Ω
1
t
i
)
_
m
i=1
w
(1)
=
_
sin(Ω
1
t
i
)
_
m
i=1
u
(2)
=
_
cos(Ω
2
t
i
)
_
m
i=1
w
(2)
=
_
sin(Ω
2
t
i
)
_
m
i=1
We thus seek coeﬃcients a
1
, b
1
, a
2
and b
2
such that
ˆs = a
1
u
(1)
+b
1
w
(1)
+a
2
u
(2)
+b
2
w
(2)
is the least-squares approximation (among all such linear combinations) to
the vector of received samples s. The solution is obtained in the usual
fashion: if
V =
_
u
(1)
w
(1)
u
(2)
w
(2)
¸
then the optimal coeﬃcients are given by
_
¸
¸
_
a
1
b
1
a
2
b
2
_
¸
¸
_
= (V
T
V)
−1
V
T
s
115
Since, for each k, the suma
2
k
+b
2
k
is approximately equal to σ
2
R
−γ
k
(cos
2
φ
k
+
sin
2
φ
k
) = σ
2
R
−γ
k
, the resulting estimate of the ratio R
1
/R
2
is
_
a
2
1
+b
2
1
a
2
2
+b
2
2
_
−1/γ

116
2.12 Orthogonality and Least-Squares Approxima-
tion
2.12.1 A Simpler Least-Squares Problem
The solution of the projection, or least-squares, problem is greatly simpliﬁed
when the reference vectors v
(1)
, . . . , v
(n)
in Vare (pairwise) orthogonal. This
means that v
(i)
⊥v
(j)
for any i ,= j, or equivalently,
¸v
(i)
, v
(j)
) =
_
|v
(i)
|
2
, i = j;
0, i ,= j.
Recall that ¸v
(i)
, v
(j)
) is the (i, j)
th
element of the inner product matrix
V
T
V. Thus if the columns of V are orthogonal, then
V
T
V =
_
¸
¸
¸
_
|v
(1)
|
2
0 . . . 0
0 |v
(2)
|
2
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . |v
(n)
|
2
_
¸
¸
¸
_
and hence V
T
V is a diagonal matrix. Since none of v
(1)
, . . . , v
(n)
is an all-
zeros vector (by the assumption of linear independence), it follows that all
the squared norms on the diagonal are strictly positive, and thus
(V
T
V)
−1
=
_
¸
¸
¸
_
|v
(1)
|
−2
0 . . . 0
0 |v
(2)
|
−2
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . |v
(n)
|
−2
_
¸
¸
¸
_
is well-deﬁned. The projection ˆs of s on 1(V) is then given by ˆs = Vc,
where
c = (V
T
V)
−1
V
T
s =
_
¸
¸
¸
_
|v
(1)
|
−2
0 . . . 0
0 |v
(2)
|
−2
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . |v
(n)
|
−2
_
¸
¸
¸
_
_
¸
¸
¸
_
¸v
(1)
, s)
¸v
(2)
, s)
.
.
.
¸v
(n)
, s)
_
¸
¸
¸
_
i.e., for every i,
c
i
=
¸v
(i)
, s)
|v
(i)
|
2
117
As a result,
ˆs = Vc =
n

i=1
¸v
(i)
, s)
|v
(i)
|
2
v
(i)
The generic term in the above sum is the projection of s onto v
(i)
(as we
saw in Subsection 2.10.2). Thus if the columns of V are orthogonal, then
the projection of any vector s on 1(V) is given by the (vector) sum of the
projections of s onto each of the columns.
This result is in agreement with three-dimensional geometry, as shown
in Figure 2.11. It is also used extensively—and without elaboration—when
dealing with projections on subspaces generated by two or more of the stan-
dard orthogonal unit vectors e
(1)
, . . . , e
(m)
in R
m×1
. For example, the pro-
jection of
s = 3e
(1)
+ 4e
(2)
+ 7e
(3)
on the plane generated by e
(1)
and e
(2)
equals
ˆs = 3e
(1)
+ 4e
(2)
,
which is the sum of the projections of s on e
(1)
(given by 3e
(1)
) and on e
(2)
(given by 4e
(2)
).
0
v
(2)
s
v
(1)
s
R(V)
Figure 2.11: Projection on a plane generated by orthogonal vec-
tors v
(1)
and v
(2)
.
2.12.2 Generating Orthogonal Reference Vectors
The result of the previous subsection clearly demonstrates the advantage
of working with orthogonal reference vectors v
(1)
, . . . , v
(n)
: least squares
approximations can be obtained without solving simultaneous equations.
118
As it turns out, we can always work with orthogonal reference vectors,
if we so choose:
Fact. Any linearly independent set ¦v
(1)
, . . . , v
(n)
¦ of vectors in R
m×1
(where
n ≤ m) is equivalent to a linearly independent and orthogonal set ¦w
(1)
,
. . . , w
(n)
¦, where equivalence means that both sets produce the same subspace
of linear combinations.
In matrix terms, if
V =
_
v
(1)
. . . v
(n)
¸
,
we claim that there exists
W =
_
w
(1)
. . . w
(n)
¸
such that W
T
W is diagonal and
1(V) = 1(W)
The last condition means that each v
(i)
is expressible as a linear combination
of w
(j)
’s and vice versa (with v and w interchanged). In other words, there
exists a nonsingular n n matrix B such that
V = WB
and (equivalently)
W = VB
−1
We will now show how, given any V with linearly independent columns,
we can obtain such a matrix B. The construction is based on the LU fac-
torization of V
T
V, a matrix known to be be
• nonsingular;
• symmetric; and
• positive deﬁnite, meaning that x
T
V
T
Vx > 0 for all x ,= 0.
The three properties listed above have the following implications (proofs
are omitted):
• Nonsingularity of V
T
V implies that a unique normalized LU factor-
ization of the form
V
T
V = LDU
exists where both L and U have unit diagonal elements and D has
nonzero diagonal elements.
119
• Symmetry of of V
T
V in conjunction with the uniqueness of the above
factorization implies
L = U
T
• Finally, positive deﬁniteness of V
T
V implies that D has (strictly)
positive diagonal elements.
The following example illustrates the above-mentioned features of the
LU factorization of V
T
V.
Example 2.12.1. Consider the matrix
V =
_
v
(1)
v
(2)
v
(3)
¸
=
_
¸
¸
¸
¸
_
−1 3 −3
0 1 −2
1 1 −2
−1 1 −1
−1 1 0
_
¸
¸
¸
¸
_
Gaussian elimination on
V
T
V =
_
_
4 −4 2
−4 13 −14
2 −14 18
_
_
yields
m x
1
x
2
x
3
4 −4 2
1 −4 13 −14
−1/2 2 −14 18
4 −4 2
0 9 −12
4/3 0 −12 17
4 −4 2
0 9 −12
0 0 1
Therefore
V
T
V =
_
_
1 0 0
−1 1 0
1/2 −4/3 1
_
_
_
_
4 −4 2
0 9 −12
0 0 1
_
_
=
_
_
1 0 0
−1 1 0
1/2 −4/3 1
_
_
_
_
4 0 0
0 9 0
0 0 1
_
_
_
_
1 −1 1/2
0 1 −4/3
0 0 1
_
_
120
demonstrating that V
T
V = U
T
DU, where D has strictly positive diagonal
entries.
Using the above-mentioned properties of V
T
V, we will now show that
the matrix
W = VU
−1
has n nonzero orthogonal columns, thereby proving the claim made earlier
(with B = U). Indeed,
W
T
W = (VU
−1
)
T
VU
−1
= (U
T
)
−1
V
T
VU
−1
= (U
T
)
−1
U
T
DUU
−1
= D
and thus W has n orthogonal columns w
(1)
, . . . , w
(n)
with norms given by
|w
(i)
| =
_
d
ii
> 0
for all i.
Example 2.12.1. (Continued.) There is no need to compute U
−1
explicitly
in order to determine W = VU
−1
. We have, equivalently,
WU = V ⇔ U
T
W
T
= V
T
The last equation can be written as (note that U
T
= L)
_
_
1 0 0
−1 1 0
1/2 −4/3 1
_
_
_
_
w
(1)T
w
(2)T
w
(3)T
_
_
=
_
_
v
(1)T
v
(2)T
v
(3)T
_
_
We solve this lower triangular system using standard forward elimination,
treating the row vectors w
(·)T
and v
(·)T
as scalar variables and omitting the
transpose throughout. Thus
w
(1)
= v
(1)
w
(2)
= v
(1)
+v
(2)
w
(3)
= −
1
2
v
(1)
+
4
3
(v
(1)
+v
(2)
) +v
(3)
=
5
6
v
(1)
+
4
3
v
(2)
+v
(3)
121
We have obtained three orthogonal vectors with norms 2, 3 and 1 respec-
tively. In extensive form,
W =
_
w
(1)
w
(2)
w
(3)
¸
=
_
¸
¸
¸
¸
_
−1 2 1/6
0 1 −2/3
1 2 1/6
−1 0 −1/2
−1 0 1/2
_
¸
¸
¸
¸
_

We make the following ﬁnal remarks on this topic.
• Dividing each w
(i)
in the above transformation by its norm |w
(i)
| =

d
ii
, we obtain a new set of orthogonal vectors, each having unit
norm. A set of vectors having these two properties (orthogonality and
unit norm) is known as orthonormal. The new matrix W is related
to V via
V = WD
1/2
U
and
W = VU
−1
D
−1/2
where D
1/2
and D
−1/2
are diagonal matrices obtained by taking the
same (respective) powers of the diagonal elements of D.
• The upper triangular matrix D
1/2
U above can be computed in MAT-
LAB using the function CHOL, with argument given by V
T
V. This
the same as producing the Cholesky form of the LU factorization of
a positive deﬁnite symmetric matrix A, where the lower and upper
triangular parts are transposes of each other:
A = (D
1/2
U)
T
(D
1/2
U)
122
2.13 Complex-Valued Matrices and Vectors
2.13.1 Similarities to the Real-Valued Case
Thus far we have considered matrices and vectors with real-valued entries
only. From an applications perspective, this restriction is justiﬁable, since
real-world signals are real-valued—it therefore makes sense to represent or
approximate them using real-valued reference vectors. One class of refer-
ence vectors of particular importance in signal analysis comprises segments
of sinusoidal signals in discrete time. As the analysis of sinusoids is simpli-
ﬁed by considering complex sinusoids (or phasors), it pays to generalize the
tools and concepts developed so far to include complex-valued matrices and
vectors.
A mn complex-valued matrix takes values in C
m×n
, where C denotes
the complex plane. Since each entry of the matrix consists of a real and an
imaginary part, a complex-valued matrix stores twice as much information as
a real-valued matrix of the same size. This fact has interesting implications
about the representation of real-valued signals by complex-valued reference
vectors.
Partitioning a matrix A ∈ C
m×n
into its columns, we see that each
column of A is a vector z in C
m×1
, which can be expressed as
z = z
1
e
(1)
+ +z
m
e
(m)
Here, e
(1)
, . . . , e
(m)
are the standard (real-valued) unit vectors, and z
1
, . . . , z
m
are the complex-valued elements of z:
z =
_
¸
¸
¸
_
z
1
z
2
.
.
.
z
m
_
¸
¸
¸
_
Note that the “information-doubling” eﬀect mentioned in the previous para-
graph is due to the fact that each z
i
is complex; the vector z itself has dimen-
sion m, as does any real-valued vector x (which is also a linear combination
of e
(1)
, . . . , e
(m)
, but with real-valued coeﬃcients x
1
, . . . , x
m
).
Complex-valued matrices and vectors behave in exactly the same manner
as real-valued ones. Since all matrix computations are based on scalar addi-
tion and multiplication, one only needs to ensure that in the complex case,
these (scalar) operations are replaced by complex addition and multiplica-
tion. We thus have the following straightforward extensions of properties
and concepts deﬁned earlier:
123
• Matrix multiplication: Deﬁned as before (using complex addition and
multiplication).
• Matrix transposition (T): Deﬁned as before. No algebraic operations
are needed here. A modiﬁed transpose will be deﬁned in the next
subsection.
• Linear Independence: Deﬁned as before, allowing complex coeﬃcients
in linear combinations. Thus linear independence of the columns of A
means that z = 0 is the only solution of Az = 0.
• Matrix inverse: As before, a square matrix is nonsingular (and thus
has an inverse) provided its columns are linearly independent.
• Solution of Az = b: Gaussian elimination is still applicable. In terms
of computational eﬀort, this is a more intensive problem, since each
tiplication involves four real multiplications and three real additions.
One can also write out the above matrix equation using real constants
and unknowns; twice as many variables are needed, since each complex
variable z
i
has a real and an imaginary part.
2.13.2 Inner Products of Complex-Valued Vectors
The inner product of v and w in C
m×1
is deﬁned by
¸v, w) =
m

i=1
v

i
w
i
= (v

)
T
w
where, as usual, the superscript

denotes complex conjugate. Note that if
the two vectors are real-valued (i.e., have zero imaginary parts), the complex
conjugate has no eﬀect on the value of v, and the above deﬁnition becomes
that of the inner product of two real-valued vectors.
The combination of the complex conjugate ∗ and transpose T operators
(in either order) is known as the conjugate transpose, or Hermitian, operator
H. Thus
¸v, w) = v
H
w
The following identities involving complex conjugates of scalars are useful
in manipulating inner products and conjugate transposes of vectors:
(z

)

= z
124
(z
1
+z
2
)

= z

1
+z

2
(z
1
z
2
)

= z

1
z

2
[z[
2
= zz

We immediately obtain:
¸w, v) =
m

i=1
w

i
v
i
=
_
m

i=1
v

i
w
i
_

= ¸v, w)

or equivalently,
w
H
v =
_
v
H
w
_

Also, if c is a complex-valued scalar,
¸v, cw) = c¸v, w)
¸cv, w) = c

¸v, w)
¸cv, cw) = [c[
2
¸v, w)
The norm (or length) of v in C
m×1
is deﬁned as
|v| =
_
m

i=1
[v
i
[
2
_
1/2
where [ [ denotes the magnitude of the complex element v
i
:
[v
i
[
2
= v

i
v
i
= 'e
2
¦v
i
¦ +·m
2
¦v
i
¦
We thus see that |v|
2
is given by the sum of squares of all real and
imaginary parts contained in the vector v. We also have
|v| =
_
m

i=1
v

i
v
i
_
1/2
= ¸v, v)
1/2
which is the same relationship between norm and inner product as in the
real-valued case. This is a key reason for introducing the complex conjugate
in the deﬁnition of the inner product.
Example 2.13.1. Let
v =
_
_
1 +j
−j
5 −2j
_
_
and w =
_
_
3
1 −j
1 + 2j
_
_
125
Then
¸v, w) = v
H
w
= (1 −j) 3 +j (1 −j) + (5 + 2j) (1 + 2j) = 5 + 10j
¸w, v) = w
H
v = 5 −10j
|v|
2
= v
H
v
= (1 + 1) + (1) + (25 + 4) = 32
|w|
2
= w
H
w
= 9 + (1 + 1) + (1 + 4) = 16
Note that by separating complex vectors into their real and imaginary
parts, i.e., v = a + jb and w = c + jd, we can also express inner products
and norms of complex vectors in terms or inner products of real vectors,
e.g.,
¸v, w) = (a
T
−jb
T
)(c +jd)
= a
T
c +b
T
d +j(a
T
d −b
T
c)
The following properties of the conjugate transpose H are identical to
those of the transpose T:
(A
H
)
H
= A
(AB)
H
= B
H
A
H
(A
−1
)
H
= (A
H
)
−1
As a ﬁnal reminder, in MATLAB:
• .’ denotes T and ’ denotes H.
• In computations involving real matrices (exclusively), the same result
will be obtained by either transpose.
• Where complex matrices are involved, the conjugate transpose (H or
’) is usually needed.
• For changing the (row-column) orientation of complex vectors, the
ordinary transpose .’ must be used; use of ’ will result in an error
(due to the conjugation).
126
2.13.3 The Least-Squares Problem in the Complex Case
Orthogonality of complex-valued vectors is deﬁned in terms of their inner
product:
v⊥w ⇔ ¸v, w) = 0 ⇔ ¸w, v) = 0
or equivalently:
v⊥w ⇔ v
H
w = 0 ⇔ w
H
v = 0
This is essentially the same deﬁnition as in the real-valued case.
Remark. Note that in geometric terms, it is considerably more diﬃcult to
visualize C
m×1
than R
m×1
, and even simple concepts such as orthogonality
can lead to intuitive pitfalls. For example, the scalars v = 1 and w = j are
at right angles to each other on the complex plane; yet viewed as vectors v
and w in C
1×1
, they are not orthogonal since ¸v, w) = j.
With the given deﬁnitions of the inner product and orthogonality for
complex vectors, the formulation and solution of the least-squares approx-
imation problem in the complex case turns out to be the same as in the
real case. Assuming that v
(1)
, . . . , v
(n)
are linearly independent vectors in
C
m×1
(where n ≤ m), we seek to approximate s ∈ C
m×1
by Vc so that the
squared norm
|Vc −s|
2
(i.e., the sum of squares of all real and imaginary parts contained in the
error vector Vc −s) is minimized.
Following the same argument as in the real-valued case, we deﬁne the
projection ˆs of s on 1(V) by the following n relationships:
ˆs −s ⊥ v
(i)
⇔ (v
(i)
)
H
ˆs = (v
(i)
)
H
s
where i = 1, . . . , n. Equivalently, in matrix form we have
V
H
ˆs = V
H
s
and letting ˆs = Vc as before, we obtain
V
H
Vc = V
H
s
Since H is the extension of T to the complex case, we have, in eﬀect, the
same formula for computing projections as before.
The proof that the projection ˆs on 1(V) solves the least-squares ap-
proximation problem for complex vectors is identical to the proof given in
127
Subsection 2.10.4 for real vectors. Care should be exercised in writing out
one equation: if y is the competing approximation, then
|y −s|
2
= |ˆs −s|
2
+|y −ˆs|
2
+¸y −ˆs, ˆs −s) +¸ˆs −s, y −ˆs)
The last two terms on the right-hand side (which are conjugates of each
other) are, again, equal to zero by the assumption of orthogonality of ˆs −s
and any vector in 1(V).
128
Problems
Section 2.1
P 2.1. (MATLAB) Enter the matrix
A = [1 2 3 4; 5 6 7 8; 9 10 11 12]
(i) Find two-element arrays I and J such that A(I,J) is a 2 2 matrix
consisting of the corner elements of A.
(ii) Suppose that B=A initially. Find two-element arrays K and L such that
B(:,K) = B(:,L)
swaps the ﬁrst and fourth columns of B.
(iii) Explain the result of
C = A(:)
(iv) Study the function RESHAPE. Use it together with the transpose operator
.’ in a single (one-line) command to generate the matrix
_
1 2 3 4 5 6
7 8 9 10 11 12
_
from A.
P 2.2. (MATLAB) Study the functions FLIPUD, HILB and TOEPLITZ.
Let N be integer. How would you deﬁne the vector r so that the following
command string returns an all-zero N-by-N matrix?
R = 1./r;
A = toeplitz(R);
B = flipud(A);
B(1:N,1:N) - hilb(N)
129
Section 2.2
P 2.3. If
B
_
1
0
_
=
_
_
3
−2
−1
_
_
and B
_
−1
1
_
=
_
_
4
7
2
_
_
,
determine
B
_
2
−1
_
P 2.4. Let G be a m2 matrix such that
G
_
−1
1
_
= u and G
_
2
1
_
= v
Express
G
_
3
3
_
in terms of u and v.
P 2.5. The transformation A : R
3
→R
3
is such that
A
_
_
1
0
0
_
_
=
_
_
4
−1
2
_
_
, A
_
_
0
1
0
_
_
=
_
_
−2
3
−1
_
_
, A
_
_
0
0
1
_
_
=
_
_
1
0
3
_
_
Write out the matrix A and compute Ax, where
x =
_
2 5 −1
¸
T
(ii) The transformation B : R
3
→R
2
is such that
B
_
_
1
0
0
_
_
=
_
2
−4
_
, B
_
_
1
1
0
_
_
=
_
3
−2
_
, B
_
_
1
1
1
_
_
=
_
−2
1
_
Determine the matrix B.
130
P 2.6. Compute by hand the matrix product AB in the following cases:
A =
_
_
2 −1
3 −2
1 5
_
_
, B =
_
−2 −3 3
4 0 7
_
;
A =
_
_
2 0 0
−2 4 0
3 −5 1
_
_
, B =
_
_
2 −2 3
0 4 −5
0 0 1
_
_
P 2.7. If
A =
_
_
a b c
d e f
g h i
_
_
ﬁnd vectors u and v such that
u
T
Av = b
P 2.8. (MATLAB) Generate a 60 60 matrix A as follows:
c = zeros(1,60);
c(1) = -1; c(2) = 1;
r=c;
A = toeplitz(c,r);
(i) Write a command that displays the top left 6 6 block of A.
(ii) Generate four sinusoidal vectors x1, x2, x3 and x4 as follows:
n = 1:60;
x1 = cos(0*n)’;
x2 = cos(pi*n/6)’;
x3 = cos(pi*n/3)’;
x4 = cos(pi*n/2)’;
and compute the products
y1=A*x1 ; y2=A*x2 ; y3=A*x3 ; y4=A*x4;
(iii) Use the SUBPLOT feature to display bar graphs of all eight vectors
(against n) on the same ﬁgure window.
(iv) One of the eight vectors consists mostly of zero values. Explain math-
ematically why this is so.
131
Section 2.3
P 2.9. Consider the matrices
A =
_
_
a b c
d e f
g h i
_
_
and B =
_
_
i g h
f d e
c a b
_
_
Express B as P
(r)
AP
(c)
, where P
(r)
and P
(c)
are permutation matrices.
P 2.10. Let
x =
_
¸
¸
_
a
b
c
d
_
¸
¸
_
If P and Q are 4 4 matrices such that
Px =
_
¸
¸
_
c
a
b
d
_
¸
¸
_
and Qx =
_
¸
¸
_
a
c
d
b
_
¸
¸
_
determine the product PQ.
P 2.11. Matrices A and B are generated in MATLAB using the commands
A = [1:3; 4:6; 7:9] ;
B = [9:-1:7; 6:-1:4; 3:-1:1] ;
Find matrices P and Q such that B = PAQ.
P 2.12. (i) Find a (n n) matrix P such that if
x =
_
x
1
x
2
x
3
. . . x
n
¸
T
,
then
Px =
_
x
n
x
n−1
x
n−2
. . . x
1
¸
T
.
(ii) Without performing any calculations, determine the r
th
power P
r
for
any integer r.
(iii) Let the vector x be given by
x[k] = cos(ωk +φ) , (1 ≤ k ≤ n)
Under what conditions on ω, φ and n is the relationship Px = x true?
132
P 2.13. Express each of the matrices
A =
_
¸
¸
_
a b c d
2a 2b 2c 2d
−a −b −c −d
−2a −2b −2c −2d
_
¸
¸
_
and B =
_
¸
¸
_
1 t t
2
t
3
t
−1
1 t t
2
t
−2
t
−1
1 t
t
−3
t
−2
t
−1
1
_
¸
¸
_
as a product of a row vector and a column vector.
Sections 2.4–2.5
P 2.14. Consider the matrix
A =
_
cos θ −sin θ
sin θ cos θ
_
which represents a counterclockwise rotation by an angle θ.
(i) Without performing any matrix operations, derive A
−1
.
(ii) Verify that AA
−1
= I by computing the product on the left-hand side
of the equation.
Section 2.6
P 2.15. Solve the simultaneous equations
(i) x
1
+
x
2
2
+
x
3
3
= 14
x
1
2
+
x
2
3
+
x
3
4
= 1
x
1
3
+
x
2
4
+
x
3
5
= −2
(ii) 3x
1
−5x
2
−5x
3
= −37
5x
1
−5x
2
−2x
3
= −17
2x
1
+ 3x
2
+ 4x
3
= 32
using Gaussian elimination.
133
Section 2.7
P 2.16. (i) Determine the LU and LDV factorizations of
A =
_
¸
¸
_
4 8 4 0
1 5 4 −3
1 4 7 2
1 3 0 −2
_
¸
¸
_
(ii) Solve Ax = b, where b = [28 13 23 4]
T
, by means of the two triangular
systems
Ly = b and Ux = y
P 2.17. Repeat P 2.16 for
A =
_
¸
¸
_
1 1 0 3
2 1 −1 1
3 −1 −1 2
−1 2 6 −1
_
¸
¸
_
, b =
_
¸
¸
_
4
1
−3
4
_
¸
¸
_
Section 2.8
P 2.18. (i) Use row pivoting to solve the simultaneous equations Ax = b,
where
A =
_
¸
¸
_
1 2 4 1
2 8 6 6
3 10 8 8
4 12 10 6
_
¸
¸
_
, b =
_
¸
¸
_
21
52
79
82
_
¸
¸
_
(ii) Determine the permuted LU factorization LU = PA.
P 2.19. Consider the simultaneous equations Ax = b, where
A =
_
0.002 3.420
2.897 3.412
_
, b =
_
3.422
6.309
_
The exact solution is x
1
= 1, x
2
= 1. Solve using ﬁnite-precision Gaussian
elimination (a) without row pivoting; (b) with row pivoting. Round to four
signiﬁcant digits after each computation.
134
Section 2.9
P 2.20. Use Gaussian elimination to determine the inverses of
A =
_
_
3 2 1
−1 5 2
2 3 1
_
_
and B =
_
¸
¸
_
1 2 4 1
2 8 6 6
3 10 8 8
4 12 10 6
_
¸
¸
_
P 2.21. Consider the following MATLAB script:
A = [1 2 -4; -1 -1 5; 2 7 -3];
I = [1 0 0; 0 1 0; 0 0 1];
X = A\I
Using Gaussian elimination, determine (by hand) the answer X.
P 2.22. Consider the simultaneous equations Ax = b, where
A =
_
_
3.000 −4.000 2.000
−1.000 3.000 −2.000
1.001 1.999 −2.001
_
_
, b =
_
_
1.000
1.000
3.000
_
_
Assuming that the entries of A are precise, examine how the solution x is
aﬀected by rounding the entries of b to four signiﬁcant digits. To that end,
let each entry of b vary by ±.0005. For each of the 2
3
= 8 extremal values,
compute the solution x using the \ (backslash) operator in MATLAB, and
determine the maximum distance from the original solution (i.e., compute
norm(x-A\b)).
P 2.23. The n n Hilbert matrix A, whose entries are given by
a
ij
=
1
i +j −1
,
is a classic example of an ill-conditioned matrix. This problem illustrates
the eﬀect of perturbing each entry of A by a small random amount.
(i) Use the MATLAB function HILB to display the 5 5 Hilbert matrix.
The command format rat will display it in rational form. Also, use the
function INVHILB to display its inverse, which has integer entries.
(ii) Enter
135
format long
n = 5;
A = hilb(n);
b = ones(n,1)
c = b + 0.001*rand(n,1)
Compare the values
max(abs(b-c))
and
max(abs((A\c)./(A\b)-1))
Repeat for n=10. What do you observe?
Section 2.10
P 2.24. Consider the four-dimensional vectors a = [ −1 7 2 4 ]
T
and
b = [ 3 0 −1 −5 ]
T
.
(i) Compute |a|, |b| and |b −a|.
(ii) Compute the angle θ between a and b (where 0 ≤ θ ≤ π).
(iii) Let f be the projection of b on a, and g be the projection of a on b.
Express f and g as λa and µb, respectively (λ and µ are scalars).
(iv) Verify that b −f ⊥ a and a −g ⊥ b.
P 2.25. Consider the three-dimensional vectors v
(1)
= [ −1 1 1 ]
T
, v
(2)
=
[ 2 −1 3 ]
T
, and s = [ 1 2 3 ]
T
.
(i) Determine the projection ˆs of s on the plane deﬁned by v
(1)
and v
(2)
.
(ii) Show that the projection of ˆs on v
(1)
is the same as the projection of s
on v
(1)
. (Is this result expected from three-dimensional geometry?)
P 2.26. Let a
(1)
, a
(2)
, a
(3)
and a
(4)
be the columns of the matrix
A =
_
¸
¸
_
1 1/2 1/4 1/8
0 1 1/2 1/4
0 0 1 1/2
0 0 0 1
_
¸
¸
_
Determine the least squares approximation
p = c
1
a
(1)
+c
2
a
(2)
+c
3
a
(3)
136
of a
(4)
based on a
(1)
, a
(2)
and a
(3)
. Also determine the relative (root mean
square) error
|p −a
(4)
|
|a
(4)
|
Section 2.11
P 2.27. Consider the data set
t 0.10 0.15 0.22 0.34 0.44 0.51 0.62 0.75 0.89 1.00
s 0.055 0.358 0.637 1.073 1.492 1.677 2.181 2.299 2.862 3.184
Find the least squares approximation to the data set shown above in terms
of
(i) a straight line f
1
(t) = a
1
t +a
0
;
2
(t) = b
2
t
2
+b
1
t +b
0
.
Plot two graphs, each showing the discrete data set and the approximating
function.
P 2.28. Consider the data set
t −2 −1 0 1 2
s −0.2 −0.2 0.8 1.6 3.1
Let f(t) = c
1
t
2
+c
2
t +c
3
be the parabola that yields the least squares ﬁt to
the above data set, and let c = [c
1
c
2
c
3
]
T
.
(i) Determine a matrix A and a vector b such that Ac = b.
(ii) Is it possible to deduce the value of c
2
independently of the other vari-
ables c
1
and c
3
(i.e., without having to solve the entire system of equations)?
If so, what is the value of c
2
?
P 2.29. Five measurements x
i
taken at times t
i
are shown below.
t
i
0.1 0.2 0.3 0.4 0.5
x
i
2.88 3.48 4.29 5.00 6.25
It is believed that the data follow an exponential law of the form
x(t) = ae
bt
137
or equivalently,
ln x(t) = ln a +bt
Determine the values of a and b in the least squares approximation of the
vector [s
i
] = [ln x
i
] by the vector [lna +bt
i
] (where i = 1, . . . , 5).
P 2.30. Five measurements x
i
taken at times t
i
are shown below.
t
i
10 20 30 40 50
x
i
14.0 20.9 26.7 32.2 36.8
It is believed that the data obey a power law of the form
x(t) = at
r
or equivalently,
lnx(t) = ln a +r ln t
Determine the values of a and r in the least squares approximation of the
vector [s
i
] = [ln x
i
] by the vector [lna +r lnt
i
] (where i = 1, . . . , 5).
P 2.31. The transient response of a passive circuit has the form
y(t) = a
1
e
−1.5t
+a
2
e
−1.2t
(t ≥ 0)
where t is in seconds. The constants a
1
and a
2
the initial state (at t = 0) of the circuit.
The data vector s in s1.txt contains noisy measurements of y(t) taken every
200 milliseconds starting at t = 0. Find the values a
1
and a
2
which provide
the least squares approximation ˆs to the data s. Plot s and ˆs on the same
graph.
P 2.32. The data vector s can be found in s2.txt. It consists of noisy samples
of the sinusoid
x(t) = a
1
cos(200πt) +a
2
sin(200πt) +b
1
cos(250πt) +b
2
sin(250πt) +c
taken every 100 microseconds (starting at t = 0). Find the values of a
1
, a
2
,
b
1
, b
2
and c which provide the least squares approximation ˆs to the data s.
Plot s and ˆs on the same graph.
138
Section 2.12
P 2.33. Let v
(1)
, v
(2)
and v
(3)
be the columns of the matrix
V =
_
_
2 −1 3
4 1 −1
−1 2 2
_
_
(i) Show that v
(1)
, v
(2)
and v
(3)
(and thus also V itself) are orthogonal.
Display V
T
V.
(ii) Without performing Gaussian elimination, solve
Vc = s
for s = [ 7 2 −5 ]
T
.
(iii) Determine the projection ˆs of s on the plane deﬁned by v
(1)
and v
(2)
.
What is the relationship between the error vector ˆs − s and the projection
of s onto v
(3)
?
(iv) Scale the columns of V so as to obtain an orthonormal matrix.
P 2.34. Consider the matrix
V =
_
¸
¸
_
1 1 1 0
1 1 −1 0
1 −1 0 1
1 −1 0 −1
_
¸
¸
_
(i) Compute the inner product matrix V
T
V.
(ii) If
x =
_
0 1 −2 3
¸
T
determine a vector c such that Vc = x.
P 2.35. Consider the 4 3 matrix
V =
_
¸
¸
_
1 1/2 1/4
0 1 1/3
0 0 1
0 0 0
_
¸
¸
_
139
(i) The 3 3 inner product matrix V
T
V has an obvious normalized LU
factorization of the form
V
T
V = LDU
where the diagonal entries of L and U are unity and U = L
T
. What are L,
D and U in that factorization?
(ii) Show how the orthonormal matrix
W =
_
¸
¸
_
1 0 0
0 1 0
0 0 1
0 0 0
_
¸
¸
_
can be formed by taking linear combinations of the columns of V.
P 2.36. Consider the symmetric matrix
A =
_
_
1 −3 2
−3 7 −4
2 −4 5
_
_
(i) Using Gaussian elimination, ﬁnd matrices U and D such that
A = U
T
DU
where U is an upper triangular matrix whose main diagonal consists of 1’s
only; and D is a diagonal matrix.
(ii) Does there exist a m 3 matrix V (where m is arbitrary) such that
V
T
V = A?
P 2.37. Consider the matrix
V =
_
¸
¸
_
1 4 −1
−2 1 5
1 −3 3
0 1 1
_
¸
¸
_
Find a 3 3 nonsingular matrix B such that
W = VB
−1
is an orthogonal matrix (i.e., its columns are orthogonal and have nonzero
norm).
140
P 2.38. Consider the 5 3 matrix
V =
_
v
(1)
v
(2)
v
(3)
¸
=
_
¸
¸
¸
¸
_
2 3 2
1 4 2
1 1 −1
1 3 4
1 2 −1
_
¸
¸
¸
¸
_
(i) Express the inner product matrix V
T
V in the form LDL
T
.
(ii) Using U = L
T
, obtain a 5 3 matrix W such that 1(W) = 1(V) and
the columns of W are orthogonal.
(iii) Determine the projection of s = [ −6 −1 3 −1 8 ]
T
on 1(V)
(which is the same as 1(W)).
P 2.39. Consider a n 3 real-valued matrix
V =
_
v
(1)
v
(2)
v
(3)
¸
which is such that
V
T
V =
_
_
16 −8 −12
−8 8 8
−12 8 11
_
_
(i) Determine matrices D (diagonal with positive entries) and U (upper
triangular with unit diagonal entries) such that
V
T
V = U
T
DU
(ii) Determine coeﬃcients c
ij
that result in orthogonal vectors w
(1)
, w
(2)
and w
(3)
, where:
w
(1)
= v
(1)
w
(2)
= c
12
v
(1)
+v
(2)
w
(3)
= c
13
v
(1)
+c
23
v
(2)
+v
(3)
P 2.40. (MATLAB) Consider the discrete-time signals v1, v2, v3 and v4
deﬁned by
t = ((0:0.01:1)*(pi/4))’;
v1 = sin(t);
v2 = sin(2*t);
v3 = sin(3*t);
v4 = sin(4*t);
141
(i) Plot all four signals on the same graph.
(ii) Use the function CHOL to obtain four orthogonal signals w1, w2, w3 and
w4 which are linearly equivalent to the above signals. Verify that the signals
obtained are indeed orthogonal.
(iii) Plot w1, w2, w3 and w4 on the same graph.
P 2.41. (i) Use MATLAB to verify that the six sinusoidal vectors in R
6
deﬁned below are orthogonal:
n = (0:5).’;
r1 = cos(0*n);
r2 = cos(pi*n/3);
r3 = sin(pi*n/3);
r4 = cos(2*pi*n/3);
r5 = sin(2*pi*n/3);
r6 = cos(pi*n);
(ii) Verify that the vector s = [ 0 −3 1 0 −1 3 ]
T
can be represented
as a linear combination of r3 and r5, i.e., in terms of sines only.
(iii) Without performing any additional computations, express
x = [ 3 0 4 3 2 6 ]
T
as a linear combination of the six given sinusoidal vectors.
P 2.42. Fourier sinusoids are not the only class of orthogonal reference vectors
used in signal approximation/representation. In wavelet analysis, reference
vectors reﬂect the frequency content of the signal which is being analyzed,
as well as its “localized” behavior at diﬀerent times. This is accomplished
by taking as reference vectors scaled and shifted versions of a basic ﬁnite-
duration pulse (or wavelet).
As a simple example, consider the problem of approximating a (m = 256)-
dimensional signal s using n = 64 reference vectors derived from the basic
Haar wavelet, which is a succession of two square pulses of opposite ampli-
tudes. The 64 columns of the matrix V are constructed as follows:
• The ﬁrst column is an all-ones vector, and provides resolution of order
r = 0.
• The second column is a full-length Haar wavelet: 128 +1’s followed by
128 −1’s. This column provides resolution of order r = 1.
142
• The third column is a half-length Haar wavelet: a pulse of 64 +1’s
followed by 64 −1’s, followed by 128 zeros. In the fourth column, the
pulse is shifted to the right by 128 positions so that it does not overlap
the pulse in the third column. These two columns together provide
resolution of order r = 2.
• We continue in this fashion to obtain resolutions of order up to r = 6.
Resolution of order r is provided by columns 2
r−1
+ 1 through 2
r
of
V. Each of these columns contains a scaled and shifted Haar wavelet
of duration 2
9−r
. The number of columns at that resolution is the
maximum possible under the constraint that pulses across columns do
not overlap in time (row index).
The matrix V is generated in MATLAB using the following script (also
found in haar1.txt):
p = 8; % vector size = m = 2^p
rmax = 6; % max. resolution; also, n = 2^rmax
a = ones(2^p,1);
V = a; % first column (r=0) is all-ones
for r = 1:rmax
v = [a(1:2^(p-r)); -a(1:2^(p-r)); zeros(2^p-2^(p-r+1),1)];
for k = 0:2^(r-1)-1
V = [V v(1+mod(-k*2^(p-r+1):-k*2^(p-r+1)+2^p-1, 2^p),:)];
end
end
(i) Compute V
T
V and hence verify that the columns of V are orthogonal.
You can view V using a ribbon plot, where each column (i.e., reference
vector) is plotted as a separate ribbon:
ribbon(V)
% Maximize the graph window for a better view
view(15,45)
pause
view(45,45)
(ii) Plot the following two signals:
s1 = [exp((0:255)/256)]’;
s2 = [exp((0:115)/256) -0.5+exp((116:255)/256)].’;
143
(iii) Using the command
y = A*((A’*A)\(A’*s));
where A and s are chosen appropriately, determine and plot the least-squares
approximations of s1 and s2 based on the entire matrix V. Repeat, using
columns 33–64 only, which provide the highest resolution (r = 6).
(Note that projecting s1 on the highest-resolution columns results in a vec-
tor of small values compared to the amplitude of s1; this is because s1 looks
rather smooth at high resolution. On the other hand, s2 has a sharp discon-
tinuity which is captured in the highest-resolution projection. The order of
the resolution can be increased to r = 8, in which case the matrix V has 256
columns and can provide a complete representation of any 256-point signal.
At the highest resolution, each column of V consists of 254 zeros, one +1
and one −1.)
Section 2.13
P 2.43. Consider the equation Az = b, where
A =
_
−1 +j −2 −3j
1 +j 3
_
, b =
_
4
2 −j
_
and z ∈ C
2×1
.
(i) Find a real-valued 4 4 matrix F and a real-valued 4 1 vector c such
that the above equation is equivalent to
F
_
¸
¸
_
x
1
y
1
x
2
y
2
_
¸
¸
_
= c
where z
1
= x
1
+jy
1
, z
2
= x
2
+jy
2
.
(ii) Use MATLAB to verify that A\b and F\c are indeed equal.
Chapter 3
The Discrete Fourier
Transform
144
145
3.1 Complex Sinusoids as Reference Vectors
3.1.1 Introduction
We have seen how a discrete-time signal consisting of ﬁnitely many samples
can be represented as a vector, and how a linear transformation of that
vector can be described by a matrix. In what follows, we will consider
signal vectors x of length N, i.e., containing N samples or points. The time
index n will range from n = 0 to n = N −1. Thus
x =
_
x[0] x[1] . . . x[N −1]
¸
T
Note the shift relative to our earlier indexing scheme: n = 0, . . . , N − 1
instead of n = 1, . . . , N.
0 1 2 3 4 5 6 N−3 N−2 N−1
.
.
. .
.
.
.
.
.
.
n
x[n]
Figure 3.1: An N-point signal vector x.
The sample values in x provide an immediate representation of the signal
in terms of unit pulses at times n = 0, . . . , N −1:
x = x[0]e
(0)
+x[1]e
(1)
+ +x[N −1]e
(N−1)
The most important feature of this representation is localization in time:
each entry x[n] of the vector x gives us the exact value of the signal at time
n.
In signal analysis, we seek to interpret the information present in a signal.
In particular, we are interested in how signal values at diﬀerent times depend
on each other. In general, this sort of information cannot be obtained by
inspection of the vector x. For example, if the samples in x follow a pattern
closely described by a linear, exponential or sinusoidal function of time, that
pattern cannot be easily discerned by looking at individual numbers in x.
146
And while a trained eye could detect such a simple pattern by looking at
a plot of x, it could miss a more complicated pattern such as a sum of ﬁve
sinusoids of diﬀerent frequencies.
Fourier analysis is the most common approach to signal analysis, and is
based on representing or approximating a signal by a linear combination of
sinusoids. The importance of sinusoids is due to (among other factors):
• their immunity to distortion when propagating through a linear medium,
or when processed by a linear ﬁlter;
• our perception of real-world signals, e.g., our ability to detect and
separate diﬀerent frequencies in audio signals, as well as our ability to
group harmonics into a single note.
In Fourier analysis, we represent x by a vector X of the same length,
where each entry in X provides the (complex) amplitude of a complex si-
nusoid. The two representations are equivalent. In contrast to the vector
x itself, whose values provide localization in time, the vector X provides
localization in frequency.
3.1.2 Fourier Sinusoids
A standard complex sinusoid x of length N and frequency ω is given by
x[n] = e
jωn
, n = 0, . . . , N −1
The designation “standard” means that the complex-valued amplitude is
unity—equivalently, the real-valued amplitude is unity and the phase shift
is zero. Note that regardless of the value of ω,
x[0] = 1
Recall from Subsection 1.5.3 that frequencies ω
1
and ω
2
related by
ω
2
= ω
1
+ 2kπ (k ∈ Z)
yield the same values for x and are thus equivalent. This enables us to limit
the eﬀective frequency range to a single interval of angles of length 2π. In
what follows, we will usually assume that
0 ≤ ω < 2π
If we extend the discrete time axis to Z (i.e., n = −∞ to n = ∞), then
x

[n] = e
jωn
, n ∈ Z
147
deﬁnes a complex sinusoid having inﬁnite duration. The sinusoid x

[] will
repeat itself every N samples provided
(∀n) e
jω(n+N)
= e
jωn
which reduces to
e
jωN
= 1
and thus
ω = k

N
for some integer k. (This expression was also obtained in Subsection 1.5.4.)
Since the eﬀective range of ω is (0, 2π], it follows that k can be limited to
the integers 0, . . . , N − 1. Thus there are exactly N distinct frequencies ω
for which x

[n] = e
jωn
is periodic with (fundamental) period equal to N or
a submultiple of N. These frequencies are given by
ω = 0,

N
,

N
, . . . ,
(N −1)2π
N
Deﬁnition 3.1.1. The Fourier frequencies for an N-point vector are given
by
ω = k

N
where k = 0, . . . , N−1. The corresponding Fourier sinusoids v
(0)
, v
(1)
, . . . , v
(N−1)
are the N-point vectors given by
v
(k)
[n] = e
j(2π/N)kn
, n = 0, . . . , N −1
We note the following:
• All Fourier frequencies are multiples of 2π/N.
• ω = 0 (which results in a constant signal of value +1) is always a
Fourier frequency, corresponding to k = 0 in the above deﬁnition.
• ω = π (which results in a signal whose value alternates between +1
and −1) is a Fourier frequency only when N is even.
• If ω ∈ (0, π) is a Fourier frequency, so is 2π − ω ∈ (π, 2π). Put
diﬀerently, if ω is the k
th
Fourier frequency, then 2π−ω is the (N−k)
th
Fourier frequency. The corresponding Fourier sinusoids are complex
conjugates of each other:
e
j(2π−ω)n
= e
−jωn
= (e
jωn
)

148
1 π/4 1
2π/7
Figure 3.2: Fourier frequencies for N = 7 (left) and N = 8 (right),
represented by asterisks on the unit circle.
The Fourier frequencies ω for N = 7 (odd) and N = 8 (even) are illus-
trated in Figure 3.2.
For the remainder of this chapter, Fourier sinusoids of length N will be
used as reference vectors for signals in R
N×1
, and more generally, C
N×1
.
The notation v
(k)
for the k
th
Fourier sinusoid is consistent with our earlier
notation for reference vectors. In particular, we can arrange the Fourier
sinusoids as columns of a N N matrix V:
V =
_
v
(0)
v
(1)
. . . v
(N−1)
¸
For V, the row index n = 0, . . . , N − 1 represents time, while the column
index k = 0, . . . , N−1 represents frequency. The (n, k)
th
entry of V is given
by
V
nk
= v
(k)
[n] = e
j(2π/N)kn
= cos
_
2πkn
N
_
+j sin
_
2πkn
N
_
Note in particular that V
nk
= V
kn
, i.e., V is always symmetric.
The Fourier frequencies and matrices V for N = 1, 2, 3 and 4 are shown
in Figure 3.3. Note that the n = 0
th
row and k = 0
th
column equal unity
for all N. The k
th
column is generated by moving counterclockwise on the
unit circle in increments of ω = k(2π/N), starting from z = 1.
3.1.3 Orthogonality of Fourier Sinusoids
Fourier sinusoids of the same length N are orthogonal. As a result, they
are well-suited as reference vectors. (Recall that the the least squares ap-
proximation of a signal by a set of reference vectors is the vector sum of the
projections of the signal on each reference vector.)
Proof of the orthogonality of Fourier sinusoids involves the summation
of geometric series, which we review below.
149
2π/3
_
1
¸
_
1 1
1 −1
_
_
¸
_
1 1 1
1 −
1
2
+j

3
2

1
2
−j

3
2
1 −
1
2
−j

3
2

1
2
+j

3
2
_
¸
_
_
¸
¸
_
1 1 1 1
1 j −1 −j
1 −1 1 −1
1 −j −1 j
_
¸
¸
_
Figure 3.3: Fourier sinusoids for N = 1, 2, 3 and 4. The entries of
each matrix are marked on the corresponding unit circle.
Fact. If z is complex, then
G
n
(z)
def
= 1 +z +z
2
+ +z
n
=
_
(1 −z
n+1
)/(1 −z), z ,= 1;
n + 1, z = 1.
The limit of G
n
(z) as n →∞ exists if and only [z[ < 1, and is given by
G

(z) =
1
1 −z
The above results are from elementary calculus, and are easily obtained
by noting that:
• If z = 1, then all terms in the geometric series are equal to 1.
• G
n
(z) −zG
n
(z) = 1 −z
n+1
.
• If [z[ > 1, then [z[
n+1
= [z
n+1
[ diverges to inﬁnity.
• If [z[ = 1 and z ,= 1, then z
n+1
constantly rotates on the unit circle
(hence it does not converge).
• If [z[ < 1, then [z[
n+1
converges to zero.
To establish the orthogonality of the N-point Fourier sinusoids v
(k)
and
v
()
having frequencies ω = k(2π/N) and ω = (2π/N), respectively, we
150
need to compute the inner product of v
(k)
and v
()
:
¸v
(k)
, v
()
) =
N−1

n=0
_
e
j(2π/N)kn
_

e
j(2π/N)n
=
N−1

n=0
e
j(2π/N)(−k)n
=
N−1

n=0
z
n
= G
N−1
(z)
where
z = e
j(2π/N)(−k)
If k = , then z = 1 and
¸v
(k)
, v
(k)
) = |v
(k)
|
2
= N
(Note that each entry of v
(k)
is on the unit circle, hence its squared magni-
tude equals unity.) Of course, this is also consistent with G
N−1
(1) = N.
If 0 ≤ k ,= ≤ N − 1, then − k cannot be a multiple of N, and thus
z ,= 1. Using the formula for G
N−1
(z), we have
¸v
(k)
, v
()
) =
1 −z
N
1 −z
But
z
N
= e
j(2π/N)(−k)N
= e
j2π(−k)
= 1
and hence
¸v
(k)
, v
()
) = 0
as needed. We have thus obtained the following result.
Fact. The N-point complex Fourier sinusoids comprising the columns of the
matrix
V =
_
v
(0)
v
(1)
. . . v
(N−1)
¸
are orthogonal, each having squared norm equal to N.
In terms of the inner product matrix V
H
V, we have
V
H
V = NI
151
Since V is a square matrix, the above relationship tells us that it is also
nonsingular and has inverse
V
−1
=
1
N
V
H
It follows that the N complex Fourier sinusoids v
(0)
, v
(1)
, . . . , v
(N−1)
are also
linearly independent.
In the context of signal approximation and representation, the above
facts imply the following:
• The least-squares approximation of any real or complex-valued signal
s based on any subset of the above N complex Fourier sinusoids is
given by the sum of the projections of s onto each of the sinusoids;
where the projection of s onto v
(k)
is given by the standard expression
¸v
(k)
, s)
N
v
(k)
• The least-squares approximation of s based on all N complex Fourier
sinusoids is an exact representation of s, and is given by
s = Vc ⇔ c = V
−1
s =
1
N
V
H
s
Example 3.1.1. Let N = 3, and consider the signals
s = [2 −1 −1]
T
and x = [2 −1 0]
T
The Fourier exponentials are given by the columns of
V =
_
¸
_
1 1 1
1 −
1
2
+j

3
2

1
2
−j

3
2
1 −
1
2
−j

3
2

1
2
+j

3
2
_
¸
_
We have
s = Vc and x = Vd
where
c
0
=
1
3
¸v
(0)
, s) = 0
c
1
=
1
3
¸v
(1)
, s) = 1
c
2
=
1
3
¸v
(2)
, s) = 1
152
and
d
0
=
1
3
¸v
(0)
, x) =
1
3
d
1
=
1
3
¸v
(1)
, x) =
5
6
+j

3
6
d
2
=
1
3
¸v
(2)
, x) =
5
6
−j

3
6
153
3.2 The Discrete Fourier Transform and its In-
verse
3.2.1 Deﬁnitions
We begin our in-depth discussion of the representation of a N-point signal
vector s by the Fourier sinusoids in V with a simple change of notation: we
replace the coeﬃcient vector c in the equations
s = Vc ⇔ c =
1
N
V
H
s
by S/N, i.e., we deﬁne S = Nc. We thus obtain
s =
1
N
VS ⇔ S = V
H
s
Recall that the matrix V is symmetric:
V
nk
= e
j(2π/N)kn
= V
kn
Thus
V
H
= (V
T
)

= V

and
(V
H
)
nk
= e
−j(2π/N)kn
Deﬁnition 3.2.1. The discrete Fourier transform (DFT) of the N-point
vector s is the N-point vector S deﬁned by
S = V
H
s
or equivalently,
S[k] =
N−1

n=0
s[n]e
−j(2π/N)kn
, k = 0, . . . , N −1
S will also be referred to as the (complex) spectrum of s.
For the remainder of this chapter, the upper-case boldface symbols S,
X and Y will be reserved for the discrete Fourier transforms of the signal
vectors s, x and y, respectively.
The equations in the deﬁnition of the DFT are also known as the analysis
equations. They yield the coeﬃcients needed in order to express the signal
s as a sum of Fourier sinusoids. Since these sinusoids are the components of
s, computation of the coeﬃcients S[0], . . . , S[N − 1] amounts to analyzing
the signal into N distinct frequencies.
154
Deﬁnition 3.2.2. The inverse discrete Fourier transform (IDFT) of the
N-point vector X is the N-point vector x deﬁned by
x =
1
N
VX
or equivalently,
x[n] =
1
N
N−1

k=0
X[k]e
j(2π/N)kn
, n = 0, . . . , N −1
The equations in the deﬁnition of the IDFT are known as the synthesis
equations, since they produce a signal vector x by summing together com-
plex sinusoids with (complex) amplitudes given by the entries of (1/N)X.
Clearly,
• if X is the DFT of x, then x is the IDFT of X; and
• the analysis and synthesis equations are equivalent.
A signal x and its DFT X form a DFT pair, denoted by
x
DFT
←→ X or simply x ←→ X
Throughout our discussion, the lower-case symbol x will denote a signal
in the time domain, i.e., evolving in time n. The DFT (or spectrum) X
will be regarded as a signal in the frequency domain, since its values are
indexed by the frequency parameter k (corresponding to a radian frequency
ω = k(2π/N)).
The two signals x and X have the same physical dimensions and units.
Either signal can be assigned arbitrary values in C
N×1
; such assignment
automatically determines the value of the other signal (through the synthesis
or analysis equations). Thus any signal in the time domain is also perfectly
valid as a signal in the frequency domain (i.e., as the spectrum of a diﬀerent
time domain signal), and vice versa. Even though time and frequency have
a distinctly diﬀerent physical meaning, the two entities are treated very
similarly in the discrete Fourier transform and its inverse. This similarity
is quite evident in the analysis and synthesis equations, which diﬀer only in
terms of a complex conjugate and a scaling factor.
155
3.2.2 Symmetry Properties of the DFT and IDFT Matrices
Recall that the N N matrix
V =
_
v
(0)
v
(1)
. . . v
(N−1)
¸
is given by
V
nk
= e
j(2π/N)kn
where the row index n = 0, . . . , N −1 corresponds to time and the column
index k = 0, . . . , N−1 corresponds to frequency. V will be referred to as the
IDFT matrix for a N-point vector, since it appears (without conjugation)
in the synthesis equation. Whenever the value of N is not clear from the
context, we will use V
N
Letting
v = v
N
def
= e
j(2π/N)
= cos
_

N
_
+j sin
_

N
_
we obtain the simple expression
V
nk
= v
kn
and can thus write
V =
_
¸
¸
¸
¸
¸
_
1 1 1 . . . 1
1 v v
2
. . . v
N−1
1 v
2
v
4
. . . v
2(N−1)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 v
N−1
v
2(N−1)
. . . v
(N−1)
2
_
¸
¸
¸
¸
¸
_
As noted earlier, the entries of V are symmetric about the main diagonal
since V
nk
= v
kn
= v
nk
. For other structural properties of V, we turn to the
k = 1
st
column
v
(1)
=
_
1 v v
2
. . . v
N−1
¸
T
and note that the elements of that column represent equally spaced points
on the unit circle. This is illustrated in Figure 3.4.
Note that
v
N
= e
j(2π/N)N
= e
j2π
= 1
and thus for any integer r,
v
n+rN
= v
n
Every integer power of v thus equals one of 1, v, . . . , v
N−1
, which means that
all columns of V can be formed using entries taken from the k = 1
st
column
v
(1)
.
156
N=7, v=exp(j2π/7)
v
7
=1
v
v
2
v
3
v
4
v
6
v
5
N=8, v=exp(jπ/4)
v
8
=1
v
v
2
v
3
v
4
=−1
v
5
v
6
v
7
Figure 3.4: The entries of the ﬁrst Fourier sinusoid marked on the
unit circle for N = 7 (left) and N = 8 (right).
Example 3.2.1. For N = 6, the entries of V = V
6
can be expressed in
terms of 1, v, v
2
, v
3
, v
4
and v
5
, where v = e
jπ/3
. Note that v
3
= e

= −1.
We have
V =
_
¸
¸
¸
¸
¸
¸
_
1 1 1 1 1 1
1 e
jπ/3
e
j2π/3
−1 e
j4π/3
e
j5π/3
1 e
j2π/3
e
j4π/3
1 e
j2π/3
e
j4π/3
1 −1 1 −1 1 −1
1 e
j4π/3
e
j2π/3
1 e
j4π/3
e
j2π/3
1 e
j5π/3
e
j4π/3
−1 e
j2π/3
e
jπ/3
_
¸
¸
¸
¸
¸
¸
_
or equivalently, (since e
j(2π−θ)
= e
−jθ
):
V =
_
¸
¸
¸
¸
¸
¸
_
1 1 1 1 1 1
1 e
jπ/3
e
j2π/3
−1 e
−j2π/3
e
−jπ/3
1 e
j2π/3
e
−j2π/3
1 e
j2π/3
e
−2jπ/3
1 −1 1 −1 1 −1
1 e
−j2π/3
e
j2π/3
1 e
−j2π/3
e
j2π/3
1 e
−jπ/3
e
−j2π/3
−1 e
j2π/3
e
jπ/3
_
¸
¸
¸
¸
¸
¸
_

The symmetric nature of V = V
6
is evident in Example 3.2.1. We note
that in the general case (arbitrary N), the n
th
and (N −n)
th
entries of v
(k)
are complex conjugates of each other:
v
k(N−n)
= v
−kn
= (v
kn
)

This means that the elements of the k = 1
st
column exhibit a form of
conjugate symmetry. The center of symmetry is the row index N/2, which
157
is at equal distance from n and N −n:
n =
N
2

_
N
2
−n
_
and N −n =
N
2
+
_
N
2
−n
_
The value N/2 is an actual row index if N is even, and is midway between
two row indices if N is odd. We note also that the n = 0
th
entry is not part
of a conjugate symmetric pair, since the highest row index is n = N −1, not
n = N.
Clearly, the same conjugate symmetry arises in the columns of V, with
center of symmetry given by the column index k = N/2. This follows easily
from the symmetry of V about its diagonal, i.e., from V = V
T
.
We ﬁnally note that since
v
(N−k)(N−n)
= v
N
2
−kN−nN+kn
= v
kn
we also have radial symmetry (without conjugation) about the point (n, k) =
(N/2, N/2):
V
N−n,N−k
= V
nk
In particular, radial symmetry implies that entries n = 1, . . . , N − 1 in the
(N −k)
th
column can be obtained by reversing the order of the same entries
in the k
th
column.
The symmetry properties of V are summarized in Figure 3.5.
Deﬁnition 3.2.3. The DFT matrix W = W
N
for a N-point vector is
deﬁned by
W = V
H
= V

Speciﬁcally, (k, n)
th
entry of W is given by
W
kn
= w
kn
where
w = w
N
def
= v
−1
N
= e
−j2π/N

W is known as the DFT matrix because it appears without conjugation
in the analysis equation:
X = V
H
x = Wx
Note that in the case of W, the frequency index k is the row index, while the
time index n is the column index. This is also consistent with the indexing
scheme used in the sum form of the analysis equation:
X[k] =
N−1

n=0
x[n]e
−j(2π/N)kn
158
1 1 1
. . .
1
. . .
1
1
1
.
.
.
1
.
.
.
1
0 1 2
. . .
N/2
. . .
N-1
0
1
2
.
.
.
N/2
.
.
.
N-1
=
=
*=
*=
Figure 3.5: Symmetry properties of the IDFT matrix V. Equal
values are denoted by “=” and conjugate values by “

=”.
We note the following:
• The introduction of W allows us to write the equation
V
H
V = VV
H
= NI
in a variety of equivalent forms, using
V
H
= V

= W and V = W
H
= W

• Since W is just the complex conjugate of V, it has the same symmetry
properties as V.
3.2.3 Signal Structure and the Discrete Fourier Transform
The structure of a signal determines the structure of its spectrum. In order
to understand and apply Fourier analysis, one needs to know how certain
key features of a signal in the time domain are reﬂected in its spectrum in
the frequency domain.
A fundamental property of the discrete Fourier transform is its linearity,
which is due to the matrix-vector form of the DFT and IDFT transforma-
tions.
159
DFT 1. (Linearity) If x ←→X and y ←→Y, then for any real or complex
scalars α and β,
s = αx +βy ←→ S = αX+βY
This can be proved using either the analysis or the synthesis equation.
Using the former,
S = Ws
= W(αx +βy)
= αWx +βWy
= αX+βY
as needed.
Another fundamental property of the Fourier transform is the duality
of the DFT and IDFT transformations. Duality stems from the fact that
the two transformations are obtained using matrices W = V

and (1/N)V
which diﬀer only by a scaling factor and a complex conjugate. As a result, if
x ←→X is a DFT pair, then the time-domain signal y = X has a spectrum
Y whose structure is very similar to that of the original time-domain signal
x. The precise statement of the duality property will be given in Section
3.4.
Before continuing with our systematic development of DFT properties,
we consider four simple signals and compute their spectra.
Example 3.2.2. Let x be the 0
th
unit vector e
(0)
, namely a single unit
pulse at time n = 0:
x = e
(0)
=
_
1 0 0 . . . 0
¸
T
The DFT X is given by
X = We
(0)
Right-multiplying W by a unit vector amounts to column selection. In this
case, the 0
th
column of W is selected:
X =
_
1 1 1 . . . 1
¸
T
= 1
i.e., X is the all-ones vector. The same result is obtained using the analysis
equation in its sum form:
X[k] =
N−1

n=0
x[n]w
kn
= 1 w
0
= 1
for all k = 0, . . . , N −1.
160
0 1 2 . . . N-1
1
x
0 1 2 . . . N-1
X
1
Example 3.2.2
Example 3.2.3. We now delay the unit pulse by taking x = e
(m)
, or equiv-
alently,
x[n] =
_
1, n = m;
0, n ,= m,
where 0 ≤ m ≤ N −1. The DFT becomes
X = We
(m)
i.e., X is the m
th
column of W, which is the same as the complex conjugate
of v
(m)
(the m
th
column of V). Thus X is a Fourier exponential of frequency
(N −m)(2π/N), or equivalently, −m(2π/N):
X[k] = w
km
= v
−km
= e
−j(2π/N)km
Of course, the same result is obtained using the analysis equation in its sum
form:
X[k] =
N−1

n=0
x[n]w
kn
= 1 w
km
Note that in this example, the spectrum is complex-valued. It is purely
real-valued (for all k) if and only if m = 0 (i.e., there is no delay) or m = N/2.
The latter value of m is an actual time index only when the number of
samples is even, in which case the resulting spectrum is given by X[k] =
(−1)
k
.
Example 3.2.4. Building upon Example 3.2.3, we add a second pulse of
unit height at time n = N − m (where m ,= 0). The two pulses are now
symmetric to each other relative to the time instant n = N/2. Denoting the
new signal by s, we have
s = e
(m)
+e
(N−m)
and thus by linearity of the DFT we obtain
S = We
(m)
+We
(N−m)
161
This is the sum of the DFT’s obtained in Example 3.2.3 for delays m and
N −m. In particular, we have
S[k] = e
−j(2π/N)km
+e
−j(2π/N)k(N−m)
= e
−j(2π/N)km
+e
j(2π/N)km
= 2 cos
_
2πmk
N
_
The spectrum is now purely real-valued. The ﬁgure illustrates the case
N = 8 and m = 3.
0 1 2 3 5 6 7
s
0 7
S
1
4=N/2
2
2
2
2
2
2
Example 3.2.4 (with N = 8 and m = 3)
Example 3.2.5. Finally, we introduce unit pulses at every time instant,
resulting in the constant signal y = 1. The analysis equation gives
Y = W1
i.e., Y is the (componentwise) sum of all columns of W. The sum form of
the equation gives
Y [k] =
N−1

n=0
1 w
kn
which is the same as the geometric sum G
N−1
(w
k
). It is also the inner
product ¸v
(k)
, v
(0)
), which equals zero if k ,= 0 and N if k = 0. The resulting
spectrum is
Y = Ne
(0)
For a simpler way of deriving Y, note that
y = 1 = v
(0)
162
0 1 2 . . . N-1
Y
N
1
y
0 1 2 . . . N-1
Example 3.2.5
i.e., y consists of a single Fourier sinusoid of frequency zero (corresponding
to k = 0) and unit amplitude. This means that in the synthesis equation
y =
1
N
VY ,
the DFT vector Y selects the k = 0
th
column of V and multiplies it by N.
Hence Y = Ne
(0)
.
The graphs for Examples 3.2.2 and 3.2.5 are similar: essentially, time
and frequency domains are interchanged with one of the signals undergoing
scaling by N. This similarity is a consequence of the duality property of the
DFT, which will be discussed formally in Section 3.4.
163
3.3 Structural Properties of the DFT: Part I
3.3.1 The Spectrum of a Real-Valued Signal
As we saw in the previous section, any complex-valued vector can be viewed
as either
• a time-domain signal; or
• a frequency-domain signal, i.e., the DFT (or spectrum) of a time-
domain signal.
Of course, all time-domain signals encountered in practice are real-valued;
the generalization to complex-valued signals is necessary in order to include
complex sinusoids (and, as we shall see later, complex exponentials) in our
analysis. A natural question to ask is whether the spectrum X of a real-
valued signal x has special properties that distinguish it from the spectrum
of an arbitrary complex-valued signal in the time domain. The answer is
aﬃrmative, and is summarized below.
DFT 2. (DFT of a Real-Valued Signal) If x is a real-valued N-point signal,
then X satisﬁes
X[0] = X

[0]
and
X[k] = X

[N −k]
for k = 1, . . . , N −1.
To prove this property, we note ﬁrst that
X[0] =
N−1

n=0
x[n]w
0·n
=
N−1

n=0
x[n]
which is real-valued since x has real-valued entries. For k = 1, . . . , N − 1,
we have
X[k] =
N−1

n=0
x[n]w
kn
and
X[N −k] =
N−1

n=0
x[n]w
(N−k)n
=
N−1

n=0
x[n]w
−kn
164
Taking complex conjugates across the second equation, we obtain
X

[N −k] =
_
N−1

n=0
x[n]w
−kn
_

=
N−1

n=0
_
x[n]w
−kn
_

=
N−1

n=0
x

[n]w
kn
Since x is real valued, we have x

[] = x[] and therefore
X[k] = X

[N −k]
as needed.
DFT 2 tells us that the DFT (or spectrum) of a real-valued signal exhibits
the same kind of conjugate symmetry as was seen in the rows and columns
of W and V. This symmetry will be explored further in this chapter.
Expressing X[k] in polar form, i.e.,
X[k] = [X[k][ e
j∠X[k]
we obtain two new frequency-domain vectors indexed by k = 0, . . . , N − 1.
These are:
• The amplitude spectrum, given by [X[][. If x is real-valued, then
DFT 2 implies that
[X[k][ = [X

[N −k][, k = 1, . . . , N −1
i.e., the amplitude spectrum is symmetric in k with center of symmetry
at k = N/2.
• The phase spectrum, given by the angle ∠X[] quoted in the interval
[−π, π]. If x is real-valued, then DFT 2 implies that
∠X[0] = 0 or ±π
and
∠X[k] = −∠X[N −k], k = 1, . . . , N −1
i.e., the phase spectrum is antisymmetric in k.
165
Example 3.3.1. In Example 3.1.1, we considered the vector
x =
_
2 −1 0
¸
T
and evaluated its DFT as
X =
_
1
5
2
+j

3
2
5
2
−j

3
2
_
T
(note the scaling X = 3c). There is only one pair of conjugate symmetric
entries here:
X[1] = X

[2]
The amplitude spectrum is given by
_
1 2.6458 2.6458
¸
T
while the phase spectrum is given by
_
0 0.3335 −0.3335
¸
T

It is always possible to express a real-valued signal vector x as a lin-
ear combination of real-valued sinusoids at the Fourier frequencies. Indeed,
the conjugate symmetry of the spectrum X allows us to combine complex
conjugate terms
X[k]e
j(2π/N)kn
and
X[N −k]e
−j(2π/N)(N−k)n
= X

[k]e
−j(2π/N)kn
into a single real-valued sinusoid:
2'e
_
X[k]e
j(2π/N)kn
_
= 2[X[k][ cos
_
2πkn
N
+∠X[k]
_
(Note that such pairs occur for values of k other than 0 or N/2.) The result-
ing real-valued form of the synthesis equation involves Fourier frequencies
in the range [0, π] only:
x[n] =
1
N
X[0] +
1
N
X[N/2](−1)
n
+
2
N

0<k<N/2
[X[k][ cos
_
2πkn
N
+∠X[k]
_
The second term (corresponding to frequency ω = π) is present only when
N/2 is an integer, i.e., when N is even.
166
Example 3.3.1. (Continued.) The representation of
x =
_
2 −1 0
¸
T
using real sinusoids involves a constant term (corresponding to k = 0) to-
gether with sinusoid of frequency 2π/3 (corresponding to k = 1 and k = 2).
Substituting the values of X[k] computed earlier into the last equation, we
obtain
x[n] = 0.3333 + 1.7639 cos
_
2πn
3
+ 0.3335
_
for n = 0, 1, 2.
3.3.2 Signal Permutations
Certain important properties of the DFT involve signal transformations in
both time and frequency domains. These transformations are simple per-
mutations of the elements of a vector and, as such, can be described by
permutation matrices. Recall from Section 2.3 that the columns of a N N
permutation matrix are the unit vectors e
(0)
, . . . , e
(N−1)
listed in arbitrary
order.
We introduce three permutations which are particularly important in
our study of the DFT.
Circular (or Cyclic) Shift P: This permutation is deﬁned by the relation-
ship
P(x[0], x[1], . . . , x[N −2], x[N −1]) = (x[N −1], x[0], . . . , x[N −3], x[N −2])
and can be illustrated by placing the N entries of x on a “time wheel”, in
the same way as the Fourier frequencies appear on the unit circle; namely
counterclockwise, starting at angle zero. A circular shift amounts to rotating
the wheel counterclockwise by one notch (i.e., by an angle of 2π/N), so that
x[N −1] appears at angle zero. The rotation is illustrated in Figure 3.6 for
N = 7.
If x is a column vector, a circular shift on x produces the vector Px,
where P is a permutation matrix given by
P =
_
¸
¸
¸
¸
¸
¸
_
0 0 . . . 0 1
1 0 . . . 0 0
0 1 . . . 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
. 1 0
_
¸
¸
¸
¸
¸
¸
_
167
x[0]
x[1]
x[2]
x[3]
x[4]
x[5]
x[6]
n=0
x[0]
x[1]
x[2]
x[3]
x[4]
x[5]
x[6]
n=0
Figure 3.6: Vector x (left) and its circular shift Px (right).
Note that removal of the ﬁrst row and last column of P yields a (N −1)
(N −1) identity matrix.
A circular shift is also known as a rotation. The term periodic is also used
instead of circular, and delay is used instead of shift. Clearly, P
m
represents
a circular shift by m positions. For m = N, the time wheel undergoes one
full rotation, i.e.,
x = P
N
x or P
N
= I
Index Reversal Q: Also known as linear index reversal to distinguish it from
the circular reversal discussed below, it is deﬁned by
Q(x[0], x[1], . . . , x[N −2], x[N −1]) = (x[N −1], x[N −2], . . . , x[1], x[0])
and described in terms of the permutation matrix
Q =
_
¸
¸
¸
¸
¸
¸
¸
_
0 0 . . . 0 1
0 0 . . . 1 0
.
.
.
.
.
. ¸
.
.
.
.
.
.
0 1
.
.
. 0 0
1 0
.
.
. 0 0
_
¸
¸
¸
¸
¸
¸
¸
_
Note that Q is an identity matrix ﬂipped left-to-right.
Circular Index Reversal R: Also known as periodic index reversal, this trans-
formation diﬀers from its linear counterpart in that reversal takes place
among entries x[1], . . . , x[N −1] only, with x[0] kept in the same position:
R(x[0], x[1], . . . , x[N −2], x[N −1]) = (x[0], x[N −1], . . . , x[2], x[1])
168
This can be illustrated using again a time wheel with the entries of x ar-
ranged in the usual fashion. A circular index reversal amounts to turning
the wheel upside down. The 0
th
always remains in the same position, as
does the (N/2)
th
entry when N is even (as depicted in Figure 3.7 for the
case N = 8).
x[0]
x[1]
x[2]
x[3]
x[4]
x[5]
x[6]
x[7]
n=0
x[0]
x[1]
x[2]
x[3]
x[4]
x[5]
x[6]
x[7]
n=0
Figure 3.7: Vector x (left) and its circular time-reverse Rx (right).
The permutation matrix R for circular reversal is given by
R =
_
¸
¸
¸
¸
¸
¸
_
1 0 . . . 0 0
0 0 . . . 0 1
0 0 . . . 1 0
.
.
.
.
.
. ¸
.
.
.
.
.
.
0 1
.
.
. 0 0
_
¸
¸
¸
¸
¸
¸
_
Note that
R = PQ
i.e., circular reversal of a column vector can be implemented by linear re-
versal followed by circular shift.
Recall from Section 2.3 that a permutation matrix Π satisﬁes
Π
T
Π = I ⇔ Π
−1
= Π
T
and is thus orthonormal. Since Q and R are symmetric, it follows that
Q
−1
= Q and R
−1
= R
(Note, on the other hand, that P is not symmetric.)
We note one ﬁnal property of any permutation matrix Π:
169
Fact. Π acting on a column vector and Π
−1
= Π
T
acting on a row vector
both produce the same permutation of indices.
This is established by noting that
(Πx)
T
= x
T
Π
T
= x
T
Π
−1

The following example illustrates some of the signal transformations dis-
cussed so far.
Example 3.3.2. Let
x =
_
1 −2 5 3 4 −1
¸
T
Also, let
v
(3)
=
_
1 −1 1 −1 1 −1
¸
T
be the third Fourier sinusoid for N = 6, corresponding to ω = π. The
following signals can be expressed compactly in terms of x, v
(3)
and the
permutation matrices P and R:
• x
(1)
=
_
4 −1 1 −2 5 3
¸
T
• x
(2)
=
_
1 −1 4 3 5 −2
¸
T
• x
(3)
=
_
2 −3 9 6 9 −3
¸
T
• x
(4)
=
_
0 −1 1 0 −1 1
¸
T
• x
(5)
=
_
1 2 5 −3 4 1
¸
T
• x
(6)
=
_
0 4 0 −6 0 2
¸
T
Indeed, we have:
• x
(1)
= P
2
x
• x
(2)
= Rx
• x
(3)
= x +Rx.
• x
(4)
= x −Rx
• For every k, x
(5)
[k] = x[k]v
(3)
[k]
• For every k, x
(6)
[k] = x[k]
_
v
(3)
[k] −1
_
170
3.4 Structural Properties of the DFT: Part II
3.4.1 Permutations of DFT and IDFT Matrices
In Subsection 3.2.2, we observed certain symmetries in the entries of V and
W. The key symmetry properties are:
• Symmetry about the main diagonal, i.e., V = V
T
and W = W
T
;
• Conjugate symmetry with respect to row index N/2:
V
N−n,k
= V

nk
and W
N−k,n
= W

k,n
(and similarly with respect to column index N/2).
The conjugate symmetry property can be also expressed using the cir-
cular reversal matrix R. Applied to either V or W as a row permutation,
R leaves the m = 0
th
row in the same position, while interchanging rows m
and N −m for 0 < m < N/2. Since these two rows are complex conjugates
of each other, and the zeroth row is real-valued, the resulting matrix is the
complex conjugate of the original one. Of course, the same is true about
column permutations using R, since both V and W are symmetric. We
thus have
RV = VR = V

= W
and
RW = WR = W

= V
The eﬀect of circular shift on the rows and columns of V and W can
be explained better by introducing a diagonal matrix F whose diagonal
elements are the entries of the k = 1
st
Fourier sinusoid:
F
def
=
_
¸
¸
¸
¸
¸
_
1 0 0 . . . 0
0 v 0 . . . 0
0 0 v
2
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . v
N−1
_
¸
¸
¸
¸
¸
_
Since the k
th
power of F is obtained by raising each diagonal element to
that power, it follows that
v
(k)
= F
k
1
171
where, as before, 1 is the all-ones column vector (same as v
(0)
). Thus all N
Fourier sinusoids can be expressed using powers of F:
V =
_
1 F1 F
2
1 . . . F
N−1
1
¸
A circular shift on the columns of V is obtained by right-multiplying it by
P
T
= P
−1
(not by P—see the last fact in Subsection 3.3.2). The result is
VP
−1
=
_
F
N−1
1 1 F1 . . . F
N−2
1
¸
=
_
F
−1
1 1 F1 . . . F
N−2
1
¸
and thus
VP
−1
= F
−1
V
This can be easily generalized to any power m of P:
VP
m
= F
m
V
Taking transposes of both sides yields
P
−m
V = VF
m
Taking complex conjugates across the last two equations, and noting that
V

= W and F

= F
−1
we obtain
WP
m
= F
−m
W
and
P
−m
W = WF
−m
In conclusion, a circular shift on the rows or columns of V (or W) is equiv-
alent to right- or left-multiplication by a diagonal matrix whose diagonal is
given by a suitable Fourier sinusoid.
3.4.2 Summary of Identities
We have deﬁned the following N N matrices with the aid of v = e
j2π/N
and w = v
−1
= v

:
V : IDFT matrix, deﬁned by V
nk
= v
kn
W : DFT matrix, deﬁned by W
kn
= w
kn
F : diagonal matrix deﬁned by F
nn
= v
n
P : circular shift matrix
R : circular reversal matrix
172
We have shown that
P
−1
= P
T
(3.1)
R
−1
= R
T
= R (3.2)
RV = VR = V

= W (3.3)
RW = WR = W

= V (3.4)
VP
m
= F
m
V (3.5)
P
m
V = VF
−m
(3.6)
WP
m
= F
−m
W (3.7)
P
m
W = WF
m
(3.8)
The identities listed above will be used to derive structural properties of the
DFT.
3.4.3 Main Structural Properties of the DFT
In what follows, we will consider the DFT pair
x ←→X
where, again, the time-domain signal appears on the left, and the frequency-
domain signal on the right, of the arrow. The two signals are related via the
analysis and synthesis equations:
X = Wx (3.9)
x =
1
N
VX =
1
N
W

X (3.10)
We will investigate how certain systematic operations on the time-domain
signal x aﬀect its spectrum X, and vice versa.
DFT 3. (Complex Conjugation) Complex conjugation in the time domain
is equivalent to complex conjugation together with circular time reversal in
the frequency domain:
y = x

←→ Y = RX

(3.11)
Proof. We have
Y = Wy
= Wx

= RW

x

= RX

173
where the third equality follows from (3.4).
Remark. Using DFT 3, we can easily deduce DFT 2. Indeed, if x is real-
valued, then
x = x

and taking DFT’s of both sides, we obtain
X = RX

which is precisely the statement of DFT 2. This relationship expresses a
type of conjugate symmetry with respect to circular reversal, which will be
henceforth referred to as circular conjugate symmetry.
DFT 4. (Circular Time Reversal) Circular reversal of the entries of x is
equivalent to circular reversal of the entries of X:
y = Rx ←→ Y = RX (3.12)
Proof. We have
Y = Wy
= WRx
= RWx
= RX
where the third equality follows from (3.4).
Remark. If we (circularly) time-reverse a real-valued signal x, then the
resulting signal y = Rx has DFT
Y = RX = X

where the second equality is due to the circular conjugate symmetry of the
DFT of a real-valued signal (i.e., DFT 2). In particular, the amplitude and
phase spectra of the two signals x and y = Rx are related by
[Y [k][ = [X[k][
∠Y [k] = −∠X[k]
for all values of k. The ﬁrst equation tells us that the two signals contain
exactly the same amounts (in terms of amplitudes) of each Fourier frequency.
The second equation implies that the relative positions (in time) of these
174
sinusoids will be very diﬀerent in the two signals. This diﬀerence can have
a drastic eﬀect on the perceived signal; for example, music played backward
bears little resemblance to the original sound. Thus in general, the phase
spectrum can be as important as the amplitude spectrum and cannot be
ignored in applications such as audio signal compression.
DFT 5. (Circular Time Delay) Circular time shift of the entries of x by
m positions is equivalent to multiplication of the entries of X by the corre-
sponding entries of the (N −m)
th
Fourier sinusoid:
y = P
m
x ←→ Y = F
−m
X (3.13)
i.e.,
Y [k] = v
k(N−m)
X[k] = v
−km
X[k] = w
km
X[k]
for k = 0, . . . , N −1.
Proof. We have
Y = WP
m
x
= F
−m
Wx
= F
−m
X
where the second equality is due to (3.7).
DFT 6. (Multiplication by a Fourier Sinusoid) Entry-by-entry multiplica-
tion of x by the m
th
Fourier sinusoid is equivalent to a circular shift of the
entries of X by m frequency indices:
y = F
m
x ←→ Y = P
m
X (3.14)
Proof. We have
Y = WF
m
x
= P
m
Wx
= P
m
X
where the second equality is due to (3.8).
Remark. Multiplication of a signal by a sinusoid is also known as ampli-
tude modulation (abbreviated as AM). In real-world communication sys-
tems, amplitude modulation enables a baseband signal, i.e., one consisting
175
of (relatively) low frequencies, to be transmitted over a frequency band cen-
tered at a much higher carrier frequency. A sinusoid (known as the carrier)
of that frequency is multiplied by the baseband signal (e.g., an audio signal),
resulting in the AM signal. The spectrum of the baseband signal and that
of the AM signal are basically related by a shift (in frequency), similarly
to what DFT 6 implies. For that reason, DFT 6 is also referred to as the
modulation property.
The similarity between DFT 5 and DFT 6 cannot be overlooked: iden-
tical signal operations in two diﬀerent domains result in very similar op-
erations in the opposite domains (the only diﬀerence being a complex con-
jugate). This similarity is a recurrent theme in Fourier analysis, and is
particularly prominent in the case of the DFT: basically, the analysis and
synthesis equations are identical with the exception of a scaling factor and
a complex conjugate. As a result, computation of any DFT pair yields an-
other DFT pair as a by-product. This is known as the duality property of
the DFT, and is stated below.
DFT 7. (Duality) If x ←→X, then
y = X ←→ Y = NRx (3.15)
Proof. We have
Y = Wy
= WX
= RVX
= NRx
where the third and fourth equalities are due to (3.3) and (3.10) (the syn-
thesis equation), respectively.
Remark. We saw an instance of the duality property in Examples 3.2.2 and
3.2.5, where we showed that
e
(0)
←→1
and
1 ←→Ne
(0)
The second DFT pair can be obtained from the ﬁrst one, and vice versa,
by application of DFT 7. In this particular case, both signals are circularly
symmetric, and thus circular reversal has no eﬀect:
Re
(0)
= e
(0)
and R1 = 1
176
3.5 Structural Properties of the DFT: Examples
3.5.1 Miscellaneous Signal Transformations
Example 3.5.1. Consider the 4-point real-valued signal
x =
_
1 2 3 4
¸
T
We will ﬁrst compute the DFT X of x. Using X, we will then derive the
DFT’s of the following signals:
• x
(1)
=
_
1 4 3 2
¸
T
• x
(2)
=
_
4 1 2 3
¸
T
• x
(3)
=
_
3 4 1 2
¸
T
• x
(4)
=
_
1 −2 3 −4
¸
T
• x
(5)
=
_
5 −1 +j −1 −1 −j
¸
T
• x
(6)
=
_
5 −1 −1 −1
¸
T
The Fourier frequencies in this case are 0, π/2, π and 3π/2. We have
X = Wx =
_
¸
¸
_
1 1 1 1
1 −j −1 j
1 −1 1 −1
1 j −1 −j
_
¸
¸
_
_
¸
¸
_
1
2
3
4
_
¸
¸
_
=
_
¸
¸
_
10
−2 + 2j
−2
−2 −2j
_
¸
¸
_
Note that X exhibits the conjugate symmetry common to the spectra of all
real-valued signals (DFT 2).
Signals x
(1)
through x
(6)
are derived from either x or X. Their spectra
can be computed using known structural properties of the DFT.
• x
(1)
=
_
1 4 3 2
¸
T
is the circular time-reverse of x, i.e., x
(1)
=
Rx. By DFT 4,
X
(1)
= RX =
_
10 −2 −2j −2 −2 + 2j
¸
T
• x
(2)
=
_
4 1 2 3
¸
T
is the circular delay of x by one time unit, i.e.,
x
(2)
= Px. By DFT 5, the k
th
entry of X is multiplied by w
k
= (−j)
k
,
for each value of k:
X
(2)
= F
−1
X =
_
10 2 + 2j 2 2 −2j
¸
T
177
• x
(3)
=
_
3 4 1 2
¸
T
is the circular delay of x by two time units,
i.e., x
(3)
= P
2
x. Again by DFT 5, the k
th
entry of X is multiplied by
w
2k
= (−1)
k
, for each value of k:
X
(3)
= F
−2
X =
_
10 2 −2j −2 2 + 2j
¸
T
• x
(4)
=
_
1 −2 3 −4
¸
T
is obtained by multiplying the entries of
x by those of the Fourier sinusoid (−1)
n
= v
2n
, i.e., x
(4)
= F
2
x. By
DFT 6, the spectrum is shifted by m = 2 frequency indices, or, in
terms of actual frequencies, by π radians. The resulting spectrum is
X
(4)
= P
2
X =
_
−2 −2 −2j 10 −2 + 2j
¸
T
• x
(5)
=
_
5 −1 +j −1 −1 −j
¸
T
equals X/2. By DFT 7 (duality),
we have that
X
(6)
= 4R(x/2) =
_
2 8 6 4
¸
T
• x
(6)
=
_
5 −1 −1 −1
¸
T
equals the real part of x
(5)
, i.e.,
x
(6)
=
x
(5)
+
_
x
(5)
_

2
By DFT 3, complex conjugation in the time domain is equivalent to
complex conjugation together with circular reversal in the frequency
domain. Thus
X
(6)
=
1
2
_
2 8 6 4
¸
T
+
1
2
_
2 4 6 8
¸
T
=
_
2 6 6 6
¸
T

3.5.2 Exploring Symmetries
The spectrum of a real-valued signal is always circularly conjugate-symmetric;
this was established in DFT 2 and was illustrated in Examples 3.3.1 and
3.5.1. It turns out that this property also holds with the time and frequency
domains interchanged, i.e., a signal which exhibits circular conjugate sym-
metry in the time domain has a real-valued spectrum. This can be seen by
taking DFT’s on both sides of the identity
x = Rx

178
and then applying DFT 4 to obtain
X = RRX

= X

This means that X is real-valued.
It follows that if a signal x happens to be both real-valued and circularly
symmetric, then its spectrum X will have the same properties. (Conjugate
symmetry and symmetry are equivalent notions for a real-valued signal,
which always equals its complex conjugate.)
We summarize the above observations as follows.
Fact. If a signal is real-valued in either domain (time or frequency), then it
exhibits circular conjugate symmetry in the other domain. A signal exhibit-
ing both properties in one domain also exhibits both properties in the other
domain. The terms symmetry and conjugate symmetry are equivalent
for real-valued signals.
The following example illustrates operations on circularly symmetric,
real-valued signals that preserve both properties.
Example 3.5.2. Consider the 16-point signal x whose spectrum X is given
by
X =
_
4 3 2 1 0 0 0 0 0 0 0 0 0 1 2 3
¸
T
Xis shown in Figure E.3.5.2 (i). Clearly, Xis both real-valued and circularly
symmetric. By the foregoing discussion, the time-domain signal x has the
same properties (and can be computed quite easily by issuing the command
x = ifft(X) in MATLAB).
0 1 2 3 4 8 12 13 14 15
4
3
2
1 1
2
3
X
Figure E.3.5.2 (i)
We are interested in determining whether the two fundamental time-
domain operations:
• circular delay by m time units
179
• multiplication by the m
th
Fourier sinusoid
preserve either or both properties of x, (i.e., real values and circular sym-
metry).
A circular time delay of m units in x clearly preserves the real values in
x, and thus also preserves the circular symmetry in X. To see how circular
symmetry of x is aﬀected by this operation, we consider the spectrum X,
which undergoes multiplication by the m
th
Fourier sinusoid: each X[k] is
multiplied by e
−j(π/8)km
. Unless m = 0 (no delay) or m = 8, the result-
ing spectrum will contain complex values, which means that the (delayed)
time-domain signal will no longer be circularly symmetric. Thus the only
nontrivial value of m that preserves circular symmetry is m = 8. The de-
layed signal is given by
x
(1)
= P
8
x
and its spectrum equals
X
(1)
[k] = e
−jπk
X[k] = (−1)
k
X[k], k = 0, . . . , 15
The spectrum X
(1)
is plotted in Figure E.3.5.2 (ii).
0 4 8 12
4
−3
2
−1
2
−3
−1
X
(1)
Figure E.3.5.2 (ii)
We note also that summing together two versions of x that have been
delayed by complementary amounts, i.e., m and −m (or 16−m) time units,
will also preserve the circular symmetry in x regardless of the value of m.
This is because
e
−j(π/8)km
+e
j(π/8)km
= 2 cos
_
πkm
8
_
180
and therefore
x
(2)
= P
m
x +P
−m
x ←→ X
(2)
[k] = 2 cos
_
πkm
8
_
X[k]
Since the spectrum X
(2)
is real-valued, x
(2)
is circularly symmetric. Figure
E.3.5.2 (iii) illustrates X
(2)
for the case m = 4. The graph was generated
by multiplying each entry of X by the corresponding entry of
_
2 0 −2 0 2 0 −2 0 2 0 −2 0 2 0 −2 0
¸
T
−4
0 4 8 12
8 X
(2)
−4
Figure E.3.5.2 (iii)
Multiplication of x by a complex Fourier sinusoid of frequency mπ/8 will
result in certain entries of x taking complex values, unless m = 0 or m = 8.
Correspondingly, the spectrum will undergo circular shift by m frequency
indices, and circular symmetry will be preserved only in the cases m = 0
and m = 8. Thus the only nontrivial value of m that preserves real values
in x is m = 8, for which
x
(3)
[n] = e
jπn
x[n] = (−1)
n
x[n], n = 0, . . . , 15
and
X
(3)
= P
8
X
The spectrum X
(3)
is plotted in Figure E.3.5.2 (iv).
We also note that since the Fourier sinusoids are circularly conjugate-
symmetric, the element-wise product of x and v
(m)
will also be circularly
conjugate-symmetric, whether it is real or complex.
Finally, we see that if we multiply x by the sum of two complex Fourier
sinusoids which are conjugates of each other, i.e., have frequency indices m
181
X
(3)
4
3
2
1 1
2
3
0 4 8 12
Figure E.3.5.2 (iv)
and 16 −m (or simply −m), the resulting signal will be real-valued, since
e
−j(π/8)mn
+e
j(π/8)mn
= 2 cos
_
πmn
8
_
(same identity as encountered earlier). Also,
x
(4)
[n] = 2 cos
_
πmn
8
_
x[n] ←→ X
(4)
= P
m
X+P
−m
X
X
(4)
4
3
2
1 1
2
3
0 4 8 12
4
3
2
1 1
2
3
Figure E.3.5.2 (v)
Figure E.3.5.2 (v) illustrates X
(4)
in the case m = 4. Note the circular
symmetry in X
(4)
, as well as the relationship between X
(4)
and the original
spectrum X. In this case, each entry of x (in the time domain) is multiplied
by the corresponding entry of
_
2 0 −2 0 2 0 −2 0 2 0 −2 0 2 0 −2 0
¸
T

182
3.6 Multiplication and Circular Convolution
The structural properties of the discrete Fourier transform considered thus
far involved operations on a single signal vector of ﬁxed length. In this
section, we examine two speciﬁc ways of combining together two (or more)
arbitrary signals in the time domain, and study the spectra of the resulting
signals.
3.6.1 Duality of Multiplication and Circular Convolution
We have seen (in DFT 1) that DFT is a linear transformation, i.e., the
DFT of a linear combination of two signals is the linear combination of their
DFT’s:
αx +βy ←→ αX+βY
There are other important ways in which two signal vectors x and y can be
combined in both time and frequency domains. We will examine two such
operations.
Deﬁnition 3.6.1. The (element-wise) product of N-point vectors x and y
is the vector s deﬁned by
s[n] = x[n]y[n] , n = 0, . . . , N −1
A special case of the product was encountered earlier, where y = v
(m)
(the m
th
Fourier sinusoid).
Let us examine the DFT S of the product signal s, which is given by the
analysis equation:
S[k] =
N−1

n=0
x[n]y[n]w
kn
, k = 0, . . . , N −1
(We note that in this case, we do not have a compact expression for the
entire vector S as a product of the vectors x, y and the matrix W.) Noting
that w
kn
is the (n, n)
th
entry of the diagonal matrix F
−k
, we can write
S[k] =
_
x[0] x[1] . . . x[N −1]
¸
_
¸
¸
¸
_
1 0 . . . 0
0 w
k
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . w
k(N−1)
_
¸
¸
¸
_
_
¸
¸
¸
_
y[0]
y[1]
.
.
.
y[N −1]
_
¸
¸
¸
_
i.e.,
S[k] = x
T
F
−k
y
183
We seek an expression for S[k] in terms of the spectra X and Y. To that
end, we use the matrix forms of the analysis and synthesis equations, as well
as the known identities
V = V
T
, VF
−k
= P
k
V and V = RW
We obtain
S[k] =
_
1
N
VX
_
T
F
−k
y
=
1
N
X
T
VF
−k
y
=
1
N
X
T
P
k
Vy
=
1
N
X
T
P
k
RWy
=
1
N
X
T
P
k
RY
The operation involving X, Y and the permutation matrices R (circular
reversal) and P (circular shift) in the last expression is known as the circular
convolution of the vectors X and Y. It is deﬁned for any two vectors of the
same length.
Deﬁnition 3.6.2. The circular convolution of the N-point vectors a and b
is the N-point vector denoted by
a b
and given by
(a b)[n] = a
T
P
n
Rb , n = 0, . . . , N −1
We can now write our result as follows.
DFT 8. (Multiplication of Two Signals) If x ←→X and y ←→Y, then
x[n]y[n] ←→
1
N
XY
Fact. Since x[n]y[n] = y[n]x[n], we also have XY = YX, i.e., circular
convolution is symmetric in its two arguments.
184
Multiplication in the time domain has many applications in communi-
cations (e.g., amplitude modulation, signal spreading) and signal processing
(e.g., windowing in spectral analysis and ﬁlter design). By DFT 8, multipli-
cation in the time domain is equivalent to (scaled) circular convolution in
the frequency domain.
As it turns out, convolution is equally (if not more) important as a
time-domain operation, since it can be used to compute the response of
a linear system to an arbitrary input vector; this feature will be explored
further in the following chapter. In the meantime, we note that DFT 8 has
a dual property: circular convolution in the time domain corresponds to
multiplication in the frequency domain.
DFT 9. (Circular Convolution of Two Signals) If x ←→X and y ←→Y,
then
x y ←→ X[k]Y [k]
Proof. We follow an argument parallel to the proof of DFT 8. We consider
the product signal S[k] = X[k]Y [k] (now in the frequency domain), and
use the synthesis equation (which diﬀers from the analysis equation by a
complex conjugate and the factor N) to express the time-domain signal s as
s[n] =
1
N
X
T
F
n
Y
We need to show that s[n] is the n
th
entry of x y. Indeed,
s[n] =
_
1
N
Wx
_
T
F
n
Y
=
1
N
x
T
WF
n
Y
=
1
N
x
T
P
n
WY
=
1
N
x
T
P
n
R(VY)
= x
T
P
n
Ry
= (x y)[n]
as needed. Alternatively, this result can be proved using DFT 7 (duality)
on the pairs x ←→X, y ←→Y and x[n]y[n] ←→N
−1
(XY).
185
3.6.2 Computation of Circular Convolution
The computation of the circular convolution x y for two N-point vectors
x and y can be summarized as follows:
• Circularly reverse y to obtain Ry.
• For each n = 0, . . . , N − 1, shift Ry circularly by n indices to obtain
P
n
Ry.
• Compute (xy)[n] as x
T
P
n
Ry, i.e., as the (unconjugated) dot prod-
uct of x and P
n
Ry.
By symmetry of circular convolution, x and y can be interchanged in
the above computation.
Example 3.6.1. Consider the four-point vectors
x =
_
a b c d
¸
T
and y =
_
0 −1 0 1
¸
T
and let
s = x y
The vectors x, y and P
n
Ry for n = 0, 1, 2, 3 are depicted using four-point
time wheels.
a
b
c
d
0 0
1
-1
0 0
1
-1 0
0
1 -1
0
0
1 -1 0 0
1
-1
x y
Ry = P Ry
0
P Ry
3
P Ry
2
P Ry
1
Example 3.6.1
We have
P
0
Ry =
_
0 1 0 −1
¸
T
186
P
1
Ry =
_
−1 0 1 0
¸
T
P
2
Ry =
_
0 −1 0 1
¸
T
P
3
Ry =
_
1 0 −1 0
¸
T
and thus
s[0] = x
T
P
0
Ry = b −d
s[1] = x
T
P
1
Ry = c −a
s[2] = x
T
P
2
Ry = d −b
s[3] = x
T
P
3
Ry = a −c
Therefore
s =
_
b −d c −a d −b a −c
¸
T

Example 3.6.2. We verify DFT 9 for
x =
_
2 −3 5 −3
¸
T
and y =
_
0 −1 0 1
¸
T
The signal y is the same as in Example 3.6.1, thus
s = x y =
_
0 3 0 −3
¸
T
The DFT’s are easily computed as:
X =
_
1 −3 13 −3
¸
T
Y =
_
0 2j 0 −2j
¸
T
S =
_
0 −6j 0 6j
¸
T
Indeed,
S[k] = X[k]Y [k]
for all values of k.
187
3.7 Periodic and Zero-Padded Extensions of a Sig-
nal
3.7.1 Deﬁnitions
We now shift our focus to signals derived from a basic signal vector s by
extending its length. In what follows, we will assume that s has length L,
and will derive two distinct types of signals of variable length N ≥ L based
on s.
Deﬁnition 3.7.1. The N-point periodic extension of s is the signal x deﬁned
by
x[n] =
_
s[n], 0 ≤ n ≤ L −1;
x[n −L], L ≤ n ≤ N −1.

Deﬁnition 3.7.2. The N-point zero-padded extension of s is the signal y
deﬁned by
y[n] =
_
s[n], 0 ≤ n ≤ L −1;
0, L ≤ n ≤ N −1.

y
x
s
0 9
0 10 20 30 35
0 10 35
Figure 3.8: The 10-point signal s, its 36-point periodic extension
x and its 36-point zero-padded extension y.
The two types of extensions are illustrated in Figure 3.8, where L = 10
and N = 36. Note that the signals x and y are fundamentally diﬀerent. In
188
the case of x, the same activity is observed every L time units; y, on the
other hand, exhibits no activity after time n = L −1.
Since the extensions x and y are formed by either replicating the basic
signal s or appending zeros to it, both DFT’s X and Y are computed using
elements of the vector s. However, the relationships between X, Y and S
are not at all apparent. This is because the Fourier frequencies for X and Y
depend on N, and so do the associated Fourier sinusoids. Since the elements
of X and Y are inner products involving such Fourier sinusoids, there are
no identiﬁable correspondences or similarities between X, Y and S in the
general case (i.e., where N is arbitrary). One important exception is the
special case where N is an integer multiple of L.
Fact. If N = ML, where M is an integer, then the set of Fourier frequencies
for an L-point signal is formed by taking every M
th
Fourier frequency for
an N-point signal, starting with the zeroth frequency (ω = 0).
This fact is easily established by noting that the k
th
Fourier frequency
for an L-point signal is given by
k
_

L
_
= kM
_

ML
_
= kM
_

N
_
and thus equals the kM
th
frequency for an N-point signal.
Figure 3.9 illustrates this fact in the case where L = 4, M = 3 and
N = 12.
3.7.2 Periodic Extension to an Integer Number of Periods
Assuming that N = ML, we have the following relationship between X and
S.
Fact. If N = ML, then the DFT of the N-point periodic extension x of an
L-point signal s is given by
X[r] =
_
MS[r/M], if r/M = integer;
0, otherwise.
Proof. The assumption N = ML implies that x contains an exact number
of copies of s. We know that
s =
1
L
V
L
S
189
0
π/6
π/3
π/2
2π/3
5π/6
π
7π/6
4π/3
3π/2
5π/3
11π/6
Figure 3.9: Fourier frequencies for a 4-point signal (◦) and the
remaining Fourier frequencies for a 12-point signal ().
where V
L
is the matrix of Fourier sinusoids for an L-point signal. By repli-
cating the above equation M times, we obtain
_
¸
¸
¸
_
s
s
.
.
.
s
_
¸
¸
¸
_
=
1
L
_
¸
¸
¸
_
V
L
V
L
.
.
.
V
L
_
¸
¸
¸
_
S (3.16)
The N 1 vector on the left-hand side of the equation is simply x. The k
th
column of the N L matrix on the right-hand side is the N-point periodic
extension of
e
j(2π/L)kn
, n = 0, . . . L −1
which is the k
th
Fourier sinusoid for an L-point signal. Since for every integer
p,
e
j(2π/L)kn
= e
j(2π/L)k(n+pL)
it follows that the k
th
column of the N L matrix is simply given by
e
j(2π/L)kn
= e
j(2π/N)kMn
, n = 0, . . . N −1
But this is just the (kM)
th
Fourier sinusoid for an N-point signal, i.e., the
(kM)
th
column of the matrix V
N
. Thus (3.16) implies that x is a linear
combination of no more than L Fourier sinusoids. Not surprisingly, these
sinusoids have the same frequencies as the Fourier components of s.
190
We have therefore established that X[r] = 0 if r is not a multiple of M.
To determine the value of X[kM] for k = 0, . . . , L − 1, we compare (3.16)
with the synthesis equation
x =
1
N
V
N
X
and obtain the identity
1
L
_
¸
¸
¸
_
V
L
V
L
.
.
.
V
L
_
¸
¸
¸
_
S =
1
N
V
N
X
This can only be true if X[kM] = (N/L)S[k] = MS[k] for k = 0, . . . ,
L −1.
Example 3.7.1. If
s =
_
a b c d
¸
T
has DFT
S =
_
A B C D
¸
T
,
then
x =
_
a b c d a b c d a b c d
¸
T
has DFT
X =
_
3A 0 0 3B 0 0 3C 0 0 3D 0 0
¸
T

In summary, the DFT allows us to express an L-point time-domain signal
s as a linear combination of L sinusoids at frequencies which are multiples
of 2π/L. The same L sinusoids, with the same coeﬃcients, could be used to
represent the N-point periodic extension x of s. This representation would
not, in general, be consistent with the one provided by Fourier analysis (i.e.,
the DFT) of x; this is because the Fourier frequencies for s may not form a
subset of the Fourier frequencies for x. In the special case where x consists of
an integer number of copies of s, then all the original frequencies for s will be
Fourier frequencies for x, and x can be represented as a linear combination
of L Fourier sinusoids only; none of the remaining N −L Fourier sinusoids
will appear in x.
191
3.7.3 Zero-Padding to a Multiple of the Signal Length
We now turn to the zero-padded extension y of s in the case where N = ML.
We have the following relationship between the spectra Y and S.
Fact. If N = ML, then the DFT of an L-point signal s can be obtained
from the DFT of its N-point zero-padded extension y by sampling:
S[k] = Y [kM] , k = 0, . . . , L −1
Proof. As we noted earlier, the frequency ω = k(2π/L) is both the k
th
Fourier frequency for s and the (kM)
th
Fourier frequency for y; the corre-
sponding DFT values are S[k] and Y [kM], respectively. We have
Y [kM] =
N−1

n=0
y[n]e
−j(2π/N)kMn
=
L−1

n=0
s[n]e
−j(2π/L)kn
= S[k]
where the ﬁrst equality follows from the deﬁnition of y (i.e., the fact that y[n]
coincides with s[n] for the ﬁrst L time indices and equals zero thereafter).
Example 3.7.2. If
y =
_
a b c d 0 0 0 0 0 0 0 0
¸
T
has DFT
Y =
_
Y
0
Y
1
Y
2
Y
3
Y
4
Y
5
Y
6
Y
7
Y
8
Y
9
Y
10
Y
11
¸
T
then
s =
_
a b c d
¸
T
has DFT
S =
_
Y
0
Y
3
Y
6
Y
9
¸
T

The DFT of a ﬁnite-length signal vector can thus be obtained by sam-
pling the DFT of its zero-padded extension, provided the number of ap-
pended zeros is an integer multiple of the (original) signal length.
192
3.8 Detection of Sinusoids Using the Discrete Fourier
Transform
3.8.1 Introduction
An important application of signal analysis is the identiﬁcation of diﬀerent
components present in a signal. For example, given a musical recording, we
may be interested in identifying all instruments being played at a particular
time instant. If the recording contains vocals, we may also want to count
and, if possible, identify the diﬀerent voices that are being heard. With
minimal training, the human ear can perform most of the above mentioned
more detailed information about the content of the signal: it can identify
speciﬁc notes (i.e., frequencies), and can detect whether an instrument is
out of tune.
Of course, the human auditory system cannot yield quantitative mea-
sures of signal parameters, nor can it process anything other than acoustic
signals in the audible frequency band (from approximately 20 Hz to 20
kHz). (Analogous statements can be made about the human eye, which is
particularly adept at identifying visual patterns.) Signal processing, which
is the quantitative analysis of signals, allows us to identify and separate es-
sential components of a signal based on its numerical samples (rather than
its perceptual attributes). In many cases, these tasks can be carried out
automatically, i.e., with little or no human intervention.
Sinusoids comprise a particularly important class of signals, since they
arise in a diverse class of physical systems which exhibit linear properties.
Such systems were introduced in Chapter 2 and will be the focus of our
discussion in Chapter 4. In many applications, it is desirable to identify
and isolate certain sinusoidal components present in a signal. For example,
in restoring a musical recording that has been corrupted by static noise or
deterioration of the recording medium, one would be interested in identifying
the note being played during a noisy segment of the piece. A note consists of
sinusoids at multiples of a certain fundamental frequency; identifying those
sinusoids would be an essential step towards removing the noise from the
signal. Sinusoids could also represent unwanted components of a signal, e.g.,
interference from a power supply at 50 Hz.
The discrete Fourier transform is well-suited for the detection of sinusoids
in a discrete-time signal, since it provides a decomposition of the signal into
sinusoidal components. For an N-point signal, these components are at the
so-called Fourier frequencies, which are the integer multiples of 2π/N.
193
First, let us consider a vector consisting of N uniform samples of a real-
valued sinusoid of frequency Ω
0
rad/sec, obtained at a rate of f
s
samples
per second. If ω
0
= Ω
0
/f
s
happens to be a Fourier frequency for sample size
N, then the N-point DFT will will contain exactly two nonzero entries at
frequencies ω
0
and 2π−ω
0
, the remaining entries being zero. The frequency
ω
0
(hence also Ω
0
) can be determined by inspection of the DFT, provided
that no aliasing has occurred. Also, the amplitude and phase of the sinusoid
can be easily obtained from the corresponding entries of the DFT vector,
and the continuous-time signal can be reconstructed perfectly.
Example 3.8.1. The continuous-time sinusoid
x(t) = 5 cos(600πt + 3π/4) , t ∈ R
is sampled at a rate of 1,000 samples/sec to yield the discrete-time sinusoid
x[n] = 5 cos(0.6πn + 3π/4) , n ∈ N
If s = x[0 : N − 1] (using MATLAB notation for indices), then ω
0
= 0.6π
(i.e., the frequency of x[]) is a Fourier frequency for s provided
k

N
= 0.6π
or equivalently,
N =
10k
3
for some integer k. For example, if N = 20, then ω
0
is the k = 6
th
Fourier
frequency for s. The resulting amplitude and phase spectra are shown in
the ﬁgure.
In general, the frequency ω
0
will not be an exact Fourier frequency.
In that case, the DFT will analyze the sinusoidal signal into N complex
sinusoidal components at frequencies unrelated to ω
0
, and as a result, it will
not exhibit the features seen in the graphs of Example 3.8.1. The question
is whether we can still use a DFT-based method to estimate the frequency
ω
0
3.8.2 The DFT of a Complex Sinusoid
Let s be the L-point complex sinusoid given by
s[n] = e

0
n
, n = 0, . . . , L −1
194
|S[k]|
0 6 10 14 19
50 50
S[k]
3π/4
−3π/4
Example 3.8.1
where ω
0
is a ﬁxed frequency. Let v represent a complex sinusoid of variable
frequency ω:
v[n] = e
jωn
, n = 0, . . . , L −1
The inner product ¸v, s) is a function of ω. Its value at ω = 2kπ/L (where
k is integer) can be computed as the the k
th
entry in the DFT of s:
¸v, s) =
L−1

n=0
v

[n]s[n] =
L−1

n=0
s[n]e
−j(2π/L)kn
= S[k]
Similarly, the inner product ¸v, s) at ω = 2kπ/N, where N > L, can be
computed via the DFT of s zero-padded to total length N. As N increases,
the spacing 2π/N between consecutive Fourier frequencies decreases to zero,
and the resulting DFT provides a very dense sampling of ¸v, s) over the entire
frequency range [0, 2π).
Let us evaluate ¸v, s) at frequency ω. (This was done in Subsection 3.1.3
for the special case where both ω and ω
0
are Fourier frequencies, in order
to establish the orthogonality of the complex sinusoids.)
For ω = ω
0
, we have v = s and thus
¸v, s) = |s|
2
= L
For ω ,= ω
0
, we have
¸v, s) =
L−1

n=0
v

[n]s[n]
195
=
L−1

n=0
e
−jωn
e

0
n
=
L−1

n=0
e
j(ω
0
−ω)n
=
1 −e
jL(ω
0
−ω)
1 −e
j(ω
0
−ω)
where the last equality is the familiar expression for the geometric sum.
(Note that the denominator cannot be equal to zero, since ω ,= ω
0
and both
frequencies are assumed to be in [0, 2π).) The inner product is therefore
complex. It can be written as the product of a real term and a complex
term of unit magnitude by ﬁrst applying the identity 1 −z
2
= z(z
−1
−z) to
both numerator and denominator:
1 −e
jL(ω
0
−ω)
1 −e
j(ω
0
−ω)
=
e
jL(ω
0
−ω)/2
e
j(ω
0
−ω)/2

e
−jL(ω
0
−ω)/2
−e
jL(ω
0
−ω)/2
e
−j(ω
0
−ω)/2
−e
j(ω
0
−ω)/2
and then recalling that e

−e
−jθ
= 2j sinθ. The resulting expression is
¸v, s) = e
−j(L−1)(ω−ω
0
)/2

sin(L(ω −ω
0
)/2)
sin((ω −ω
0
)/2)
= e
−j(L−1)(ω−ω
0
)/2
T
L
(ω −ω
0
) (3.17)
where the function T
L
() is deﬁned by
T
L
(θ) =
sin(Lθ/2)
sin(θ/2)
Since
¸
¸
e
−j(L−1)(ω−ω
0
)/2
¸
¸
= 1, we have that
[¸v, s)[ = [T
L
(ω −ω
0
)[ =
¸
¸
¸
¸
sin(L(ω −ω
0
)/2)
sin((ω −ω
0
)/2)
¸
¸
¸
¸
(3.18)
Evaluated at ω = 2kπ/L, the above expression gives the k
th
entry in
the amplitude spectrum of s, i.e., [S[k][. Evaluated at ω = 2kπ/N (where
N > L), it yields the k
th
entry in the amplitude spectrum of the N-point
The absolute value of the function T
L
(θ) is shown in Figure 3.10 for
L = 6 and L = 9, in each case for θ varying over (−2π, 2π) (which includes
the range of values of ω −ω
0
in (3.17)).
We make the following key observations about the function T
L
(θ):
196
−0.5 0 0.5 1
0
2
4
6
8
10
θ/2π
|D
6
(θ)|
−0.5 0 0.5 1
0
2
4
6
8
10
θ/2π
|D
9
(θ)|
Figure 3.10: The function [T
L
(θ)[ for L = 6 (left) and L = 9
(right).
• At θ = 0, both the numerator and the denominator in the deﬁnition
of T
L
(θ) equal 0. T
L
(0) is then deﬁned as the limit of T
L
(θ) as θ
approaches zero, which equals L. This also gives the correct result for
the inner product when ω = ω
0
.
• T
L
(θ) is periodic with period 2π.
• T
L
(θ) = 0 for all values of θ which are integer multiples of 2π/L,
except those which are also integer multiples of 2π (T
L
(2πk) = L).
• In each period, the graph of T
L
(θ) contains one main lobe of width
4π/L and height L; and L − 2 side lobes of width 2π/L and varying
height. The ﬁrst side lobe is the tallest one (in absolute value), its
height being approximately 2/3π that of the main one.
We conclude this lengthy discussion with the following observation.
Fact. Let ω
0
be a ﬁxed frequency in [0, 2π) and ω be a variable frequency in
the same interval. Then the magnitude of the inner product ¸v, s), where
s[n] = e

0
n
and v[n] = e
jωn
, n = 0, . . . , L −1
achieves its unique maximum (as ω varies) when ω = ω
0
. Thus if the
frequency ω
0
of a given complex sinusoid s is not known, it can be estimated
with arbitrary accuracy by locating the maximum value of [¸v, s)[, computed
for a suﬃciently dense set of frequencies ω in [0, 2π).
197
This fact follows from (3.18):
[¸v, s)[ = [T
L
(ω −ω
0
)[
For ﬁxed ω
0
, the diﬀerence θ = ω − ω
0
lies in the interval [2π − ω
0
, −ω
0
).
This is a subinterval of (−2π, 2π) which includes the origin. The maximum
value of [T
L
(θ)[ over that interval will be achieved at θ = 0, as illustrated
in Figure 3.10. Thus the maximum value of [¸v, s)[ = [T
L
(ω − ω
0
)[ occurs
at ω = ω
0
, and equals L.
3.8.3 Detection of a Real-Valued Sinusoid
As we showed in the previous subsection, the frequency of a L-point complex-
valued sinusoid
s[n] = e

0
n
, n = 0, . . . , L −1
can be determined by zero-padding the signal to obtain a reasonably dense
set of Fourier frequencies in [0, 2π), then locating the maximum of the am-
plitude spectrum.
A real-valued sinusoid is the sum of two complex-valued sinusoids at
conjugate frequencies:
Acos(ω
0
n +φ) =
A
2
e
j(ω
0
n+φ)
+
A
2
e
−j(ω
0
n+φ)
We assume that ω
0
lies in [0, π], which is the eﬀective frequency range for
real sinusoids. In most practical situations, the observed signal vector will
have other components, as well. We thus model it as
x[n] = Acos(ω
0
n +φ) +r[n] , n = 0, . . . , L −1
where the remaining components of x are collectively represented by r. We
then have
x =
A
2
e

s +
A
2
e
−jφ
s

+r
We argue that a satisfactory estimate of frequency ω
0
can be obtained
from x using the method developed earlier for the complex exponential s.
Again, let
v[n] = e
jωn
, n = 0, . . . , L −1
be the variable-frequency sinusoid used in the computation of the DFT, and
consider the inner product
¸v, x) =
A
2
e

¸v, s) +
A
2
e
−jφ
¸v, s

) +¸v, r)
198
which yields the DFT of x and its zero-padded extensions by appropriate
choice of ω.
We have already studied the behavior of the ﬁrst two terms on the right-
hand side. In particular, we know that
[¸v, s)[ = [T
L
(ω −ω
0
)[ =
¸
¸
¸
¸
sin(L(ω −ω
0
)/2)
sin((ω −ω
0
)/2)
¸
¸
¸
¸
and
[¸v, s

)[ = [T
L
(ω +ω
0
)[ =
¸
¸
¸
¸
sin(L(ω +ω
0
)/2)
sin((ω +ω
0
)/2)
¸
¸
¸
¸
As long as ω
0
is not too close to 0 or π, the side lobes of [¸v, s

)[ will not
interfere signiﬁcantly with the main lobe of [¸v, s)[, and the location of the
maximum of
[¸v, s) +¸v, s

)[
over the range [0, π] will be negligibly diﬀerent from ω
0
.
If, similarly, [¸v, r)[ does not vary signiﬁcantly near ω
0
, where “signiﬁ-
cant” is understood relatively to the height and curvature of the main lobe
of (A/2)[¸v, s)[, then the maximum of [¸v, x)[ over the range [0, π] will occur
very close to the unknown frequency ω
0
.
Example 3.8.2. Consider the 16-point signal
x[n] = 5.12 cos(2π(0.3221)n + 1.39) +r[n]
Here r[n] is white Gaussian noise with mean zero and standard deviation 0.85
(roughly a sixth of the amplitude of the sinusoid). In the current notation,
we have A = 5.12, ω
0
= 2π(0.3221) and φ = 1.39.
The ﬁgure shows plots of (A/2)[¸v, s)[, (A/2)[¸v, s

)[, [¸v, r)[ and [¸v, x)[,
all computed using 256-point DFT’s, against cyclic frequency f = ω/2π
(cycles/sample). Note that [¸v, r)[ and [¸v, x)[ are symmetric about ω = π
(or f = 1/2), since x and r are real-valued vectors. Both [¸v, s)[ and [¸v, x)[
achieve their maximum over the range 0 ≤ ω ≤ π at ω = 2π(0.3202), which
corresponds to frequency index k = 82 in the 256-point DFT. Additional
zero-padding would give an estimate closer to the true value 2π(0.3221).
Conclusion. If a signal vector x has a single strong sinusoidal component,
then the frequency of that component can be estimated reasonably accu-
rately from the position of the maximum in the left half of the amplitude
spectrum, where the DFT is computed after padding the signal with suﬃ-
ciently many zeros.
199
0 0.2 0.4 0.6 0.8 1
0
10
20
30
40
50
(A/2).|<v, s>|
0 0.2 0.4 0.6 0.8 1
0
10
20
30
40
50
(A/2).|<v, s
*
>|
0 0.2 0.4 0.6 0.8 1
0
10
20
30
40
50
|<v, r>|
0 0.2 0.4 0.6 0.8 1
0
10
20
30
40
50
|<v, x>|
Example 3.8.2
3.8.4 Final Remarks
The methodology discussed above can be extended to a signal containing
two or more strong sinusoidal components. Simply put, as long as these
components are well-separated on the frequency axis, their (unknown) fre-
quencies can be estimated well by locating maxima on the left half of the
amplitude spectrum. Good separation in frequency typically means that the
main lobes corresponding to diﬀerent components are suﬃciently far apart.
Since the main lobe width is 4π/L, the only way to obtain sharper peaks is
to increase L, i.e., take more samples of the signal. Increasing the number
of points N in the DFT does not solve the problem, since lobe width does
not depend on N.
Finally, we should note that it is possible to improve the detection and
frequency estimation technique outlined above (in the case where multiple
sinusoids may be present) by scaling the values of the data vector x using
200
certain standard time-domain functions (known as windows) before com-
puting the DFT. As a result of this preprocessing, the height ratio between
the main lobe and the side lobes is increased by orders of magnitude, at the
expense of only a moderate increase in main lobe width. Such techniques
provide better resolution for sinusoids that are close in frequency.
201
Problems
Section 3.1
P 3.1. Let
α =
1
2
and β =

3
2
(i) Determine a complex number z such that the vector
v =
_
1 α +jβ −α +jβ −1 −α −jβ α −jβ
¸
T
equals
_
1 z z
2
z
3
z
4
z
5
¸
T
(ii) If
s =
_
3 2 −1 0 −1 2
¸
T
determine the least-squares approximation ˆs of s in the form of a linear
combination of 1 (i.e., the all-ones vector), v and v

. Clearly show the
numerical values of the elements of ˆs.
P 3.2. Let v
(0)
, v
(1)
and v
(7)
denote the complex Fourier sinusoids of length
N = 8 at frequencies ω = 0, ω = π/4 and ω = 7π/4, respectively.
Determine the least-squares approximation ˆs of
s =
_
4 3 2 1 0 1 2 3
¸
based on v
(0)
, v
(1)
and v
(7)
. Compute the squared approximation error
|ˆs −s|
2
.
P 3.3. Consider the eight-point vector
x =
_
2

2 −1 1 −2

2 −1 5 −2

2 −1 1 2

2 −1 −3
¸
T
(i) Show that x contains only three Fourier frequencies, i.e., it is the linear
combination of exactly three Fourier sinusoids v
(k)
. Determine the coeﬃ-
cients of these sinusoids in the expression for x.
(ii) Find an equivalent expression for x in terms of real-valued sinusoids of
the form cos(ωn +φ), where 0 ≤ ω ≤ π.
202
P 3.4. The columns of the matrix
V =
_
¸
¸
_
1 1 1 1
1 j −1 −j
1 −1 1 −1
1 −j −1 j
_
¸
¸
_
are the complex Fourier sinusoids of length N = 4.
Express the vector
s =
_
1 4 −2 5
¸
T
as a linear combination of the above sinusoids. In other words, ﬁnd a vector
c = [c
0
c
1
c
2
c
3
]
T
such that s = Vc.
Section 3.2
P 3.5. Let u = e
j(2π/9)
and z = e
j(π/5)
.
(i) Write out the entries of the DFT matrix W
9
(corresponding to a nine-
point signal) using the real number 1 and complex numbers u
m
, where m is
a nonzero integer between −4 and 4.
(ii) Write out the entries of the IDFT matrix V
10
(corresponding to a ten-
point signal) using real numbers 1, −1 and complex numbers z
m
, where m
is a nonzero integer between −4 and 4.
P 3.6. (i) Sketch the six-point vectors x and y deﬁned by
x[n] =
_
1, n = 2, 4
0, otherwise;
y[n] = cos(2πn/3)
(ii) Prove that
y =
v
(2)
+v
(4)
2
where v
(k)
is the k
th
Fourier sinusoid for N = 6.
(iii) Compute the DFT’s X and Y.
Observe the similarities between x and Y, as well as between y and X.
203
Section 3.3
P 3.7. A ﬁve-point real-valued signal x has DFT given by
X =
_
4 1 +j 3 −j z
1
z
2
¸
T
(i) Compute x[0] +x[1] +x[2] +x[3] +x[4] using one entry of X only.
(ii) Determine the values of z
1
and z
2
.
(iii) Compute the amplitude and phase spectra of x, displaying each as a
vector.
(iv) Express x[n] as a linear combination of three real-valued sinusoids.
P 3.8. Consider the real-valued signal x given by
x[n] = 3(−1)
n
+ 7 cos
_
πn
4
+ 1.2
_
+ 2 cos
_
πn
2
−0.8
_
, n = 0, . . . , 7
(i) Which Fourier frequencies (for an eight-point sample) are present in the
signal x?
(ii) Determine the amplitude spectrum of x, displaying your answer in the
form
_
A
0
A
1
A
2
A
3
A
4
A
5
A
6
A
7
¸
T
(iii) Determine the phase spectrum of x, displaying your answer in the form
_
φ
0
φ
1
φ
2
φ
3
φ
4
φ
5
φ
6
φ
7
¸
T
P 3.9. Consider the N-point time-domain signal x given by
x[n] = r
n
, n = 0, . . . , N −1
where r is a real number other than −1 or +1.
(i) Using the formula for the geometric sum, show that the DFT X of x is
given by
X[k] =
1 −r
N
1 −rw
k
, k = 0, . . . , N −1
where, as usual, w = e
−j(2π/N)
.
(ii) Using the formula [z[
2
= zz

, show that the square amplitude spectrum
is given by
[X[k][
2
=
(1 −r
N
)
2
1 +r
2
−2r cos(2kπ/N)
, k = 0, . . . , N −1
204
(iii) For the case r = 0.7 and N = 16, use MATLAB to produce a (discrete)
plot of the square amplitude spectrum in the equation above. Compare your
N=16; r = 0.7;
n=(0:N-1)’;
x = r.^n;
A = abs(fft(x));
bar(n,A.^2)
Section 3.4
P 3.10. Let
x =
_
2 1 −1 −2 −3 3
¸
T
Display (as vectors) and sketch the following signals:
• x
(1)
= Px
• x
(2)
= P
5
x
• x
(3)
= Px +P
5
x
• x
(4)
= x +Rx (Note the symmetry.)
• x
(5)
= x −Rx (Note the symmetry.)
• x
(6)
= F
3
x
• x
(7)
= (I −F
3
)x
P 3.11. The time-domain signal
x =
_
a b c d e f
¸
T
has DFT
X =
_
A B C D E F
¸
T
Using the given parameters, and deﬁning
λ =
1
2
and µ =

3
2
205
for convenience, write out the components of the DFT vector X
(r)
for each
of the following time-domain signals x
(r)
.
x
(1)
=
_
a −b c −d e −f
¸
T
x
(2)
=
_
a 0 c 0 e 0
¸
T
x
(3)
=
_
a f e d c b
¸
T
x
(4)
=
_
d e f a b c
¸
T
x
(5)
=
_
b a f e d c
¸
T
x
(6)
=
_
f +b a +c b +d c +e d +f e +a
¸
T
x
(7)
=
_
0 µb µc 0 −µe −µf
¸
T
x
(8)
=
_
A B C D E F
¸
T
P 3.12. A real-valued eight-point signal vector
x =
_
x
0
x
1
x
2
x
3
x
4
x
5
x
6
x
7
¸
T
has DFT X given by
X =
_
4 5 −j −1 + 3j −2 −7 S
5
S
6
S
7
¸
T
Without inverting X, compute the numerical values in the DFT Y of
y =
_
2x
0
x
1
+x
7
x
2
+x
6
x
3
+x
5
2x
4
x
5
+x
3
x
6
+x
2
x
7
+x
1
¸
T
P 3.13. An eight-point signal x has DFT
X =
_
0 0 0 −1 2 −1 0 0
¸
T
Without inverting X, compute the DFT Y of y, which is given by the
equation
y[n] = x[n] cos
_
πn
4
_
, n = 0, . . . , 7
206
Section 3.5
P 3.14. Consider the signal x shown in the ﬁgure (on the left). Its spectrum
is given by
X =
_
A B C D E F G H
¸
T
(i) The DFT vector shown above contains duplicate values. What are those
values?
(ii) Express the DFT Y of the signal y (shown on the right) in terms of the
entries of X.
x y
4
3
2
1
0
1
2
3
4
3 3
2 2
1 1
0
Problem P 3.14
P 3.15. Consider the twelve-point vectors x, y and s shown in the ﬁgure. If
the DFT X is given by
X =
_
X
0
X
1
X
2
X
3
X
4
X
5
X
6
X
7
X
8
X
9
X
10
X
11
¸
T
express the DFT’s Y and S in terms of the entries of X.
207
0 1 2 3 4 5 6 7 8 9 10 11
y
1
0 1 2 3 5 7 9 10 11
s
1
-1
0 1 2 3 4 5 6 7 8 9 10 11
x
1
Problem P 3.15
P 3.16. Run the MATLAB script
n = (0:63)’;
X =[ones(11,1); zeros(43,1); ones(10,1)];
bar(X), axis tight
max(imag(ifft(X))) % See (i) below
x = real(ifft(X));
bar(x)
cs = cos(3*pi*n/4); % See (ii) below
y = x.*cs;
bar(y);
max(imag(fft(y))) % See (iii) below
Y = real(fft(y));
bar(Y) % See (iv) below
(i) Why is this value so small?
(ii) Is this a Fourier sinusoid for this problem?
(iii) Why is this value so small?
(iv) Derive the answer for Y analytically, i.e., based on known properties of
the DFT.
208
P 3.17. The energy spectrum of the signal vector x is deﬁned as the square
of the amplitude spectrum, namely [X[k][
2
(for k = 0, . . . , N −1).
Show that the total energy of the signal vector x equals the sum of the
entries in the energy spectrum divided by N, i.e.,
N−1

n=0
[x[n][
2
=
1
N
N−1

k=0
[X[k][
2
This relationship, known as Parseval’s identity, can be also written as
|x|
2
=
1
N
|X|
2
(and is easier to prove in this form). In geometric terms, this result is
consistent with the fact that the length of a vector can be computed (via
the Pythagorean theorem) using the projections of that vector on any or-
thornormal set of reference vectors.
Section 3.6
P 3.18. Determine the circular convolution s = x y, where
x =
_
1 2 3 4
¸
T
and y =
_
a b c d
¸
T
Also, express s as My, where M is a 4 4 matrix of numerical values.
P 3.19. Consider two time-domain vectors x and y whose DFT’s are given by
X =
_
1 −2 3 −4
¸
T
and Y =
_
A B C D
¸
T
Without explicitly computing x or y, determine the DFT of their element-
wise product
s[n] = x[n]y[n], n = 0, 1, 2, 3
P 3.20. The time-domain signals x and y have DFT’s
X =
_
1 0 1 −1
¸
T
and
Y =
_
3 5 8 −4
¸
T
(i) Is either x or y real-valued?
209
(ii) Does either x or y have circular conjugate symmetry?
(iii) Without inverting X or Y, determine the DFT of the signal s
(1)
deﬁned
by
s
(1)
[n] = x[n]y[n] , n = 0, 1, 2, 3
(iv) Without inverting X or Y, determine the DFT of the signal s
(2)
deﬁned
by
s
(2)
= x y
P 3.21. The time-domain signals
x =
_
2 0 1 3
¸
T
and
y =
_
1 −1 2 −4
¸
T
have DFT’s X and Y given by
X =
_
X
0
X
1
X
2
X
3
¸
T
and
Y =
_
Y
0
Y
1
Y
2
Y
3
¸
T
Determine the time-domain signal s whose DFT is given by
S =
_
X
0
Y
2
X
1
Y
3
X
2
Y
0
X
3
Y
1
¸
T
P 3.22. The circular cross-correlation of two N-point vectors x and y is the
N-point vector s deﬁned by
s[n] = ¸P
n
y, x) = x
T
P
n
y

, n = 0, . . . , N −1
(i) Show that s also equals x Ry

.
(ii) Using (i) above, show that the DFT S of s is given by
S[k] = X[k]Y

[k], k = 0, . . . , N −1
(iii) Show that in the special case where x = y, S is just the energy spectrum
of x, i.e.,
S[k] = [X[k][
2
, k = 0, . . . , N −1
What symmetry properties does s have in this case?
210
(iv) Compute s and S for
x = y =
_
1 +j 2 1 −j 0
¸
T
Section 3.7
P 3.23. The signal
x =
_
a b c d 0 0 0 0
¸
T
has DFT X given by
X =
_
A B C D E F G H
¸
T
Express the following DFT’s in terms of the entries of X:
(i) The DFT Y of
y =
_
0 0 0 0 a b c d
¸
T
(ii) The DFT S of
s =
_
a b c d a b c d
¸
T
P 3.24. The DFT of the signal
x =
_
1 −1 1 −1 0 0 0 0
¸
T
is given by
X =
_
X
0
X
1
X
2
X
3
X
4
X
5
X
6
X
7
¸
T
(i) What are the values of X
0
, X
2
, X
4
and X
6
?
(ii) Display the time-domain signal y whose DFT is given by
Y =
_
X
0
0 0 X
2
0 0 X
4
0 0 X
6
0 0
¸
T
P 3.25. (i) The signal x has spectrum (DFT)
X = [A 0 0 B 0 0 C 0 0 D 0 0]
T
What special properties does x have?
(ii) The signal y has spectrum
Y = [A B C D A B C D A B C D]
T
What special properties does y have?
211
P 3.26. The twelve-point signal
x =
_
a b c 0 0 0 0 0 0 0 0 0
¸
T
has DFT X given by
X =
_
X
0
X
1
X
2
X
3
X
4
X
5
X
6
X
7
X
8
X
9
X
10
X
11
¸
T
Express the following DFT’s in terms of the entries of X:
(i) The DFT Y of
y =
_
a b c 0 0 0
¸
T
(ii) The DFT S of
s =
_
a b c a b c a b c a b c
¸
T
P 3.27. In MATLAB notation, consider the 4-point vector
s = [a b c d].’
x = [s ; zeros(12,1)]
Let X denote the DFT of x. Express the DFT’s of the following vectors
using the entries of X:
x1 = s
x2 = [ s ; s ]
x3 = [ s ; s ; s ; s ; s ] % length=20
x4 = [ s ; z4 ]
x5 = [ z4 ; s ]
x6 = [ s ; z4 ; s ; z4 ]
x7 = [ s ; s ; z4 ; z4 ]
where z4 = zeros(4,1).
P 3.28. Consider a real-valued vector s of length L, and deﬁne a vector x of
length 2L + 1 by
x =
_
_
a
s
Qs
_
_
where Q is the linear reversal matrix and a ∈ R.
212
Let y be the vector obtained by padding s with L+1 zeros. Show that the
DFT X of x has the following expression in terms of the DFT Y of y and
the complex number w = e
−j2π/(2L+1)
:
X[k] = a + 2 'e
_
w
k
Y [k]
_
, k = 0, . . . , 2L
Section 3.8
P 3.29. A continuous-time signal consists of two sinusoids at frequencies f
1
and f
2
(Hz). The signal is sampled at a rate of 500 samples/sec (assumed to
be greater than the Nyquist rate), and 32 consecutive samples are recorded.
The ﬁgure shows the graph of magnitude of the 32-point DFT (i.e., without
zero-padding) as a function of the frequency index k.
0 4 8 12 16 20 24 28 31
126
84
126
84
Problem P 3.29
(i) What are the frequencies f
1
and f
2
(in Hz)?
(ii) Is it possible to write an equation for the continuous-time signal based
on the information given? If so, write that equation. If not so, explain why.
P 3.30. A continuous-time signal is given by
x(t) = A
1
cos(2πf
1
t +φ
1
) +A
2
cos(2πf
2
t +φ
2
) +z(t)
where z(t) is noise. The signal x(t) is sampled every 1.25 ms for a total of
80 samples. The ﬁgure shows the graph of the DFT of the 80-point signal
as a function of the frequency index k = 0, . . . , 79.
Based on the given graph, what are your estimates of f
1
and f
2
?
(You may assume that no aliasing has occurred, i.e., the sampling rate of
800 samples/sec is no less than twice each of f
1
and f
2
.)
213
0 20 40 60 80
0
50
100
150
200
250
Problem P 3.30
P 3.31. A continuous-time signal is a sum of two sinusoids at frequencies 164
Hz and 182 Hz. 200 samples of the signal are obtained at the rate of 640
samples/sec, and the DFT of the samples is computed.
(i) What frequencies ω
1
= 2πf
1
and ω
2
= 2πf
2
are present in the discrete-
time signal obtained by sampling at the above rate?
(ii) Of the 200 Fourier frequencies in the DFT, which two are closest to
ω
1
= 2πf
1
and ω
2
= 2πf
2
?
(iii) If the 200 samples are padded with M zeros and the (M + 200)-point
DFT is computed, what would be the least value of M for which both ω
1
and ω
2
are Fourier frequencies?
P 3.32. A musical note is a pure sinusoid of a speciﬁc frequency known as
fundamental frequency. Played on an instrument, notes are “colored” by
the introduction of harmonics, namely sinusoids having frequencies which
are exact multiples of the fundamental.
An instrument playing a 330 Hz note is recorded digitally over a time interval
of duration 50 ms at a sampling rate of 46,200 samples/sec, yielding 2,310
samples. It is assumed that the sampling rate exceeds the Nyquist rate; this
means that there are no harmonics at frequencies k(330) Hz for k ≥ 70.
(i) Does the fundamental frequency of 330 Hz correspond to a Fourier fre-
quency for the 2,310-point sample vector? (Note that if it is a Fourier fre-
quency, its harmonics will be Fourier frequencies also.) If it is not a Fourier
214
frequency, what is the largest value of N less than or equal to 2,310 that
would make it a Fourier frequency?
(ii) (MATLAB) The vector s3 contains the 2,310 samples of the note, where
each sample is distorted by a small amount of noise. Take the ﬁrst N entries
of that vector, where N was the answer to part (i), and compute their DFT.
Determine the total number of harmonics (positive frequencies only, and
excluding the fundamental) which are within 40 dB of the fundamental i.e.,
the ratio of their amplitude to that of the fundamental is no less than 1%.
P 3.33. The 64-point vector s4 was generated in MATLAB using
n = (0:63).’;
s4 = A*cos(2*pi*f1*n+q) + z;
The parameters A, f1 (between 0 and 1/2) and q were previously speciﬁed.
The vector z, which represents noise, was also generated earlier.
(i) Compute the DFT of s4 extended by zero-padding to N = 1000 points:
X = fft(s4,1000)
Obtain an estimate of f1 by locating the maximum value of abs(X). Plot
abs(X) against cyclic frequency f. (Note: f is related to the Fourier fre-
quency index k by f = k/N.)
(ii) Obtain an estimate of the phase q using the estimate f1 and the follow-
ing script:
p = [];
for r = 2*pi*(0:0.002:1)
s = cos(2*pi*f1*n+r);
p = [p; s4’*s];
end
(The maximum of the inner product s4’*s occurs for a value of r close
to the true value of the phase shift q.) Finally, obtain an estimate of the
amplitude A.
P 3.34. The 128-point vector s5 consists of three sinusoids of cyclic frequencies
f
1
, f
2
and f
3
(where 0 ≤ f
1
< f
2
< f
3
≤ 1/2) plus a small amount of noise.
Plot s5 against time. Use zero-padding to N = 2048 points and follow the
same technique as in part (i) of Problem 3.33 to obtain estimates of f
1
, f
2
and f
3
.
Chapter 4
Introduction to Linear
Filtering
215
216
4.1 The Discrete-Time Fourier Transform
4.1.1 From Vectors to Sequences
We will now discuss how the DFT tools developed in the previous chapter
can be adapted for the analysis of discrete-time signals of inﬁnite length,
also known as sequences. The index set (i.e., time axis) for such a sequence
x will consist of all integers:
x = ¦x[n], n ∈ Z¦
As in MATLAB, we will use the notation
x[n
1
: n
2
] =
_
x[n
1
] . . . x[n
2
]
¸
T
to denote a segment of the sequence x corresponding to time indices n
1
through n
2
. The length of the segment is ﬁnite provided both n
1
and n
2
are
ﬁnite.
In order to express x as a linear combination of sinusoids of inﬁnite
length, it is important to understand how the DFT of a signal vector s
behaves as the length N of s approaches inﬁnity. To that end, we note the
following:
• The analysis equation
S[k] =
N−1

n=0
s[n]e
−j(2π/N)kn
involves a sum over time indices 0 through N−1. This clearly becomes
an inﬁnite sum in the limit as N → ∞. A similar expression for the
inﬁnite sequence x would have to include both positive and negative
time indices, which would range from n = −∞ to n = +∞.
• The synthesis equation
s[n] =
1
N
N−1

k=0
S[k]e
j(2π/N)kn
is a sum over frequency indices k = 0 through k = N − 1. The cor-
responding radian frequencies are ω = 0 through ω = 2π − (2π/N)
in increments of 2π/N. As N → ∞, the spacing between consecutive
frequencies shrinks to zero, and thus every frequency ω in [0, 2π) be-
comes a Fourier frequency. The continuous-index version of a sum is
217
an integral; thus as N → ∞, one expects the synthesis equation to
involve an integral over all frequencies in [0, 2π), i.e.,
N−1

k=0

_
ω=2π
ω=0

In brief, both the analysis and synthesis equations are expected to take on
diﬀerent forms as N →∞.
4.1.2 Development of the Discrete-Time Fourier Transform
To develop the exact form of the analysis and synthesis equations for the
sequence x, consider the segment
x
(L)
= x[−L : L] =
_
x[−L] . . . x[L]
¸
T
which has length N = 2L+1. The inﬁnite sequence x is obtained by letting
L →∞, as shown in Figure 4.1.
x
0 -L L
x
(L)
Figure 4.1: The DTFT of a sequence x is obtained from the DFT
of the vector x
(L)
by letting L →∞.
The standard DFT of x
(L)
provides us with the coeﬃcients needed for
representing x
(L)
as a linear combination of N sinusoidal vectors v
(k)
of
length N, each having a diﬀerent Fourier frequency k(2π/N). The topmost
element in each v
(k)
(which would correspond to time n = −L in this case)
equals e
j0
= 1. For the task at hand, a more suitable choice for the k
th
Fourier sinusoid is
˜ v
(k)
=
_
e
j(2π/N)kn
_
n=L
n=−L
218
whose middle entry (corresponding to time n = 0) equals 1. This circular
shift has no eﬀect on the orthogonality of the Fourier sinusoids, nor does it
alter their norm. Thus, as in Subsection 3.1.3,
¸˜ v
(k)
, ˜ v
()
) =
_
N, k = ;
0, k ,=
The resulting modiﬁed analysis and synthesis equations are
X
(L)
[k] =
L

n=−L
x[n]e
−j(2π/N)kn
, k = 0, . . . , 2L (4.1)
and
x[n] =
1
N
N−1

k=0
X
(L)
[k]e
j(2π/N)kn
, n = −L, . . . , L (4.2)
where, again, N = 2L + 1. (Incidentally, we note that the frequency index
k can also range from −L to L, in which case the Fourier frequencies are
taken in (−π, π).)
As we noted earlier, the set of Fourier frequencies k(2π/N) becomes the
entire interval [0, 2π) as N → ∞. This prompts us to rewrite the sum in
(4.1) as a function of ω instead of k, and to also take the limit as L → ∞
(i.e., N →∞). The resulting expression

n=−∞
x[n]e
−jωn
is a power series in e

. Since e

is periodic with period 2π, so is the
resulting sum.
Deﬁnition 4.1.1. The discrete-time Fourier transform (DTFT) of the se-
quence x = ¦x[n], n ∈ Z¦ is deﬁned by
X(e

) =

n=−∞
x[n]e
−jωn
provided the inﬁnite sum converges.
This deﬁnition serves as the analysis equation in the case of an inﬁnite-
length sequence, and provides us with a continuous set of coeﬃcients X(e

)
for frequencies in [0, 2π). It remains to derive a synthesis equation, i.e.,
to show how the sequence x can be reconstructed by linearly combining
219
sinusoids e

with the above coeﬃcients. As suggested earlier, this linear
combination takes the form of an integral, which is constructed as follows.
For large values of L, we can write an approximation for the sum in
(4.2) by replacing each coeﬃcient X
(L)
[k] by the value of X(e

) at the
same frequency:
x[n] ≈
1
N
N−1

k=0
X
_
e
j(2π/N)k
_
e
j(2π/N)kn
Multiplying and dividing the right-hand side by 2π results in
x[n] ≈
1

N−1

k=0
X
_
e
j(2π/N)k
_
e
j(2π/N)kn

_

N
_
The sum consists of N equally spaced samples of the function X(e

)e
jωn
over the interval [0, 2π), each multiplied by the spacing 2π/N (in frequency).
As N → ∞, the sum converges to the integral of the function over the
same interval. The approximation (≈) also becomes exact in the limit, and
therefore
x[n] =
1

_

0
X(e

)e
jωn
dω , n ∈ Z
In summary, we have obtained the DTFT pair
x[n]
DTFT
←→ X(e

)
deﬁned by the equivalent equations:
• Analysis:
X(e

) =

n=−∞
x[n]e
−jωn
(4.3)
• Synthesis:
x[n] =
1

_

0
X(e

)e
jωn
dω , n ∈ Z (4.4)
The synthesis equation is an example of a linear combination of a contin-
uously indexed set of signals (the continuous index being frequency ω in this
case). Each signal has its own coeﬃcient (as was the case with the DFT),
and the signals are combined at each time instant n by integrating over the
range of the continuous index.
220
4.2 Computation of the Discrete-Time Fourier Trans-
form
4.2.1 Signals of Finite Duration
From a computational viewpoint, there are essential diﬀerences between the
DFT of a ﬁnite-length vector and the DTFT of a sequence of inﬁnite length.
The DFT is a ﬁnite vector of complex values obtained using a ﬁnite number
of ﬂoating-point operations. The DTFT, on the other hand, is an inﬁnite
sum computed over a continuum of frequencies, which
• is rarely expressible in a simple closed form; and
• involves, as a rule, an inﬁnite number of ﬂoating-point operations at
each frequency.
An obvious exception to the second statement is the class of signal se-
quences which contain only ﬁnitely many nonzero values.
Deﬁnition 4.2.1. The signal sequence x has ﬁnite duration if there exist
ﬁnite time indices n
1
and n
2
such that
n < n
1
or n > n
2
⇒ x[n] = 0
It has inﬁnite duration otherwise.
The DTFT of a ﬁnite-duration sequence is computed by summing to-
gether ﬁnitely many terms. We begin by considering three simple examples
of such sequences and their DTFT’s.
Example 4.2.1. The unit impulse sequence (also known as unit sample
sequence) is deﬁned by
x[n] = δ[n] =
_
1, n = 0;
0, n ,= 0
Its DTFT is given by
X(e

) = 1 e
−jω·0
= 1
Using the synthesis equation 4.4, we see that the unit impulse combines all
sinusoids in the frequency range [0, 2π) with equal coeﬃcients:
δ[n] =
1

_

0
e
jωn

221
1
x[n] = δ[n]
0 n
1
0 ω 2π
X(e )

Example 4.2.1
Example 4.2.2. The delayed impulse
x[n] = δ[n −m] =
_
1, n = m;
0, otherwise
is shown in the ﬁgure.
1
x[n] = δ[n-m]
0 m n
Example 4.2.2
Its DTFT is given by
X(e

) = e
−jωm
= cos ωm−j sin ωm
and is complex valued except for m = 0.
Example 4.2.3. The symmetric impulse pair
x[n] = δ[n +m] +δ[n −m]
has DTFT is given by
X(e

) = e
−jω(−m)
+e
−jωm
= 2 cos mω
This is a real sinusoid in ω having period 2π/m. It is plotted in the case
m = 2.
222
1
x[n] = δ[n+m] + δ[n−m]
-m 0 m n
2
π ω 2π
1
−2
0
X(e )

Example 4.2.3
In Example 4.2.3, the time-domain signal was real-valued and symmetric
x[n] = x[−n]
The DTFT X(e

) was also real-valued and symmetric about ω = π. A
similar relationship between signals in the time and frequency domains was
encountered earlier in our discussion of the DFT. There is, in fact, a direct
correspondence between the structural properties of the two transforms,
and one set of properties can be derived from the other using standard
substitution rules. In the case of the DTFT, symmetry in the time domain
is about n = 0 (which is the middle index of the modiﬁed DFT X
(L)
[k]
introduced in Subsection 4.1.2). Also, time-index reversal (involved in the
deﬁnition of symmetry) is understood in linear terms, i.e., n −→−n; circular
time reversal is impossible on an inﬁnite time axis. The same is true for time
delays of sequences, i.e., they are linear instead of circular.
We know that delaying a vector in time results in multiplying its spec-
trum by a complex sinusoid. As it turns out, the same is true for sequences:
Fact. (Time Delay Property of the DTFT) If y[n] = x[n−m], then Y (e

) =
e
−jωm
X(e

).
Proof. We use the analysis equation (4.4) with a change of summation index
(n

= n −m):
Y (e

) =

n=−∞
y[n]e
−jωn
=

n=−∞
x[n −m]e
−jωn
223
=

n

=−∞
x[n

]e
−jω(n

+m)
= e
−jωm

n

=−∞
x[n

]e
−jωn

= e
−jωm
X(e

)
4.2.2 The DFT as a Sampled DTFT
We now turn to an important connection between the DTFT of a ﬁnite-
duration signal x and the DFT of a ﬁnite segment which contains all the
nonzero values in x. Let x be given by
x[n] =
_
s[n], 0 ≤ n ≤ L −1;
0, otherwise
s x
0 L-1 0 L-1
Figure 4.2: Vector s and its two-sided inﬁnite zero-padded exten-
sion x.
We see that x is obtained from the L-point vector s by padding s with
inﬁnitely many zeros on both sides (as shown in Figure 4.2). The DFT S of
s is given by
S[k] =
L−1

n=0
s[n]e
−j(2π/L)kn
=
L−1

n=0
x[n]e
−j(2π/L)kn
= X(e
j(2π/L)k
)
i.e., it is obtained by sampling the DTFT X(e

) of x at the Fourier frequen-
cies for an L-point sample. Similarly, the DFT of the N-point zero-padded
extension of s is obtained by sampling X(e

) at the Fourier frequencies for
an N-point sample. As N increases, the set of Fourier frequencies becomes
denser (since the spacing equals 2π/N), and the DFT of the zero-padded
vector provides uniformly spaced samples of X(e

) at a higher resolution.
224
If the vector s appears in the sequence x with a delay of m time units, i.e.,
x[n] =
_
s[n −m], m ≤ n ≤ m+L −1;
0, otherwise,
then by the time delay property established earlier, the DFT S is obtained by
sampling e
jωm
X(e

) at frequencies ω = k(2π/L). Also, a dense plot of the
DTFT X(e

) can be obtained from the DFT of a zero-padded extension
of s by multiplying each entry (of the DFT vector) by the corresponding
sample of e
−jωm
.
Example 4.2.4. Each of the ﬁnite-duration sequences x, y and u is a two-
sided zero-padded extension of an eight-point vector s delayed by a diﬀerent
amount. Speciﬁcally, let
x[0 : 7] = y[3 : 10] = u[−5 : 2] = s
as illustrated in the ﬁgure.
s
0
0
0
x
y
u
Example 4.2.4
Suppose we are interested in computing the DTFT’s X(e

), Y (e

)
and U(e

) with frequency resolution of 0.01 cycle/sample, i.e., for ω equal
to multiples of 2π/100. For that purpose, it suﬃces to compute 100-point
DFT’s. Using the standard MATLAB syntax fft(s,N) for the DFT of the
N-point zero-padded extension of s, we have:
k = (0:99).’ ; % s also assumed column vector
225
v = exp(j*2*pi*k/100) ;
X = fft(s,100) ;
Y = X.*(v.^(-3)) ;
U = X.*(v.^5) ;
(Alternatively, each of Y and U is the DFT of a circular shift on the zero-
4.2.3 Signals of Inﬁnite Duration
An inﬁnite-duration signal sequence x has the property that x[n] ,= 0 for
inﬁnitely many values of n. As a result, the analysis equation
X(e

) =

n=−∞
x[n]e
−jωn
involves an inﬁnite sum of complex-valued terms. The expression is mean-
ingful only when the sum converges absolutely; in other words, when the
magnitudes of the summands add up to a ﬁnite value:

n=−∞
[x[n][ [e
−jωn
[ =

n=−∞
[x[n][ < ∞ (4.5)
Two important types of inﬁnite-duration signals which violate the above
condition are:
• sinusoids (real or complex) of any frequency; and
• periodic signals.
Fortunately, the analysis equation is not needed in order to derive the spec-
trum of either type of signal.
A real or complex sinusoid of requires no further frequency analysis, since
every ω in [0, 2π) is a valid Fourier frequency for the purpose of representing
a sequence. Thus, for example,
x[n] = Acos(ω
0
n +φ) =
A
2
e

e

0
n
+
A
2
e
−jφ
e
−jω
0
n
is a sum of two complex sinusoids at frequencies ω
0
and 2π − ω
0
. We say
that x[n] has a discrete spectrum with components, or lines, at these two
frequencies. To plot a discrete spectrum against frequency, we use a vertical
line for each sinusoidal component, e.g., as shown in Figure 4.3.
226
0 ω π 2π−ω 2π
X(e )

(A/2)e
-jφ jφ
(A/2)e
0 0
Figure 4.3: The spectrum of the real-valued sinusoid x[n] =
Acos(ω
0
n + φ) is shown for ω ∈ [0, 2π). Note that (A/2)e

and
(A/2)e
−jφ
are real-valued only when φ = 0 or φ = π.
Remark. Even though the sum in the analysis equation fails to converge
to a valid function X(e

), it is still possible to express a discrete spectrum
mathematically using a special type of function (of ω), so that the integral
in the synthesis equation provides the correct time-domain signal. This
representation is beyond the scope of the present discussion.
Periodic sequences can be also represented in terms of sinusoids without
recourse to the analysis equation. This is because a sequence x of period L
is the inﬁnite two-sided periodic extension of the L-point vector
s = x[0 : L −1] .
This is illustrated in Figure 4.4.
s x
... ...
0 L-1 -L 0 L 2L
Figure 4.4: An inﬁnite periodic sequence as the two-sided periodic
extension of its ﬁrst period.
From our discussion in Subsection 3.7.2, we know that if S is the (L-
point) DFT of s, then the synthesis equation
s[n] =
1
L
L−1

k=0
S[k]e
j(2π/L)kn
, n = 0, . . . , L −1
227
also returns x[n] for values of n outside the range 0 : L −1. Thus
x[n] =
1
L
L−1

k=0
S[k]e
j(2π/L)kn
, n ∈ Z
and we have established the following.
Fact. A periodic sequence x of period L is the sum of L sinusoidal compo-
nents at frequencies which are multiples of 2π/L. The coeﬃcients of these
sinusoids are given by the DFT of x[0 : L −1] scaled by 1/L. The spectrum
X(e

) also consists of L lines at the above-mentioned frequencies.
The spectrum of a periodic sequence is illustrated in Figure 4.5.
0 ω 2π
2π/L
X(e )

Figure 4.5: The spectrum of a periodic signal of period L, shown
for ω ∈ [0, 2π).
We conclude our discussion with an example of an inﬁnite-duration signal
which satisﬁes the convergence condition (4.5).
Example 4.2.5. Let x be a decaying exponential in positive time:
x[n] =
_
a
n
, n ≥ 0;
0, n < 0
where [a[ < 1. This signal is shown (for 0 < a < 1) together with the cases
a = 1 and a > 1, both of which violate the convergence condition (4.5).
For [a[ < 1, we have
X(e

) =

n=0
a
n
e
−jωn
=

n=0
(ae
−jω
)
n
=
1
1 −ae
−jω
228
−3−2−1 0 1 2 3 4 5 6
−1
0
1
2
3
0<a<1
−3−2−1 0 1 2 3 4 5 6
−1
0
1
2
3
a=1
−3−2−1 0 1 2 3 4 5 6
−1
0
1
2
3
a>1
Example 4.2.5
where the last expression is the formula for the inﬁnite geometric sum from
Subsection 3.1.3. The condition [ae
−jω
[ = [a[ < 1 guarantees convergence
of the inﬁnite sum.
229
4.3 Introduction to Linear Filters
4.3.1 Linearity, Time-Invariance and Causality
In our earlier discussion of the DFT and its applications, we saw how the
main sinusoidal components of a signal manifest themselves in the DFT of
a suitably long segment of that signal. Identifying these components in the
DFT allows us to reconstitute the signal (via an inverse DFT) in a more
selective manner, e.g., by retaining desirable frequencies while eliminating
undesirable ones. This type of signal transformation is known as frequency-
selective ﬁltering and can be applied to any segment of the signal.
Frequency-selective ﬁltering can be performed more directly and eﬃ-
ciently without use of DFT’s, once the frequencies of interest have been
determined. In many applications, these frequencies are known ahead of
time, e.g., radar return signals in a particular frequency band, or interfer-
ence signals from a known source (e.g., a power supply at 50 Hz). In such
cases, it is possible to use a so-called linear ﬁlter to generate the ﬁltered
signal in real time—i.e., at the rate at which the original signal is being
sampled or recorded, and with minimal delay. The remainder of this chap-
ter is devoted to basic concepts needed for the analysis of linear ﬁlters and
their operation.
A linear ﬁlter is a special type of a linear transformation of an input
signal sequence x to an output sequence y. We call y the response of the
ﬁlter to the input x. Since the time index for a signal sequence varies
from n = −∞ to n = +∞, we implicitly assume that the linear ﬁlter is
active at all times—this is despite the fact that all signals encountered in
practice have ﬁnite duration. The term linear system, used earlier for a
linear transformation, is also applicable to a linear ﬁlter. Thus a linear ﬁlter
H is a linear system whose input and output are related by
y = H(x)
x y
Linear Filter
H
Figure 4.6: A linear ﬁlter.
Since the objective of ﬁltering is to retain certain sinusoidal components
230
while eliminating others, an essential property of the transformation H is
the relationship between the spectra (i.e., DTFT’s) of the sequences x and
y. We begin our discussion of ﬁlters by exploring that relationship in a very
simple example.
Consider the transformation H described by the input-output relation-
ship
y[n] = x[n] −x[n −1] +x[n −2] , n ∈ Z (4.6)
Recall our earlier deﬁnition of a linear transformation:
y
(1)
= H(x
(1)
)
y
(2)
= H(x
(2)
)
_
⇒ c
1
y
(1)
+c
2
y
(2)
= H(c
1
x
(1)
+c
2
x
(2)
)
It is straightforward to show that the transformation H deﬁned by (4.6)
satisﬁes the above deﬁnition of linearity. In addition, H has the following
properties:
• Time-Invariance. In the context of the above example, this means
that the relationship between the output sample y[n] and the input
samples x[n], x[n − 1] and x[n − 2] does not change with the time
index n, i.e., the same coeﬃcients (1, −1 and 1) are used to combine
the three most recent input samples at any time n. Writing (4.6) in
the form
y[] = x[] −x[ −1] +x[ −2]
better illustrates this feature.
• Causality. This means that the current output y[n] depends only on
present and past values of the input sequence—in this particular case,
x[n], x[n−1] and x[n−2]. This is an essential constraint on all systems
operating in real time, where future values of the input are unavailable.
If, on the other hand, n represents a parameter other than time (e.g.,
a spatial index in an image or a pointer in an array of recorded data),
causality is not a crucial constraint.
The deﬁning properties of a linear ﬁlter are linearity and time invariance.
Causality is an option which becomes necessary for ﬁlters operating in real
time.
Before proceeding with the analysis of the ﬁlter described by (4.6), we
illustrate the operation of the ﬁlter in Figure 4.7. The output y[n] is a
linear combination of x[n], x[n − 1] and x[n − 2] with coeﬃcients 1, −1
and 1, respectively. As n increases, the time window containing the three
most recent values of the input slides to the right, while the values of the
coeﬃcients remain the same.
231
1 -1 1
x
y
n
n
Figure 4.7: Illustration of the operation of the ﬁlter y[n] = x[n] −
x[n −1] +x[n −2].
4.3.2 Sinusoidal Inputs and Frequency Response
In order to understand how the time-domain equation (4.6) determines the
relationship between the spectra of the input and output sequences x and
y, consider a purely sinusoidal input at frequency ω:
x[n] = e
jωn
, n ∈ Z
The output sequence is then given by
y[n] = e
jωn
−e
jω(n−1)
+e
jω(n−2)
= (1 −e
−jω
+e
−j2ω
) e
jωn
= (1 −e
−jω
+e
−j2ω
) x[n]
for every value of n. Note that y is obtained by scaling each sample in x
by the same (i.e., independent of time) amount. Since a complex scaling
factor is equivalent to a change of amplitude and a shift in phase, we have
illustrated the following important property of linear ﬁlters.
232
Fact. When processed by a linear ﬁlter, pure sinusoids undergo no distortion
other than a change in amplitude and a shift in phase.
Remark. Nonlinear ﬁlters distort pure sinusoids by introducing components
at other frequencies, notably multiples of the input frequency; this eﬀect is
known as harmonic distortion.
The fact stated above can be proved for (almost) any linear ﬁlter using a
similar factorization of the output y[n] in terms of the input x[n] = e
jωn
and a a frequency-dependent complex scaling factor H(e

). The function
H(e

) is known as the frequency response of the ﬁlter. In this case,
H(e

) = 1 −e
−jω
+e
−j2ω
We can thus write
x[n] = e
jωn
, n ∈ Z ⇒ y[n] = H(e

)e
jωn
, n ∈ Z (4.7)
We are now in a position to explore the relationship between an arbi-
trary input sequence x and the corresponding output sequence y = H(x)
by shifting our focus to the frequency domain. We may assume that the
sequence x can be expressed as a sum of sinusoidal components, where the
summation is either (i) discrete or (ii) continuous.
In the ﬁrst case, which includes all periodic sequences, we have
x[n] =

k
X
k
e

k
n
where the sum is over a discrete set of frequencies ω
k
. The coeﬃcients X
k
are, in general, complex-valued. Since the ﬁlter is linear, the response y will
be the sum of the responses to each of the sinusoids on the right-hand side.
From (4.7), we know that the response to e

k
n
is given by H(e

k
)e

k
n
.
We thus obtain
y[n] =

k
H(e

k
)X
k
e

k
n
We conclude that the output sequence y also has a discrete spectrum consist-
ing of lines at the same frequencies (i.e., the ω
k
’s) as for the input sequence
x, but with coeﬃcients scaled by the corresponding value of the frequency
response H(e

k
). In other words,
y[n] =

k
Y
k
e

k
n
233
where
Y
k
= H(e

k
)X
k
In the second case, we have (by the synthesis equation for the inverse
DTFT) a representation of the ﬁlter input as
x[n] =
1

_

0
X(e

)e
jωn

i.e., x is a linear combination of complex sinusoids whose frequencies range
over the continuous interval [0, 2π). Each of these sinusoids is scaled by the
corresponding value of the frequency response H(e

) when processed by
the ﬁlter, and linearity of the ﬁlter implies that
y[n] =
1

_

0
H(e

)X(e

)e
jωn

Comparing this expression for y[n] with one provided by the synthesis equa-
tion
y[n] =
1

_

0
Y (e

)e
jωn

we conclude that Y (e

) is, in fact, given by H(e

)X(e

). This important
result is stated below.
Fact. For a linear ﬁlter, the complex spectrum of the output signal is given
by the product of the complex spectrum of the input signal and the ﬁlter
frequency response:
Y (e

) = H(e

)X(e

)
If the input spectrum is discrete, then so is the output spectrum, and the
same relationship holds with X(e

) and Y (e

) replaced by the coeﬃcients
in the two spectra.
Figure 4.8 illustrates the relationship 4.3.2 between input and output
spectra.
4.3.3 Amplitude and Phase Response
y[n] = x[n] −x[n −1] +x[n −2] , n ∈ Z
and compute the magnitude and angle of the frequency response
H(e

) = 1 −e
−jω
+e
−j2ω
234
X(e )
jω jω
Y(e ) = H(e )X(e )
jω jω
H(e )

Figure 4.8: Input-output relationship in the frequency domain.
For the magnitude [H(e

)[ (also known as the amplitude response of
the ﬁlter), we have
¸
¸
H(e

)
¸
¸
2
= H

(e

)H(e

)
= (1 −e
−jω
+e
−j2ω
)(1 −e

+e
j2ω
)
= 3 −2(e

+e
−jω
) + (e
j2ω
+e
−j2ω
)
= 3 −4 cos ω + 2 cos 2ω
and thus
[H(e

)[ = (3 −4 cos ω + 2 cos 2ω)
1/2
Since cos(nπ+θ) = cos(nπ−θ), we see that [H(e

ω = π. This is due to the fact that H(e

) is the DTFT of a real-valued
sequence, namely
δ[n] −δ[n −1] +δ[n −2]
and as such, it exhibits the same kind of conjugate symmetry as does the
DFT of a real-valued vector.
Note that in this case, the amplitude response achieves its maximum at
ω = π, and has two zeros at ω = π/3 and ω = 5π/3. This implies that:
• among all possible complex sinusoidal inputs, (−1)
n
will undergo the
most ampliﬁcation (or least attenuation);
• sinusoidal inputs such as e
j(π/3)n
, e
j(5π/3)n
and cos(πn/3 + φ) will all
be nulled out, i.e., will result in y[n] = 0 for all n.
The symmetry of the ﬁlter coeﬃcients also allows us to obtain [H(e

)[
in a more direct way:
H(e

) = e
−jω
(e

−1 +e
−jω
)
= e
−jω
(2 cos ω −1)
235
Since [e
−jω
[ = 1, we obtain
[H(e

)[ = [2 cos ω −1[
which is equivalent to the earlier expression (as can be shown by using the
identity cos 2ω = 2 cos
2
ω −1).
The angle ∠H(e

) is known as the phase response of the ﬁlter. In this
case, we have
∠H(e

) = −ω +∠(2 cos ω −1)
Since 2 cos ω−1 is a real number, the second term equals 0 (when the number
is positive) or π (when the number is negative). We thus have
∠H(e

) =
_
_
_
−ω, 0 ≤ ω < π/3;
−ω +π, π/3 ≤ ω ≤ 5π/3;
−ω, 5π/3 < ω < 2π.
The amplitude and phase responses are plotted in Figure 4.9, against
cyclic frequency ω/2π. Note that the phase response is antisymmetric about
ω = π, i.e., ∠H(e

) = −∠H(e
j(π−ω)
), which is also due to the fact that the
ﬁlter coeﬃcients are real-valued.
0 0.5 1
0
1
2
3
Amplitude Response
ω/2π
0 0.5 1
−1
−0.5
0
0.5
1
ω/2π
Figure 4.9: Amplitude response (left) and phase response (right)
of the ﬁlter y[n] = x[n] −x[n−1] +x[n−2]. Frequencies and phases
are normalized by 2π and π, respectively.
236
4.4 Linear Filtering in the Frequency Domain
4.4.1 System Function and its Relationship to Frequency Re-
sponse
In the previous section, we saw that the response of the linear ﬁlter H to
the complex sinusoidal input
x[n] = e
jωn
, n ∈ Z
equals
y[n] = H(e

)e
jωn
, n ∈ Z
In other words, the input and output signal sequences diﬀer by a complex
scaling factor which depends on the frequency ω:
y = H(e

) x
This factor was called the frequency response of the ﬁlter.
This scaling (or proportionality) relationship between the input and out-
put sequences can be generalized to a class of signals known as complex
exponentials:
x[n] = z
n
, n ∈ Z
where z is any complex number other than z = 0 (which would give z
n
= ∞
for n < 0). Writing z in the standard polar form
z = re

we have
x[n] = r
n
e
jωn
Thus a complex sinusoid is a special case of a complex exponential where
r = 1, i.e., z lies on the unit circle. Also,
[x[n][ = r
n

¸
¸
e
jωn
¸
¸
= r
n
and thus the magnitude of x[n]
• increases geometrically (in n) if [z[ > 1;
• decreases geometrically if [z[ < 1;
• is constant if [z[ = 1,
237
... ...
. . . . . .
. . . . . .
1
z
1
z
3
z
2
|z |
1
n
|z |
2
n
|z |
3
n
n
n
n
Figure 4.10: Three choices of z (left) and the corresponding com-
plex exponentials in magnitude (right).
as illustrated in Figure 4.10.
Consider the ﬁlter introduced in Section 4.3, namely
y[n] = x[n] −x[n −1] +x[n −2] , n ∈ Z
The response of the ﬁlter to x[n] = z
n
equals
y[n] = z
n
−z
n−1
+z
n−2
= (1 −z
−1
+z
−2
)z
n
= (1 −z
−1
+z
−2
) x[n]
Letting
H(z) = 1 −z
−1
+z
−2
we can restate the above result as follows:
x[n] = z
n
, n ∈ Z ⇒ y[n] = H(z)z
n
, n ∈ Z (4.8)
The fact established in Subsection 4.3.2, namely that
x[n] = e
jωn
, n ∈ Z ⇒ y[n] = H(e

)e
jωn
, n ∈ Z
is a special case of (4.8) where z = e
jωn
.
The function H(z), where z is complex, is known as the system function,
or transfer function, of the ﬁlter H. The frequency response of the ﬁlter is
obtained from H(z) by taking z on the unit circle, i.e., z = e

.
238
The cascade (also referred to as series or tandem) connection of two ﬁlters
H
1
and H
2
is illustrated in Figure 4.11. At each time instant, the output of
H
1
serves as the input to H
2
.
x = x
H
y = y y = x
H
(1) (2) (1) (2)
1 2
H
Figure 4.11: Cascade connection of two ﬁlters.
In Chapter 2, we showed that the cascade (A◦ B)() of two linear trans-
formations A() and B() is represented by the matrix product AB (where
A and B are the matrices corresponding to A() and B(), respectively). A
linear ﬁlter is a special type of linear transformation which is invariant in
time. The cascade of two such ﬁlters is also a linear ﬁlter H whose sys-
tem function H(z) bears a particularly simple relationship to the system
functions H
1
(z) and H
2
(z) of the constituent ﬁlters.
From (4.8), we know that H(z) is the scaling factor between the input
and output sequences x and y when the former (input) is given by x[n] = z
n
.
In this case, x is also the input to the ﬁrst ﬁlter H
1
, which results in an
output
y
(1)
[n] = H
1
(z)z
n
This scaled complex exponential is the input to the second ﬁlter H
2
, and
thus
y[n] = y
(2)
[n] = H
1
(z)H
2
(z)z
n
We have established the following.
Fact. The cascade connection H of two ﬁlters H
1
and H
2
has system func-
tion given by
H(z) = H
1
(z)H
2
(z)
Similarly, the frequency response of H is given by
H(e

) = H
1
(e

)H
2
(e

)
Note that the order in which the two ﬁlters are cascaded is immaterial;
the same system function is obtained if H
2
is followed by H
1
.
239
Example 4.4.1. Consider a cascade of two identical ﬁlters, whose input-
output relationship is given by (4.6). Thus
y
(1)
[n] = x[n] −x[n −1] +x[n −2]
y[n] = y
(1)
[n] −y
(1)
[n −1] +y
(1)
[n −2]
In this case, H
1
(z) = H
2
(z) = 1 −z
−1
+z
−2
, and thus the system function
of the cascade is given by
H(z) = H
1
(z)H
2
(z) = 1 −2z
−1
+ 3z
−2
−2z
−3
+z
−4
Similarly,
H(e

) = H
1
(e

)H
2
(e

)
= 1 −2e
−jω
+ 3e
−j2ω
−2e
−j3ω
+e
−j4ω
= e
−j2ω
(3 −4 cos ω + 2 cos 2ω)
= e
−j2ω
(1 −2 cos ω)
2
4.4.3 The General Finite Impulse Response Filter
We introduced linear ﬁlters by considering a particular example where the
ﬁlter output at any particular instant is formed by linearly combining the
three most recent values of the input, with coeﬃcients which are constant
in time. There is nothing special about choosing three input samples; we
can draw the same conclusions about the ﬁlter given by the equation
y[n] = b
0
x[n] +b
1
x[n −1] + +b
M
x[n −M] , n ∈ Z (4.9)
where the M + 1 most recent values of the input are combined to form the
output at any given time.
Deﬁnition 4.4.1. A ﬁlter deﬁned by (4.9) is known as a ﬁnite impulse
response (FIR) ﬁlter of order M.
The qualiﬁer ﬁnite in the above deﬁnition reﬂects the fact that a ﬁnite
number of input samples are combined to form a particular output sample.
The term impulse response will be deﬁned shortly.
The system function of the FIR ﬁlter deﬁned by (4.9) is given by
H(z) = b
0
+b
1
z
−1
+ +b
M
z
−M
240
and the frequency response is obtained by letting z = e

:
H(e

) = b
0
+b
1
e
−jω
+ +b
M
e
−jωM
Comparing the above equation for H(e

) with the deﬁnition of the
DTFT, we see that H(e

) is, in fact, the DTFT of the ﬁnite-duration
sequence h deﬁned by
h[n] =
_
b
n
, 0 ≤ n ≤ M;
0, otherwise.
The sequence h is known as the impulse response of the ﬁlter. As we shall
see soon, it is the response of the ﬁlter to a unit impulse, i.e., to the signal
x[n] = δ[n].
The coeﬃcient vector b and the impulse response h for the example of
Section 4.3 is shown in Figure 4.12.
b h
1 1 1 1
-1 -1
0 2
Figure 4.12: The coeﬃcient vector (left) and the impulse re-
sponse sequence (right) of the FIR ﬁlter y[n] = x[n] − x[n − 1] +
x[n −2].
The relationship between b and the frequency response H(e

) suggests
a convenient way of computing H(e

) based on the DFT of b. To evaluate
H(e

) at N ≥ M +1 evenly spaced frequencies in [0, 2π), all we need to do
is compute the DFT of the (M + 1)-point vector b zero-padded to length
N. Thus for the ﬁlter considered in Section 4.3, the frequency response at
N = 256 evenly spaced frequencies can be computed in MATLAB using the
command
H = fft([1;-1;1],256)
The amplitude and phase responses in Figure 4.9 in the previous section are
the graphs of abs(H) and angle(H), respectively.
241
4.4.4 Frequency Domain Approach to Computing Filter Re-
sponse
The input-output relationship
y[n] = b
0
x[n] +b
1
x[n −1] + +b
M
x[n −M]
provides us with a direct way of implementing an FIR ﬁlter. At every instant
n, a new value of the input x[n] is read into a buﬀer, an old value (namely
x[n − M − 1]) is shifted out of the same buﬀer, and the output y[n] is
computed using M + 1 multiplications and M additions.
In applications where a single processor is used to perform many signal
processing tasks in parallel, computational eﬃciency is a key consideration.
It therefore pays to examine how a simple component such as an FIR ﬁlter
(given by the above equation) can be implemented eﬃciently. In this sub-
section, we provide a partial answer to this question for two special types of
inputs, both of which are readily expressed in terms of complex exponentials
or complex sinusoids. A third class of such input signals will be considered
in the following subsection.
The ﬁrst type of input we consider is a real-valued sinusoid, i.e.,
x[n] = Acos(ω
0
n +φ) , n ∈ Z
Using the identity e

+ e
−jθ
= 2 cos θ, we can write the signal in terms of
two complex sinusoids as
x[n] =
A
2
e

e

0
n
+
A
2
e
−jφ
e
−jω
0
n
(In other words, x has a discrete spectrum with lines at frequencies ω and
2π −ω.) By linearity of the ﬁlter, the output is given by
y[n] =
A
2
e

H(e

0
)e

0
n
+
A
2
e
−jφ
H(e
−jω
0
)e
−jω
0
n
At this point, we note that
H(e
−jω
) = b
0
+b
1
e

+ +b
M
e
jωM
=
_
b
0
+b
1
e
−jω
+ +b
M
e
−jωM
_

= H

(e

)
where, for the last equality, we also used the fact the the ﬁlter coeﬃcients
b
i
are real. We thus have
y[n] =
A
2
e

H(e

0
)e

0
n
+
A
2
e
−jφ
H

(e

0
)e
−jω
0
n
242
The two terms on the right-hand side are complex conjugates of each other,
and thus the identity z +z

= 2'e¦z¦ gives
y[n] = 'e
_
AH(e

0
)e
j(ω
0
n+φ)
_
An alternative expression can be written using the amplitude and phase
responses
¸
¸
H(e

)
¸
¸
and ∠H(e

):
H(e

) =
¸
¸
H(e

)
¸
¸
e
j∠H(e

)
and thus
y[n] = 'e
_
A
¸
¸
H(e

0
)
¸
¸
e
j(ω
0
n+φ+j∠H(e

0
))
_
= A
¸
¸
H(e

0
)
¸
¸
cos
_
ω
0
n +φ +∠H(e

0
)
_
Comparing the resulting expression with x[n] = Acos(ω
0
n+φ), we con-
clude the following.
Fact. The response of a linear ﬁlter to a real-valued sinusoid of frequency
ω
0
is a sinusoid of the same frequency. The gain (ratio of output to input
amplitude) is given by the ﬁlter amplitude response at ω
0
, while the phase
shift (of the output relative to the input) is given by the phase response at
the same frequency.
Example 4.4.2. Let
x[n] = cos
_
2πn
3

π
4
_
, n ∈ Z
be the input to the second-order FIR ﬁlter
y[n] = x[n] −x[n −1] +x[n −2] , n ∈ Z
introduced in Section 4.3. We have
H(e
j(2π/3)
) = e
−j(2π/3)
(2 cos(2π/3) −1)
= −2e
−j(2π/3)
= 2e
j(π/3)
and thus [H(e
j(2π/3)
)[ = 2, ∠H(e
j(2π/3)
) = π/3. Using the fact established
above, we obtain
y[n] = 2 cos
_
2πn
3

π
4
+
π
3
_
= 2 cos
_
2πn
3
+
π
12
_
, n ∈ Z
243
The foregoing analysis can be also extended to the real exponential
x[n] = Ar
n
, n ∈ Z
and the real oscillating exponential
x[n] = Ar
n
cos(ω
0
n +φ) , n ∈ Z
where A and r are real-valued. In each case, the output can be computed
in terms of the system function H(z) as follows:
x[n] = Ar
n
=⇒ y[n] = AH(r)r
n
x[n] = Ar
n
cos(ω
0
n +φ) =⇒ y[n] = A
¸
¸
H(re

0
)
¸
¸
r
n
cos
_
ω
0
n +φ +∠H(re

0
)
_
Example 4.4.3. Consider the same ﬁlter as in Section 4.3, with two diﬀer-
ent input sequences:
x
(1)
[n] = 3
n
x
(2)
[n] = 3
n
cos
_
2πn
3
_
The ﬁlter system function is given by
H(z) = 1 −z
−1
+z
−2
and thus
H(3) = 7/9
H(3e
j(2π/3)
) = 2.614 +j10.472 = 10.793 e
j1.326
It follows that
y
(1)
[n] =
7
9
3
n
y
(2)
[n] = 10.793 3
n
cos
_
2πn
3
+ 1.326
_
4.4.5 Response to a Periodic Input
The third type of input x for which the ﬁlter response y can be readily
computed using a frequency domain-based approach (i.e., based on H(e

))
is a periodic sequence of period (say) L:
x[n +L] = x[n] , n ∈ Z
244
Before we explain how H(e

) can be used to compute y, we note that
y[n +L] = b
0
x[n +L] +b
1
x[n +L −1] + +b
M
x[n +L −M]
= b
0
x[n] +b
1
x[n −1] + +b
M
x[n −M]
= y[n]
and thus the response is also periodic with period L. It therefore suﬃces to
compute y[0 : L −1], which will be referred to as the ﬁrst period of y.
The frequency domain approach to computing y[0 : L − 1] exploits the
fact that the periodic input signal x is the sum of L sinusoids at frequencies
which are multiples of 2π/L. As was discussed in Subsection 4.2.3, if the
ﬁrst period x[0 : L −1] of x has DFT A[], then the equation
x[n] =
1
L
L−1

k=0
A[k]e
j(2π/L)kn
holds for every n, i.e., it describes the entire sequence x. By linearity of the
ﬁlter, the output sequence y is given by
y[n] =
1
L
L−1

k=0
H
_
e
j(2π/L)k
_
A[k]e
j(2π/L)kn
Since y is also periodic with period L, we have (similarly to x)
y[n] =
1
L
L−1

k=0
¸[k]e
j(2π/L)kn
where ¸[] is the DFT of y[0 : L − 1]. Comparing the two expressions for
y[n], we see that
¸[k] = H
_
e
j(2π/L)k
_
A[k]
In other words, the DFT of y[0 : L − 1] is obtained by multiplying each
entry of the DFT of x[0 : L−1] by the frequency response of the ﬁlter at the
corresponding Fourier frequency.
This observation leads to the following algorithm for computing y[0 :
L −1] given x[0 : L −1] and the coeﬃcient vector b:
• Compute the DFT of x[0 : L −1].
• Compute H(e

) at the same Fourier frequencies, e.g., using the DFT
of b after zero-padding to a multiple of L.
245
• Multiply the two vectors element-by-element.
• Invert the DFT of the resulting vector to obtain y[0 : L −1].
Example 4.4.4. Consider the second-order ﬁlter introduced in Section 4.3.
Suppose that the input sequence x is periodic with period L = 7, and such
that
x[0 : 6] =
_
2 5 3 4 −7 −4 −1
¸
T
The ﬁrst period of the output sequence is computed in MATLAB as follows:
x = [2 5 3 4 -7 -4 -1].’ ;
b = [1 -1 1].’; % filter coefficient vector
X = fft(x);
H = fft(b,7);
Y = H.*X;
y = ifft(Y)
resulting in
y[0 : 6] =
_
−1 2 0 6 −8 7 −4
¸
T

In the example above, M +1 < L, i.e., the coeﬃcient vector was shorter
than the ﬁrst input period. By zero-padding the coeﬃcient vector b to length
L and taking its DFT, we obtained precisely those values of the frequency
response H(e

) which were needed in order to compute the product in
(4.4.5). Recalling that element-wise multiplication of two DFT’s of the
same length is equivalent to circular convolution in the time domain, we see
that the ﬁrst period of the output signal is, in eﬀect, computed by circularly
convolving the ﬁrst period of the input signal with the zero-padded (to
length L) coeﬃcient vector. This fact can be established independently by
examining the time-domain expression
y[n] =
M

k=0
b
k
x[n −k]
in the special case where x is periodic with period L. For this reason, circular
convolution is also known as periodic convolution.
If the length of the coeﬃcient vector is greater than the input period
L, then the algorithm outlined in Example 4.4.4 needs to be modiﬁed—
otherwise the command H = fft(b,L) would lead to truncation of b, re-
sulting in an error. Two possible modiﬁcations, assuming that M + 1 is
satisﬁes
(J −1)L < M + 1 ≤ JL
for some integer J > 1, are:
246
• Zero-pad b to length JL, then extract every J
th
entry of the resulting
DFT. Or,
• Use the ﬁrst L periods of x (in conjunction with zero-padding b to
length JL), in which case the answer will consist of the ﬁrst L periods
of y. This approach is also equivalent to a circular convolution in the
time domain (with vectors of length JL).
247
4.5 Frequency-Selective Filters
4.5.1 Main Filter Types
We motivated our discussion of linear ﬁlters using the concept of frequency
selection. A frequency-selective ﬁlter reconstructs at its output certain desir-
able sinusoidal components present in the input signal, while attenuating or
eliminating others. The range of frequencies which are preserved is known
as the ﬁlter passband, while those that are rejected are referred to as the
stopband.
For ﬁlters with real-valued coeﬃcients operating in discrete time, it is
customary to specify frequency bands as subintervals of [0, π]. This is be-
cause the ﬁlter frequency response H(e

ω = π:
H(e
j(2π−ω)
) = H(e
−jω
) = H

(e

) (4.10)
(This is equivalent to symmetry in the amplitude response and antisymmetry
in the phase response.) Thus it suﬃces to specify H(e

) over [0, π], which
is also the eﬀective range of frequencies for real-valued sinusoids.
Remark. Any frequency band (i.e., interval) in [0, π] has a symmetric band in
[π, 2π]. Equation (4.10) also implies that H(e

) has conjugate symmetry
about ω = 0; thus any frequency band in [0, π] has a symmetric band in
[−π, 0], as well. This is illustrated in Figure 4.13.
−π 0 π 2π
Figure 4.13: A frequency band in [0, π] (dark) and its symmetric
bands (light), shown on the unit circle (left) and the frequency axis
(right).
The four main types of frequency selective ﬁlters are: lowpass, highpass,
bandpass and bandstop. The frequency characteristics (i.e., passband and
stopband) of each type are illustrated in Figure 4.14.
248
0 π 0 π
0 π 0 π
PASS STOP
PASS PASS
PASS
PASS
STOP
STOP STOP STOP
Lowpass Highpass
Bandpass Bandstop
Figure 4.14: Passbands and stopbands of frequency-selective ﬁl-
ters.
4.5.2 Ideal Filters
An (zero-delay) ideal ﬁlter has the following properties:
• the passband and stopband are complements of each other (with re-
spect to [0, π]);
• the frequency response is constant over the passband; and
• the frequency response is zero over the stopband.
We illustrate some general properties of ideal ﬁlters by considering the
ideal lowpass ﬁlter. The ﬁlter frequency response is shown in Figure 4.15,
plotted over a full cycle (−π, π]. The edge of the passband is known as the
cutoﬀ frequency ω
c
.
H(e )

−π −ω 0 ω π
c c
A
Figure 4.15: Frequency response of an ideal lowpass ﬁlter.
249
The impulse response h of the ideal lowpass ﬁlter is given by the inverse
DTFT of its frequency response H(e

). We thus have
h[n] =
1

_

0
H(e

)e
jωn

=
1

_
π
−π
H(e

)e
jωn

=
A

_
ωc
−ωc
e
jωn

=
A

_
e
jωn
jn
_
ωc
−ωc
=
A

e

c
n
−e
−jω
c
n
jn
=
Asin(nω
c
)
πn
The impulse response h is plotted in Figure 4.16 for the case ω
c
= π/4,
(with A=1). For any cutoﬀ frequency ω
c
< π, h is a sequence of inﬁnite
duration, extending from n = −∞ to n = +∞. By examining the input-
output relationship
y[n] =

k=−∞
h[k]x[n −k] ,
we see that inﬁnitely many future values of the input signal x are needed
in order to compute the output at any time n. This is clearly infeasible
in practice (i.e., the ﬁlter is “irrecoverably” noncausal). Practical lowpass
ﬁlters have frequency responses that necessarily deviate from the ideal form
shown in Figure 4.15.
Similar conclusions can be drawn for other ideal frequency-selective ﬁl-
ters (highpass, bandpass and bandstop). In short, ﬁlters with perfectly ﬂat
response over the passband, zero response over the stopband, and inﬁnitely
steep edges at the cutoﬀ frequencies are not practically realizable.
4.5.3 Amplitude Response of Practical Filters
Figure 4.17 shows the main features of the amplitude response [H(e

)[ of
a lowpass ﬁlter used in practice.
• The passband is the interval of frequencies over which the amplitude
response takes high values. These values range fromA(1−δ) to A(1+δ)
250
−20 −10 0 10 20
−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
Figure 4.16: Impulse response h[n] of an ideal lowpass ﬁlter with
cutoﬀ frequency ω
c
= π/4, plotted for n = −15 to n = 15.

0
A(1−δ)
A(1+δ)
ω
p
ω
s
π
|H(exp(jω))|
Figure 4.17: Amplitude response of a practical lowpass ﬁlter.
(as shown in Figure 4.17), and the ﬂuctucation in value is known as
(passband) ripple. The edge (endpoint) of the passband ω
p
is the
highest frequency for which [H(e

)[ = A(1 −δ).
251
• The stopband is the interval of frequencies over which the amplitude re-
sponse takes low values; these range from 0 to in Figure 4.17. Again,
the ﬂuctuation in the value of [H(e

)[ is referred to as (stopband)
ripple. The ratio
midpoint of amplitude range over passband
maximum of amplitude over stopband
=
A
A
=
1

is known as the stopband attenuation. The edge (endpoint) of the
stopband ω
s
is the lowest frequency for which [H(e

)[ = A.
• The transition band is the interval (ω
p
, ω
s
) separating the passband
from the stopband.
(Note that an ideal ﬁlter has δ = 0, = 0, ω
p
= ω
s
= ω
c
, and an empty
transition band.)
The parameters of other ﬁlter types (highpass, bandpass and bandstop)
are deﬁned similarly.
4.5.4 Practical FIR Filters
We began our discussion of FIR ﬁlters in Section 4.3 by examining the the
second-order ﬁlter
y[n] = x[n] −x[n −1] +x[n −2] , n ∈ Z
whose frequency response
H(e

) = e
−jω
(2 cos ω −1)
crudely approximates that of a bandstop ﬁlter. To obtain ﬁlters with good
frequency responses, it is necessary to combine more values of the input
signal, i.e., use a higher ﬁlter order M. There are several design techniques
for frequency-selective FIR ﬁlters, the details of which are beyond the scope
of our discussion. In the case where M is even-valued, the resulting impulse
response sequence
h[n] =
_
b
n
, 0 ≤ n ≤ M;
0, otherwise.
resembles the impulse response sequence of an ideal ﬁlter over the time
interval n = −M/2 and n = M/2, delayed in time by M/2 units. Odd
values of M are also used in practice.
252
The coeﬃcient vectors of FIR ﬁlters used for frequency selection typically
have symmetry about n = M/2, i.e.,
b
n
= b
M−n
(4.11)
(This feature was also present in the second-order ﬁlter introduced in Section
4.3). As a result, their impulse response sequences are also symmetric about
n = M/2:
h[n] = h[M −n]
Figure 4.18 shows the impulse response of a lowpass ﬁlter of order M = 17,
which is symmetric about n = 17/2.
−5 0 5 10 15 20
−0.1
0
0.1
0.2
0.3
0.4
Figure 4.18: Impulse response of a lowpass ﬁlter of order M = 17.
The frequency response of an FIR ﬁlter satisfying (4.11) can be always
written in the form
H(e

) = e
−j(Mω/2)
F(ω) (4.12)
where F(ω) is a real-valued function expressible as a sum of cosines. We
illustrate this property using two examples.
Example 4.5.1. Consider the FIR ﬁlter with coeﬃcient vector
b =
_
1 −2 3 −2 1
¸
T
253
Here M = 4 (even). The ﬁlter frequency response is given by
H(e

) = 1 −2e
−jω
+ 3e
−j2ω
−2e
−j3ω
+e
−j4ω
= e
−j2ω
_
e
j2ω
−2e

+ 3 −2e
−jω
+e
−j2ω
_
= e
−j2ω
_
3 −2(e

+e
−jω
) + (e
j2ω
+e
−j2ω
)
_
= e
−j2ω
(3 −4 cos ω + 2 cos 2ω)
and thus
F(ω) = 3 −4 cos ω + 2 cos 2ω
Example 4.5.2. In this case, M = 5 (odd). The coeﬃcient vector is given
by
b =
_
1 −1 2 2 −1 1
¸
T
and the ﬁlter frequency response is given by
H(e

) = 1 −e
−jω
+ 2e
−j2ω
+ 2e
−j3ω
−e
−j4ω
+e
−j5ω
= e
−j(5ω/2)
(e
j(5ω/2)
−e
j(3ω/2)
+ 2e
j(ω/2)
+2e
−j(ω/2)
−e
−j(3ω/2)
+e
−j(5ω/2)
)
= e
−j(5ω/2)
(4 cos(ω/2) −2 cos(3ω/2) + 2 cos(5ω/2))
and thus
F(ω) = 4 cos(ω/2) −2 cos(3ω/2) + 2 cos(5ω/2)
Equation (4.12) allows us to express the amplitude and phase responses
as
¸
¸
H(e

)
¸
¸
= [F(ω)[ and ∠H(e

) = −

2
+
_
0, F(ω) ≥ 0;
π, F(ω) < 0.
Note that the phase response is a linear function of ω with jumps of π
occurring wherever F(ω) changes sign. Since F(ω) has no zeros over the
passband, no such discontinuities occur there; thus the phase response is
exactly linear. By the time delay property of the DTFT (Subsection 4.2.3),
adding a linear function of frequency to the phase spectrum is equivalent to
a delay in the time domain. Thus all frequency components of interest (i.e.,
those in the passband) are reconstructed at the output with the same delay,
equal to M/2 time units. This is a very appealing property of FIR ﬁlters
with symmetric coeﬃcients. The other important class of ﬁlters (known
as IIR, or inﬁnite impulse response) exhibits a nonlinear phase response,
resulting in sinusoidal components reappearing at the output with variable
delays. This eﬀect, known as phase distortion, can severely change the shape
of a pulse and can be particularly noticeable in images.
254
4.5.5 Filter Transformations
Multiplication of a sequence by a complex sinusoid in the time domain results
in a shift of its DTFT in frequency.
Fact. (Multiplication by a Complex Sinusoid in Time Domain) If y[n] =
e

0
n
x[n], then Y (e

) = X(e
j(ω−ω
0
)
).
This follows from the analysis equation:
Y (e

) =

n=−∞
y[n]e
−jnω
=

n=−∞
x[n]e

0
n
e
−jnω
=

n=−∞
x[n]e
−jn(ω−ω
0
)
= X(e
j(ω−ω
0
)
)
This property, applied to the impulse response sequence h and the fre-
quency response H(e

), allows us to make certain straightforward trans-
formations between frequency-selective ﬁlters of diﬀerent types (lowpass,
highpass, bandpass and bandstop). For example:
• Multiplication of h[n] by e
jπn
= (−1)
n
results in H(e

) being shifted
in frequency by π radians—this transforms a lowpass ﬁlter into a high-
pass one, and vice versa.
• Multiplication of h[n] by
cos(ω
0
n) =
e

0
n
+e
−jω
0
n
2
produces a new frequency response equal to
H(e
j(ω−ω
0
)
) +H(e
j(ω+ω
0
)
)
2
If the original ﬁlter is lowpass with passband and stopband edges at
ω = ω
p
and ω = ω
s
, respectively, then the above transformation will
produce a bandpass ﬁlter provided ω
s
< ω
0
< π − ω
s
. The center of
the passband will be at ω = ω
0
and its width will be about 2ω
p
.
These transformations are illustrated in Figure 4.19.
255
−2π −π 0 π 2π
A
−2π −π 0 π 2π
A
−2π −π −ω 0 ω π 2π
A/2
0 0
(i)
(ii)
(iii)
Figure 4.19: Multiplication of the impulse response by a sinusoid
results in ﬁlter type transformation: lowpass (i) to highpass (ii);
and lowpass (i) to bandpass (iii).
256
4.6 Time Domain Computation of Filter Response
4.6.1 Impulse Response
The input-output relationship
y[n] = b
0
x[n] +b
1
x[n −1] + +b
M
x[n −M] (4.13)
of a ﬁnite impulse response ﬁlter allows us to directly compute the response
y of the ﬁlter to an arbitrary input sequence x. We have seen that for cer-
tain classes of input signals (notably sinusoids, exponentials and periodic
signals), the output sequence can be computed eﬃciently using frequency-
domain properties of the ﬁlter such as the system function H(z) and the
frequency response H(e

). The frequency domain-based approach is gener-
ally recommended for input signals whose spectra are discrete, or else have
“nice” closed forms which enable us to invert the equation
Y (e

) = H(e

)X(e

)
with ease. In most practical applications, the signals involved are not as
structured, and it becomes necessary to use the equation (4.13) directly.
In Section 4.3, we described the operation of the FIR ﬁlter deﬁned by
(4.13) as follows: At each time n, the time-reverse of the vector b is aligned
with a window of the input signal spanning time indices n −M through n.
The products b
k
x[n −k] are computed, then added together to produce the
output sample y[n]. The time-reverse of b is then shifted to the right by
one time index, a new window is formed, and the computation is repeated
to yield y[n+1], etc. This procedure is illustrated graphically in Figure 4.20
(i).
In our discussion of the FIR frequency response and system function, we
deﬁned the so-called impulse response of the ﬁlter as the sequence h formed
by padding the vector b with inﬁnitely many zeros on both sides. To see
why that sequence was named so, consider an input signal consisting of a
unit impulse at time zero:
x[n] = δ[n]
This signal is plotted in the top graph of Figure 4.20 (ii). If n < 0, then
indices n − M through n are all (strictly) negative, and thus x[n − M] =
. . . = x[n] = 0. This implies that
y[n] = b
0
x[n] +b
1
x[n −1] + +b
M
x[n −M] = 0
257
bM ... b2 b1 b0
n
n
x[n]
x[n-M]
Σ . = y[n]
bM ... b2 b1 b0
Σ . = h[2] = b2
0 2
2
1
b0
b1
b3
b4
... b
M
(i) (ii)
n
n
Figure 4.20: (i) Filter response at time n formed by linear com-
bination of input signal values in a sliding time window; (ii) Filter
response to a unit impulse.
A zero value for y[n] is also obtained for n > M, in which case indices
n −M through n are all (strictly) positive and the corresponding values of
the input signal are all zero.
For 0 ≤ n ≤ M, the window of input samples which are aligned with
the (time-reversed) vector b includes the only nonzero value of the input,
namely x[0] = 1. Then
y[n] = b
n
x[0] = b
n
as illustrated in the bottom graph of Figure 4.20 (ii).
Thus the response of the ﬁlter to a unit impulse is given by
h[n] =
_
b
n
, 0 ≤ n ≤ M;
0, otherwise.
(4.14)
Clearly, the designation ﬁnite impulse response (FIR) is appropriate for
this ﬁlter, since the duration of the impulse response sequence is ﬁnite (it
begins at time 0 and ends at time M).
258
4.6.2 Convolution and Linearity
The deﬁnition of h in (4.14) allows us to rewrite the input-output relation-
ship (4.13) as follows:
y[n] =
M

k=0
b
k
x[n −k]
=
M

k=0
h[k]x[n −k]
=

k=−∞
h[k]x[n −k] (4.15)
where the conversion to an inﬁnite sum was possible because h[k] = 0 for
k < 0 and k > M. Although only a ﬁnite number (i.e., M +1) of summands
are involved here, the inﬁnite sum in (4.15) gives us a general form for
the input-output relationship of a linear ﬁlter which would also hold if the
impulse response had inﬁnite duration. (Such ﬁlters are beyond the scope
of the present discussion.)
Deﬁnition 4.6.1. The (linear) convolution of sequences h and x is the
sequence y denoted by
y = h ∗ x
and deﬁned by
y[n] =

k=−∞
h[k]x[n −k]
In the above sum, k is a (dummy) variable of summation. At a given
time n, the output is formed by summing together all products of the form
h[k]x[n−k] (note that the time indices in the two arguments always add up
to n).
The concept of convolution was encountered earlier in its circular form,
which involved two vectors of the same (ﬁnite) length. The relationship
between linear and circular convolution will be explored in the next section.
In the meantime, we note that linear convolution is symmetric:
h ∗ x = x ∗ h
or equivalently,

k=−∞
h[k]x[n −k] =

k=−∞
x[k]h[n −k]
259
This can be shown using a change of variable from k to k

= n−k in (4.15).
As k ranges over all integers (for ﬁxed n), so does k

(in reverse order), and
thus

k=−∞
h[k]x[n −k] =

k

=−∞
h[n −k

]x[k

] =

k=−∞
x[k]h[n −k]
The last expression for the convolution sum, namely
y[n] =

k=−∞
x[k]h[n −k] , (4.16)
can be also obtained directly from the linearity of the ﬁlter. We saw earlier
that the response of the FIR ﬁlter to the input x[n] = δ[n] is given by h[n],
deﬁned in (4.14) above. By a similar argument, the response to a delayed
impulse
x[n] = δ[n −k] n ∈ Z
is given by h[n] delayed by k time instants, which is the signal h[n −k].
We can express any input x as a linear combination of delayed impulses:
x[n] =

k=−∞
x[k]δ[n −k]
By linearity of the ﬁlter, the response y to input x will be the linear com-
bination of the responses to each of the delayed impulses in the sum, i.e.,
y[n] =

k=−∞
x[k]h[n −k]
This is the same equation as (4.16) above.
4.6.3 Computation of the Convolution Sum
We now illustrate the computation of (4.15) in its simplest and most fre-
quently encountered form, where both signal sequences h and x have ﬁnite
duration. We will assume that x[n] = 0 for n < 0 or n ≥ L; and simi-
larly, h[n] = 0 for n < 0 or n ≥ P (thus the order of the FIR ﬁlter equals
M = P −1).
As we noted previously, in order to compute y[n], we need to form the
products h[k]x[n−k] (where k ranges over all time indices), then sum them
together. The easiest way to form x[n −k] is to reverse x in time:
˜ x[k] = x[−k] ;
260
then delay ˜ x by n time instants:
˜ x[k −n] = x[n −k]
This procedure is depicted in Figure 4.21, where x[n − k] is plotted (as
a function of k) for diﬀerent values of n. We note the following:
x[k]
0 L-1 k 0 P-1 k
h[k]
x[-k]
x[n
1
-k]
x[n
2
-k]
x[n
3
-k]
x[n
4
-k]
x[n
5
-k]
n
1
n
2
n
3
n
4
n
5
Figure 4.21: Illustration of the convolution of two ﬁnite-length
sequences h and x.
• For n < 0, the nonzero values of the sequence x[n −k] do not overlap
(in time) with those of h[k]. As a result, h[k]x[n−k] = 0 for all k, and
261
y[n] = 0. (This is expected since the impulse response is causal and
the input is zero for negative time indices.)
• Similarly, no overlap occurs when the left edge of x[n −k] follows the
right edge of h[k]. This happens when n −(L−1) > P −1, i.e., when
n > P +L −2.
• For 0 ≤ n ≤ P + L − 2, the nonzero segments of x[n − k] and h[k]
overlap in varying degrees, and the resulting value of y[n] is, in general,
nonzero.
We summarize our conclusions as follows.
Fact. The convolution y = h∗x of two ﬁnite-duration sequences h and x is
also a ﬁnite-duration sequence. If the nonzero segments of h and x begin at
time n = 0 and have length P and L, respectively, then the nonzero segment
of y also begins at time n = 0 and has length P +L −1.
Example 4.6.1. Consider the two sequences h and x given by
h[n] = −δ[n] + 3δ[n −1] −3δ[n −2] +δ[n −3]
and
x[n] = δ[n] + 2δ[n −1] + 3δ[n −2]
h[k]
0
-1
3
1
-3
3 k
x[k]
0
1
2
3
2 k
Example 4.6.1
Based on our earlier discussion, the nonzero segment of y = h∗ x begins
at time n = 0 and has length 4 +3 −1 = 6. Thus we only need to compute
y[n] for n = 0, . . . , 5.
262
We follow the technique outlined earlier and illustrated in Figure 4.21.
Instead of plotting, we tabulate (in rows) the values of h[k] along with those
x[n − k] for each time index n of interest (i.e., n = 0 : 5). The products
h[k]x[n −k] are then computed and summed together for every value of n,
and the resulting output value y[n] is entered in the last column.
k −2 −1 0 1 2 3 4 5 y[n]
h[k] −1 3 −3 1
x[−k] 3 2 1 −1
x[1 −k] 3 2 1 1
x[2 −k] 3 2 1 0
x[3 −k] 3 2 1 4
x[4 −k] 3 2 1 −7
x[5 −k] 3 2 1 3
We thus obtain
y[0 : 5] =
_
−1 1 0 4 −7 3
¸
T

Remark. Since convolution is symmetric, the same result would have been
obtained in the previous example by interchanging the signals x and h.
4.6.4 Convolution and Polynomial Multiplication
The z-transform of a sequence x is a function of the complex variable z ∈ C
deﬁned by
X(z) =

k=−∞
x[k]z
−k
provided, of course, that the inﬁnite sum converges. If x has ﬁnite duration,
then the sum consists of a ﬁnite number of nonzero terms and convergence
is not an issue (except at z = 0). If x[n] = 0 for n < 0 and n ≥ L (as was
the case in the previous section), then
X(z) =
L−1

k=0
x[k]z
−k
i.e., X(z) is a polynomial of degree L −1 in the variable z
−1
= 1/z.
263
Note that the z-transform was encountered earlier in Section 4.4: if h is
a FIR ﬁlter response of order M = P −1, then
H(z) =
P−1

k=0
h[k]z
−k
is the ﬁlter system function, which reduces to the ﬁlter frequency response
when z = e

.
The convolution y = h∗ x mirrors the multiplication of the polynomials
X(z) and H(z). To see this, write
H(z)X(z) =
_
P−1

k=0
h[k]z
−k
__
L−1

k=0
x[k]z
−k
_
=
_
h[0] +h[1]z
−1
+ +h[P −1]z
−(P−1)
_

_
x[0] +x[1]z
−1
+ +x[L −1]z
−(L−1)
_
and consider the z
−n
term (where 0 ≤ n ≤ L + P − 2) in the resulting
product. This term is formed by adding together products of the form
h[k]z
−k
x[n −k]z
−(n−k)
= h[k]x[n −k]z
−n
Its coeﬃcient is thus given by the sum
h[0]x[n] +h[1]x[n −1] + +h[n −1]x[1] +h[n]x[0]
which in turn equals
(h ∗ x)[n] = y[n]
We therefore have
Y (z) =
L+P−2

n=0
y[n]z
−n
= H(z)X(z)
This result can be simply stated as follows.
Fact. The z-transform of the ﬁlter response is given by the product of the
ﬁlter system function and the z-transform of the input sequence.
A special case of this result is the input-output relationship
Y (e

) = H(e

)X(e

)
264
in the frequency domain, which was encountered earlier. We note that
Y (z) = H(z)X(z)
also holds for inﬁnite-duration sequences, provided the inﬁnite sums involved
converge.
Conclusion. Linear convolution of two sequences in the time domain is
equivalent to multiplication (of their DTFT’s or, more generally, their z-
transforms) in the frequency domain. Thus linear convolution of sequences
is analogous to circular convolution of vectors (which also corresponds to
multiplication of DFT’s in the frequency domain).
4.6.5 Impulse Response of Cascaded Filters
Consider two ﬁlters with impulse response sequences h
(1)
and h
(2)
, and
system functions H
1
(z) and H
2
(z), respectively. We have seen in Subsection
4.4.2 that the system function H(z) of the cascade of two ﬁlters is given by
H(z) = H
1
(z)H
2
(z)
Thus we must also have
h = h
(1)
∗ h
(2)
i.e., the impulse response sequence of the cascade is given by the convolution
of the two impulse response sequences.
The same result can be obtained using a time domain-based argument.
Brieﬂy, if the input to the cascade is a unit impulse, then the ﬁrst ﬁlter
will produce an output equal to h
(1)
. The second ﬁlter will act on h
(1)
to
produce a response equal to h
(1)
∗ h
(2)
, which is also the impulse response
of the cascade. This is illustrated graphically in Figure 4.22.
δδ
H
1
h
*
h
(2) (1)
h
(1)
H
2
Figure 4.22: Impulse response of a cascade.
Example 4.6.2. Consider two FIR ﬁlters with impulse responses
h
(1)
[n] = δ[n] −2δ[n −1] −2δ[n −2] +δ[n −3]
265
and
h
(2)
[n] = δ[n] +δ[n −1] −3δ[n −2] +δ[n −3] +δ[n −4]
The impulse response h of the cascade can be computed directly in the time
domain using the formula
h = h
(1)
∗ h
(2)
and the convolution technique explained in Subsection 4.6.3. Alternatively,
we have
H
1
(e

) = 1 −2e
−jω
−2e
−j2ω
+e
−j3ω
H
2
(e

) = 1 +e
−jω
−3e
−j2ω
+e
−j3ω
+e
−j4ω
and therefore
H(e

) = H
1
(e

)H
2
(e

)
= 1 −e
−jω
−7e
−j2ω
+ 6e
−j3ω
+ 6e
−j4ω
−7e
−j5ω
−e
−j6ω
+e
−j7ω
The impulse response h of the cascade is read oﬀ the coeﬃcients of H(e

):
h[n] = δ[n] −δ[n −1] −7δ[n −2] + 6δ[n −3]
+6δ[n −4] −7δ[n −5] −δ[n −6] +δ[n −7]
In particular, the cascade is an FIR ﬁlter of order M = 7 with coeﬃcient
vector
b = h[0 : 7] = [ 1 −1 −7 6 6 −7 −1 1 ]
T

As a closing remark, we note that the time domain and frequency domain
formulas for cascaded ﬁlters developed here and in Subsection 4.4.2 can be
generalized to an arbitrary number of ﬁlters connected in cascade.
266
4.7 Convolution in Practice
4.7.1 Linear Convolution of Two Vectors
In the previous section, we saw that the convolution of two ﬁnite-duration
sequences also results in a sequence of ﬁnite duration. This allows us to
extend the concept of (linear) convolution to vectors, by converting them to
sequences of ﬁnite duration.
Suppose b and s are vectors of length P and L, respectively. Let
h[n] =
_
b[n], 0 ≤ n ≤ P −1;
0, otherwise.
and x[n] =
_
s[n], 0 ≤ n ≤ L −1;
0, otherwise.
be the sequences obtained by padding b and s with inﬁnitely many zeros on
both sides. As we saw earlier,
y = h ∗ x
is a sequence of total duration P +L −1 time instants:
y[n] =
_
c[n], 0 ≤ n ≤ P +L −2;
0, otherwise.
The convolution of b and s is deﬁned by
b ∗ s = c = y[0 : P +L −2]
Example 4.7.1. In Example 4.6.1,
b =
_
−1 3 −3 1
¸
T
s =
_
1 2 3
¸
T
and
c = b ∗ s =
_
−1 1 0 4 −7 3
¸
T

The vectors b and s often (though not always) have nonzero ﬁrst and
last entries. In such cases, the ﬁrst and last entries of c = b ∗ s are also
nonzero, since
c[0] = b[0]s[0] and c[P +L −2] = b[P −1]s[L −1]
If any single endpoint (i.e., b[0], s[0], b[P − 1] or s[L − 1]) is replaced by a
zero, then the corresponding endpoint of c (i.e., c[0] or c[P + L − 2]) also
becomes a zero. This leads to the following useful property:
267
Fact. If 0
i
represents an all-zeros vector of length i, then (using the semi-
colon notation as in MATLAB)
[0
i
; b] ∗ s = [0
i
; b ∗ s]
and
b ∗ [s ; 0
i
] = [b ∗ s ; 0
i
]
4.7.2 Linear Convolution versus Circular Convolution
In our discussion of the DFT, we introduced the circular convolution () of
two vectors of the same length N, and showed that circular convolution in
the time domain is equivalent to (element-wise) multiplication of DFT’s in
the frequency domain. It turns out that linear convolution of two vectors
can be also implemented by circular convolution, after zero-padding the two
vectors to the appropriate output length (which is known in advance).
Fact. If b = b[0 : P −1] and s = s[0 : L −1], then
b ∗ s = [b; 0
L−1
] [s ; 0
P−1
]
This result is demonstrated in the arrays below for the case P = 6 and
L = 9, where the output vector has length P + L − 1 = 14. Note that the
same product terms b[k]s[n −k] will be generated in corresponding rows of
the two arrays. As before, zero entries are omitted.
Linear Convolution:
b
0
b
1
b
2
b
3
b
4
b
5
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
0
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
1
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
2
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
3
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
4
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
5
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
6
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
7
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
8
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
9
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
10
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
11
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
12
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
13
268
Circular Convolution:
b
0
b
1
b
2
b
3
b
4
b
5
s
0
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
→ c
0
s
1
s
0
s
8
s
7
s
6
s
5
s
4
s
3
s
2
→ c
1
s
2
s
1
s
0
s
8
s
7
s
6
s
5
s
4
s
3
→ c
2
s
3
s
2
s
1
s
0
s
8
s
7
s
6
s
5
s
4
→ c
3
s
4
s
3
s
2
s
1
s
0
s
8
s
7
s
6
s
5
→ c
4
s
5
s
4
s
3
s
2
s
1
s
0
s
8
s
7
s
6
→ c
5
s
6
s
5
s
4
s
3
s
2
s
1
s
0
s
8
s
7
→ c
6
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
s
8
→ c
7
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
8
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
9
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
10
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
11
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
12
s
8
s
7
s
6
s
5
s
4
s
3
s
2
s
1
s
0
→ c
13
The formal proof of this fact using the time-domain deﬁnitions of the
linear and circular convolution is neither diﬃcult nor particularly insightful.
For a more interesting proof, we turn to the frequency domain. If, as before,
h, x and y are the ﬁnite-duration sequences obtained by padding b, s and
c = b ∗ s
(respectively) with inﬁnitely many zeros, then
y = h ∗ x
The DTFT’s of the three sequences in the above equation are related by
Y (e

) = H(e

)X(e

) (4.17)
We have
H(e

) =
P−1

n=0
b[n]e
−jωn
X(e

) =
L−1

n=0
s[n]e
−jωn
Y (e

) =
P+L−2

n=0
c[n]e
−jωn
269
In Subsection 4.2.2, we saw that sampling the DTFT of a ﬁnite-duration
sequence at frequencies
ω
k
=
2πk
N
, k = 0, . . . , N −1
yields the DFT of s zero-padded to length N, provided, of course, that N is
greater than or equal to the length of s. Taking N = P +L −1 (the size of
the longest vector in this case), we have
H(e
j(2π/N)k
) = k
th
entry in the DFT of [b; 0
L−1
]
X(e
j(2π/N)k
) = k
th
entry in the DFT of [s ; 0
P−1
]
Y (e
j(2π/N)k
) = k
th
entry in the DFT of c
From (4.17), we conclude that the DFT of c is given by the element-wise
product of the DFT of [b; 0
L−1
] and that of [s ; 0
P−1
]. From DFT 9, it
follows that the time-domain vectors must satisfy
c = [b; 0
L−1
] [s ; 0
P−1
]
Remark. If b and s are zero-padded to a total length N > P +L −1, then
circular convolution will result in c being padded with N−(P +L−1) zeros.
The equivalence of linear and circular convolution for ﬁnite vectors gives
us an option of performing this operation in the frequency domain, by com-
puting the DFT’s of the two vectors involved (zero-padded to the appro-
priate length), then inverting the element-wise product of the DFT’s. This
approach was illustrated in Section 4.4, where we computed the response
of an FIR ﬁlter to a periodic input using (in eﬀect) a circular convolution
implemented via DFT’s. As it turns out, implementation via DFT’s can
be greatly advantageous in practice. This is because the number of ﬂoat-
ing point operations required to convolve two vectors of length N in the
time domain is of the order of N
2
; while the DFT and its inverse can be
implemented using fast Fourier transform (FFT) algorithms, for which the
number of such operations is only of the order of N log
2
N. For large values
of N, the frequency-domain approach can be much faster.
In MATLAB, the standard command for convolving two vectors b and
s is
c = conv(b,s)
which is implemented entirely in the time domain. An equivalent frequency-
domain implementation using FFT’s is
270
N = length(b) + length(s) - 1;
B = fft(b,N);
S = fft(s,N);
C = B.*S;
c = ifft(C)
4.7.3 Circular Buﬀers for Real-Time FIR Filters
In many practical applications, a linear ﬁlter will operate over long periods
of time, processing input signals which are much longer than the ﬁlter’s
impulse response. If the ﬁlter is part of a system operating in real time
(e.g., a digital speech encoder used during a telephone conversation), the
delay in generating output samples shouldn’t—and needn’t—be too long.
The current output y[n] can be computed as soon as the current input x[n]
becomes available using the input-output relationship
y[n] =
M

k=0
b
k
x[n −k] (4.18)
Clearly, the M previous input values must also be available at any given
time, since computation of y[n] requires knowledge of the vector x[n−M : n].
The subsequent ﬁlter output y[n+1] is computed using the vector x[n−
M + 1 : n + 1]. Both x[n − M : n] and x[n − M + 1 : n + 1] contain the
subvector x[n −M + 1 : n], and diﬀer only in one element:
x[n −M : n] =
_
x[n −M] ; x[n −M + 1 : n]
¸
while
x[n −M + 1 : n + 1] =
_
x[n −M + 1 : n] ; x[n + 1]
¸
Using signal permutations, x[n−M+1 : n+1] can be generated by applying
a left circular shift on x[n −M : n]:
P
−1
x[n −M : n] =
_
x[n −M + 1 : n] ; x[n −M]
¸
followed by replacing the last entry x[n −M] by x[n + 1].
The use of a circular shift in updating the vector of input values suggests
a computationally eﬃcient implementation of an FIR ﬁlter in terms of a so-
called circular buﬀer. The circular buﬀer is a vector of size M +1 which at
any particular time n holds P
r
x[n −M : n], i.e., a circularly shifted version
of x[n − M : n]. A pointer i
now
gives the position of x[n] in the circular
buﬀer; it is not diﬃcult to see that if the buﬀer is indexed by 1 : M + 1,
then r = i
now
. At time n + 1, the buﬀer is updated as follows:
271
• the pointer i
now
is incremented circularly by one (position); and
• the input x[n−M+1] at that position is replaced by the current input
x[n + 1].
After the update, the buﬀer holds P
r+1
x[n −M + 1 : n + 1].
Figure 4.23 illustrates the sequential updating of the circular buﬀer in
the case M = 6, assuming all inputs before n = 0 were identically equal
to zero. The initial position of x[0] (obtained by initialization of i
now
) is
arbitrary.
x[0] 0 0 0 0 0 0 n = 0
x[0] 0 0 0 0 0 n = 1 x[1]
x[0] 0 0 0 0 n = 2 x[1] x[2]
x[0] n = 6 x[1] x[2] x[3] x[4] x[5] x[6]
x[7] n = 7 x[1] x[2] x[3] x[4] x[5] x[6]
x[7] n = 8 x[8] x[2] x[3] x[4] x[5] x[6]
.
.
.
.
.
.
Figure 4.23: A circular buﬀer of size P = M + 1 = 7 used for
computing y[0 : 8]. For each value of n, the arrow (same as the
pointer i
now
) indicates the current input x[n].
The ﬁlter output (4.18) at time n is the (unconjugated) inner product
of the coeﬃcient vector b with the contents of the circular buﬀer read in
reverse circular order starting with the current input sample x[n].
4.7.4 Block Convolution
In applications where longer delays between input and output can be tol-
erated, it may be advantageous to ﬁlter the input signal in a block-wise
fashion, i.e., by convolving consecutive blocks (segments) of the input signal
with the ﬁlter impulse response. Convolving long vectors using frequency
272
domain-based algorithms such as the FFT is generally more eﬃcient than
following the sliding window approach (in the time domain) discussed in the
previous subsection.
To develop a model for the block-wise implementation of convolution,
consider an FIR ﬁlter of order M = P −1 > 0 with coeﬃcient vector b. The
ﬁlter acts on an input vector s, which has been partitioned into J blocks of
length L:
s = [s
(0)
; s
(1)
; . . . ; s
(J−1)
]
The block length L is a design parameter chosen according to factors such
as processing speed and acceptable delay between input and output. Here
we will assume that L ≥ P. Note that we have also taken the length of s
as an exact multiple of L (zero-padding may be used for the last block if
necessary).
By linearity and time-invariance of the ﬁlter, the output
c = b ∗ s
is the superposition (i.e., sum) of the responses to each of the input blocks
s
(j)
, where each response is delayed in time by the appropriate index, namely
jL. Since the response to s
(j)
has a total duration of L+P−1 > L time units,
it follows that the responses overlap in time. Speciﬁcally, for 0 < j ≤ J −1,
the vector c[jL : jL + P − 2] is formed by summing together parts of two
vectors, namely the time-shifted responses to s
(j−1)
and s
(j)
.
The underlying concept of superposition is illustrated graphically in Fig-
ure 4.24 for the case J = 2.
(1)
s
(2)
s
bP-1 ... b2 b1 b0
(2)
s
L
0
(1)
s
L
0
bP-1 ... b2 b1 b0
bP-1 ... b2 b1 b0
Figure 4.24: Linearity and time-invariance applied to block con-
volution.
Example 4.7.2. Consider the FIR ﬁlter with coeﬃcient vector
b = [ 1 −3 −3 1 ]
T
273
driven by the ﬁnite-duration input sequence x, where
x[0 : 11] = s = [ 1 −2 3 4 4 −2 7 5 1 3 −1 −4 ]
T
and x[n] = 0 for n < 0 and n > 11.
Suppose we wish to compute the ﬁlter response y to x using block con-
volution with block length L = 6. We express s as
s = [s
(1)
; s
(2)
] = [s
(1)
; 0
6
] + [0
6
; s
(2)
]
where
s
(1)
= [ 1 −2 3 4 4 −2 ]
T
and
s
(2)
= [ 7 5 1 3 −1 −4 ]
T
Then
b ∗ [s
(1)
; 0
6
] = [b ∗ s
(1)
; 0
6
]
= [ 1 −5 6 2 −19 −23 −2 10 −2 0 0 0 0 0 0 ]
T
and
b ∗ [0
6
; s
(2)
] = [0
6
; b ∗ s
(2)
]
= [ 0 0 0 0 0 0 7 −16 −35 −8 −8 −9 18 11 −4 ]
T
We thus have
b ∗ s = b ∗ [s
(1)
; 0
6
] +b ∗ [0
6
; s
(2)
]
= [ 1 −5 6 2 −19 −23 5 −6 −37 −8 −8 −9 18 11 −4 ]
T
and the ﬁlter response is given by
y[n] =
_
(b ∗ s)[n], 0 ≤ n ≤ 14;
0, otherwise.
274
Problems
Sections 4.1–4.2
P 4.1. (i) Sketch the signal sequence
x[n] = δ[n + 2] −δ[n + 1] + 5δ[n] −δ[n −1] +δ[n −2]
and write an expression for its discrete-time Fourier transform X(e

). Show
that X(e

) can be written as a sum of cosines with real coeﬃcients.
(ii) Generalize the result of part (i) to a symmetric signal sequence of ﬁnite
duration (2L + 1 samples): If
x[n] = x[−n] =
_
β
|n|
, [n[ ≤ L
0, [n[ > L
express X(e

) as a sum of L cosines plus a constant.
P 4.2. (i) Sketch the signal sequence
x[n] =
_
1, 0 ≤ n ≤ M
0, otherwise
and express its Fourier transform X(e

) in the form
e
jKω
F(ω)
where F(ω) is a real-valued function which is symmetric about ω = 0. (Hint:
Consult Subsection 3.8.2.)
(ii) Let M = 23. How would you deﬁne an array x in MATLAB, so that
the command
X = fft(x,500)
computes X(e

) at 500 equally spaced frequencies ω ranging from 0 to
1.996π, inclusive?
P 4.3. Consider the signal sequence x deﬁned by
x[n] = cos
_
3πn
14
−1.8
_
+ 2 cos
_
18πn
35
−0.7
_
+ 6 cos
_
17πn
24
+ 2.0
_
(i) Is the sequence periodic, and if so, what is its period?
(ii) Sketch the amplitude and phase spectra of x (both of which are line
spectra).
275
Sections 4.3–4.4
P 4.4. Consider the FIR ﬁlter whose input x and output y are related by
y[n] = x[n] −x[n −1] −x[n −2] +x[n −3]
(i) Write out an expression for the system function H(z).
(ii) Express [H(e

)[
2
in terms of cosines only. Plot [H(e

)[ as a function
of ω.
(iii) Determine the output y[n] when the input sequence x is given by each
of the following expressions (where n ∈ Z):
• x[n] = 1
• x[n] = (−1)
n
• x[n] = e
jπn/4
• x[n] = cos(πn/4 +φ)
• x[n] = 2
−n
• x[n] = 2
−n
cos(πn/4)
(In all cases except the third, your answer should involve real-valued terms
only.)
P 4.5. Consider the FIR ﬁlter
y[n] = x[n] −3x[n −1] +x[n −2] +x[n −3] −3x[n −4] +x[n −5]
(i) Write MATLAB code which includes the function fft, and which com-
putes the amplitude and phase response of the ﬁlter at 256 equally spaced
frequencies between 0 and 2π(1 −256
−1
).
(ii) Express the frequency response of the ﬁlter in the form
e
−jαω
F(ω)
where F(ω) is a real-valued sum of cosines.
(iii) Determine the response y[n] of the ﬁlter to the exponential input se-
quence
x[n] =
_
1
2
_
n
, n ∈ Z
276
P 4.6. The MATLAB code
a = [ 1 -3 5 -3 1 ].’ ;
H = fft(a,500);
A = abs(H);
q = angle(H);
computes the amplitude response A and phase response q of a FIR ﬁlter over
500 equally spaced frequencies in the interval [0, 2π).
(i) If x and y are (respectively) the input and output sequences of that
ﬁlter, write an expression for y[n] in terms of values of x.
(ii) Determine the output y of the ﬁlter when the input x is given by
x[n] =
_
1
3
_
n
, n ∈ Z
(iii) Express the frequency response of the ﬁlter in the form
e
−jαω
F(ω)
where F(ω) is a real-valued sum of cosines.
P 4.7. Consider a FIR ﬁlter whose input x and output y are related by
y[n] =
8

k=0
b
k
x[n −k] ,
where the coeﬃcient vector b is given by
b = [ 1 2 −2 −1 4 −1 −2 2 1 ]
T
Let the input x be an inﬁnite periodic sequence with period L = 7, and such
that
x[0 : 6] = [ 1 −1 0 3 1 −2 0 ]
T
Using DFT’s (and MATLAB), determine the ﬁrst period y[0 : 6] of the
output sequence y.
P 4.8. Consider two FIR ﬁlters with coeﬃcient vectors b and c, where
b =
_
3 2 1 2 3
¸
T
and
c =
_
1 −2 2 −1
¸
T
277
(i) Determine the system function H(z) of the cascade. Is the cascade also
a FIR ﬁlter? If so, determine its coeﬃcient vector.
(ii) Express the amplitude response of the cascade as a sum of sines or
cosines (as appropriate) with real-valued coeﬃcients.
P 4.9. Consider the FIR ﬁlter with coeﬃcient vector
b =
_
1 1 1 1
¸
T
Two copies of this ﬁlter are connected in series (cascade).
(i) Determine the system function H(z) of the cascade. Is the cascade also
a FIR ﬁlter? If so, determine its coeﬃcient vector.
(ii) Determine the response y[n] of the cascade to the sinusoidal input se-
quence
x[n] = cos
_

2
_
, n ∈ Z
Section 4.5
P 4.10. The text ﬁle s6.txt contains the coeﬃcient vector of a FIR ﬁlter of
order M = 40.
(i) Plot the impulse response of the ﬁlter (i.e., the vector in s6.txt) using
the STEM or BAR functions in MATLAB. What kind of symmetry does the
impulse response exhibit?
(ii) Plot [H(e

)[, i.e., the magnitude of the frequency response of the ﬁlter.
What type of ﬁlter (lowpass, highpass, bandpass or bandstop) do we have
here? Determine the edges of the passband and stopband.
(iii) If
A
1
= maximum value of [H(e

)[ over the passband
A
2
= minimum value of [H(e

)[ over the passband
A
3
= maximum value of [H(e

)[ over the stopband
compute the ratios A
1
/A
2
and A
2
/A
3
(iv) How would you convert this ﬁlter to a bandpass ﬁlter whose passband is
centered around ω = π/2? What would be resulting width of the passband?
278
P 4.11. Consider the signal sequence x given by
x[n] = 2 cos(3.0n −1.7) + cos(1.4n + 0.8) , n ∈ Z
(i) Is x periodic? If so, what is its period?
(ii) If x is the input to the linear ﬁlter with frequency response described
by
H(e

) =
_
2e
−j3ω
, 2π/3 ≤ ω ≤ 4π/3;
0, all other ω in [0, 2π),
determine the output sequence y.
P 4.12. Consider the signal
x[n] = 3 cos
_
πn
15

π
3
_
+ 5 cos
_
πn
4
+
π
2
_
(i) Is the signal periodic, and if so, what is its period?
(ii) If x[n] is the input to a ﬁlter with amplitude and phase responses as
shown in ﬁgure, determine the resulting output y[n].
H(e )

A
H(e )

−π 0 π/6 π/3 π/2 2π/3 5π/6 π −π π/6 π/2 5π/6 π
π/3
−π/3
Problem P 4.12
Section 4.6
P 4.13. Consider the FIR ﬁlter with impulse response h given by
h[n] = δ[n] −2δ[n −1] + 3δ[n −3] −2δ[n −4]
(i) Without using convolution, determine the response y of the ﬁlter to the
input signal x given by
x[n] = δ[n + 1] −δ[n −1]
279
(Hint: Express y in terms of h.)
(ii) Now let the input signal x be given by
x[n] = δ[n] + 2δ[n −1] + 3δ[n −2] −δ[n −3]
Using convolution, determine the output signal y.
(iii) Obtain the answer to (ii) by direct multiplication of the z-transforms
H(z) and X(z).
P 4.14. Consider the signal sequence
x[n] =
_
1, 0 ≤ n ≤ M
0, otherwise
of Problem P 4.2.
(i) Determine the convolution
y = x ∗ x
For what values of n is y[n] nonzero?
(ii) Using the results of Problem P 4.2, write an equation for the DTFT
Y (e

) of y.
(iii) By how many instants should y be delayed (or advanced) in time so
that the resulting signal is symmetric about n = 0? What is the DTFT of
that signal?
P 4.15. Deﬁne x by
x[n] = δ[n] −δ[n −1] +δ[n −2] −δ[n −3]
and let h be the (inﬁnite impulse response) sequence given by
h[n] =
_
α
n
, n ≥ 0;
0, n < 0.
Compute y = x ∗ h. Sketch your answer in the case α = 2.
P 4.16. Consider the ﬁlter with impulse response given by
h[n] =
_
1, 0 ≤ n ≤ 8
0, otherwise
280
(i) Determine the response of the ﬁlter to the input
x
(1)
[n] = cos
_

2
+
π
8
_
, n ∈ Z
(ii) Determine the output of the ﬁlter to the input
x
(2)
[n] = δ[n] −δ[n −1] +δ[n −2]
P 4.17. Two FIR ﬁlters whose impulse response sequences h
(1)
and h
(2)
are
given by
h
(1)
[n] = δ[n] +δ[n −1] +δ[n −2] +δ[n −3] , n ∈ Z
and
h
(2)
[n] = δ[n] −δ[n −2] , n ∈ Z
are connected in series (cascade) to form a single ﬁlter with impulse response
h.
(i) Determine the corresponding system functions H
(1)
(z), H
(2)
(z) and
H(z).
(ii) Determine h (i.e., h[n] for every n ∈ Z).
(iii) Sketch the output y of the cascade when the input x is given by
x[n] = δ[n + 1] , n ∈ Z
Section 4.7
P 4.18. Consider the convolution of b = b[0 : P − 1] and s = s[0 : L − 1],
where b represents a FIR coeﬃcient vector and s represents the ﬁlter input.
(i) Construct a (P +L −1) L matrix B such that
b ∗ s = Bs
(In eﬀect, every FIR ﬁlter can be viewed as a linear transformation of the
input signal sequence. The matrix B describes this transformation in the
case where the input signal has ﬁnite duration L. See also Problem P 2.8.)
(ii) Verify that the following MATLAB script generates the required matrix
B from a column vector b:
281
c = [b ; zeros(L-1,1)];
r = [b(1) zeros(1,L-1)];
B = toeplitz(c,r);
Compare B*s and conv(b,s) for random choices of s.
P 4.19. A circular buﬀer for an FIR ﬁlter of order M = 16 is initialized at
time n = 0 by placing x[0] in the ﬁrst (leftmost or topmost) position within
the buﬀer. Express the contents of the buﬀer at time n = 139 in the form
_
x[n
1
: n
2
] ; x[n
3
: n
4
]
¸
P 4.20. Consider the following incomplete MATLAB code, where P is a known
(previously deﬁned) integer and b is a known column vector or length P.
iupdate = [2:P 1];
ireverse = toeplitz(1:P, [1 P:-1:2]);
xcirc = zeros(P,1);
inow = P;
xvec = [];
yvec = [];
while 1
x = randn(1);
xvec = [xvec ; x];
%
% computation of scalar variable y
%
yvec = [yvec ; y];
end
This code simulates the continuous operation (note the while loop) of a FIR
ﬁlter with coeﬃcient vector b, driven by a sequence xvec of Gaussian noise.
The scalar variables x and y denote the current input and output, and the
output sequence is given by yvec. Complete the code using three commands
(in the commented-out section) that contain no additional variables (i.e.,
other than previously deﬁned), numbers or loops. Test your code against
the results obtained using the function CONV.
P 4.21. Consider the FIR ﬁlter
y[n] = x[n] −2x[n−1] −x[n−2] +4x[n−3] −x[n−4] −2x[n−5] +x[n−6]
282
The response of the ﬁlter to the input
x[0 : 3] =
_
a
0
a
1
a
2
a
3
¸
T
(x[] = 0 otherwise) is given by
y[0 : 9] =
_
1 −1 0 −7 8 13 −20 −1 11 −4
¸
T
(y[] = 0 otherwise); while the response to
˜ x[0 : 3] =
_
c
0
c
1
c
2
c
3
¸
T
(˜ x[] = 0 otherwise) is given by
˜ y[0 : 9] =
_
1 −2 −2 10 −8 −10 18 −2 −9 4
¸
T
(˜ y[] = 0 otherwise). Determine the response of the ﬁlter to
ˆ x[0 : 7] =
_
a
0
a
1
a
2
a
3
2c
0
2c
1
2c
2
2c
3
¸
T
(ˆ x[] = 0 otherwise).
P 4.22. Consider the two vectors
b =
_
2 −3 4 −5 −1 −1
¸
T
and
s =
_
2 −3 0 7 −3 4 1 5
¸
T
(i) Determine the convolution b ∗ s using the function CONV in MATLAB.
(ii) Repeat using the functions FFT and IFFT instead of CONV.
P 4.23. Consider the following MATLAB code:
a = [ 1 -2 3 -4 ].’ ;
b = [ 1 2 -1 2 1].’ ;
A = fft(a,8);
B = fft(b,8);
C = A.*B;
c = ifft(C);
(i) Without using MATLAB, determine the vector c.
(ii) Let a = x[0 : 3], where x is a sequence of period L = 4. The signal x
is the input to a FIR ﬁlter with coeﬃcient vector b, resulting in an output
sequence y. How would you modify the code shown above so that c =
y[0 : 3]?
283
P 4.24. Consider the following MATLAB code:
a = [ 2 -1 0 -1 2 ].’ ;
b = [ 3 -2 5 1 -1].’ ;
A = fft(a,9);
B = fft(b,9);
C = A.*B;
c = ifft(C);
(i) Without using MATLAB, determine the vector c.
(ii) Without performing a new convolution, determine the vector e obtained
by the following code:
a = [ 2 -1 0 -1 2 ].’ ;
d = [-3 2 -5 -1 1 3 -2 5 1 -1].’ ;
A = fft(a,14);
D = fft(d,14);
E = A.*D;
e = ifft(E);
P 4.25. The data ﬁle s7.txt contains a vector s of length L = 64. Let
b =
_
2 −1 −3 1 3 1 −3 −1 2
¸
T
(i) Use the CONV function in MATLAB to compute the convolution c = b∗s.
What is the length of c?
(ii) Compute c using three convolutions of output length 2
5
= 32, followed
by a summation. Your answer should be the same as for part (i).
Epilogue
Our overview of signals took us from analog waveforms to discrete-time
sequences (obtained by sampling) to ﬁnite-dimensional vectors. With the
aid of basic tools and concepts from linear algebra—notably linear indepen-
dence, Gaussian elimination, least-squares approximation (projection) and
orthogonality—we developed the representation of an N-dimensional signal
vector in terms of N sinusoidal components known as the discrete Fourier
transform (DFT). The extension of vectors to sequences (by taking N →∞)
brought the continuum of frequency into play and gave rise to the discrete-
time Fourier transform (DTFT). In the second part of this book, we will
develop similar tools for the analysis of analog waveforms and examine the
implications of sampling from a frequency-domain viewpoint.
The traditional exposition of signal analysis begins with the sinusoidal
representation of periodic analog waveforms known as the Fourier series.
This representation is then extended to aperiodic waveforms (by taking a
limit as the signal period tends to inﬁnity), resulting in the so-called Fourier
transform. Sequences and vectors are treated next, and the DTFT and DFT
are developed using some of the ideas and results from Fourier series and
the Fourier transform. Our approach in this book clearly follows a diﬀerent
order.
While both the Fourier series and the Fourier transform for analog wave-
forms will be developed in the second part of this book, the motivated reader
might be interested in a preview of that development, which, as it turns out,
is quite accessible at this point. It is also possible to provide here a deﬁni-
tive answer to a question raised in Section 1.6, namely on the suﬃciency of
Nyquist-rate sampling for the faithful reconstruction of bandlimited analog
waveforms.
284
285
The Fourier Series
A continuous-time signal ¦s(t), t ∈ R¦ is periodic with period T
0
if
s(t +T
0
) = s(t) , t ∈ R
Associated with the period T
0
are two familiar parameters, the cyclic fre-
quency f
0
= 1/T
0
(Hz) and the angular frequency Ω
0
= 2π/T
0
The continuous-time sinusoids
cos Ω
0
t, sin Ω
0
t, and e
jΩ
0
t
encountered in Chapter 1 are all periodic with period T
0
. It follows that for
any nonzero integer k, the sinusoids
cos kΩ
0
t, sinkΩ
0
t, and e
jkΩ
0
t
are periodic with period T
0
/[k[; and since T
0
is an integer multiple of T
0
/[k[,
they are also periodic with period T
0
. These sinusoids are also referred to
as harmonics of the basic sinusoid (real or complex) of frequency Ω
0
.
The key result in Fourier series is that a “well-behaved” periodic signal
s(t) with period T
0
can be expressed as a linear combination of the sinusoids
e
jkΩ
0
t
, all of which are periodic with period T
0
. In other words,
s(t) =

k=−∞
S
k
e
jkΩ
0
t
, t ∈ R
The coeﬃcients S
k
can be obtained from s(t) by a suitable integral over
a time interval of length equal to one period, i.e., T
0
. Together with the
corresponding frequencies kΩ
0
, these coeﬃcients yield a discrete (line) spec-
trum ¦(kΩ
0
, S
k
), k ∈ Z¦ for the periodic signal s(t), as shown in Figure
E.1. Note that the frequency Ω is in radians per second (as opposed to per
sample), and that there is no periodicity in the spectrum (as was the case
with discrete-time sequences).
As it turns out, this result can be established by considering the usual
DTFT analysis and synthesis formulas for discrete-time signals. We know
that the spectrum of a time-domain sequence is a periodic function of the
frequency ω (measured in radians per sample), with period 2π. Thus given a
spectrum X(e

), the time-domain sequence ¦x[n], n ∈ Z¦ can be recovered
from X(e

) using the integral (synthesis equation)
x[n] =
1

_

0
X(e

)e
jωn
dω ;
286
while X(e

) can be obtained from ¦x[n], n ∈ Z¦ using the sum (analysis
equation)
X(e

) =

n=−∞
x[n]e
−jωn
The trick here is to associate the continuous frequency parameter ω with
continuous time. To that end, we deﬁne
t
def
= ω/Ω
0
(which has the correct units, namely seconds) and use the periodic frequency-
domain signal X(e

) to create a periodic time-domain signal s(t):
s(t)
def
= X(e
jΩ
0
t
) = X(e

)
Since X(e

) is an arbitrary signal of period 2π (in ω), s(t) is an arbitrary
signal of period 2π/Ω
0
= T
0
(in t). From the synthesis equation, we see that
s(t) can be associated with a sequence ¦S
k
, k ∈ Z¦ deﬁned by
S
k
def
= x[−k]
and such that
S
k
=
1

_

0
X(e

)e
−jωk

=

0

_
2π/Ω
0
0
s(t)e
−jkΩ
0
t
dt
=
1
T
0
_
T
0
0
s(t)e
−jkΩ
0
t
dt
From the analysis equation, we then obtain
s(t) = X(e
jΩ
0
t
) =

k=−∞
x[k]e
−jkΩ
0
t
=

k=−∞
S
k
e
jkΩ
0
t
We have therefore established that the periodic signal s(t) can be expressed
as a Fourier series, i.e., a linear combination of complex sinusoids whose
frequencies are multiples of Ω
0
. (A Fourier series can be also regarded as
a power series in e
jΩ
0
t
.) The k
th
coeﬃcient S
k
is obtained by integrating,
over one period, the product of s(t) and the complex conjugate of the k
th
sinusoid.
287
In using the DTFT to derive the Fourier series, we exploited the duality
of time and frequency, which is a central theme in Fourier analysis. Duality
was encountered earlier in the discussion of the DFT, where it was shown
that the time and frequency domains had the same dimension (namely vector
size N), and that the DFT and IDFT operations were essentially the same
(with the exception of a scaling factor and a complex conjugate). In the
case of the DTFT, the two domains are quite diﬀerent: time n is a discrete
parameter taking values in Z, while frequency ω is a continuous parameter
taking values in an interval of length 2π. Also, the time-domain signal
is a sequence, while the frequency-domain signal is continuous-parameter
periodic signal. In eﬀect, the Fourier series is the dual of the DTFT where
the two domains are interchanged: the time-domain signal is a continuous-
parameter periodic signal, while the frequency-domain signal is a discrete
sequence of coeﬃcients (i.e., a line spectrum). Both the Fourier series and
the DTFT use (essentially) the same sum and integral for transitioning
between the two domains (discrete-to-continuous and continuous-to-discrete,
respectively). Figure E.1 is a graphical illustration of the duality of the
Fourier series and the DTFT.
The Fourier Transform
The Fourier transform is a frequency-domain representation for certain classes
of aperiodic continuous-time signals. These signals are typically either ab-
solutely integrable or square-integrable, i.e., satisfy
_

−∞
[s(t)[dt < ∞ or
_

−∞
[s(t)[
2
dt < ∞
(or both). Either type of integrability implies that the signal s(t) decays
suﬃciently fast in t as t → ∞ or t → −∞; and that if s(t) is locally
unbounded, it cannot approach inﬁnity too fast.
Consider, for example, the three signals deﬁned, for all t ∈ R, by
e
−|t|
, e
t
and cos t + cos πt
The ﬁrst signal satisﬁes both integrability conditions and has a Fourier trans-
form, while the second signal violates both integrability conditions and does
not have a Fourier transform. The third signal (which is aperiodic) violates
both integrability conditions and does not have a Fourier transform in the
conventional sense; yet it has an obvious frequency-domain representation
in terms of a line spectrum (at frequencies Ω = ±1 and Ω = ±π rad/sec).
288
−T
0
0 T
0
t 0 Ω
0

s(t)
S
0
S
-1
S
2
S
1
S
3
S
-2
S
-3
x[n]
n 0 1
−2π 0 2π ω
X(e

)
Figure E.1: A periodic waveform (top left) and its Fourier series
coeﬃcients (top right); a sample sequence (bottom left) and its
discrete-time Fourier transform (bottom right). If the continuous-
parameter graphs are scaled versions of each other, then the
discrete-parameter graphs are also scaled and time-reversed ver-
sions of each other (duality of the Fourier series and the DTFT).
It is possible to obtain the Fourier transform of an aperiodic signal
¦s(t), t ∈ R¦ by considering the Fourier series of the periodic extension
of the time-truncated signal ¦s(t), −T/2 < t < T/2¦ and then taking a
suitable limit as T →∞. Here we will follow a diﬀerent approach based on
the DTFT of the sampled signal
s

[n]
def
= s(n∆) , n ∈ Z
(where ∆ > 0 is the sampling period). The idea behind the proof is as
follows: if s

[] has a frequency-domain representation which converges, as
∆ → 0, to a single ”spectrum-like” function, then that function must be
valid as a frequency-domain representation for the continuous-time signal
¦s(t), t ∈ R¦ also, since that signal can be asymptotically obtained in this
fashion, i.e., by reducing the sampling period to zero.
To simplify the analysis, we will let ∆ →0 by taking
∆ = T, T/2, T/4, . . . , T/2
r
, . . .
289
in turn, where T > 0 is ﬁxed. This ensures that the sampling instants for
diﬀerent values of ∆ are nested; in particular, any time t = (m2
−r
)T, where
r ∈ N and m ∈ Z, is a sampling instant for all suﬃciently small ∆. By
varying m and r, we obtain a dense set of points t on the time axis, which
are suﬃcient for the speciﬁcation of the entire signal s().
If t is such a point, then t/∆ is an integer for suﬃciently small ∆, and
thus t is the (t/∆)
th
sampling instant for the sequence s

[]. The synthesis
equation (inverse DTFT) gives
s(t) = s

[t/∆]
=
1

_
π
−π
S

(e

)e
jω(t/∆)

=
1

_
π/∆
−π/∆
¦∆S

(e
jΩ∆
)¦ e
jΩt
dΩ
where the change of variables Ω = ω/∆ was used for the last integral. Note
that Ω is now measured in radians per second, which is appropriate for a
continuous-time signal.
Viewed as a function of Ω, the expression ∆S

(e
jΩ∆
) is periodic with
period 2π/∆. As we will soon see, this expression converges (as ∆ →0) for
each Ω ∈ R, leading to a new function
S(Ω)
def
= lim
∆→0
∆S

(e
jΩ∆
) , Ω ∈ R
The function S(Ω) cannot be periodic. Assuming that it decays suﬃciently
fast as Ω → ∞ and Ω → −∞ (which it does, by virtue of the integrability
conditions on s()), the last integral will converge, as ∆ →0, to
1

_

−∞
S(Ω)e
jΩt
dΩ
Thus also
s(t) =
1

_

−∞
S(Ω)e
jΩt
dΩ
Remark. It is interesting to note that the equivalent expression
s(t) =
1

_
2π/∆
0
∆S

(e
jΩ∆
)e
jΩt
dΩ
will, upon replacing ∆S

(e
jΩ∆
) by S(Ω) and 2π/∆ by its limiting value ∞,
result in the incorrect integral
1

_

0
S(Ω)e
jΩt
dΩ
290
for s(t). This paradox can be resolved by observing that, no matter how
small ∆ is, the integral for s(t) is taken over an entire period of the function
∆S

(e
jΩ∆
). As ∆ →0, the period T/∆ approaches inﬁnity, and the limiting
function S(Ω) is aperiodic—in other words, a period of S(Ω) takes up the
entire real axis. Thus the correct limits for the integral should be −∞ to
∞.
To show that ∆S

(e
jΩ∆
) indeed converges, we use the analysis equation:
∆S

(e
jΩ∆
) = ∆

n=−∞
s

[n]e
−jnΩ∆
=

n=−∞
s(n∆) e
−jnΩ∆

The last sum consists of equally spaced samples (over the entire time axis) of
the function s(t)e
−jΩt
, each multiplied by the spacing, or sampling period,
∆. As ∆ →0, the sum converges to an integral, and thus
S(Ω) = lim
∆→0
∆S

(e
jΩ∆
) =
_

−∞
s(t)e
−jΩt
dt
The function S(Ω) is the Fourier transform, or spectrum, of the continuous-
time signal s(t). Here, Ω (in rad/sec) is a continuous frequency parameter
ranging from −∞ to ∞. Similarly, s(t) is the inverse Fourier transform of
S(Ω). Note that in this case, the two domains, time and frequency, have the
same dimension (that of the continuum). This gives rise to interesting dual-
ity properties, which are apparent from the similarities between the analysis
and synthesis equations:
S(Ω) =
_

−∞
s(t)e
−jΩt
dt
s(t) =
1

_

0
S(Ω)e
jΩt
dΩ
Nyquist Sampling
In Section 1.6, we saw that an analog signal made up of a continuum of
sinusoidal components with frequencies in the range [0, f
B
) (Hz) cannot be
sampled at a rate less than 2f
B
without loss of information due to alias-
ing. We will show how sampling at a rate greater than, or equal to, 2f
B
291
(samples/second) allows for perfect reconstruction of the continuous-time
signal.
Let us assume then that the continuous-time signal ¦s(t), t ∈ R¦ has
a Fourier transform S(Ω) which vanishes outside the interval (−Ω
B
, Ω
B
)
(where Ω
B
= 2πf
B
). We thus have
s(t) =
1

_

−∞
S(Ω)e
jΩt
dΩ =
1

_

B
−Ω
B
S(Ω)e
jΩt
dΩ (E.1)
Sampling at the Nyquist rate yields a sequence of samples at times t = nT
s
,
where
T
s
def
=
1
2f
B
=
π

B
From (E.1), we obtain an expression for each of these samples:
s(nT
s
) =
1

_

B
−Ω
B
S(Ω)e
jπ(Ω/Ω
B
)
dΩ (E.2)
-Ω
B
0 Ω
Β

S(Ω)
Figure E.2: Shown in bold is the Fourier transform (spectrum)
S(Ω) of a bandlimited continuous-time signal.
The signal S(Ω) has ﬁnite duration in the frequency domain. As such,
it can be periodically extended outside the interval (−Ω
B
, Ω
B
) (see Figure
E.2) and the resulting periodic extension can be expressed in terms of a
Fourier series in Ω. Clearly, the Fourier series will return S(Ω) for Ω in
the interval (−Ω
B
, Ω
B
). Using the Fourier series formulas developed earlier
(with the period parameter equal to 2Ω
B
), we have
S(Ω) =

n=−∞
A
n
e
jnπ(Ω/Ω
B
)
(E.3)
292
where
A
n
=
1
2Ω
B
_

B
−Ω
B
S(Ω)e
−jnπ(Ω/Ω
B
)
dΩ (E.4)
Comparing (E.2) and (E.3), we immediately obtain
A
−n
=
π

B
s(nT
s
) = T
s
s(nT
s
) (E.5)
Thus the sequence of samples of s() obtained at the Nyquist rate 2f
B
com-
pletely determine the spectrum S(Ω) through its Fourier series expansion
(E.3). Since S(Ω) uniquely determines s(t) (via the analysis equation (E.1)),
it follows that sampling at the Nyquist rate is suﬃcient for the faithful re-
construction of the bandlimited signal s(t).
As it turns out, we can obtain an explicit formula for s(t) at any time
t in terms of the samples ¦s(nT
s
)¦ by combining (E.1), (E.3) and (E.5) as
follows:
s(t) =
1

_

B
−Ω
B
_

n=−∞
A
−n
e
−jnπ(Ω/Ω
B
)
_
e
jΩt
dΩ
=
1
2Ω
B

n=−∞
s(nT
S
)
_

B
−Ω
B
e
jΩ(t−nTs)
dΩ
The inner integral equals
e
jΩ
B
(t−nTs)
−e
−jΩ
B
(t−nTs)
j(t −nT
s
)
=
2 sin(Ω
B
(t −nT
s
))
t −nT
s
and thus the ﬁnal expression for s(t) is
s(t) =

n=−∞
s(nT
s
)
sin(Ω
B
(t −nT
s
))

B
(t −nT
s
)
Clearly, s(t) as a weighted sum of all samples s(nT
s
) with coeﬃcients pro-
vided by the interpolation function
I

B
(τ)
def
=
sin(Ω
B
τ)

B
τ
evaluated for τ = t −nT
s
. Writing
s(t) =

n=−∞
s(nT
s
) I

B
(t −nT
s
)
we can also see that s() is a linear combination of the signals I

B
( −nT
s
),
which are time-delayed versions of I

B
() centered at the sampling instants
nT
s
.

Contents
1 Numbers, Waveforms and Signals 1.1 Real Numbers in Theory and Practice . . . . . . . . . . . . . 1.1.1 Fixed-Point Representation of Real Numbers . . . . . 1.1.2 Floating-Point Representation . . . . . . . . . . . . . . 1.1.3 Visualization of the Fixed and Floating Point Representations . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Finite-Precision Representations of Real Numbers . . 1.1.5 The IEEE Standard for Floating-Point Arithmetic . . 1.2 Round-oﬀ Errors . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Error Deﬁnitions . . . . . . . . . . . . . . . . . . . . . 1.2.2 Round-Oﬀ . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Round-Oﬀ Errors in Finite-Precision Computation . . 1.3 Review of Complex Numbers . . . . . . . . . . . . . . . . . . 1.3.1 Complex Numbers as Two-Dimensional Vectors . . . 1.3.2 The Imaginary Unit . . . . . . . . . . . . . . . . . . . 1.3.3 Complex Multiplication and its Interpretation . . . . . 1.3.4 The Complex Exponential . . . . . . . . . . . . . . . . 1.3.5 Inverses and Division . . . . . . . . . . . . . . . . . . . 1.4 Sinusoids in Continuous Time . . . . . . . . . . . . . . . . . . 1.4.1 The Basic Cosine and Sine Functions . . . . . . . . . . 1.4.2 The General Real-Valued Sinusoid . . . . . . . . . . . 1.4.3 Complex-Valued Sinusoids and Phasors . . . . . . . . 1.5 Sinusoids in Discrete Time . . . . . . . . . . . . . . . . . . . . 1.5.1 Discrete-Time Signals . . . . . . . . . . . . . . . . . . 1.5.2 Deﬁnition of the Discrete-Time Sinusoid . . . . . . . . 1.5.3 Equivalent Frequencies for Discrete-Time Sinusoids . . 1.5.4 Periodicity of Discrete-Time Sinusoids . . . . . . . . . 1.6 Sampling of Continuous-Time Sinusoids . . . . . . . . . . . . 1.6.1 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . ii 1 2 2 3 4 5 6 8 8 8 10 13 13 15 15 16 18 20 20 21 23 27 27 27 29 32 34 34

iii 1.6.2 Discrete-Time Sinusoids Obtained by Sampling 1.6.3 The Concept of Alias . . . . . . . . . . . . . . 1.6.4 The Nyquist Rate . . . . . . . . . . . . . . . . Problems for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 36 39 42

2 Matrices and Systems 49 2.1 Introduction to Matrix Algebra . . . . . . . . . . . . . . . . . 50 2.1.1 Matrices and Vectors . . . . . . . . . . . . . . . . . . . 50 2.1.2 Signals as Vectors . . . . . . . . . . . . . . . . . . . . 52 2.1.3 Linear Transformations . . . . . . . . . . . . . . . . . 54 2.2 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . 56 2.2.1 The Matrix of a Linear Transformation . . . . . . . . 56 2.2.2 The Matrix-Vector Product . . . . . . . . . . . . . . . 57 2.2.3 The Matrix-Matrix Product . . . . . . . . . . . . . . . 58 2.2.4 Associativity and Commutativity . . . . . . . . . . . . 59 2.3 More on Matrix Algebra . . . . . . . . . . . . . . . . . . . . . 62 2.3.1 Column Selection and Permutation . . . . . . . . . . . 62 2.3.2 Matrix Transpose . . . . . . . . . . . . . . . . . . . . . 63 2.3.3 Row Selection . . . . . . . . . . . . . . . . . . . . . . . 65 2.4 Matrix Inversion and Linear Independence . . . . . . . . . . . 66 2.4.1 Inverse Systems and Signal Representations . . . . . . 66 2.4.2 Range of a Matrix . . . . . . . . . . . . . . . . . . . . 67 2.4.3 Geometrical Interpretation of the Range . . . . . . . . 68 2.5 Inverse of a Square Matrix . . . . . . . . . . . . . . . . . . . . 72 2.5.1 Nonsingular Matrices . . . . . . . . . . . . . . . . . . 72 2.5.2 Inverse of a Nonsingular Matrix . . . . . . . . . . . . . 73 2.5.3 Inverses in Matrix Algebra . . . . . . . . . . . . . . . 75 2.5.4 Nonsingularity of Triangular Matrices . . . . . . . . . 76 2.5.5 The Inversion Problem for Matrices of Arbitrary Size 77 2.6 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . 79 2.6.1 Statement of the Problem . . . . . . . . . . . . . . . . 79 2.6.2 Outline of Gaussian Elimination . . . . . . . . . . . . 79 2.6.3 Gaussian Elimination Illustrated . . . . . . . . . . . . 81 2.6.4 Row Operations in Forward Elimination . . . . . . . . 83 2.7 Factorization of Matrices in Gaussian Elimination . . . . . . 84 2.7.1 LU Factorization . . . . . . . . . . . . . . . . . . . . 84 2.7.2 The Matrix L . . . . . . . . . . . . . . . . . . . . . . . 85 2.7.3 Applications of the LU Factorization . . . . . . . . . . 88 2.7.4 Row Operations in Backward Substitution . . . . . . . 89 2.8 Pivoting in Gaussian Elimination . . . . . . . . . . . . . . . . 92

iv 2.8.1 Pivots and Pivot Rows . . . . . . . . . . . . . . . . 2.8.2 Zero Pivots . . . . . . . . . . . . . . . . . . . . . . 2.8.3 Small Pivots . . . . . . . . . . . . . . . . . . . . . 2.8.4 Row Pivoting . . . . . . . . . . . . . . . . . . . . . 2.9 Further Topics in Gaussian Elimination . . . . . . . . . . 2.9.1 Computation of the Matrix Inverse . . . . . . . . . 2.9.2 Ill-Conditioned Problems . . . . . . . . . . . . . . 2.10 Inner Products, Projections and Signal Approximation . . 2.10.1 Inner Products and Norms . . . . . . . . . . . . . 2.10.2 Angles, Projections and Orthogonality . . . . . . . 2.10.3 Formulation of the Signal Approximation Problem 2.10.4 Projection and the Signal Approximation Problem 2.10.5 Solution of the Signal Approximation Problem . . 2.11 Least-Squares Approximation in Practice . . . . . . . . . 2.12 Orthogonality and Least-Squares Approximation . . . . . 2.12.1 A Simpler Least-Squares Problem . . . . . . . . . 2.12.2 Generating Orthogonal Reference Vectors . . . . . 2.13 Complex-Valued Matrices and Vectors . . . . . . . . . . . 2.13.1 Similarities to the Real-Valued Case . . . . . . . . 2.13.2 Inner Products of Complex-Valued Vectors . . . . 2.13.3 The Least-Squares Problem in the Complex Case . Problems for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 92 93 95 98 98 100 102 102 103 105 106 108 111 116 116 117 122 122 123 126 128

3 The Discrete Fourier Transform 144 3.1 Complex Sinusoids as Reference Vectors . . . . . . . . . . . . 145 3.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 145 3.1.2 Fourier Sinusoids . . . . . . . . . . . . . . . . . . . . . 146 3.1.3 Orthogonality of Fourier Sinusoids . . . . . . . . . . . 148 3.2 The Discrete Fourier Transform and its Inverse . . . . . . . . 153 3.2.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . 153 3.2.2 Symmetry Properties of the DFT and IDFT Matrices 155 3.2.3 Signal Structure and the Discrete Fourier Transform . 158 3.3 Structural Properties of the DFT: Part I . . . . . . . . . . . . 163 3.3.1 The Spectrum of a Real-Valued Signal . . . . . . . . . 163 3.3.2 Signal Permutations . . . . . . . . . . . . . . . . . . . 166 3.4 Structural Properties of the DFT: Part II . . . . . . . . . . . 170 3.4.1 Permutations of DFT and IDFT Matrices . . . . . . . 170 3.4.2 Summary of Identities . . . . . . . . . . . . . . . . . . 171 3.4.3 Main Structural Properties of the DFT . . . . . . . . 172 3.5 Structural Properties of the DFT: Examples . . . . . . . . . . 176

v 3.5.1 Miscellaneous Signal Transformations . . . . . . . . 3.5.2 Exploring Symmetries . . . . . . . . . . . . . . . . . 3.6 Multiplication and Circular Convolution . . . . . . . . . . . 3.6.1 Duality of Multiplication and Circular Convolution . 3.6.2 Computation of Circular Convolution . . . . . . . . 3.7 Periodic and Zero-Padded Extensions of a Signal . . . . . . 3.7.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 Periodic Extension to an Integer Number of Periods 3.7.3 Zero-Padding to a Multiple of the Signal Length . . 3.8 Detection of Sinusoids Using the Discrete Fourier Transform 3.8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 3.8.2 The DFT of a Complex Sinusoid . . . . . . . . . . . 3.8.3 Detection of a Real-Valued Sinusoid . . . . . . . . . 3.8.4 Final Remarks . . . . . . . . . . . . . . . . . . . . . Problems for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 177 182 182 185 187 187 188 191 192 192 193 197 199 201

4 Introduction to Linear Filtering 215 4.1 The Discrete-Time Fourier Transform . . . . . . . . . . . . . 216 4.1.1 From Vectors to Sequences . . . . . . . . . . . . . . . 216 4.1.2 Development of the Discrete-Time Fourier Transform 217 4.2 Computation of the Discrete-Time Fourier Transform . . . . 220 4.2.1 Signals of Finite Duration . . . . . . . . . . . . . . . . 220 4.2.2 The DFT as a Sampled DTFT . . . . . . . . . . . . . 223 4.2.3 Signals of Inﬁnite Duration . . . . . . . . . . . . . . . 225 4.3 Introduction to Linear Filters . . . . . . . . . . . . . . . . . . 229 4.3.1 Linearity, Time-Invariance and Causality . . . . . . . 229 4.3.2 Sinusoidal Inputs and Frequency Response . . . . . . 231 4.3.3 Amplitude and Phase Response . . . . . . . . . . . . . 233 4.4 Linear Filtering in the Frequency Domain . . . . . . . . . . . 236 4.4.1 System Function and its Relationship to Frequency Response . . . . . . . . . . . . . . . . . . . . . . . . . 236 4.4.2 Cascaded Filters . . . . . . . . . . . . . . . . . . . . . 238 4.4.3 The General Finite Impulse Response Filter . . . . . . 239 4.4.4 Frequency Domain Approach to Computing Filter Response . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 4.4.5 Response to a Periodic Input . . . . . . . . . . . . . . 243 4.5 Frequency-Selective Filters . . . . . . . . . . . . . . . . . . . . 247 4.5.1 Main Filter Types . . . . . . . . . . . . . . . . . . . . 247 4.5.2 Ideal Filters . . . . . . . . . . . . . . . . . . . . . . . . 248 4.5.3 Amplitude Response of Practical Filters . . . . . . . . 249

. . . . . . . . . . . . . . . . . . . . . . . . . . .7 Convolution in Practice . . . . . . . . . . . . . . . 4.3 Circular Buﬀers for Real-Time FIR Filters . . . . . . . . . 4. . . . . . . . . 4. 4.6 Time Domain Computation of Filter Response . .4 Practical FIR Filters . . . . . 4. . . . . . .4 Block Convolution .4 Convolution and Polynomial Multiplication . . . . . . . . . . . . .5 Impulse Response of Cascaded Filters . . . . .7. . . . . . . . . . . . . . . . . . . . . . . . . .5. . . .6. . . . . . . . . . . . . . . . . .2 Linear Convolution versus Circular Convolution . .vi 4. . . . .7. . . . .5. . . . . . . . . . . . . 4. . . . . . . . . 290 . . 4. . .7. . . . . . .6. . 251 254 256 256 258 259 262 264 266 266 267 270 271 274 Epilogue 284 The Fourier Series . . . . . . . .5 Filter Transformations . . . . . . . . . . 287 Nyquist Sampling . 4. . . . . . . . . . . . . . . . . . 4. 4. . . . .1 Linear Convolution of Two Vectors .7. . Problems for Chapter 4 . . . . . . . . . . . . . . . . . .2 Convolution and Linearity . . . 4. . . . . . .6. . . . . . . . . . .1 Impulse Response . . . .6. . . . . . . . 285 The Fourier Transform . . . 4.6. .3 Computation of the Convolution Sum . .

Waveforms and Signals 1 .Chapter 1 Numbers.

or √ • irrational . . 1. the (ﬁxed) fractional point.. For example (B = 10): √ π = 3. The fractional part is. the base B = 10 is used. All digits (in both the integer and fractional parts of x) are integers ranging from 0 to B − 1. . non-terminating expansions are the rule rather than the exception. 2. in general.2: Fixed-point representation of a real number. i. • in the decimal representation of x. . A real number can be • rational. the base B = 2 is used. and the possible digits are 0. • in the binary representation of x. each point representing a diﬀerent number. .1: The real line. +/- Integer Part . which is a positive integer. For example. inclusive.2 1. examples of irrational numbers include π. . The representation uses a base (or radix ) B. 2 = 1. .e. The sign of x (also denoted by sgn(x)) is followed by the integer part of x.1. Fractional Part Figure 1. and the possible digits are 0 and 1.4142136 . .1 Fixed-Point Representation of Real Numbers This is the most natural representation of a real number x. etc. and the fractional part of x. x R 1. . 9. Fractional part may consist of inﬁnitely many digits. . The integer part of x is always a ﬁnite string of digits. in other words. a ratio m/n of two integers.1 Real Numbers in Theory and Practice The real line R consists of a continuum of points. where (n = 0). only. an inﬁnite string. 0 Figure 1.1415926 .

08 × 103 = 0.11)2 = (1101.7499999 . Note that the ﬂoating-point representation of x is not unique. Clearly.d1 d2 d3 .1. . terminating expansions can be expressed in two equivalent non-terminating forms. where l is integer). x = 80 can be written as 0. .75 = (1101. . The integer factor B E is also known as the characteristic of the ﬂoatingpoint representation. if the binary representation of x has a terminating fractional part.7500000 .11)2 and thus 13.g. = 13.008 × 104 = .75 = 1 × 2−1 + 1 × 2−2 = (0. Thus.75 = 13. 2−l can be expressed as 5l × 10−l .)2 where both non-terminating expansions are shown. The converse is not. . Interestingly.. decimal (B = 10) representations are most common. however. is known as the mantissa of x. 13 = 1 × 23 + 1 × 22 + 0 × 21 + 1 × 20 = (1101)2 0.e. For example.8 × 102 = 0. and • E is the exponent of the base B. .) × B E where: • the fraction 0.1100000 . . For example (B = 10): 13.6 does not have a terminating binary expansion. 1. e. respectively. it equals B −l .2 Floating-Point Representation A ﬂoating-point representation of x with respect to base B is usually speciﬁed as sgn(x) × (0.. . .d1 d2 d3 .3 The rational part of x has a terminating expansion if and only if x is an exact multiple of a negative power of B (i.)2 = (1101. For example. . . true. .1011111 . so does the decimal one. . Binary forms (B = 2) are equally (if not more) useful in machine computations. the decimal fraction 0. . Conversion to binary form involves expressing the integer and fractional parts of x in terms of nonnegative and negative powers of 2. . . Note that for a positive integer l.

1375 × 10−1 Note that in the binary (B = 2) case. the integer part of x corresponds to a unit-length interval on the real line. If. Thus the normalized ﬂoating-point representation of x = 80 is x = 0.e. and such intervals form a partition of the real line as shown in Figure 1. we keep the sign and integer part of x ﬁxed and vary the fractional part.31415926 . B E ]. the range of 0. Since d1 ≥ 1. −4 −3 −2 −1 0 1 2 3 4 Figure 1. 0 1/2 B=2 1 0 1/10 B=10 1 Figure 1. Thus positive numbers with the same exponent E form the interval [B E−1 . In the normalized ﬂoating-point representation of x.. This means that varying E results in a partition of the positive .4 The ﬂoating point representation can be made unique by normalization. requiring that d1 be nonzero.d1 d2 d3 . .01375 = 0.1375 × 102 = (0.4.8 × 102 . π = (0. shown in bold in Figure 1. In other words.3. keeping both the sign and exponent E of x ﬁxed and varying the mantissa also results in x ranging over an interval of values. . is the interval [1/B.4: Shown in bold are the ranges of values of the normalized mantissa of binary (left) and decimal (right) numbers. i. . 1].) × 101 13.1. normalization forces the ﬁrst fractional digit d1 to be equal to 1. the value of x will range between two consecutive integers. Similarly. .110111)2 × 24 0.3 Visualization of the Fixed and Floating Point Representations Assume a ﬁxed base B.3: Partition of the real line into intervals of unit length.75 = 0. 1. in the ﬁxed-point representation of x.

) The 2Ke possible exponents are consecutive integers. Note that x = 0 is the only number that does not have a normalized representation (all the di ’s are zero).1. the same holds for the negative real line. These values will be collectively termed as the precise numbers. The binary case is illustrated in Figure 1. it can lead to large relative errors when small values are involved.5 E=−1 E=0 E=1 E=2 0 . a precise number would be an exact multiple of A · 2−K ranging in value from −(A/2) + A · 2−K to A/2. 2E ). Fixed-point computation involves. is generally more accurate and preferable for most applications. The K precise numbers are uniformly spaced on an interval of length A. 1. integer arithmetic. Within each such interval. Although ﬁxed-point computation is relatively simple and can be attractive for low-power applications.4 Finite-Precision Representations of Real Numbers Machine computations require ﬁnite storage and ﬁnite processing time.25 . Precise numbers in ﬂoating-point computation are determined by assigning Ke bits to the exponent and Km bits to the mantissa. In a 2 typical application. Floating-point computation.5: Partition of the real line according to the exponent E in the binary ﬂoating-point representation. This approximation is known as roundoﬀ (more on that later).5. every real number is approximated by a precise number. real line into intervals of diﬀerent length—the ratio of interval lengths being a power of B. Clearly.5 1 2 4 Figure 1. In the course of a numerical computation. If K bits are allocated to storing a real variable. there are 2Km uniformly spaced precise numbers. then that variable can take at most 2K values. essentially. this places the positive precise numbers in Ke adjacent intervals of the form [2E−1 . where 1 + Ke + Km = K (One bit is required for encoding the sign of the number. each corresponding to a . on the other hand.

.1. An inﬁnite value (positive or negative) is encoded using an allzeros mantissa. d53 ) × 2−1021 = sgn(a) × (0. then a has a denormalized ﬂoating-point representation: a = sgn(a) × (0. . .d2 . The precise numbers and their encoding follow the IEEE 754 Standard. . 1 for negative ones. . d53 ) × 2C−1022 • If C = 0.6 diﬀerent value of the mantissa. .1d2 . etc. d53 ) × 2−1022 Note that d2 = . The precise number a is then obtained as follows: • If 1 ≤ C ≤ 2046. = d53 = 0 gives a = 0. .g. d53 are read directly oﬀ the mantissa codeword in the order shown in Figure 1.). . K = 64 bits are used for each precise number.1d2 . d53 . the exponent codeword ranges in value from C = 0 to C = 211 − 1 = 2047. where d2 . .0d2 . . the value of a is either inﬁnite (Inf) or indeterminate (NaN). a = sgn(a) × (0. Any other mantissa word denotes an indeterminate value (such values arise from expressions such as 0/0.. • If C = 2047. . Details of an actual implementation (e. . . . Read as a binary integer with the most signiﬁcant digit on the left (as usual).6. The value of the sign bit is 0 for positive numbers.6: Binary encoding of a double-precision ﬂoating-point number. then a has a normalized ﬂoating-point representation with exponent E = C −1022 and mantissa given by 0. The bits are arranged as shown in Figure 1.5 The IEEE Standard for Floating-Point Arithmetic MATLAB uses what is known as double-precision variables in its ﬂoatingpoint computations. . in MATLAB) are described below.6: Sign (1 bit) Exponent Codeword (11 bits) d2 Mantissa Codeword (52 bits) d53 Figure 1. with Ke = 11 bits allocated to the exponent and Km = 52 bits allocated to the mantissa. 1. ∞ − ∞.

• hex2num(’7FEFFFFFFFFFFFFF’) gives the largest positive number. Denormalization also allows 252 uniformly spaced precise numbers in the interval [0. which is approximately equal to 4. 2E ) (on the positive real line). As discussed earlier. Conversely. each exponent E corresponds to a range of 252 precise numbers uniformly spaced in [2E−1 . will output the precise number corresponding to the input string. which is approximately equal to 1. MATLAB can display the binary string corresponding to any ﬂoating point number (after approximating it by the nearest precise number). the hexadecimal form is used (0000 through 1111 are encoded as 0 through F). Thus: • hex2num(’0000000000000001’) gives the smallest positive number. 2−1022 ). namely 2−1074 . The command format hex causes all subsequent numerical values to be shown in that form. the command hex2num(’string ’) where string consists of 16 hex digits (= 64 bits).80 × 10308 .94 × 10−324 .7 The eﬀective range of exponents in the normalized ﬂoating-point representation is E = −1021 to E = 1024. namely 21024 − 2971 . .

Chopping is illustrated in Figure 1. ˆ As an example. 1.7: Chopping () of positive and negative numbers. Word of Caution: The term absolute error is sometimes used for the (signed) error x −x itself. that of x (i. We will not adopt ˆ this deﬁnition here.7.2.e. We will use the term “absolute” (error. to diﬀerentiate it from the relative error. This means that the precise numbers are all multiples of 10−4 that fall in a range (−A/2. This approximation is necessary in most machine computations involving real numbers.8 1.2 Round-Oﬀ Round-oﬀ is deﬁned as the approximation of a real number by a ﬁniteprecision number.2 1. Chopping Chopping is the approximation of a real number x by the closest precise number whose absolute value is less than. The most common round-oﬀ methods are chopping and rounding. the approximation is in the direction of the origin). Assuming that . due to the storage constraints (ﬁnite word lengths for all variables involved). relative error) to designate the absolute value of the quantities deﬁned above.2. where A is a large number. or equal to.1 Round-oﬀ Errors Error Deﬁnitions The error in approximating a number x by x is given by the diﬀerence ˆ x−x ˆ The relative error is given by the ratio x−x ˆ x and is often quoted as a percentage.. y 0 y x x Figure 1. consider a decimal ﬁxed-point machine with four-digit precision. A/2].

. we have: π = 3. This is consistent with the fact that ﬁxed-point computation results in errors x − x that are roughly ˆ of the same order of magnitude regardless of the order of magnitude of x. not relative) error increases with |x|. .4142. .) × 100 → → 0. . . . for example. rounding (again denoted by “→”) gives π = 3.8. Rounding Rounding is the approximation of x by the closest precise number. the signiﬁcant digits of π are: 31415926. while the expected actual (i. In this case.1416 √ 1 − 2 = −0.. . → −0.4142136 . . four-digit precision refers to the decimal expansion of the mantissa. on the other hand..9 x falls in the above range. . denoting chopping by “→”.4142 × 100 The signiﬁcant digits of a number begin with the leftmost nonzero digit in its ﬁxed-point representation. . the ﬂoating-point machine produced one less signiﬁcant digit for π than the ﬁxed-point machine. Thus. The signiﬁcant digits are the same as the fractional digits in the normalized ﬂoating-point representation.) × 101 √ 1 − 2 = (−0. → 3. chopping x is the same as keeping the ﬁrst four fractional digits in the ﬁxed-point decimal representation of x. Note that in the case of π.3141 × 101 −0. maintains the same order of magnitude for the relative error throughout the range of values of x for which a normalized representation is possible.. → −0.4142 The same procedure would apply to a decimal ﬂoating-point machine with (the same.e.1415 √ 1 − 2 = −0.31415926 .1415926 . then the number farther from the origin is chosen (i. for the sake of comparison) four-digit precision. that digit (known as the most signiﬁcant digit) can be on either side of the fractional point. . the same numbers would be chopped as follows: π = (0. Thus. Thus.4142136 .1415926 . Floating-point computation. . → 3.4142136 .. Rounding is illustrated in Figure 1. In the ﬁxed-point. four decimal-digit precision example discussed earlier. If x is midway between two precise numbers. . thus the relative error improves as the order of magnitude of x increases. the number with the larger absolute value).e.

which is far easier to specify numerically than. This can lead to very large cumulative errors in situations where one sign (+ or −) dominates. Fact. As a result. The value x = 6/10 falls in the range [1/2. It follows that the round-oﬀ error in representing x will be less than 2−54 in absolute value. . Example 1.6 to 53 signiﬁcant binary digits (equivalent to roughly 16 decimal digits). 1. this is an exact fraction (6/10).e. MATLAB will round 0.2. e.4142 × 100 Note that in the case of π.g.31415926 . In decimal form. but it won’t be zero.6 in MATLAB. . ˆ while in the ﬂoating-point case. The error resulting from chopping is x also biased: it is always negative for positive numbers and positive for negative numbers. summands should be ordered by increasing absolute value in ﬂoating-point computation.3 Round-Oﬀ Errors in Finite-Precision Computation Fact.10 x x y y Figure 1. adding a large number of positive variables scaled by positive coeﬃcients.8: Rounding (). since it always results in a lower absolute error value |ˆ − x| than does chopping. Rounding is preferable. it is not an exact multiple of 2−l for some integer l. 1). π. The order of computation aﬀects the accumulation of round-oﬀ errors. . where the spacing between precise numbers equals 2−53 .2. Round-oﬀ errors are ubiquitous. say. Yet 6/10 does not have a terminating binary expansion. i. Suppose we enter x = 0.3142 × 101 −0.) × 100 → → 0.. As a rule of thumb.1.. rounding gave a diﬀerent answer from chopping. .) × 101 √ 1 − 2 = (−0. . π = (0.4142136 .

1000 × 101 0. .0001 → → → 0. no round-oﬀ error is incurred in representing x0 . 0. . four-digit precision produces the same result as adding zero to x0 . Note that the ratio of x0 to each of the other numbers is 10−4 .1000 × 101 + x10 = 1. .2.0001 0. The exact answer can be obtained by summing x1 through x10 ﬁrst. Suppose that we are asked to compute x0 + x1 + .e.2000 × 10−3 + x3 = 0.1000 × 10−2 + x0 = 0.1000 × 101 Thus at each stage. . At every stage of the computation (including input and output). Clearly. = x10 = 0. . The ﬁnal result is clearly unsatisfactory (even though the relative error is only about 10−3 ). . . . the exact answer is 1.. and E is assumed unrestricted. . If we perform the summation in the natural order (i.2. . .3000 × 10−3 ..1001×101 . Assume a decimal ﬂoating-point machine with four digit precision.0001 . .e. .0002 = 0.1000 × 10−2 0.0003 = 0. . every number x is rounded to the form sgn(x) × (0. .9000 × 10−3 + x10 = 0. .0010 = 0. + x10 where x0 = 1 and x1 = . .11 Example 1. . . . .1000 × 101 + x2 = 1. each of the eleven numbers corresponds to a precise number.1001 × 101 Note that no rounding was necessary at any point in the above computation. then we have: x0 + x1 = 1. . i. x10 . 0. .0001. .001. which corresponds to four decimal digits.2000 × 10−3 0. which is the precise four-digit ﬂoatingpoint number 0.1000 × 101 0. . and then adding x0 : x1 + x2 = 0.d1 d2 d3 d4 ) × 10E where di are decimal digits such that d1 > 0 (for normalization). that of the indices). Also.

889 × 10−13 It is possible to estimate. The Taylor series expansion for cos x gives cos x = 1 − 1 − cos x = x2 x4 x6 + − + . 2! 4! 6! Evaluating the ﬁrst three terms in the second expansion with three-digit precision. Computing the diﬀerence of two quantities that are close in value may result in low precision.. The relative error is thus roughly (0. we obtain: x2 /2 = 0.667 × 10−8 x6 /720 = 0.02) = 1 − (0. while the error will be dominated by the second term. namely 0.33 × 10−5 . Example 1.3.1000 × 101 = 0 which is clearly unsatisfactory. A diﬀerent algorithm is needed here. summing with 3-digit precision will result in an answer equal to the ﬁrst term. .1999933.1000 × 101 − 0.200 × 10−3 ) = 3. a diﬀerent algorithm may be required.200 × 10−3 . which is very satisfactory considering the low precision involved in the computation. Consider the computation of 1 − cos x for x = 0. . If higher precision is sought.) × 10−3 If cos(0. using the remainder form of Taylor’s theorem.667 × 10−8 )/(0.200 × 10−3 x4 /24 = 0. the resulting answer is 0. That estimate turns out to be much less than the smallest of the three terms. The exact answer is 1 − cos(0.2) is ﬁrst computed correctly and then rounded to three signiﬁcant digits.999800007 . Since the ratio between consecutive terms is much larger than 103 . .) = (0.12 Fact.... the error involved in approximating 1 − cos x by the ﬁrst three terms in the expansion.02 on a ﬂoating-point machine with three-digit precision. 2! 4! 6! x2 x4 x6 − + − .2...

0). or equivalently.1 Complex Numbers as Two-Dimensional Vectors A complex number z is a two-dimensional vector.13 1. 2π) (corresponding to points below the x-axis) are often quoted as negative angles in (−π.3. 2π). a point (x.9: Cartesian and polar representation of a complex number z.3 Review of Complex Numbers Complex numbers are especially important in signal analysis because they lead to an eﬃcient representation of sinusoidal signals—arguably the most important class of signals.y) Figure 1. by adding or subtracting integer multiples of 2π (this does not aﬀect the values of the sine and cosine functions). where r is the (nonnegative) length of the vector corresponding to (x. We have x = r cos θ y = r sin θ r = x2 + y 2 As implied by the arrow in Figure 1. The Cartesian coordinate pair (x.9. 1. y) is also equivalent to the polar coordinate pair (r. θ). It is also acceptable to use angles outside that interval. A formula for θ in terms of the Cartesian coordinates x and y is θ = arctan y + (0 or π) x . y). angles in the range (π. and θ is the angle of the vector relative to positive real line. y) on the Cartesian plane: y r 0 θ x z=(x. the angle θ takes values in [0. In particular.

The modulus of z equals √ 3 2 2 + =1 . • r = |z|. Since the conventional range of the function arctan(·) is the interval [−π/2. |z| = m{z} = 1 2 2 √ 3/2. • θ = ∠z.3. the imaginary part of z.14 where π is added if and only if x is negative. Consider the complex number √ 1 3 z= − . 3π/2). The complex conjugate of z is the complex number z ∗ deﬁned by e{z ∗ } = e{z} Equivalently. the real part of z. • y = m{z}. |z ∗ | = |z| and ∠z ∗ = −∠z Example 1. 2 2 shown in the ﬁgure. the resulting value of θ is in the range [−π/2. π/2].1. θ): • x = e{z}. the phase or angle of z. y) = (r.3. the modulus or magnitude of z. and m{z ∗ } = − m{z} Im z 3/2 −1/2 0 Re z* Example 1.1 We have e{z} = −1/2. We also have the following terminology and notation in conjunction with the complex number z = (x.

− 2 2 2π 3 1.3. then cz = cx + j(cy) Also. z = with |z ∗ | = 1 and ∠z ∗ = − ∗ √ 1 3 − . The ‘+’ in the expression z = x + jy represents the summation of the two vectors (x.2 The Imaginary Unit A complex number is also represented as z = x + jy where j denotes the imaginary unit.15 while the angle of z equals √ π 2π ∠z = arctan(− 3) + π = − + π = 3 3 Also. This becomes possible by making the assignment j 2 = −1 .3 Complex Multiplication and its Interpretation Unlike ordinary two-dimensional vectors. if z1 = x1 + jy1 and z2 = x2 + jy2 . It follows easily that if c is a real-valued constant. which clearly yields (x.3. the complex conjugate of x + jy is given by z ∗ = x − jy Note that e{z} = z + z∗ 2 z − z∗ 2j and m{z} = 1. 0) and (0. y). y) = z. then z1 + z2 = (x1 + x2 ) + j(y1 + y2 ) Finally. complex numbers can be multiplied together to yield another complex number (vector) on the Cartesian plane.

Since a sine-cosine pair uniquely speciﬁes an angle in [0. as shown in Figure 1. 2π). Let z1 = r1 (cos θ1 + j sin θ1 ) Then z1 z2 = r1 r2 (cos θ1 + j sin θ1 )(cos θ2 + j sin θ2 ) = r1 r2 [(cos θ1 cos θ2 − sin θ1 sin θ2 ) + j(cos θ1 sin θ2 + sin θ1 cos θ2 )] = r1 r2 [cos(θ1 + θ2 ) + j sin(θ1 + θ2 )] where the last equality follows from standard trigonometric identities. we conclude that |z1 z2 | = |z1 ||z2 | and ∠z1 z2 = ∠z1 + ∠z2 and z2 = r2 (cos θ2 + j sin θ2 ) Thus. 1. and • rotation of either vector through an angle equal to the angle of the other.10.4 The Complex Exponential The Taylor series expansions for sin θ and cos θ are given by cos θ = 1 − θ2 θ4 + − ··· 2! 4! θ3 θ5 sin θ = θ − + − ··· 3! 5! A power series for cos θ+j sin θ can be obtained by combining the two Taylor series: θ2 θ3 θ4 θ5 cos θ + j sin θ = 1 + jθ − −j + + j − ··· 2! 3! 4! 5! . then multiplication is a pure rotation. If both complex numbers lie on the unit circle (given by the equation |z| = 1).3. multiplication of two complex numbers viewed as vectors entails: • scaling the length of either vector by the length of the other.16 and applying the usual distributive and commutative laws of algebraic multiplication: (a + jb)(c + jd) = ac + jad + jbc + j 2 bd = (ac − bd) + j(ad + bc) This result becomes more interesting if we express the two complex numbers in terms of polar coordinates.

−j and +1. This leads to the expression ejθ = cos θ + j sin θ known as Euler’s formula.10: Multiplication of two complex numbers w and z on the unit circle. −1. θ): z = rejθ = |z|ej∠z . consecutive (increasing) powers of j circle through the four values j. We have therefore derived an alternative expression for the polar form of a complex number z = (r. Changing the sign of θ yields the complex conjugate: e−jθ = cos(−θ) + j sin(−θ) = cos θ − j sin θ By adding and subtracting the last two equations. we obtain cos θ = ejθ + e−jθ 2 and sin θ = ejθ − e−jθ 2j respectively. Thus the power series above can be expressed as 1 + jθ + (jθ)2 (jθ)3 (jθ)4 (jθ)5 + + + + ··· 2! 3! 4! 5! which is recognizable as the Taylor series for ex with the real-valued argument x replaced by the complex-valued argument jθ.17 j wz z θ+φ φ w 1 −1 0 θ −j Figure 1. Starting with j 1 .

3. For the Cartesian form for the inverse of z = x + jy.18 Note that if z1 = |z1 |ej∠z1 and z2 = |z2 |ej∠z2 . 1. then direct multiplication gives z1 z2 = |z1 ||z2 |ej∠z1 ej∠z2 = |z1 ||z2 |ej(∠z1 +∠z2 ) as expected.5 Inverses and Division zz −1 = 1 The (multiplicative) inverse z −1 of z is deﬁned by the relationship The inverse exists as long as z = 0. z −1 = z ∗ .3. we have z −1 = |z|−2 z ∗ = x − jy x2 + y 2 −1 As in the case of real numbers. then √ z1 3−j = z2 1+j √ ( 3 − j)(1 − j) = 12 + 12 √ √ 3−1 3+1 −j = 2 2 . the division of z1 by z2 can be deﬁned in terms of a product and an inverse: z1 1 = z1 · z2 z2 Example 1. If z1 = √ 3 − j and z2 = 1 + j.2. and is easily obtained from the polar form of z: z −1 = |z|−1 ej∠z = |z|−1 e−j∠z Thus the inverse of z is a scaled version of its complex conjugate: z −1 = |z|−2 z ∗ and in the special case where z lies on the unit circle.

Alternatively. one can multiply both ∗ numerator and denominator by z2 = 1 − j. z1 = 2ej(−π/6) and z2 = 2ej(π/4) . we used the Cartesian form for the inverse of z2 = 1 + j.19 For the second expression on the right-hand side. thus √ 2 z1 = √ ej(−π/6−π/4) = 2ej(−5π/12) z2 2 . √ In polar form.

cosθ 1 1 sinθ 0 2π 4π θ 0 2π 4π θ −1 −1 Figure 1. cos(θ + 2kπ) = cos θ sin(θ + 2kπ) = sin θ Our second observation is that the cosine function is symmetric.1 Sinusoids in Continuous Time The Basic Cosine and Sine Functions Let θ be a real-valued angle. even-symmetric about the origin. Our ﬁrst observation is that both cos θ and sin θ are periodic with period equal to 2π. The sine function is obtained in θ equal to π/2 radians.20 1.11. The basic sinusoidal functions are cos θ and sin θ.4 1. This means that for any positive or negative integer k. shown in Figure 1. This follows also from the sin θ = cos(π/2 − θ) . or more precisely. while the sine function is antisymmetric (or odd-symmetric): cos(−θ) = cos θ sin(−θ) = − sin θ Our third observation other by means of a shift from the cosine by a delay is obtained from the sine trigonometric identity is that either function can be obtained from the in the variable θ. equivalently. measured in radians (recall that π rad is the same as 1800 ).4.11: The cosine (left) and sine (right) functions. the cosine by an advance in θ.

The horizontal scale can be changed by making θ a linear function of another variable.. We thus have 1 Hz = 2π rad/sec In Figure 1. time t is often chosen for that purpose. then Ω is measured in radians per second (rad/sec). We obtain the general form for the real-valued sinusoid by modifying the function cos θ in three ways: 1.4. f= 1 Ω = T 2π The unit of f is cycles/second. It follows that the period T of x(t) is T = 2π Ω (seconds) The (cyclic) frequency of x(t) is the number of cycles per unit time. A full cycle (period) of cos θ lasts 2π radians. and thus T2 < T1 . If t is measured in seconds. Vertical Scaling. which is better described using a shift in θ: − cos θ = cos(θ + π) 2. We thus have sin θ = cos(θ − π/2) cos θ = sin(θ + π/2) 1. or Hertz (Hz). θ = θ(t) = Ωt and the resulting sinusoid is x(t) = A cos Ωt The scale parameter Ω is known as the angular frequency.2 The General Real-Valued Sinusoid The shift properties discussed above allow us to express sinusoidal signals in terms of either the sine or the cosine function. Ω2 > Ω1 . for Ω ≥ 0.12. We thus have. Multiplication of cos θ by a constant A yields an amplitude (peak value) equal to |A|. Horizontal Scaling. or equivalently. i.21 and the fact that cos θ = cos(−θ). A negative A would result in a sign inversion.e. It is customary to use positive values for A. 2π/Ω seconds. . or radian frequency of the signal x(t).

maxima and minima. Horizontal Shifting. and the positions of zeros. Let us sketch the signal x(t) = 3 cos 10πt + 3π 4 showing all features of interest such as the signal amplitude. using an additional phase shift (see the discussion earlier): π 2 π cos(Ωt + φ) = sin Ωt + φ + 2 sin(Ωt + φ) = cos Ωt + φ − Example 1. A positive value of φ represents a phase (angle) advance of φ/2π cycles.22 cosΩ1t cosΩ t 2 0 T1 t 0 T2 t Figure 1. The duration of that cycle is the period T of x(t).2 sec . First. the initial value of the signal (t = 0). a negative value of φ represents a delay.1. x(t) can be also expressed in terms of the sine function. The introduction of a phase shift angle φ in the argument of the cosine results in x(t) = A cos(Ωt + φ) which is the most general form for the real-valued sinusoid. or a time advance of φ/Ω seconds. we draw one cycle of the cosine function (peak to valley to peak) without marking the time origin. which is obtained from Ω (the coeﬃcient of t): Ω = 10π rad/sec ⇒ f = 5 Hz ⇒ T = 0. Clearly.4. 3.12: Sinusoids of two diﬀerent frequencies.

.4. The value of the signal at t = 0 is √ 3 cos(3π/4) = −3 2/2 = −2. which is equivalent to 3/8 cycles.e..075 seconds. 1. the dotted line in the ﬁgure). every 0.121 -3 T = 0. In this case.025 .075 -.025 0 . Note that these occur at quarter-cycle intervals. at a distance corresponding to 3/8 cycles. Since the phase shift at t = 0 is 3/8 cycles.125 t -2.025 seconds).3 Complex-Valued Sinusoids and Phasors In our discussion of complex numbers.05 seconds. i. the phase shift equals 3π/4.4. Thus the correct placement of the time origin (t = 0) is to the right of the leftmost peak.121 The positions of the zeros.2 sec Example 1.1 More cycles of the signal can then be added (e. and will be a minimum. which would be the value of the signal at t = 0 if the phase shift were equal to 0.e. +3 -. maxima and minima of x(t) are now marked relative to the origin.. or 0. the ﬁrst such point on the positive t-axis will occur 1/8 cycles later (i. we established Euler’s formula ejθ = cos θ + j sin θ .g. at t = 0.075 .23 The cycle drawn starts on a peak.

x(t) = Aej(Ωt+φ) + Ae−j(Ωt+φ) z(t) + z ∗ (t) = 2 2 To interpret the above relationships. changing the sign of Ω is no diﬀerent from changing the sign of the phase angle φ. their angular velocities are equal in magnitude. by averaging the two numbers. on the other hand. Analogous conclusions can be drawn about the sine function. . i. The two points rotate in opposite directions on the circle. or. equivalently.e. equivalently. think of z(t) and z ∗ (t) as two points on the circle |z| = A. allows us to linearly combine such complex signals to obtain a real-valued sinusoid. or taking the midpoint of the segment joining z(t) and z ∗ (t). since A cos(−Ωt + φ) = A cos(Ωt − φ) . The real-valued sinusoid x(t) = A cos(Ωt + φ) can also be obtained from the complex-valued sinusoid z(t) = Aej(Ωt+φ) using x(t) = e Aej(Ωt+φ) or. as shown in Figure 1. with initial angular positions (at t = 0) given by φ and −φ.24 and obtained the identities cos θ = ejθ + e−jθ 2 and sin θ = ejθ − e−jθ 2j It is clear from the ﬁrst of the two identities that cos θ can be obtained from the two complex numbers ejθ and e−jθ (both on the unit circle) by projecting either number onto the real axis.. using the imaginary axis and a slightly diﬀerent linear combination. Using negative frequencies for complex sinusoids.13. Negative frequencies are of no particular value in expressing real sinusoids. This concept will be further discussed in signal analysis. The real-valued sinusoid x(t) is obtained by either projecting z(t) onto the real axis. The foregoing discussion used the concept of negative angular frequency to describe an angle which is decreasing linearly in time.

If x1 (t).. note ﬁrst that xm (t) = e Am e(jΩt+φm ) . Aej(Ωt+φ) = Aejφ · ejΩt i. the rotating phasor is obtained from the stationary phasor by rotation through an angle Ωt. Its initial position at t = 0 is referred to as a stationary phasor. .e. To prove it.25 jA z(t) x(t) −A −Ωt−φ z*(t) −jA Figure 1. xM (t) are real sinusoids given by xm (t) = Am cos(Ωt + φm ) then x1 (t) + · · · + xM (t) = A cos(Ωt + φ) where Aejφ = m=1 M Ωt+φ A Am ejφm The fact that a sum of sinusoids of the same frequency is also sinusoidal is not at all obvious. Stationary phasors are particularly useful in depicting the relative phases of sinusoidal signals of the same frequency.13: Rotating phasor. . A complex sinusoid such as z(t) is also known as a rotating phasor. such as voltages and currents in an AC (alternating current) circuit. . One particular application involves sums of such signals: Fact. As expected. .

2. The same.3787 − j2. holds for the rotating phasors.209 = 2. Aejφ = m=1 Am ejφm One way of interpreting this result is that the stationary phasor Aejφ corresponding to the sum signal x(t) is just the (vector) sum of the stationary phasors of each of the components of xm (t).241 · e−j1.241 · cos(10πt − 1. To express x(t) = 3 cos 10πt + 3π 4 + 5 sin 10πt + π 6 as a single sinusoid. Example 1..401 Thus x(t) = 2. of course.26 so that M M xm (t) = m=1 m=1 e Am e(jΩt+φm ) Since e{z1 + z2 } = e{z1 } + e{z2 }. it follows that M M xm (t) = m=1 e m=1 M Am e(jΩt+φm ) Am ejφm ejΩt m=1 M = = = where A= m=1 M e e Am ejφm m=1 · ejΩt e Aej(Ωt+φ) = A cos(Ωt + φ) M Am e jφm and M φ=∠ m=1 Am ejφm i.401) .4.e. we ﬁrst write the second term as π π π 5 cos 10πt + − = 5 cos 10πt − 6 2 3 The stationary phasor for the sum signal x(t) is given by 3 · ej(3π/4) + 5 · e−j(π/3) = 0.

. space. which takes values in a set of integers I.e.2 Deﬁnition of the Discrete-Time Sinusoid A sinusoid in discrete time is deﬁned as in continuous time. . The notation for a discrete-time signal involves square brackets around n. i. . Examples of I include inﬁnite sets such as Z (all integers). etc.1 Sinusoids in Discrete Time Discrete-Time Signals A discrete-time. . n ∈ Z distinct from s(t).e. the signal is also called a vector. One important note here is that n has no physical dimensions.. . x[0] 0 x[n] . we will assume that the discrete (time) parameter n takes values over all integers. N (positive integers). n . 1. etc.5. i. M }.5. .14: A discrete-time signal. . . or discrete-parameter. this is true even when s[n] represents samples of a continuous-parameter signal whose value varies in time. Figure 1. to diﬀerentiate it from a continuous-time signal: s[n].27 1. . . n ∈ Z.5 1. In the latter case. signal is a sequence of real or complex numbers indexed by a discrete variable (or parameter) n. it is a pure integer number. by replacing the continuous variable t by the discrete variable n: x[n] = A cos(ωn + φ) y[n] = A sin(ωn + φ) z[n] = Aej(ωn+φ) = x[n] + jy[n] . . In what follows.. as well as ﬁnite sets such as {1. t ∈ R . .

y[n] = A sin φ. we assume A > 0..28 Again. n −Acosφ Example 1. The graph was generated using the stem command in MATLAB: . the highest possible frequency for a discrete-time sinusoid. hence Ω is measured in radians per second . Example 1. x[n] (ω=0) Acosφ x[n] (ω=π) Acosφ . The discrete-time sinusoid √ πn π x[n] = 3 cos − 3 6 is shown in the ﬁgure (on the left). .e. then z[n] = Aej(πn+φ) = A(ejπ )n · ejφ = A(−1)n · ejφ since ejπ = −1.1. all three signals are constant in value. The diﬀerence is in the units: t is measured in seconds. . If ω = 0.2. then x[n] = A cos φ. . measured in radians or radians per sample. .5. If ω = π. . 0 n . Example 1.2 .5. hence ω is just an angle. which is the fastest possible for a discrete-time sinusoid. Thus also x[n] = A(−1)n cos φ and y[n] = A(−1)n sin φ The signal x[n] is shown in the ﬁgure. .5. As we shall see soon. . Example 1. . . eﬀectively. while n has no units. 0 . We will use the lower-case letter ω for angular frequency in discrete time because it is diﬀerent from the parameter Ω used for continuous-time sinusoids.3. ω = π rad/sample is. .5. and z[n] = Aejφ i. Note the oscillation between the two extreme values +A cos φ and −A cos φ.

cos θ and ejθ are all periodic in θ (with period 2π).5.5 1 n=3 0.x) Note that the signal is periodic—it repeats itself every 6 samples.3 Equivalent Frequencies for Discrete-Time Sinusoids As we mentioned earlier. 5.3 n=2 n=1 0 n=4 n=5 4 6 8 1 n=0 1. which appears in the argument ωn + φ of a discrete-time sinusoid. it is conceivable that two diﬀerent values of ω may result in identical sinusoidal waveforms. .5 x[n] 0 −0. stem(n...5 −1 −1. To see that this is indeed the case. Since the sinusoids sin θ.29 n = -2:8. It actually acts as an angle increment: With each sample.5 −2 0 2 n Example 1..e. consider z[n] = Aej(ωn+φ) We know that ejψ = ejθ if and only if the angles θ and ψ correspond to the same point on the unit circle. The angle in the argument of the cosine is marked on the unit circle (shown on the right). ψ = θ + 2kπ . i. the angular frequency parameter ω. for n = 0. 1.5. x = sqrt(3) * cos((pi/3)*n -(pi/6)). the angle in the argument of the sinusoid is incremented by ω.. is an angle.

corresponding to one full rotation on the unit circle. Deﬁnition 1.1. (k ∈ Z) Equivalent frequencies result in identical complex sinusoids.30 where k is an integer. Thus two frequencies which diﬀer by an integral multiple of 2π will result in identical real-valued sinusoids provided the initial phase shifts φ are the same. If we replace ω by ω + 2kπ in the equation for z[n].5. Example 1. 2π] or (−π. The angular frequencies ω and ω are equivalent for complex-valued sinusoids in discrete time if ω = ω + 2kπ. Thus the signals z[n] and v[n] are identical. Thus the eﬀective range of frequencies for complex sinusoids can be taken as (0.5. The three complex sinusoids z (1) [n] = exp j − z (2) [n] = exp j z (3) [n] = exp j πn +φ 6 11πn +φ 6 23πn +φ 6 all represent the same signal (in discrete time). we obtain a new signal v[n] given by v[n] = Aej((ω+2kπ)n+φ) = Aej(ωn+φ+2knπ) = Aej(ωn+φ) = z[n] where the angle argument was reduced by an integer (= kn) multiple of 2π without aﬀecting the value of the sinusoid. π] rad/sample. The notion of equivalent frequency also applies to the real-valued sinusoid x[n] = A cos(ωn + φ) since x[n] = e{z[n]}.4. . provided the initial phase shifts φ are equal.

Let u[n] = A cos((−ω + 2kπ)n − φ) Then. (k ∈ Z) This further reduces the eﬀective range of frequency for real sinusoids to the interval [0. ωn+φ 1 −ωn−φ Same value obtained for cos(.5.2.15. recalling the identity cos(−θ) = cos(θ) we have u[n] = A cos((−ω + 2kπ)n − φ) = A cos(ωn + φ − 2kπn) = A cos(ωn + φ) = x[n] We also note that no sign reversal is needed for φ if ω = 0 or ω = π. for the real valued sinusoid above. Deﬁnition 1. eﬀective range of frequencies for real sinusoids (right). π].) π 0 Figure 1.15: Two equivalent frequencies for real sinusoids (left). The angular frequencies ω and ω are equivalent for realvalued sinusoids in discrete time if ω = ±ω + 2kπ. . it is fairly easy to show that using ω = −ω + 2kπ instead of ω will result in an identical signal provided the sign of φ is also reversed.31 Moreover. This is illustrated in Figure 1.

5. a refresher on periodicity: Deﬁnition 1. First. For z[n] = Aej(ωn+φ) . periodicity is the exception (rather than the rule) for discrete-time sinusoids. ω= . The three real sinusoids x(1) [n] = cos − x(2) [n] = cos x(3) [n] = cos πn +φ 6 11πn +φ 6 23πn +φ 6 all represent the same signal.4 Periodicity of Discrete-Time Sinusoids While continuous-time sinusoids are always periodic with period T = 2π/Ω.5.e. ratio of two integers) multiple of 2π. The period (smallest such N ) is found by cancelling out common factors between the numerator k and denominator N. s[n + N ] = s[n] The smallest such N is called the (fundamental) period of s[n]. which is also given by (note the sign reversal in the phase): cos πn − φ = cos 6 13πn −φ 6 = cos 25πn −φ 6 1.5. The same condition for periodicity can be obtained for the real-valued sinusoid x[n] = A cos(ωn + φ).5.32 Example 1. the condition (∀n) z[n + N ] = z[n] holds for a given N > 0 if and only if ω(n + N ) + φ = ωn + φ + 2kπ which reduces to k · 2π N Thus z[n] is periodic if and only if its frequency is a rational (i.3. A discrete-time signal s[n] is periodic if there exists an integer N > 0 such that for all n ∈ Z..

6. while s[n] = cos is.5.2) is not periodic. . The signal x[n] = cos(10n + 0. Since 13πn + 0.4 15 13π 13 = · 2π 15 30 (note that there are no common factors in the numerator and the denominator of the last fraction). the period of s[n] is N = 30.33 Example 1.

otherwise. namely reconstructing the continuous-time signal from the discrete (and quantized) samples.1 Sampling of Continuous-Time Sinusoids Sampling Sampling is the process of recording the values taken by a continuous-time signal at discrete sampling instants. the discrete-signal resulting from sampling is x[n] = x(nTs ) . Ts = 2. The sampling rate fs determines.e.0 . . The sampling rate.16: Sampling every two seconds (i.6 1. If the sampling rate is high enough. fs is the reciprocal of the sampling period: fs = 1 Ts (samples/sec) If the continuous time signal is x(t). to a large extent. .6. the quality of the interpolated analog signal. the other component being quantization. . Although the theory behind the Nyquist rate is beyond the scope of this chapter. The reverse process of digitalto-analog (D-to-A) conversion involves interpolation. . the samples will contain suﬃcient information about the sampled analog signal to allow accurate interpolation. or sampling frequency. The minimum sampling rate fs required for accurate interpolation is known as the Nyquist rate. Sampling is one of two main components of analog-to-digital (A-to-D). Uniform sampling is most commonly used: the sampling instants Ts are seconds apart in time. x[-1] −2. .0 0 x[1] 2. . . x(t) t Figure 1.34 1. x[0] .. the necessity of sampling at the Nyquist rate (or higher) can be seen by considering simple properties of sampled sinusoidal signals. which depends on the range of frequencies spanned by the original signal (in terms of its sinusoidal components). where Ts is the sampling period. ﬁne detail will be lost and the reconstructed signal will be a poor replica of the original.0).

or equivalently. etc. inversely proportional to the sampling frequency fs . Suppose Ts is small in relation to T . or equivalently. We have x[n] = A cos(ΩnTs + φ) = A cos(ωn + φ) where ω = ΩTs = Ω fs Note that. equal to π rad/sample. The eﬀective frequency will become zero when Ts = T (or fs = f ). Consider. the real-valued sinusoid x(t) = A cos(Ωt + φ) which has period T = 2π/Ω and cyclical frequency f = Ω/2π. however. as expected. the frequency ω of the discrete-time sinusoid also increases. for example.e. ω has no physical dimensions. and thus frequency of the discrete-time signal will be low.35 1.. .e. It can also be expressed as ω = 2π · f Ts = 2π · T fs Thus the frequency of the discrete-time sinusoid obtained by sampling a continuous-time sinusoid is directly proportional to the sampling period Ts . it is just an angle. Since many samples are taken in each period. fs is high compared to f . fs decreases). Ω 2 (equivalently. the eﬀective frequency of the resulting samples will decrease with Ts . the sample values will vary slowly. This is consistent with the fact that π+δ and π − δ = −(π + δ) + 2π are equivalent frequencies.. When Ts exceeds T /2. and then it will start increasing again. that there is an upper limit. for fs = 2f ).6.2 Discrete-Time Sinusoids Obtained by Sampling Sampling a continuous-time sinusoid produces a discrete-time sinusoid. Recall. i. That value will be attained for Ts = T π = . on the eﬀective frequency of a real-valued discrete-time sinusoid. As Ts increases (i.

most of this information is redundant: as few as three samples may suﬃce to determine the unknown parameters Ω.8 5 = cos 2πn − 0. The resulting signal is 2πn x1 [n] = cos + 0. 500. φ and A (only two samples if the frequency Ω is known).6. Suppose the following three sampling rates are used: 1.6.2.8 5 i.6. Ts = 1 ms and ω = (400π)/1000 = 2π/5.000. and are given by the discrete-time sinusoid A cos(πn/4).2). . The graphs show the continuous time sinusoid A cos(2πf t) sampled at fs = 8f (left) and fs = (8f )/7 (right). the two sequences are identical. Ts = 2 ms and ω = 4π/5. The resulting signal is x3 [n] = cos 8πn + 0. 1.1. The resulting signal is x2 [n] = cos 4πn + 0. t instead of n (as in the graphs of Example 1.3 The Concept of Alias The notion of equivalent frequency was implicitly used in Example 1. Clearly.. i. Consider the continuous-time sinusoid x(t) = cos(400πt + 0.6.e. If we were to plot the two sample sequences using an actual time axis.6.e.8) which has period T = 5 ms and frequency f = 200 Hz. In the ﬁrst case. we would observe that one sequence of samples contains a lot more information about the analog signal.8 5 In the second case. The sample sequence on the right is formed by taking every seventh sample from the sequence on the left. In the case of a sinusoidal signal. Ts = 4 ms and ω = 8π/5.. and 250 samples/sec.8 5 In the third case. and hence to interpolate the signal perfectly. Example 1.36 Example 1.1 to show that a given continuous-time sinusoid can be sampled at two diﬀerent sampling rates to yield discrete sinusoids of the same frequency (diﬀering possibly in the phase). On the other hand. it is a sinusoid of the same frequency as x1 [n] (but with diﬀerent phase shift).

Most analog (i. interpolation from a ﬁnite number of samples is impossible. . resulting in x [n] having frequency ω = Ω Ts = 2π(f /fs ). it suﬃces to consider the simple class of signals which are sums of ﬁnitely many sinusoids.6.37 A A 0 t 0 t Example 1. continuous-time) signals encountered in practice can be formed by summing together sinusoids of diﬀerent frequencies and phase shifts. in which case integration is more appropriate than summation.e.2 for an arbitrary signal x(t) that is neither sinusoidal nor (more generally) periodic. Consider again the sinusoid x(t) = A cos(Ωt + φ) sampled at a rate fs which is ﬁxed (unlike the situation discussed in the previous subsection). To appreciate a key issue which arises in sampling analog signals. Consider also a second sinusoid x (t) = A cos(Ω t + φ ) sampled at the same rate. for such a signal.. The details of the sinusoidal representation of continuous-time signals (known as the Fourier transform) are not essential at this point. The resulting discrete-time sinusoid x[n] has frequency ω = ΩTs = 2π(f /fs ). a higher sampling rate is likely to result in better (more faithful) reconstruction. the frequencies involved span an entire (continuous) interval or band. In most cases.

Continuous-time sinusoids at frequencies related in that manner. the aliases of f are at f = 200 + k(500) and f = −200 + k(500) Hz If fs = 250 Hz. Frequencies f and f (in Hz) satisfying the above relationship are known as aliases of each other with respect to the sampling rate fs (in samples/second).38 The two frequencies ω and ω are equivalent (allowing both x[n] and x [n] to be expressed in terms of the same frequency) provided that ω = ±ω + 2kπ .1. and thus it suﬃces to consider k ≥ 0 in both series.6. the aliases of f are at f = 200 + k(1000) and f = −200 + k(1000) Hz k integer If fs = 500 Hz.3. The graphs illustrate how three continuous-time sinusoids with frequencies f = 0. will. f = fs and f = 2fs can yield the same discrete-time sequence with ω = 0. f = ±f + kfs for some integer k.4. Clearly. 000 Hz. simply. In the ﬁrst two cases. only positive frequencies are of interest. when sampled at the appropriate rate. yield discrete-time samples having the same (eﬀective) frequency. the lowest alias is at −200 + 250 = 50 Hz.6. . Consider again Example 1. This is equivalent to f f =± +k fs fs or. If fs = 1. Example 1. the lowest (positive) alias is f itself. Example 1.6. the aliases of f are at f = 200 + k(250) and f = −200 + k(250) Hz The aliases (in Hz) are plotted for each sampling rate. where f = 200 Hz. In the third case.

4 The Nyquist Rate The term aliasing is used in reference to sampling continuous-time signals which are expressible as sums of sinusoids of diﬀerent frequencies (most signals fall in that category).4 1.6.6. The resulting sequence of samples can be expressed as x[n] = cos ωn + 4 cos ω n Since ω and ω are equivalent frequencies (and no phase shifts are involved here). consider x(t) = cos Ωt + 4 cos Ω t where Ω and Ω are aliases with respect to the sampling rate fs . we can also write x[n] = 5 cos ωn . For example.200 Example 1. Aliasing occurs in sampling a continuous-time signal x(t) if two (or more) sinusoidal components of x(t) are aliases of each other with respect to the sampling rate.200 Sampling Rate = 250 samples/sec −200 0 50 200 300 450 550 700 800 950 1.6.000 samples/sec −200 0 200 800 1.050 1.3 Example 1.200 Sampling Rate = 500 samples/sec −200 0 200 300 700 800 1.39 Sampling Rate = 1.

we will determine the minimum sampling rate fs such that no aliasing occurs. No aliasing means that neither of the intervals [−fs . fB + kfs ] of aliases of frequencies in [0.17. The eﬀect of aliasing is the same whether we sum or integrate over frequency: components at frequencies that are aliases of each other cannot be resolved on the basis of the samples obtained. fB ]. aliasing is avoided if no frequency f in that band has an alias f = ±f + kfs (excluding itself) in the same band.. fB ] Hz (where fB is the signal bandwidth). u(t) = 2 cos Ωt + 3 cos Ω t or v(t) = − cos Ωt + 6 cos Ω t which are very diﬀerent signals from x(t). of frequencies.e. respectively) overlaps with [0. e.. fB + fs ] (corresponding to k = −1 and k = 1. This is ensured if and only if fs > fB . hence the amplitude and phase information needed to incorporate those components into the reconstructed signal will be unavailable. Thus absence of aliasing is a necessary condition for the faithful reconstruction of the sampled signal. . If the sinusoidal components of x(t) have frequencies which are limited to the frequency band [0. Each k corresponds to an interval [kfs .40 Reconstructing x(t) from the samples x[n] without knowing the amplitudes of the two components cos Ωt and cos Ω t is a hopeless task: the same sequence of samples could have been obtained from. these sinusoidal components are “summed” by an integral over frequency (rather than by a conventional sum). With the aid of Figure 1.g. As we mentioned earlier. −f s f −f B s 0 f B f s f +f B s −f s −f B 0 −f +f B s f s −f +2f B s Figure 1. fs > 2fB . fB − fs ] and [fs . This shows convincingly that aliasing is undesirable in sampling signals composed of sinusoids. or band. Consider the alias relationship f = f + kfs ﬁrst. i. fB ] (top graph).17: Illustration of the Nyquist condition: −fB +fs > fB . typical analog signals consist of a continuum of sinusoids spanning an entire interval.

consider the alias relationship f = −f +kfs . Sampling at a rate higher than the Nyquist rate is a necessary condition for the faithful reconstruction of the signal x(t) from its samples x[n]. Thus aliasing is avoided if fs > 2fB . fB ]. fs ] does not overlap with [0. No aliasing means that the interval [−fB +fs . kfs ] (bottom graph). It turns out that the condition fs > 2fB is also suﬃcient for the complete recovery of the analog signal from its samples. The proof of this fact requires more advanced tools in signal analysis. In other words. This minimum rate is known as the Nyquist rate. the sampling rate must be greater than twice the bandwidth of the analog signal. fB ] in the interval [−fB + kfs . . which is ensured if and only if fs > 2fB . and will be outlined in the epilogue following Chapter 4. which gives aliases of [0.41 Next.

25) . Simplify the complex fraction √ (1 + j 3)(1 − j) √ ( 3 + j)(1 + j) using (i) Cartesian forms.11. for i = 1:N y = y+(1/N).8.4 P 1. expressing your answer in Cartesian form. Let N be an arbitrary positive integer. Section 1.44 y = 0.3 P 1.9.10. throughout your calculation. and (ii) polar forms. Which of the following expressions represents the complex number 2 + 3j in MATLAB? 2 + 3*j 2 + 3*i 2 + j3 2 + 3j [2+3j] [2 + 3j] [2 +3j] P 1. P 1. Evaluate the product N −1 cos k=1 kπ N + j sin kπ N . Consider the continuous-time sinusoid x(t) = 5 cos(500πt + 0. end y Section 1.

14. if any. The input voltage v(t) to a light-emitting diode circuit is given by A cos(Ωt + φ).0.08) + 0. Using stationary phasors. (i) What is the ﬁrst value of t greater than 0 such that x(t) = 0? (ii) Consider the following MATLAB script which generates a discrete approximation to x(t): t = 0 : 0.49 cos(43t + 1.5 ms. what is the value of φ? P 1.12.0 and +2.0001 : 0. then turns on again at t = 9. is x(n) zero? P 1. The diode turns on instantly.5 ms.0 for 70% of its period. The circuit is designed in such a way that the diode turns on at the moment the input voltage exceeds A/2. Based on this information.25) . P 1. (i) What percentage of the time is the diode on? (ii) Suppose the voltage v(t) is applied to the diode at time t = 0.26 cos(43t + 0. express each of the following sums in the form A cos(Ωt + φ). what is the value of Ω? (iii) If t = 40 ms is the ﬁrst positive time for which x(t) = −2. The value of the continuous-time sinusoid x(t) = A cos(Ωt+φ) (where A > 0 and 0 ≤ φ < 2π) is between −2. x = 5*cos(500*pi*t + 0. and turns oﬀ when the voltage falls below A/2. For which values of n.45 where t is in seconds.37) π π π − sin 7πt − + 3 cos 7πt + 6 6 4 .77 cos(43t + 2.01 . determine Ω and φ. (i) cos 7πt − (ii) 2.11) − 5. (i) What is the value of A? (ii) If it takes 300 ms for the value of x(t) to rise from −2.0 to +2. turns oﬀ at t = 1.0 and x (t) (the ﬁrst derivative) is negative. where A > 0. Ω > 0 and φ are unknown parameters.13.

(i) Use MATLAB to plot four periods of the discrete-time sinusoid x1 [n] = cos (ii) Show that the product x2 [n] = x1 [n] · cos(πn) is also a (real-valued) discrete-time sinusoid. suppose the ﬁrst period of v[n] is given by v[0] = 1. Express it in the form A cos(ωn+ φ). v[2] = −1 and v[3] = −1 What are the values of A > 0 and φ? P 1. 7πn π + 9 6 .4908 are three consecutive values of the discrete-time sinusoid x[n] = A cos(ωn + φ). π] and φ ∈ [0. the discrete-time sinusoid v[n] = A cos(ωn + φ) is periodic with period equal to N = 4 time units. where A > 0. (i) Use the trigonometric identity cos(α + β) = cos α cos β − sin α sin β to show that cos(ω(n + 1) + φ) + cos(ω(n − 1) + φ) = 2 cos(ωn + φ) cos ω (ii) Suppose x[1] = 1.17. Then use the ratio x[2]/x[1] together with the given identity for cos(α + β) to evaluate tan φ and hence φ. v[1] = 1. where A > 0.5 P 1. 2π). determine A. π].46 Section 1.15. P 1. (i) For exactly one value of ω in [0. π] and φ ∈ (0. Use the equation derived in (i) to evaluate ω.16. What is that value of ω? (ii) For that value of ω. ω ∈ [0.7740. Finally.1251 and x[3] = 0. 2π]. x[2] = 3. ω ∈ [0.

determine the value of ω..0 ms starting at t = 0.8) + 2 sin(200πt) . in radians? (ii) Consider the discrete-time signal y[n] = x[n + 1] + 2x[n] + x[n − 1] Using phasors. (ii) Is the discrete-time sinusoid x[n] periodic? If so. Consider the continuous-time sinusoid x(t) = 5 cos(200πt − 0. The continuous-time sinusoid x(t) = cos(150πt + φ) is sampled every Ts = 3. what is its period? (iii) Suppose that the sampling rate fs = 1/Ts is variable.6 P 1.20. The resulting discrete-time sinusoid is x[n] = x(nTs ) (i) Express x[n] in the form x[n] = cos(ωn + φ) i.e. P 1.18. What is the angular frequency ω of x[n]. For what values of fs is x[n] constant for all n? For what values of fs does x[n] alternate in value between − cos φ and cos φ? P 1.19. express y[n] in the form y[n] = A cos(ωn + ψ) clearly showing the values of A and ψ.47 Section 1. The continuous-time sinusoid x(t) = cos(2πf t + φ) is sampled every Ts = 1/(12f ) seconds to produce the discrete-time sinusoid x[n] = x(nTs ) (i) Write an expression for x[n].

22. what is the least value of Ts greater than 2 ms for which x(t) yields the same sequence of samples (as for Ts = 2 ms)? .002n) (i) For what other values of f in the range 0 Hz to 2. What is the value of ω? What is the relationship between ψ and φ? P 1.48 where t is in seconds. so that x[n] = x(0.0 KHz does the sinusoid v(t) = cos(2πf t) produce the same samples as x(t) (i. v[·] = x[·]) when sampled at the same rate? (ii) If we increase the sampling period Ts (or equivalently. (i) Express x(t) in the form x(t) = A cos(200πt + φ) (ii) Suppose x(t) is sampled at a rate of fs = 160 samples/second. The discrete-time signal x[n] = x(n/fs ) is expressed as x[n] = A cos(ωn + ψ) with ω in the interval [0.21.0 KHz does the sinusoid x(t) = cos(2πf t) yield the signal x[n] = cos(0..e. π].0 ms. The continuous-time sinusoid x(t) = cos(300πt) is sampled every Ts = 2. drop the sampling rate). For what frequencies f in the range 0 to 3.4πn) when sampled at a rate of fs = 1/Ts = 800 samples/sec? P 1.

Chapter 2 Matrices and Systems 49 .

In representing such signals and their transformations resulting from numerical processing. . diﬀerent forms of which are suitable for signals and systems of diﬀering complexity. 2. . complex numbers. . . a1n  a21 a21 . . .1. . . . amn A compact notation for the above matrix is A = [aij ]m×n = [aij ] The last form is used whenever the dimensions m and n of A are known or can be easily inferred. .. A row vector has the form a1 a2 . .50 2. . am1 am2 .   . The elements (also called entries) of the matrix A are real or. A vector is a special case of a matrix where one of the dimensions (m or n) equals unity. a2n    A= . an and lies in the space R1×n or. .1 Introduction to Matrix Algebra The analysis of signals and linear systems relies on a variety of mathematical tools. more generally.1 Matrices and Vectors A m × n matrix is an ordered array consisting of m rows and n columns of elements:   a11 a12 . more generally.  . Understanding the role played by linear algebra in the analysis of such simple (ﬁnite-dimensional) signals also provides a solid foundation for studying more complex models where the signals evolve in discrete or continuous time and are inﬁnite-dimensional. . certain concepts and methods of linear algebra are indispensable. . C1×n . We deﬁne Rm×n = space of all (m × n)-dimensional real -valued matrices Cm×n = space of all (m × n)-dimensional complex -valued matrices where C denotes the complex plane. . . The simplest type of signal we consider is a discrete-time signal of ﬁnite duration. which is the same as an n-dimensional vector consisting of real or complex-valued entries.

g. Thus.e. 2] is a column vector. the transpose (T ). . e. the spaces R1×n and Rn×1 are no diﬀerent from Rn .e. an ) In operations involving matrices. In MATLAB. The transpose operator T is entered as .’ and the complex conjugate transpose H as ’. . 2]. Unless otherwise stated. ... . 0 1 5. Cm×1 . 2 2 1] Thus [-1 3 4] is a row vector. while [-1. a row vector will be denoted by aT or aH . the distinction between row and column vectors is important. In fact. operators will be used. a lower-case boldface letter will denote a column vector. i. the space of all n-tuples of real numbers.   . am To denote row vectors in expressions involving matrix operations.   a1  a2    a= . Row vectors and column vectors can be used interchangeably to represent an ordered set of (ﬁnitely many) numbers. am and lies in the space Rm×1 or.’ and      . in many ways. 0. where rows are separated with a semicolon: A = [-1 3 4. The usual notation for a vector in Rn or Cn is a lower-case boldface letter. matrices are entered row-by-row.. the redundant dimension (unity) in R1×n and Rn×1 is often omitted when the orientation (row or column) of the vector can be easily deduced from the context. a = (a1 . or complex-conjugate transpose (H).  . . i. more generally.51 A column vector has the form      a1 a2 . . Both produce the same result in the case of real-valued vectors: [-1. 0. The same holds for C replacing R.

1. . which generates a row vector. Here. .[-1. An example of this is the notation start:increment:end. a row. The initial index i = 0 will be used at a later point.g. and through the end of Section 2.. . An error message will be generated if the orientation of the vector is inconsistent with the matrix computation being performed.. .. which is interpreted in an element-wise fashion. 2]’ are the same as [-1 0 2] The default orientation of a vector in MATLAB is horizontal. xi is an abbreviated form of x[i] (in the earlier notation).e. MATLAB will ignore the orientation of row and column vectors in places where it is unimportant.52 [-1. The choice of i = 1 for the initial time is mainly for consistency with standard indexing for matrices.12. all matrices and vectors will be assumed to be real-valued. 9. 7] is the same as [2. xn ) where n is the total number of samples. 1 + [-1 9 7] results in [0 10 8] and 1 . -8. in arguments of certain functions. i. 0. . From now on. i.e. This means that MATLAB functions returning vector-valued answers from scalar arguments will (usually) do so in row-vector format.2 Signals as Vectors A discrete-time signal consisting of ﬁnitely many samples can be represented as a vector x = (x1 . A notable exception is the addition of a scalar to a vector. -6] 2. e.

(1) 1 (−2) 3 . i = j. . . deﬁned by ej = (i) 1. The linear combination of n-dimensional vectors x(1) . while addition of two or more vectors amounts to adding their respective components. i = j. .1. x(r) with coeﬃcients c1 . 1 . . e(n) . . .1. −2. 3). Example 2. .1. . . . 1 . . . 1 . (3) Example 2. . .53 Deﬁnition 2. 0. Consider the three-dimensional vector x = (1.1 −2 . . cr in R is deﬁned as the n-dimensional vector x = c1 x(1) + · · · + cr x(r) (Recall that multiplication of a vector by a scalar amounts to multiplying each entry of the vector by the scalar. Expressing it as a column-vector (the default format for matrix operations). . we have         1 1 0 0 x =  −2  = (1)  0  + (−2)  1  + (3)  0  = e(1) − 2e(2) + 3e(3) 3 0 0 1 The linear combination of unit vectors shown above is illustrated graphically.1. . .) Any vector x in Rn can be expressed as a linear combination of the standard unit vectors e(1) . . Note that the ith unit vector corresponds to a single pulse of unit height at time i. .1.

In fact. resulting in a response (i.e. and more precisely. the system is reset (once again) and run with input x = c1 x(1) + c2 x(2) If the system is linear. the ﬁnal response of the system will be y = c1 y(1) + c2 y(2) .3 Linear Transformations The concept of linearity is crucial for the analysis of signals and systems. output) y(1) . it is common to use linear approximations and techniques from the theory of linear systems. A(c1 x(1) + c2 x(2) ) = c1 A(x(1) ) + c2 A(x(2) ) To interpret this property in terms of an actual system. acts on the input x to produce the output signal y represented by y = A(x) x A y Figure 2..1: A system A with input x and output y.54 2. The system A. Linear systems form a class which is by far the easiest to study. in studying the behavior of many nonlinear systems.1.1.2. The simplest model for a linear system involves a ﬁnite-dimensional input signal x ∈ Rn and a ﬁnite-dimensional output signal y ∈ Rm . are linear. Fourier and z-transforms. resulting in a response y(2) . The most important time-frequency transformations of signals. A : Rn → Rm . suppose at ﬁrst that an input signal signal x(1) is applied to such a system. for any two input vectors x(1) and x(2) and coeﬃcients c1 and c2 . model and simulate. Finally. Deﬁnition 2. such as the Laplace. The system is then “reset” and is run again with input x(2) . (Linearity) We say that the transformation (or system) A is linear if.

3. u2 . u1 . cu1 + dv1 . v1 . cu2 + dv2 . u2 . A(c1 x(1) + c2 x(2) ) = λ(c1 x(1) + c2 x(2) ) = c1 (λx(1) ) + c2 (λx(2) ) = c1 A(x(1) ) + c2 A(x(2) ) and thus the transformation is linear. x1 . Consider the transformation A : Rn → Rn which scales an n-dimensional vector by λ ∈ R: A(x) = λx Clearly. u3 ) + d · (v1 . . v3 )) = A(cu1 + dv1 . Scaling is one of the simplest linear transformations of n-dimensional vectors. u3 ) + dA(v1 .2. cu2 + dv2 ) = c · (u3 . v3 ) Thus a cyclical shift is also a linear transformation. u2 ) + d · (v3 . x3 ) = (x3 .1. cu3 + dv3 ) = (cu3 + dv3 . Example 2. v2 . it is the only linear transformation. Example 2. x2 ) We have A(c · (u1 . and in the case n = 1.55 This deﬁning property of linear systems is also known as the superposition property. Consider the transformation A : R3 → R3 which cyclically shifts the entries of x to the right: A(x1 . v2 ) = cA(u1 . x2 .1. v2 . and can be extended to an arbitrary number of input signals: A(c1 x(1) + · · · + cr x(r) ) = c1 A(x(1) ) + · · · + cr A(x(r) ) As a consequence of the superposition property. knowing the response of a linear system to a number of diﬀerent inputs allows us to determine and predict its response to any linear combination of those inputs.

.2.1. x3 ) produces A(x) = x1 2 1 + x2 −1 0 + x3 4 −1 = 2x1 − x2 + 4x3 x1 − x3 . It follows that the m-dimensional vectors A(e(1) ). we can say that the response of a linear system to an arbitrary input can be computed from the responses of the system to each of the unit vectors in the input signal space. Deﬁnition 2.1 Matrix Multiplication The Matrix of a Linear Transformation The superposition property of a linear transformation A : Rn → Rm was stated as A(c1 x(1) + c2 x(2) ) = c1 A(x(1) ) + c2 A(x(2) ) We also saw that any n-dimensional vector x can be expressed as a linear combination of the unit vectors e(1) . The matrix of a linear transformation A : Rn → Rm is the m × n matrix A whose j th column is given by A(e(j) ). .2 2. .1.2. Suppose the transformation A : R3 → R2 is such that A(e(1) ) = 2 1 . . . e(n) : x = x1 e(1) + · · · + xn e(n) Thus the result of applying the linear transformation A to any vector x is given by A(x) = x1 A(e(1) ) + · · · + xn A(e(n) ) We conclude that knowing the eﬀect of a linear transformation on each unit vector in Rn suﬃces to determine the eﬀect of that transformation on any vector in Rn .56 2.2. In systems terms. . Example 2. A(e(2) ) = −1 0 and A(e(3) ) = 4 −1 Then the matrix of A is given by A= 2 −1 4 1 0 −1 Applying A to x = (x1 . x2 . A(e(n) ) completely specify the linear transformation A. . .

Ax is the linear combination of the columns of A with coeﬃcients given by the corresponding entries in x.) If A= then Ax = 2 −1 4 1 0 −1 and  x1 x =  x2  x3  2x1 − x2 + 4x3 x1 − x3 A diﬀerent way of obtaining the product Ax is by taking the inner (dot) product of each row of A with x. . . . .1.  + · · · + xn  .   . (Continued. . . and let x be a n-dimensional column vector. am1 am2 amn   a11 x1 + a12 x2 + · · · + a1n xn  a21 x1 + a22 x2 + · · · + a2n xn    =   . .57 2.2.2. it equals the inner product of the ith row of A and x.2.  + x2  . . . To see why this is so. a2n    A= . . .  . . xn ). Let A be a m × n matrix representing a linear transformation A : Rn → Rm . am1 x1 + am2 x2 + · · · + amn xn      Thus the ith element of the resulting vector y = Ax is given by n yi = (Ax)i = ai1 x1 + ai2 x2 + · · · + ain xn = j=1 aij xj i.  . .2 The Matrix-Vector Product Deﬁnition 2. a1n  a21 a22 .2. .e. In other words..   . let   a11 a12 . .  . we have      a11 a12 a1n  a21   a22   a2n      Ax = x1  . amn Then.  .   . am1 am2 . . . . .. for x = (x1 . The product Ax is deﬁned as the result of applying the transformation A to x. . Example 2. . .

.2: Two systems connected in series. . Suppose A : Rp → Rm and B : Rn → Rp are two linear transformations. .3. . A and B. we examine the j th column of AB. We know that (AB)·j is obtained by applying A◦B to the j th unit vector (j) in the input space Rn . . x B B(x) A y=A(B(x)) Figure 2.2. then the product AB is deﬁned as the m × n matrix of the transformation A ◦ B : Rn → Rm . .. . .. a2p   b2j     (AB)·j =  . amp . .58 2. then A is applied to B(x) to produce (A ◦ B)(x).  . we have (AB)·j = A((B)·j ) = A(B)·j Thus the j th column of the product AB is given by the product of A and the j th column of B:    b1j a11 a12 .   . . as illustrated in Figure 2. bpj am1 am2 . If A is the m × p matrix of the transformation A : Rp → Rm and B is the p × n matrix of the transformation B : Rn → Rp . to produce a pdimensional vector B(x). Thus B is applied to x ﬁrst. To obtain AB in terms of the entries of A and B.2.  . i. .3 The Matrix-Matrix Product The matrix-vector product can be generalized to the product of two matrices. which we denote here by (AB)·j . . by considering the linear transformations represented by A and B.  . we have two linear systems A and B connected in series (also tandem or cascade). . e (AB)·j = (A ◦ B)(e(j) ) = A(B(e(j) )) Since B(e(j) ) is the j th column of B (denoted by (B)·j ). and consider the composition A ◦ B deﬁned by (A ◦ B)(x) = A(B(x)) where x is n-dimensional. . Deﬁnition 2. .e.2. In systems terms. a1p  a21 a22 .

2.3.3. we can write the output y of the cascade as A(BC)x.59 It can be seen that the ith entry of (AB)·j . j)th entry is given by p (AB)ij = ai1 b1j + ai2 b2j + · · · + aip bpj = k=1 aik bkj Example 2. which is the matrix of the cascaded transformation A ◦ B ◦ C depicted in Figure 2. then AB is the m × n matrix whose (i. 2. If  4 A= 1  −2 then   and B= 1 −3 2 2 −1 4 1 0 −1 and  1 −2 3  B= 0 1 5  6 13 0 −7  4 −12 8 2  AB =  1 −3 −2 6 −4 Note that in this case.2. which is the same as the (i.2.4 Associativity and Commutativity Fact. is given by the inner product of the ith row of A and the j th column of B. Fact. the (i. Grouping C and B together. If A= then AB = Example 2. B and C are such that the products AB and BC can be formed. Matrix multiplication is associative. Thus ABC = A(BC) = (AB)C . j)th element of AB is simply ai bj . this is because the rows of A and columns of B each consist of a single element. we have the equivalent expression (AB)Cx. j)th entry of AB. we can also form the product ABC. If the dimensions of A. If A is m × p and B is p × n.2. Grouping B and A together.

and therefore A= A(0.2..60 x C C(x) B B(C(x)) A y=A(B(C(x))) Figure 2. 0) and (0. 1) yields A(1. The linear transformation A applied to the two unit vectors (1. We say that A and B commute if AB = BA Clearly.4 below shows. Consider the two-dimensional case (n = 2). the only way the products AB and BA can both exist and have the same dimensions is if the matrices A and B are both square. The number of matrices in the product can be reduced to L−1 by multiplying out any two adjacent matrices without changing their position in the product. Let A represent projection of (x1 . Matrix multiplication is not commutative. x1 sin θ+x2 cos θ) and thus B= cos θ − sin θ sin θ cos θ . this procedure can then be repeated until the product is expressed as a single matrix. or by noting that rotation by angle θ amounts to multiplication of the complex number x1 + jx2 by ejθ = cos θ + j sin θ. 0). Still. 0) = (1.3: Three systems connected in series. Example 2. where L is arbitrary. It can be extended to the product of L matrices.2. i. x2 ) on the horizontal (x1 ) axis. We can obtain the elements of B either from geometry. The result of the multiplication is x1 cos θ−x2 sin θ+j(x1 sin θ+x2 cos θ) = (x1 cos θ−x2 sin θ. 0) 1 0 0 0 Let B represent a counterclockwise rotation of (x1 . Fact.4. this is not suﬃcient for commutativity. n × n. 1) = (0. which is also known as the associative property of matrix multiplication.e. x2 ) by an angle θ. as Example 2.

projection on the x1 -axis followed by rotation through an angle other than 0 or π will result in a vector with x2 = 0. the transformation A ◦ B (whose matrix is AB) is a cascade of a rotation (ﬁrst)..e. then AB = BA i. If A and B are rotations by angles φ and θ.61 Clearly.2.5. The result of the transformation is therefore always on the x1 -axis. respectively. ..e. This is because the order of the two rotations is immaterial—both AB and BA represent a rotation by an angle θ +φ. the two matrices commute. AB = while BA = cos θ − sin θ 0 0 cos θ 0 sin θ 0 Thus the two matrices do not commute except in the trivial case where sin θ = 0. i. it has x2 = 0. Example 2. This can be also seen using a geometrical argument.e. i. θ = 0 or θ = π. On the other hand. followed by projection on the x1 -axis. Brieﬂy.

then A(1) B AB = A(2) B Also.1.3 2. each having the same number of columns as A. as illustrated in Example 2. Thus multiplication of a matrix A by a (column) unit vector is the same as column selection. Fact.      a b c 0 b  d e f  1  =  e  g h i 0 h By appending one or more unit vectors to e(j) . then Ae(j) = (A)·j where (A)·j is the j th column of the matrix A.3.3.1 More on Matrix Algebra Column Selection and Permutation We saw that if A is a m × n matrix and e(j) is the j th unit vector in Rn×1 . If B= B(1) B(2) where B(1) and B(2) are submatrices of B. Example 2.62 2. it is possible to select two or more columns from the matrix A. if A= A(1) A(2) where A(1) and A(2) are submatrices of A. This is based on the following general fact. then AB = AB(1) AB(2) Similarly. AB = A(1) B(1) A(1) B(2) A(2) B(1) A(2) B(2) The formulas above are easily obtained by recalling that the (i. each having the same number of rows as B. . j)th element of the product AB is the inner product of the ith row of A and the j th column of B.3.1. They can be extended to horizontal partitions of A and vertical partitions of B involving an arbitrary number of submatrices.

then AT is the n × m matrix whose rows are the columns of A (equivalently.3. . but permuted. as unit vectors in Rn×1 .e. 2. This is illustrated in Example 2.2 resulted in a matrix with the same columns as those of A. . AT is known as the transpose of A. whose columns are the rows of A). .1.3.3. Two equivalent conditions are: • the rows of P are the unit vectors in R1×n (in any order). This type of matrix is known as a permutation matrix.63 Choosing B(1) . i.2 Matrix Transpose If A is a m × n matrix.. it need not be square. Note that A can be any m × n matrix. The product AP is a matrix whose columns are the same as those of A. This is because   0 0 1  1 0 0  0 1 0 consists of all the unit (column) vectors in arbitrary order.2 below. ordered in the same way as the unit vectors are in the columns of P. B(2) . . and • the columns of P are the unit vectors in Rn×1 (in any order). we obtain a matrix consisting of columns of A.2.  a b c  d e f  g h i   a b c 0  d e f  1 g h i 0   0 0 1 1  0 0  0 1 0 0  1 0  =  b b  e e  h h   b c a  e f d  h i g = Note that the second product in Example 2.3. Example 2. A n × n matrix P is a permutation matrix if all rows and columns of P are unit vectors in Rn .3. and T is known as the transpose operator. Deﬁnition 2.

If A = [aij ]. j). This equals the inner product of the j th row of A and the ith column of B. (AT )T = A Fact.3 above resulted in the same (square) matrix. the ith row of A equals (the transpose of) its ith column.3. aij = aji . . j)th element of BT AT . This is just the (i.3. Deﬁnition 2.3. Assuming the product AB exists. (AB)T = BT AT To prove this fact. This was a consequence of symmetry about the leading diagonal.3. j). j)th element of (AB)T is the (j. then AT is deﬁned by (AT )ij = (A)ji = aji for every pair (i.  a b c  T = T a b c  d e f  g h i  T a b c  b d e  c e f = =  a  b  c  a d  b e c f  a b  b d c e  g h  i  c e  f Note that the third transposition in Example 2.2. Example 2.64 Deﬁnition 2. Equivalent conditions are: • for every (i. and • for every i.3. Fact. i)th element of AB.3. which is the same as the inner product of the the j th column of AT and the ith row of BT . A n × n matrix A is symmetric if A = AT . note that the (i.

then (e(i) )T is also the ith unit vector in R1×m . If P is a permutation matrix. left-multiplying A by a m × m permutation matrix results in a permutation of the rows or A.4. it follows that the inner product equals unity if i = j. let A be a m × n matrix. i = j. That is. note that the (i. Example 2.) Example 2.      0 1 0 a b c d e f  1 0 0  d e f  =  a b c  0 0 1 g h i g h i Fact.3.3 Row Selection As usual. We therefore see that left multiplication of a matrix A by a row unit vector results in selecting a row from A. ordered in the same way as the unit vectors are ordered in the rows of P. PT P = I where Iij = 1. If e(i) is the ith unit vector in Rm×1 . The product (e(i) )T A is a row vector whose transpose equals ((e(i) )T A)T = AT e(i) namely the ith column of the n×m matrix AT . As we know. Furthermore. . The ﬁrst statement follows directly from the deﬁnition of a permutation matrix.5. j)th element of PT P equals the inner product of the ith and j th columns of P. (Note that this ordering is not necessarily the same as that of the unit vectors in the columns of P. zero otherwise.  0 1 0  a b c  d e f = g h i d e f Similarly.3. then so is PT . Since the columns of P are distinct unit vectors. 0. PA is a matrix whose rows are the same as those of A. this is the same as the ith row of A. i = j.65 2. To prove the second statement.3.

. there exists x in Rn such that y = A(x).1 Matrix Inversion and Linear Independence Inverse Systems and Signal Representations y = A(x) = Ax We used the equation to describe the response y of a linear system A to an input signal x. discrete-time sinusoids in Fourier analysis).. . The natural question to ask is whether this information is complete. the matrix A is of size m × n. not directly accessible).4. Consider a m × n matrix V consisting of columns v(1) . would be the inverse system A−1 . whether there exists a vector c of coeﬃcients c1 . v(n) If the column vectors are taken as reference signals (e. In general.4 2. i. How that system can be obtained practically from A is a separate problem which we will discuss later. Inversion of linear transformations is also important in signal analysis. . the answers to questions posed above (for a given system A) determine whether the inverse system A−1 can be deﬁned in a meaningful way. is x unique? Or do there exist other vectors x = x which also satisfy y = A(x )? As we will soon see. . which means that the input vector x is n-dimensional and the output vector y is m-dimensional. cn such that n s= r=1 cr v(r) . . . which may otherwise be “hidden” (i. can we ﬁnd a vector x in the input space (Rn ) such that y = A(x)? If not. The transformation y → x.g. . the output signal y contains useful information about the input signal x. . if properly deﬁned. In almost all cases. . . in other words. for a particular vector y in Rm ..66 2. for which vectors y in Rm is this possible? • Uniqueness of the inverse: If.e. a natural question to ask is whether an arbitrary signal s in Rm can be expressed as a linear combination of these reference signals.e. A careful approach to this (inversion) problem must ﬁrst address the following two issues: • Existence of an inverse: Given any y in the output space (Rm ). . v(n) : V= v(1) . whether it possible to recover x from y.

or is it a proper subset of Rm×1 ? • Uniqueness of an inverse: For y ∈ R(A).e.2 Range of a Matrix The range. We will later study the solution to this problem in the case where the function chosen for that purpose is the sum of squares of the error vector entries. 2. with a change of notation from y = Ax to s = Vc. .. or column space. we are often interested in the best approximation to a signal s in terms of a linear combination of the columns of V. which maps Rn into Rm . does the set of vectors x ∈ Rn×1 satisfying y = Ax consist of a single element? Or does it contain two or more vectors? In what follows. of a m × n matrix A is deﬁned as the set of all linear combinations of its columns. the existence and uniqueness questions formulated earlier are also relevant in signal representation. . we are interested in minimizing some function of the error vector Vc − s by choice of c.67 or equivalently. we will examine the properties of the range of a square (i.4. n × n) matrix A. v(n) may not be suﬃciently large to provide a representation for every signal s in Rm .. and argue that the answers to the existence and . R(A) ⊂ Rm×1 The concept of range allows us to reformulate the existence and uniqueness questions of the previous section as follows: • Existence of an inverse: What is the range R(A)? Does it coincide with Rm×1 . and is denoted by R(A). the set of reference signals v(1) . In particular. In such cases.e. In other words. Clearly. . the answer to the existence question is negative. . In some applications. Thus R(A) = {Ac : c ∈ Rn×1 } R(A) is also the range of the transformation A. s = Vc This problem of signal representation is analogous to the system inversion problem posed earlier. i.

“linear” means that all linear combinations of vectors in R(A) also lie in R(A). a plane. i. This can be written as A(λc(1) + µc(2) ). consisting of a single point. To see why this has to be so. and form their linear combination λAc(1) + µAc(2) . Our discussion will involve the concept of linear independence of a set of vectors. Here.4. say Ac(1) and Ac(2) . we observe the following: provided R(a(2) ) and R(a(2) ) are two distinct lines through the origin. As c1 varies. 1.4. • d = 3: Only one such subspace exists. the same statements can be made about the range of the second column a(2) . the range of a single column vector is one-dimensional.3 Geometrical Interpretation of the Range R(A) is a linear subspace of Rm×1 .e. • d = 1: Straight lines through the origin. i.. let us focus on the special case of a square (3 × 3) matrix A given by A= a(1) a(2) a(3) The range of the ﬁrst column a(1) consists of all vectors of the form c1 where c1 ∈ R. It has four types of subspaces. Clearly. as illustrated in Figure 2. The largest Euclidean space Rm×1 which we comfortably visualize is the three-dimensional one. R([a(1) a(2) ]) is the a(1) ... • d = 2: Planes through the origin. categorized by their dimension d: • d = 0: Only one such subspace exists. The range of a 3 × n matrix will fall into one of the above categories. . namely the origin 0 = [0 0 0]T . i. and will also allow us to draw certain conclusions about matrices of arbitrary size m × n. namely R3×1 itself. it will have dimension 0. If we now consider the range of the matrix [a(1) a(2) ]. 2 or 3. Thus with only one exception. take two vectors in R(A).68 uniqueness questions posed above are either both aﬃrmative or both negative. R3×1 . linear combinations of a(1) and a(2) will produce a linear subspace of higher dimension.e.e. 2. we obtain a full line through the origin unless a(1) = 0. In fact. and therefore also lies in R(A). To see how algebraic relationships between columns aﬀect the dimension of the range.

unique plane which contains the lines R(a(2) ) and R(a(2) ).69 y3 a(1) 0 y2 R(a(1) ) y1 Figure 2. as illustrated in Figure 2.5 (on the right). This is illustrated in Figure 2. Any point on that plane has a unique representation as a linear combination of a(1) and a(2) (right). c1 a(1) and c2 a(2) . This is the .5 (on the left). It is important to note that any vector y on the plane R([a(1) a(2) ]) has a unique representation as a linear combination of a(1) and a(2) . As pointed out earlier. This results in two vectors. and the unique representation y = c1 a(1) + c2 a(2) R([a (1) a (2)]) a 0 a (2) (1) (1) y c2a (2) a(2) c1a 0 a(1) R([a (1) a (2)]) Figure 2.4: The range of a single nonzero vector a(1) is a straight line through the origin. where a line parallel to a(2) is drawn through y to intersect the line containing a(1) . R([a(1) a(2) ]) is two-dimensional if and only if R(a(1) ) and R(a(2) ) are two distinct lines through the origin.5: The range of [a(1) a(2) ] is a plane through the origin (left).

Finally. R(A) = R3×1 if and only if none of the columns of A can be expressed as a linear combination of the others. this condition also prohibits any of the columns from being an all-zeros vector or a multiple of another column. the two vectors are not multiples of each other. a(3) y R([a (1) a (2)]) a(1) 0 a(2) 0 a(2) a(3) a(1) c3a (3) Figure 2. we also determine c1 and c2 . In conclusion. as in the two-dimensional case). proceeding as before (i. An equivalent way of stating this is as follows: The only linear combination c1 a(1) + c2 a(2) + c3 a(3) . any vector in R3×1 will have a unique representation of the form y = c1 a(1) + c2 a(2) + c3 a(3) This is illustrated in Figure 2. Any point inR3×1 has a unique representation as a linear combination of a(1) .70 same as saying that neither a(1) nor a(2) is an all-zeros vector.6: The range of the matrix [a(1) a(2) a(3) ] is the entire three-dimensional space R3×1 (left).e. We thus determine c3 . we see that. and. a(2) and a(3) (right). where a line parallel to a(3) is drawn through y to intersect the plane R([a(1) a(2) ]). These conditions can be concisely stated as follows: The only linear combination c1 a(1) + c2 a(2) which equals the all-zeros vector 0 = [0 0 0]T is the one with c1 = c2 = 0. and. namely R3×1 itself.. in addition. let us consider the third column a(3) . linear combinations of vectors in R([a(1) a(2) ]) with a(3) will produce a linear subspace of higher dimension. Furthermore.6 (on the right). Arguing in the same way as before. provided R([a(1) a(2) ]) is a (two-dimensional) plane that does not contain a(3) .

. a(n) in Rm×1 are linearly independent if the only linear combination c1 a(1) + · · · + cn a(n) which equals the all-zeros vector 0 is the one where c1 = . . that of linear independence. = cn = 0. . We have thus arrived at one of the most important concepts in linear algebra. We restate it for an arbitrary set of m-dimensional vectors. Deﬁnition 2. . The vectors a(1) .71 which equals the all-zeros vector 0 is the one where c1 = c2 = c3 = 0.1. . . . a(n) is the following: The columns of A are linearly independent if the only solution to the equation Ac = 0 is the all-zeros vector c = 0 (in Rn×1 ). An equivalent deﬁnition of linear independence in terms of the m × n matrix A = a(1) . .4. .

5.. If.5 2. on the other hand. In other words. As an example. and for the above equation to have a solution for every y in Rn×1 . every linear combination of a(1) .72 2. Since R(A) is the set of vectors y ∈ R3×1 for which the equation Ax = y has a solution (given by x). In conclusion. it follows that linear independence is a necessary and suﬃcient condition for the above equation to have a solution for every y in R3×1 . linear independence of the columns of A is a necessary and suﬃcient condition for R(A) to coincide with Rn×1 . the same argument can be applied to an arbitrary n × n matrix.e. coincide with R3×1 . take the case n = 3 and suppose that c1 a(1) + c2 a(2) + c3 a(3) = 0 where c is such that (say) c3 = 0. i. Using the argument made for the case n = 3. R(A) has dimension d < n. Thus linear independence provides an answer to the question of existence of an inverse. The uniqueness of the inverse can be also addressed using the concept of linear independence. Then a(3) = − λc1 (1) λc2 (2) a − a + (1 − λ)a(3) c3 c3 for every value of λ ∈ R. Any point in that subspace will have inﬁnitely many representations. showing that a(3) (and by extension. then every vector in Rn×1 has a unique representation as a linear combination of columns of A.e. the above equation has a unique solution. linear independence of the columns of a n × n matrix is a necessary and suﬃcient condition for the existence of a solution to the equation Ax = y . By extension. meaning that the columns of A are linearly dependent. i. we showed that linear independence of the columns of a 3 × 3 matrix A is a necessary and suﬃcient condition for the range (or column space) R(A) of A to have dimension d = 3. then at least one column of A lies on the subspace generated by the remaining columns and can thus be expressed as a linear combination of those columns.. we conclude that if R(A) = Rn×1 .1 Inverse of a Square Matrix Nonsingular Matrices In the previous section. a(2) and a(3) ) has inﬁnitely many representations.

an inverse transformation A−1 : Rn → Rn also exists. Example 2. it follows that a nonsingular matrix A deﬁnes a linear transformation A : Rn → Rn which is a one-to-one correspondence. A n × n matrix is nonsingular if its columns are linearly independent.) 2. Deﬁnition 2.5.2.5. It is singular if its columns are linearly dependent. let y(1) and y(2) be any two vectors in Rn .1. In particular. and y = A(x) ⇔ x = A−1 (y) The transformation A−1 is also linear. To see this. as well as for the uniqueness of that solution. Also. since       1 1 3  −1  + (2)  0  =  −1  0 1 2 (A test for nonsingularity will be developed later in conjunction with Gaussian elimination.5. the third column cannot be obtained as a linear combination of the ﬁrst and second columns (no linear combination of zeros can produce a nonzero entry in the third row). The 3 × 3 matrix   1 a b A= 0 1 c  0 0 1 is nonsingular.2 Inverse of a Nonsingular Matrix From the foregoing discussion. Similarly.5.73 for every y ∈ Rn×1 . Example 2. the second column cannot be obtained as from the ﬁrst one by scaling (no scaling of zero can produce a nonzero entry in the second row). Note that the ﬁrst column does not equal the all-zeros vector. The 3 × 3 matrix   1 1 3 A =  −1 0 −1  0 1 2 is singular.1. and let x(1) = A−1 (y(1) ) and x(2) = A−1 (y(2) ) .

. . we have (A−1 )−1 = A and we can also evaluate the product AA−1 .74 Since the forward transformation A is linear. 0 0 . .2. Indeed.5. A(c1 x(1) + c2 x(2) ) = c1 A(x(1) ) + c2 A(x(2) ) = c1 y(1) + c2 y(2) Therefore A−1 (c1 y(1) + c2 y(2) ) = c1 x(1) + c2 x(2) = c1 A−1 (y(1) ) + c2 A−1 (y(2) ) which proves that A−1 is a linear transformation.  .. 0 1 . The corresponding matrix is the n × n  0 . Since A−1 (A(x)) = x for every x ∈ Rn . it has a matrix.  ... As such. . then A−1 is the matrix of the inverse transformation A−1 . denoted by A−1 . From the above deﬁnition. we have for any c1 and c2 . If A is a nonsingular matrix corresponding to a linear transformation A. 0   . which clearly satisﬁes Ix = x for every x ∈ Rn×1 . 1 . identity matrix  1  0  I= . AA−1 is the matrix of the cascade A ◦ A−1 ... We conclude that AA−1 = I Similarly.. Deﬁnition 2. . it follows that A−1 ◦ A = I namely the identity transformation. and therefore A−1 A = I Either of the last two equations can be used to check whether two given matrices are inverses of each other.  . . A−1 ◦ A is also an identity system..

3 Inverses in Matrix Algebra Fact. x B A y y A −1 B −1 x Figure 2. If A is nonsingular. followed by B −1 .7: Illustration of the identity (AB)−1 = B−1 A−1 . and (AB)−1 = B−1 A−1 This fact can be demonstrated schematically (see Figure 2. recall that (AB)T = BT AT . If A and B are both nonsingular. then so is the product AB. Fact. by noting that the AB is the matrix of the cascade A◦B. We have A−1 (AB) = (A−1 A)B = B where the ﬁrst equality is due to the associativity of the matrix product. It follows that B−1 A−1 (AB) = B−1 B = I and thus AB and B−1 A−1 are inverses of each other. Thus (A ◦ B)−1 = B −1 ◦ A−1 and consequently (AB)−1 = B−1 A−1 An alternative proof is based on the identities developed in the previous section.7). and (AT )−1 = (A−1 )T To prove this identity.75 2.5. then so is AT . This cascade can be inverted by applying the transformation A−1 to the output.

. we argued that the upper triangular matrix   1 a b  0 1 c  0 0 1 is nonsingular. and are particularly useful in solving equations of the form Ax = b.5. in Example 2. aii = 0 for all i.76 Taking B = A−1 . where A is square.5. nonsingularity can be readily determined by inspection of the matrix. The question of invertibility (i. Triangular matrices have numerous applications in linear algebra.4 Nonsingularity of Triangular Matrices Deﬁnition 2. nonsingularity) of triangular matrices is important in all such applications. an upper triangular matrix A is a square matrix whose elements below the main diagonal (the subdiagonal elements) are zero. Recall how. A similar argument matrix  1  a b could be given for the lower triangular  0 0 1 0  c 1 but there is no need to do so: a matrix is nonsingular if and only if its transpose is.e. i.. As it turns out.. A triangular matrix A is nonsingular if and only if all elements on its main diagonal are nonzero. Fact.3. we obtain IT = (A−1 )T AT But I is symmetric: IT = I. In what follows.1. Therefore (A−1 )T AT = I which means that (AT )−1 = (A−1 )T 2.e. Similarly. all elements above the main diagonal (also known as superdiagonal elements) are zero. we make a general observation about triangular matrices of arbitrary size.5. A lower triangular matrix A is a square matrix deﬁned by (∀j > i) aij = 0 i.e.

a3. aj−1. . we can obtain the ﬁrst j − 1 entries of the j th column by linearly combining the corresponding segments of the ﬁrst j − 1 columns. . Examining each of the following columns in turn. . . . . Thus the columns of A are linearly independent.5. . and the matrix A is singular. . . .j     0 0 0 . .77 To prove the “if” statement.  . 0 0 0 .e.5 The Inversion Problem for Matrices of Arbitrary Size The concept of linear independence allows us to investigate the existence and uniqueness of the solution of the equation Ax = y .j−1 aj−1.. .  . Arguing as in Example 2. and A is nonsingular. Since the range of j − 1 linearly independent vectors in Rj−1 is the entire space Rj−1 .. . ajj = 0 and (∀i < j) aii = 0 This means that the ﬁrst j columns of A will be of the form   a11 a12 a13 . .1. we see that the j th column cannot be expressed as a linear combination of the ﬁrst j − 1 columns since its nonzero diagonal entry ajj would have to be obtained by a linear combination of zeros—which is clearly impossible. . 0 0     . 2. suppose that all (main) diagonal elements are nonzero. .j−1 a1j  0 a22 a23 . . . we argue by contradiction. a1. 0 0 Since diagonal elements a11 through aj−1.  .. The same will be true of the column segments obtained by discarding the zeros beneath the (j − 1)th row.j−1 a2j     0 0 a33 .j−1 are all nonzero. .. . . . . Suppose that there is at least one zero on the diagonal. the ﬁrst j − 1 columns will be linearly independent (see the “if” argument above). The entire j th column can therefore be obtained by a linear combination of the ﬁrst j − 1 columns. we see immediately that the ﬁrst column does not equal the all-zeros vector (since a11 = 0). .  .. a2.5.  . . i.j−1 a3j     . . . . . . and that the leftmost such zero is in the j th column. . To prove the “only if” statement..   0 0 0 .  ..  . . .

78 where y ∈ Rm×1 is known, x ∈ Rn×1 is unknown and the matrix A is of size m × n, with n = m. In short, depending on whether m > n or m < n, and on whether the columns of A are linearly dependent or independent, a solution may or may not exist for a particular y ∈ Rm×1 . If a solution exists for all y ∈ Rm×1 , then that solution cannot be unique. This is in sharp contrast to the case m = n, where existence of a solution for all y ∈ Rm×1 is equivalent to uniqueness of that solution (and also equivalent to linear independence of the columns of A). It also tells us that deﬁning an inverse matrix A−1 in the case n = m can be very tricky—in fact, it is not done. In more speciﬁc terms, the equation Ax = y seeks to express the mdimensional column vector y as a linear combination of the n columns of A. A solution exists if and only if y ∈ R(A); and it is unique if and only if every y ∈ R(A) is given by a unique linear combination of columns of A, i.e., if and only if the columns of A are linearly independent. Note that since A has n columns, the dimension d of R(A) must satisfy d ≤ n. And since R(A) is a subset of Rm×1 , we must also have that d ≤ m. Thus d ≤ min(m, n). The overdetermined case (n < m). Here d < m, hence R(A) is a lowerdimensional subspace of Rm×1 . Thus there are (inﬁnitely many) y’s for which the equation has no solution. For those y’s for which a solution exists (i.e, the y’s in R(A)), the solution is unique if and only if the columns of A are linearly independent, i.e., d = n. The underdetermined case (n > m). The number of columns of A exceeds m, the dimension of Rm×1 . We know that at most m of these columns can be linearly independent, i.e., d ≤ m. There are two possibilities: • If d = m, then R(A) = Rm×1 and a solution exists for every y ∈ Rm×1 . That solution is not unique since the n−m redundant columns can be also used in the linear combination Ax which gives y. • If d < m, then R(A) is a lower-dimensional subset of Rm×1 . Then a solution cannot exist for every y; if it does exist, it cannot (again) be unique. The overdetermined case will be examined later on from the viewpoint of signal approximation, i.e, solving for x which minimizes a function of the error vector Ax − y.

79

2.6
2.6.1

Gaussian Elimination
Statement of the Problem

We now focus on solving the equation Ax = b where A is a n × n matrix, and x and b are both n-dimensional column vectors. This (matrix-vector) equation can be expanded into n simultaneous linear equations in the variables x1 , . . . , xn : a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 . . . . . . . . . an1 x1 + an2 x2 + · · · + ann xn = bn We know that a unique solution x exists for every b if and only if A is nonsingular. The procedure (algorithm) known as Gaussian elimination produces the unique solution x in the case where A is nonsingular, and can be also used to detect singularity of A.

2.6.2

Outline of Gaussian Elimination

The algorithm is based on the fact that if Ax = b and MAv = Mb for two nonsingular matrices A and M, then x = v. Indeed, v = (MA)−1 Mb = A−1 M−1 Mb = A−1 b = x where all the inverses in the above equalities are valid by the assumption of nonsingularity. Thus the equations Ax = b and MAx = Mb have the same solution. In Gaussian elimination, left-multiplication of both sides of the equation by a nonsingular matrix is performed repeatedly until the matrix on the left

80 is reduced to the n × n identity: Ax = b M
(1)

Ax = M(1) b

M(2) M(1) Ax = M(2) M(1) b . . . . . . . . . M(l) · · · M(2) M(1) Ax = M(l) · · · M(2) M(1) b where M(l) · · · M(2) M(1) A = I or equivalently, M(l) · · · M(2) M(1) = A−1

The solution is then given by the right-hand side vector: x = M(l) · · · M(2) M(1) b Recall that left multiplication of a matrix by a row vector (i.e., a vectormatrix product) produces a linear combination of the matrix rows, with coeﬃcients given by the corresponding entries in the row vector. By considering each row of the matrix M separately, we see that the product MA (which has the same dimensions as A) consists of rows that are linear combinations of the rows of A. Thus Gaussian elimination proceeds by a sequence of steps, each step involving certain row operations which are “encoded” as a nonsingular matrix M(j) . We should add that, since the same matrix M(j) also multiplies the right-hand side vector, we eﬀectively take linear combinations of entire equations (not just matrix rows). Gaussian elimination is completed in two phases: • forward elimination (also known as forward substitution), followed by • backward substitution (also known as backward elimination). The forward elimination phase is completed in n−1 steps. During the j th step, scaled versions of the j th equation are subtracted from the equations below it so as to eliminate the variable xj from all these equations. The matrix M(j) is a (nonsingular) lower triangular matrix whose form we will soon examine. As we shall see later, in certain situations it may be desirable (or even necessary) to change the order of equations indexed j through n prior to executing the j th step. Such interchanges can be formally dealt with by inserting permutation matrices into the string M(l) · · · M(1) ; an easier, and more practical, way of representing such permutations will also be discussed.

81 The backward substitution phase is completed in n steps. The j th step solves for variable xn−j+1 using the previously computed values of xn , xn−1 , . . . , xn−j+2 . The matrices M(j) are (nonsingular) upper triangular matrices. Diﬀerent equivalent forms for these matrices will be examined in Section 2.7. The solution of Ax = b is thus obtained in a total of l = 2n − 1 steps.

2.6.3

Gaussian Elimination Illustrated
2x1 + x2 − x3 = 4x1 − x3 = 6 6

Consider the 3 × 3 equation given by

−8x1 + 2x2 + 3x3 = −10 Since row operations (i.e., multiplication by matrices M(j) ) are performed on both sides of the equations, it is convenient to append the vector b to the matrix A: S= A b We will use the symbol S to denote the initial matrix [A b] as well as its updates; thus sij will refer to the (i, j)th element in the current matrix S. We display S in a table, with an additional empty column on the left labeled “m” (for “multiplier”). m x1 2 4 −8 x2 1 0 2 x3 −1 −1 3 b 6 6 −10

We eliminate x1 from the second row by adding a multiple of the ﬁrst row (to the second row). By inspection of the ﬁrst column, the value of the multiplier equals −(4/2) = −2. Similarly, x1 can be eliminated from the third row by adding a multiple of the ﬁrst row (to the third row), the value of the multiplier being −(−8)/2 = 4. These two values are entered in the m column, in their respective rows. The coeﬃcient of x1 in the ﬁrst row is also underlined; this coeﬃcient was used to obtain the multiplier for each subsequent row. m −2 4 x1 2 4 −8 x2 1 0 2 x3 −1 −1 3 b 6 6 −10

82 Upon adding the appropriate multiples of the ﬁrst row to the second and third, we obtain m x1 2 0 0 x2 1 −2 6 x3 −1 1 −1 b 6 −6 14

To eliminate x2 from the third row, we add a multiple of the second row (to the third row), the value of the multiplier being −(6/(−2)) = 3. As before, we enter the multiplier in the m column, in the third row. The coeﬃcient of x2 in the second equation is also underlined. m x1 2 0 0 x2 1 −2 6 x3 −1 1 −1 b 6 −6 14

3

Upon adding the multiple of the second row to the third one, we obtain m x1 2 0 0 x2 1 −2 0 x3 −1 1 2 b 6 −6 −4

At this point we have completed the forward elimination. The resulting equations are 2x1 + x2 − x3 = 6 −2x2 + x3 = −6 2x3 = −4 In backward substitution, we solve for x3 (third equation): x3 = (−4)/2 = −2 ; then use the result to solve for x2 (second equation): x2 = (−6 − x3 )/(−2) = 2 ; and ﬁnish by solving for x1 (ﬁrst equation): x1 = (6 − x2 + x3 )/2 = 1 . Thus x = [1 2 − 2]T .

4 Row Operations in Forward Elimination Let S be the current update of [A b]. the j th step in the forward elimination entails replacing each row (S)i· below (S)j· by (S)i· + mij (S)j· where mij = − sij sjj Rows (S)1· . where   1   1     .   . and denote by (S)i· the ith row of S. . That column segment (given by (mj+1.. . . . These row operations can be carried out using matrix multiplication.   .     1 (j)  M =   1     mj+1.83 2.j 1     .. (S)j· are not modiﬁed. 4 0 1 0 3 1 . . in the example of the previous subsection. . . . .j )) is read oﬀ the m column in the Gaussian elimination table. Note that the lower triangular matrix M(j) diﬀers from the identity only in the subdiagonal segment of the j th column. Thus. mnj 1 (only nonzero elements are shown above). . we had     1 0 0 1 0 0 M(1) =  −2 1 0  and M(2) =  0 1 0  . mn.j . namely by updating S to the product M(j) S.6. As we saw in the previous subsection. .

6.7.1 Factorization of Matrices in Gaussian Elimination LU Factorization Consider again the 3 × 3 example of Subsection 2.84 2.7 2.3: 2x1 + x2 − x3 = 4x1 − x3 = 6 6 −8x1 + 2x2 + 3x3 = −10 We saw that the row operations performed during the forward elimination phase amounted to left-multiplying the matrix   2 1 −1 6 6  S = A b =  4 0 −1 −8 2 3 −10 by the lower-triangular  1 M(1) =  −2 4 matrices  0 0 1 0  0 1 and M(2)  1 0 0 = 0 1 0  0 3 1  in succession. The update of S at was    1 0 0 1 0 0 2  0 1 0   −2 1 0   4 0 3 1 4 0 1 −8 and thus M(2) M(1) where the matrix the end of the forward elimination phase    1 −1 6 2 1 −1 6 0 −1 6  =  0 −2 1 −6  2 3 −10 0 0 2 −4 =  U y A b  2 1 −1 1  U = M(2) M(1) A =  0 −2 0 0 2 was upper triangular. We then solved Ux = y .

Recall that triangular matrices are nonsingular if and only if they have no zeros on the main diagonal.7.85 by backward substitution. . and the product of two such matrices is also lower triangular. the inverse of a (nonsingular) lower triangular matrix is also lower triangular. mnj 0 . .   . can be also implemented by a series of matrix products.. Thus we can write A = LU where L = (M(1) )−1 (M(2) )−1 is lower triangular.. .   . then C(j) C(k) = 0 (Here 0 is the all-zeros matrix in Rn×n . i. as we shall soon see.2 The Matrix L It turns out L is particularly easy to compute due to the following interesting fact.e.     0 (j)  C =    0     mj+1. forward elimination on a nonsingular n × n matrix A yields a factorization of the form A = LU. note the special form of C(j) (elements not shown are zero):   0   0     . we have that A = (M(1) )−1 (M(2) )−1 U In general. Thus both M(1) and M(2) are nonsingular in this case.. a procedure which. and since M(2) M(1) A = U. If k ≥ j. Let C(j) be a n × n matrix whose entries are zero with the possible exception of the subdiagonal segment of the j th column.j 0     . Fact. where L = (M(1) )−1 · · · (M(n−1) )−1 2.) To prove this fact. The foregoing discussion is clearly applicable to any dimension n.

we have that (M(1) )−1 (M(2) )−1 = (I − C(1) )(I − C(2) ) = I − C(1) − C(2) + C(1) C(2) = I − C(1) − C(2) 1 . Since the nonzero entries in the k th column of C(k) appear below the main diagonal. 1               (M(j) )−1 = I − C(j) .     1 (j)   M =  1     mj+1..   . . But those columns are all zero. . . . −mnj Using the same fact. since k + 1 > j by hypothesis. That column is a linear combination of the columns of C(j) with coeﬃcients given by corresponding entries in the k th column of C(k) . . 1 1 −mj+1... . . This completes the proof.   .j . only columns indexed k + 1 through n of C(j) are included in the linear combination. mnj 1 can be expressed as M(j) = I + C(j) Using the previous fact. the lower triangular matrix   1   1     . we have (I + C(j) )(I − C(j) ) = I − C(j) + C(j) − C(j) C(j) = I and therefore        =       1 1 ..86 Right multiplication of C(j) by C(k) will yield all-zero vectors in all columns of C(j) C(k) except (possibly) the k th column.j 1     . Now.

1 −mn1 −mn2 . .          1  1   −m32  L= . . (M(n−1) )−1 in succession yields the ﬁnal result L = (M(1) )−1 (M(2) )−1 · · · (M(n−1) )−1 = I − C(1) − C(2) − · · · − C(n−1) or  1 −m21 −m31 . In other words. .87 and thus the product (M(1) )−1 (M(2) )−1 is obtained by “overlaying” the two matrices. .e. we obtain (M(1) )−1 (M(2) )−1  1 0 0 1 0  = 2 −4 −3 1  Thus the LU factorization of A is given by      2 1 −1 1 0 0 2 1 −1  4 0 −1  =  2 1  1 0   0 −2 0 0 2 −4 −3 1 −8 2 3 . . . .n−1 (all superdiagonal elements are zero).  .. adding their subdiagonal parts while keeping the same (unit) diagonal. Right multiplication by (M(3) )−1 . . i. .   −mn−1. .. Continuing the example of Subsection 2.7. .7. Example 2. . the subdiagonal entries of L are obtained from the m column of the Gaussian elimination table by a simple sign inversion. we have     1 0 0 1 0 0 M(1) =  −2 1 0  and M(2) =  0 1 0  4 0 1 0 3 1 Therefore (M(1) )−1  1 0 0 = 2 1 0  −4 0 1  and (M(2) )−1  1 0 0 1 0  = 0 0 −3 1  Using the overlay property.1.2 . −mn.1 −mn−1. .1.

7.  1 0 −1 (2) (1) (1) (2)  −2 1 L =M M =M M = 4 3 M(j) ’s is reversed. oﬀ-line). The LU factorization of A allows us to store the results of the forward elimination in a way that minimizes the computational eﬀort involved in solving the equation for a new value of b. we note that forward elimination transforms b to y = L−1 b and backward substitution transforms y to x = U−1 y . the equation Ax = b is solved for the same matrix A (which could represent a known system or a set of standard reference signals). which is exactly the same amount as in the original matrix A. For a given b. the equation Ax = b is equivalent to LUx = b which has solution x = A−1 b = U−1 L−1 b We saw that forward elimination transforms A to L−1 A = U.7.88 Note: The overlay property cannot be used to compute the product L−1 = M(n−1) · · · M(2) M(1) since the order of the nonzero column segments in the Thus in Example 2. U) (ignoring the known zero and unit elements) amounts to n2 real numbers. but for many diﬀerent vectors b.1. and that backward substitution transforms U to U−1 U = I. Indeed.  0 0  1 2.e. It is interesting to note that the information contained in the pair (L.. forward elimination on A can be performed ahead of time (i. Focusing on the updates of the right-hand side vector b.3 Applications of the LU Factorization In many practical situations.

7. forward elimination solves the equation Ly = b while backward substitution solves Ux = y The two systems above can be solved using the same principle. These are sign-inverted entries from the third column (corresponding to x3 ). we obtain m x1 1 0 0 x2 1/2 1 0 x3 −1/2 −1/2 1 b 3 3 −2 We now enter multipliers 1/2 and 1/2 for the ﬁrst and second rows respectively. The similarity between the two procedures becomes more prominent when the factors L and U are known ahead of time.3 resulted in m x1 x2 x3 b 2 1 −1 6 0 −2 1 −6 0 0 2 −4 Dividing each row by its (main) diagonal element.7. This is indeed true. Example 2.4 Row Operations in Backward Substitution The analogy between the lower triangular system Ly = b and the upper triangular system Ux = y suggests that row operations in backward substitution may also be implemented by a series of matrix multiplications similar to those used in forward elimination. and is illustrated in Example 2. in which case there is no need to derive L by working on A. namely substitution of known values to equations below (in the former case) or above (in the latter case).2.6.7. Thus the terms substitution and elimination can be used interchangeably in both the forward and the backward phase of Gaussian elimination.89 Thus in eﬀect. m x1 x2 x3 b 1/2 1 1/2 −1/2 3 1/2 0 1 −1/2 3 0 0 1 −2 . 2.2. The forward elimination phase in Subsection 2.

which results in elimination of x2 from the ﬁrst equation: m x1 1 0 0 x2 0 1 0 x3 0 0 1 b 1 2 −2 We have thus reduced U to the identity I. • M(3) is actually diagonal.7. and obtained the solution x in the last column. with inverse  D = (M(3) )−1  2 0 0 =  0 −2 0  0 0 2 .2. In terms of matrix multiplications. we use the multiplier −1/2 (shown above) to scale the second row prior to adding it to the ﬁrst one. we have I = M(5) M(4) M(3) U where 1/2 M(3) =  0 0  1 M(4) =  0 0    0 0 −1/2 0  0 1/2  0 1/2 1 1/2  0 1 and M(5)  1 −1/2 0 1 0  = 0 0 0 1 We make the following additional observations on Example 2.90 Scaling the third row by the multiplier shown and adding it to the corresponding row results in eliminating x3 from both the ﬁrst and second equation: m x1 x2 x3 b −1/2 1 1/2 0 2 0 1 0 2 0 0 1 −2 Finally.

then the multipliers will be of the form −sij /sjj .91 The result of left-multiplying U by M(3) = D−1 is   1 1/2 −1/2 V = D−1 U =  0 1 −1/2  0 0 1 which. At the end. U will be reduced to a diagonal (not necessarily identity) matrix. . We thus have an equivalent normalized factorization A = LDV • If the diagonal entries are not normalized prior to backward substitution. like L. has diagonal entries normalized to unity. as was the case in forward elimination.

 ..8. The denominator sjj in the above expression is known as the pivotal element. Assuming that such sij = 0 exists (for i > j)..8. .   1 M(j) =   1   mj+1.j   . then the j th row cannot be used for eliminating xj from the rows below it—the multipliers mij will equal inﬁnity for all i such that sij = 0.1 Pivoting in Gaussian Elimination Pivots and Pivot Rows The j th step in the forward elimination process aims at eliminating xj from all equations below the j th one. the pivot row contains sjj in its j th position (column). is known as the pivot row for xj . it makes sense to interchange the ith and j th rows and use the nonzero value—formerly sij and now sjj —as pivot. Clearly. 2. The j th row.92 2. for xj . .8 2. .2 Zero Pivots If a zero pivot sjj is encountered in the forward elimination process. mnj The multipliers are given by mij = − sij sjj               1 1 . Thus for i ≥ j + 1 (only). row (S)i· is replaced by (S)i· + mij (S)j· This is equivalent to left-multiplying S by  1  1   .  . or simply pivot. This is a simple concept which will be illustrated later. This is accomplished by subtracting multiples of the j th row of the updated matrix S = M(j−1) · · · M(2) M(1) A b from rows indexed j + 1 through n. multiples of which are subtracted from the remaining n−j rows (below it).

and thus the matrix is singular. .. . . .4. .     0 0 0 .. . .  M(j−1) · · · M(2) M(1) A =   0 0 0 . it will not be unique. . . 0 0 . and would thus be nonsingular.. . 2. ... To answer this question.5. .. .  .) Singularity of A will therefore result in a zero pivot during the forward phase of the Gaussian elimination. . . .j−1 s2j . As we argued earlier in Subsection 2. sj−1.j . Since M(j−1) · · · M(2) M(1) is singular. s2. . . We will then have   s11 s12 s13 . s1. are zero. the j th column can be obtained as a linear combination of columns to its left.     . . .. Row interchanges are also highly desirable when small pivots are likely to cause problems in ﬁnite-precision calculations. . . A must be singular also.93 An interesting question is what happens when sjj . sj−1.j−1 are previously used pivots and are therefore nonzero.     0 0 s33 . where the diagonal entries s11 . . . .  .     .3 Small Pivots We saw that row interchanges are necessary when zero pivots are encountered (assuming A is nonsingular).. namely those in the range of A (which has dimension smaller than n − 1). suppose that sjj is the ﬁrst zero pivot encountered. .. 0 0 0 . . . .. If the solution exists. s3. .. 0 0 .j−1 sj−1. ... . . . then M(j−1) · · · M(2) M(1) A would be the product of two nonsingular matrices. . . Recall that singularity implies that Ax = b will have a solution only for certain vectors b.  0 s22 s23 . . .. . . .   . The course of action when singularity in A is detected depends very much on the application in question. .   . (This can be argued by contradiction: if A were nonsingular. . .. . .. . .j−1 s1j .8.j−1 s3j . as well as all elements below it. .

x2 = 2/3.003) = −333.001 x1 + x2 = 1 which has exact solution x1 = 1/3.3 and thus the second equation becomes (1 − (333.3)(4. Yet both b1 and a12 x2 are (absolutely) much larger than . b1 and b2 are of the same order of magnitude. a21 . Let us solve the system using four-digit precision throughout. But if precision is an issue. Example 2.001) . therefore the numerator cannot be that much larger than the denominator.1. then the order shown above can lead to large errors in the variable x1 . This means that the diﬀerence b1 − a12 x2 is very small compared to either term. we have the approximations x1 a12 b2 − a22 b1 a12 a21 and x2 b1 a12 Neither x1 nor x2 is particularly large if the constants involved (except ) are of the same order of magnitude. Treating the equations in their original order. The exact solution is easy to obtain in closed form. the same solution is obtained regardless of the pivoting order. which is obtained from the computed value of x2 (by back-substitution): b1 − a12 x2 x1 = We know that the value of x1 is not particularly large.000)/(0. The following numerical example further illustrates these diﬃculties.3)(6))x2 = 1 − (333. Clearly.8. Consider the system 0. and are all much larger than . a22 . we obtain a multiplier m21 = −(1.94 Consider the simple 2 × 2 system given by x1 + a12 x2 = b1 a21 x1 + a22 x2 = b2 where a12 . and is likely to be poorly approximated if the precision of the computation is not high enough. if we ignore small contributions due to .003x1 + 6x2 = 4.

8.4 Row Pivoting Row pivoting is a simple strategy for avoiding small pivots. add the appropriate multiple of row i(j) to row i. In implementing row pivoting.95 The resulting value of x2 is x2 = 0.001 − (6)(0. let Ij be the set of indices i corresponding to rows that have not yet been used as pivot rows.. Example 2.6668 Although the relative error in x2 is only 0. 2. compute mij = −sij /si(j). Interchanging the equations and repeating the calculation.. At the beginning of the j th step. Compare the values of |sij | for all i in Ij .02%. Increment j by 1. 1.003 Rounding the product (6)(0. If i = i(j) yields the largest such value. second. we obtain x2 = 0. Remove index i(j) from the set Ij .6668) x1 = 0. Consider the system −x2 + 3x3 = 6 3x1 + 2x2 + x3 = 15 2x1 + 4x2 − 5x3 = 1 .6668) to four digits results in 4. 4. 2.8. For each i in Ij . 6. This is clearly unsatisfactory.3333. which can be summarized as follows. the eﬀect of this error is much more severe on x1 : using that value for x2 in the ﬁrst equation. 5. we use an additional column (“p” for “pivot order”) to indicate the ﬁrst. and thus the answer is x1 = 0. take i(j) as the pivot row for the j th step. the correct values (rounded to four signiﬁcant digits). i.e.j .6667 and x1 = 0. third. pivot row. etc.000. we have 4.001.2. 3. For each i in Ij .

we compare the second (x2 ) column entries for those rows. we obtain x3 = 3. from the p column): 3x1 + 2x2 + x3 = 15 (8/3)x2 − (17/3)x3 = −9 (7/8)x3 = 21/8 Solving in the usual fashion. 3. pivoting is required for the ﬁrst step since a11 = 0. which equals 3. we mark a “3” for that row in the pivot column (even though it is not used for pivoting subsequently): p 3 1 2 m x1 0 3 0 x2 0 2 8/3 x3 7/8 1 −17/3 b 21/8 15 −9 Backward substitution can be carried out by handling the equations in pivoting order (i. shown as “1” in the pivot column. 1. We will follow the row pivoting algorithm given above. we look for the (absolutely) largest value in the ﬁrst (x1 ) column. The corresponding row is chosen as the ﬁrst pivot row. .96 Clearly..e. and row 3 is the pivot row. and the multiplier is computed for row 1: p 1 2 m 3/8 x1 0 3 0 x2 −1 2 8/3 x3 3 1 −17/3 b 6 15 −9 After eliminating x2 from row 1. Multipliers are also computed for the remaining rows: p 1 −2/3 m 0 x1 0 3 2 x2 −1 2 4 x3 3 1 −5 b 6 15 1 After eliminating x1 from rows 1 (trivially) and 2. x2 = 3 and x1 = 2. 2. A “2” is marked for that row in the pivot column. We initialize the table: p m x1 0 3 2 x2 −1 2 4 x3 3 1 −5 b 6 15 1 To eliminate x1 . The pivot is 8/3.

Thus in Example 2. Thus in the same example. respectively. Thus     0 1 0 2 4 −5 2 1  P= 0 0 1  and PA =  3 1 0 0 0 −1 3 The matrices L and U are obtained by applying the same permutation to the multiplier vectors and S (at the end of forward elimination).97 Row pivoting results in a so-called permuted LU factorization of the matrix A: LU = PA where L and U are lower and upper triangular. and P is a permutation matrix which rearranges the rows of A in the order in which they were used for pivoting.8. .     1 2 P 2  =  3  3 1 where the vector on the right is just the pivot order column at the end of the forward elimination.     0 3/8 0 1 0 0 0 0 0  =  2/3 1 0  L = I − P −2/3 0 0 0 −3/8 1 and    0 0 7/8 3 2 1 2 1  =  0 8/3 −17/3  U = P 3 0 8/3 −17/3 0 0 7/8  If the permuted LU factorization of A is known. then the equation Ax = b can be solved using the forward elimination and backward substitution equations Ly = Pb and Ux = y .2.

Ax(n) = e(n) We illustrate this procedure. if B has more than one column.e. the inverse A−1 of a nonsingular n × n matrix A appears in a product such as A−1 b or.9. x(n) ].9. by solving Ax = b or. . This is because the product A−1 b requires the same number (≈ 2n2 ) of ﬂoating-point operations as the two triangular systems Ly = b and Ux = y combined.. by an example. i. also known as Gauss-Jordan elimination. having A−1 computed and stored oﬀ-line oﬀers no particular advantage over the standard LU factorization of A..98 2. . . If Ax = b needs to be solved for many diﬀerent vectors b. As long as A is available. and produces a much smaller average error in the resulting entries of Ax − b. To invert  1 −1 2 1 1  A =  −2 −1 2 1 I]: b2 0 1 0 b3 0 0 1 x2 −1 1 2 x3 2 1 1 b1 1 0 0  we construct a table with the elements of S = [A m 2 1 x1 1 −2 −1 . A−1 B.. more generally.1 Further Topics in Gaussian Elimination Computation of the Matrix Inverse In most computations. . AX = B (here X has the same number of columns as B). Example 2.9 2. If we are asked to compute A−1 explicitly. than computing the inverse A−1 followed by the product A−1 b. Ax(2) = e(2) .1. This approach is faster (at least by a factor of two). it is best to compute that product via Gaussian elimination. we can do so by solving the 2 simultaneous equations in n AX = I where X = [x(1) . This is equivalent to solving n systems of n simultaneous equations each: Ax(1) = e(1) .

S = [U L−1 ] at this point.99 Eliminating x1 from rows 2 and 3 (using the pivot and multipliers shown above).. we obtain m x1 1 0 0 x2 0 1 0 x3 0 0 1 b1 1/8 −1/8 3/8 b2 −5/8 −3/8 1/8 b3 3/8 5/8 1/8 i. Scaling the rows of U by the diagonal elements. we obtain m −2 5 x1 1 0 0 x2 −1 1 0 x3 2 −5 1 b1 1 −2 3/8 b2 0 −1 1/8 b3 0 0 1/8 Substituting x3 back into rows 2 and 1 (using the multipliers shown above). we obtain m 1 x1 1 0 0 x2 −1 1 0 x3 0 0 1 b1 2/8 −1/8 3/8 b2 −2/8 −3/8 1/8 b3 −2/8 5/8 1/8 Substituting x2 back into row 1 (using the multiplier shown above)..e. we obtain m x1 1 0 0 x2 −1 −1 1 x3 2 5 3 b1 1 2 1 b2 0 1 0 b3 0 0 1 1 Eliminating x2 from row 3 (using the pivot and multiplier shown above). Thus   1/8 −5/8 3/8 A−1 =  −1/8 −3/8 5/8  3/8 1/8 1/8 .e. S = [I A−1 ] at the end. we obtain m x1 1 0 0 x2 −1 −1 0 x3 2 5 8 b1 1 2 3 b2 0 1 1 b3 0 0 1 i.

2. Ill-conditioned problems are characterized by large variations in the solution vector x caused by small perturbations of the parameter values (entries of A and b). where K is the number of signiﬁcant decimal digits used in the computation—the same deﬁnition can be given in terms of binary digits by changing the base to 2. This concept can quantiﬁed in terms of the so-called condition number of A. whether pivoting is used or not. pivoting is necessary in the above procedure. There are. Based on the foregoing .8.001x2 = 2.9.001x1 + x2 = 2. and how simple techniques such as row pivoting can essentially eliminate these errors. The ﬁnal table is of the form [P−1 P−1 A−1 ] where P is the permutation matrix in the (permuted) LU factorization of A: LU = PA Note that P−1 = PT by the fact given in Subsection 2.100 In case where zero pivots are encountered.2.001 × 103 . Consider the system 1. An ill-conditioned matrix is one whose condition number is of the same order of magnitude as 10K . A well-conditioned matrix is one whose condition number is small (the minimum possible value is 1). we saw how small pivots can cause large errors in ﬁniteprecision computation.001 x1 + 1. They are thus extremely sensitive to rounding. nevertheless. We can think of an ill-conditioned matrix A as one that is “nearly singular” in terms of the precision used in solving Ax = b.2 Ill-Conditioned Problems In Section 2.9. The following example demonstrates this sensitivity.3. Example 2. which can be computed in MATLAB using the function COND.2. primarily in regard to the matrix A. Note how the two columns of A are nearly equal. problems where errors due to ﬁnite precision are unavoidable. and thus A is nearly singular. The condition number of A equals 2. Such problems are referred to as ill-conditioned.001 which has exact solution x1 = x2 = 1.

with an error of 0.9.5 1 1. if the signal representation problem s = Vc is ill-conditioned.02%. In fact.101 discussion.600 . The two equations represent two nearly parallel straight lines on the (x1 . for example. we may want to choose a diﬀerent set of reference signals in the matrix V. x2 = 0. we have the exact solution x1 = 1. In a systems context. we will show that this is the case even with four-digit precision. . We give a graphical illustration of the diﬃculties encountered in this problem. an ill-conditioned matrix A might force us to seek a diﬀerent measuring device or observation method.5 1 0.5 Example 2. b2 = 2.0014 Both these values are rounded to 2.. 2. x2 ) plane. Suppose we change the right-hand vector to b1 = 2. A small parallel displacement of either line (as a result of changing the value of an intercept bi ) will cause a large movement of the solution point.0006 . y = Ax is an observation signal for a hidden state vector x.2 Ill-conditioned problems force us to reconsider the physical parameters and models involved. solving it exactly).5 2 1 2.5 x2 2 1. For example. Ignoring any further round-oﬀ errors in the solution of the system (i.5 x 0 0 0.001 using four-digit precision.e.3995 Note the 60% error in each entry relative to the earlier solution. large errors are possible when rounding to three signiﬁcant digits. where.

e.102 2. Clearly.10 2. b = i=1 ai bi Clearly. b = aT b = bT a = b. a Also. if c is a (real) scaling factor. Also. Projections and Signal Approximation Inner Products and Norms The inner product of two vectors a and b in Rm×1 is deﬁned by m a. 1/2 = i=1 a2 i m a 2 = a. a = i=1 a2 i By the Pythagorean theorem. m 1/2 ca = i=1 c2 a2 i = |c| · a ..1 Inner Products. the inner product is linear in one of its arguments (i. b The norm of a vector a in Rm×1 is deﬁned as the square root of the inner product of a with itself: m 1/2 a = a. the inner product is symmetric in its two arguments and can be expressed as the product of a row vector and a column vector: a. b + c2 a(2) . vectors) when the other argument is kept constant: m c1 a (1) + c2 a (2) . the all-zeros vector is the only vector of zero length.e.. a or equivalently. a =0 ⇔ a=0 i. a is also the length of a.b = i=1 c1 ai + c2 ai (1) (2) bi = c1 a(1) .10.

Applying the cosine rule to the triangle formed by a. Projections and Orthogonality In general. b 2 Thus a 2 + b 2 − 2 a. plane) through the origin. b = a cos θ = + b − 2 a · b · cos θ and consequently a. 2. b − 2 a. we obtain b−a 2 = a 2 + b 2 − 2 a · b · cos θ Using the linearity property stated earlier.2 Angles. note that the length of f equals f = b · | cos θ| = | a.103 b-a b θ 0 a Figure 2. the left-hand side can be also expressed as b−a 2 = = = b − a. Figure 2. two nonzero vectors a and b in Rm×1 deﬁne a two-dimensional subspace (i. b a · b Another feature of the geometry of two vectors in Rm×1 is the projection of one vector onto the other. the angle between a and b takes value in [0. b − a a. π]. a + b.. To determine the value of λ. b and the dotted vector (parallel to) b − a in Figure 2.9 shows the projection f of b onto a. which is of the form f = λa where λ is a scaling factor having the same sign as cos θ. b | a . b a 2 + b 2 2 − 2 a. By convention. b and b − a.e.10.8: The geometry of vectors a.8.

b / a 2 . The unit vector parallel to a is given by e(a) = 1 a a and f is obtained by scaling e(a) by the “signed” length of f : f= Therefore λ = a. Example 2. We then have √ a 2=3 ⇒ a = 3 and b Also. Let a = [−1 1 − 1]T and b = [2 − 5 1]T in R3×1 . a. b = −2 − 5 − 1 = −8 The projection of b onto a is given by 8 f =− a 3 while the projection of a onto b is given by g=− 4 8 b=− b 30 15 ⇒ 8 cos θ = − √ 3 10 2 a.9: Projection of vector b onto vector a.104 b θ 0 f a Figure 2. b (a) a.1. b e = a a a 2 = 30 ⇒ b = √ 30 .10.

e.10.10. will be assumed linearly independent for the remainder of this chapter. Given n reference signals v(1) . the number of reference signals is no larger than the signal dimension (i. . .3 Formulation of the Signal Approximation Problem Our discussion of matrix inversion was partly motivated by the following signal approximation problem. We therefore assume the following: Assumption. . it s makes little sense to include in V a signal which is expressible as a linear combination of other such (reference) signals. that signal would be redundant. or equivalently (since cos(π/2) = 0). namely v(1) . The above assumption immediately implies that n ≤ m. b = 0 Clearly. v(n) Since the approximation ˆ is a linear combination of reference signals. a. 2.. If the columns the (square) matrix V are linearly independent. . then V is nonsingular and the equation Vc = s .. .105 Deﬁnition 2. then the projection of either vector onto the other is the all-zeros vector 0. i. how can we best approximate an arbitrary signal vector s in Rm×1 by a linear combination of these signals? The approximation would be of the form n ˆ= s r=1 cr v(r) = Vc where the m × n matrix V has the n reference signals as its columns: V= v(1) . The case n = m has already been considered in the context of matrix inversion.1. It is therefore impossible to have a set of m + 1 (or more) linearly independent vectors in Rm×1 . v(n) in Rm×1 . As we saw earlier. The columns of V. v(n) . . vector size). and denote that relationship by a⊥b if the angle between a and b equals π/2. . We say that a and b are orthogonal. any set of m linearly independent vectors in Rm×1 can be used to obtain any vector in that space (by means of linear combinations). . . if a and b are orthogonal. .e.

..9.106 has a unique solution c given by c = V−1 s. Again. Figure 2. 2. the solution of the signal approximation problem can be obtained using twodimensional geometry.10 clearly shows that ˆ − s is also orthogonal to every vector on the plane R(V). Thus the best approximation is one that s minimizes the length ˆ − s of the error vector ˆ − s over all possible choices s s of ˆ in R(V). that approximation will also minimize the square of s that length: m ˆ−s s 2 = i=1 (ˆi − si )2 s By virtue of the expression on the right-hand side (i. Thus the only real case of interest is n < m. v(n) . It is also interesting to note that the s error vector ˆ − s is orthogonal to v(1) . If. Any signal s that does not belong to R(V) will need to approximated by a ˆ in R(V). . we have an exact representation of s as a linear combination of the n = m reference signals. we see that the point closest to s on the line generated by v(1) is none other than the projection f of s onto v(1) . sum of squares of entries of the error vector).10.. i. i. we replace b by s and a by v(1) . s To properly formulate this problem.e. m > n = 2). where the range R(V) of V is a linear subspace of Rm×1 of dimension d = n < m. m > n = 1). we will evaluate each approximation ˆ based on its distance from s.. in Figure 2..e. it is orthogonal to both v(1) and v(2) .e. The error vector satisﬁes ˆ − s ⊥ v(1) s and ˆ − s ⊥ v(2) s i.4 Projection and the Signal Approximation Problem In the case where there is only one reference vector v(1) (i. Thus there is no need to approximate here. the solution ˆ is obtained by projecting s on the plane s R(V). . Thus the least-squares approximation of s based on v(1) is given by ˆ = f ..e.e. s ˆ − s ⊥ v(1) s Similarly.. this solution is known as the least-squares (linear) approximation of s based on v(1) .e. Clearly. three-dimensional geometry provides the solution to the approximation problem when two reference vectors v(1) and v(2) are given (i. s Our intuition thus far suggests that the solution to the signal approximation problem for any number n < m of reference vectors might be obtained . .

s s • The n conditions ˆ − s. then ˆ − s ⊥ a for every a in R(V). its distance from s can be no larger than that of any other vector y in R(V). . . v(j) = 0 for every j. s s This is because the inner product is linear in one of its arguments when the other argument is ﬁxed. thus if a is a linear combination of v(1) . then ˆ − s. Before showing that this is indeed the case. where the projection ˆ ∈ R(V) satisﬁes s ˆ − s ⊥ v(j) s for every 1 ≤ j ≤ n. . we note the following: • If ˆ − s ⊥ v(j) for every vector v(j) . or equivalently.. by projecting s onto the subspace R(V) generated by v(1) . a = 0 also. v(i) = (v(i) )T (ˆ − s) = 0 can be expressed in s s terms of a single matrix-vector product as VT (ˆ − s) = 0 s where the vector 0 is n-dimensional. . . i. as VT ˆ = VT s s We prove our conjecture by showing that if there exists ˆ in R(V) sats s isfying VT ˆ = VT s.10: The projection of s on the plane generated by v(1) and v(2) . .107 s v 0 (1) s v (2) R(V) Figure 2. . .e. y−s 2 ≥ ˆ−s s 2 . v(n) and ˆ − s. v(n) .

Ac = b where: • A = VT V is an n × n symmetric matrix whose (i. ˆ − s s s The quantity y − ˆ 2 is nonnegative (and is zero if and only if y = ˆ). v(j) (hence the symmetry). this means that ˆ = Vc for some c to be deters s mined. j)th entry equals the inner product v(i) . y − ˆ + ˆ − s s s s s Treating y − s and ˆ − s as single vectors. Although the existence of s in R(V) s satisfying VT ˆ = VT s s has not yet been shown.10. ..2) y−s 2 = ˆ−s s 2 + y−ˆ s 2 + 2 y − ˆ. We rewrite the equation as VT Vc = VT s i. The s s inner product y − ˆ.e. we rewrite the left-hand side as y−ˆ+ˆ−s s s 2 = y − ˆ + ˆ − s. s 2. it follows from the “if and only if” statement (above) that there can only be one solution ˆ to the signal approximation problem. we obtain (as in the derivation of s the formula for cos θ in Subsection 2. s . and • b = VT s is a n × 1 vector whose ith entry equals v(i) . We thefore conclude that s y−s 2 ≥ ˆ−s s 2 with equality if and only if y = ˆ.108 Indeed.10. ˆ − s equals zero since y − ˆ belongs to R(V) and s s s ˆ − s is orthogonal to every vector in R(V).5 Solution of the Signal Approximation Problem VT ˆ = VT s s It remains to show that has a solution ˆ in R(V).

109 Note that once the inner products in VT V and VT s have been computed, the signal dimension m altogether drops out of the picture, i.e., the size of the problem is solely determined by the number n of reference vectors. A unique solution c exists for every s provided the n × n matrix VT V is nonsingular, i.e., x = 0 ⇒ (VT V)x = 0 To see why the above implication is indeed true, recall our earlier assumption that the columns of the m × n matrix V are linearly independent. Thus x=0 ⇒ ⇔ Vx = 0 Vx
2

>0

where the squared norm in the last expression can be also expressed as Vx Thus x=0 ⇒ xT (VT V)x > 0 and hence (VT V)x cannot be the all-zeros vector 0; if it were, then the scalar xT (VT V)x would equal zero also. This establishes that VT V is nonsingular. We therefore obtain the solution to the least squares approximation problem as ˆ = Vc s where c = (VT V)−1 VT s A single expression for ˆ is s ˆ = V(VT V)−1 VT s s (Note that the m × m matrix V(VT V)−1 VT is also symmetric.) Example 2.10.2. Consider the case m = 3 and n = 2, with v(1) = [−1 1 − 1]T and v(2) = [2 − 5 1]T (same vectors as in Example 2.10.1). Thus   −1 2 V =  1 −5  −1 1
2

= (Vx)T (Vx) = xT VT Vx

110 and VT V =

v(1)

2

v(2) , v(1)

v(1) , v(2) v(2) 2

=

3 −8 −8 30

Let us determine the projection of ˆ of s = [2 − 1 − 1]T onto the plane s deﬁned by v(1) and v(2) . We have VT s = and thus ˆ = Vc, where s c= Therefore 3 −8 −8 30
−1

v(1) , s v(2) , s

=

−2 8

−2 8

=

1 13

2 4

  3 2 (1) 4 (2) 2  ˆ= v + v = −9  s 13 13 13 1

As a ﬁnal observation, note that the formulas c = (VT V)−1 VT s and ˆ = V(VT V)−1 VT s s which we derived for the case n < m under the assumption of linear independence of the columns of V, also give us the correct answers when n = m and V−1 exists: c = (VT V)−1 VT s = V−1 (VT )−1 VT s = V−1 s and ˆ = V(VT V)−1 VT s = VV−1 s = s s

111

2.11

Least-Squares Approximation in Practice

The basic technique of least-squares approximation developed in the previous section has numerous applications across the engineering disciplines and in most scientiﬁc ﬁelds where data analysis is important. Broadly speaking, least-squares methods allow us to to estimate parameters for a variety of models that have linear structure. To illustrate the adaptability of the least-squares technique, we consider two rather diﬀerent examples. Example 2.11.1. Curve ﬁtting. We are given the following vector s consisting of ten measurements taken at regular time intervals (for simplicity, every second): t s 1.0 −0.7 2.0 −0.2 3.0 0.6 4.0 1.1 5.0 1.7 6.0 2.5 7.0 3.1 8.0 3.8 9.0 4.5 10.0 5.0

Suppose we are interested in ﬁtting a straight line s(t) = at + b ˆ through the discrete plot {(ti , si ), 1 ≤ i ≤ 10} so as to minimize
10

(ˆ(ti ) − si )2 s
i=1

Letting s(ti ) = si , we see that the vector ˆ is given by ˆ ˆ s ˆ = at + b1 s where 1 is a column vector of unit entries. We can then restate our problem as follows: choose coeﬃcients a and b for t and 1, respectively, so as to minimize ˆ−s 2 s We thus have a least-squares problem where • the length of the data vector is m = 10; and • the number of reference vectors (i.e., t and 1) is n = 2.

112 Again, we begin by computing the matrix of inner products of the reference vectors, i.e., VT V, where V = [t 1]. We have
10

t t =
i=1 10

T

i2 = 385 i = 55
i=1 10

tT 1 = 1T 1 =
i=1

1 = 10

and therefore VT V = Also,

tT t tT 1 tT 1 1T 1

=

385 55 55 10

10

tT s =
i=1 10

isi = 171.2 si = 21.4
i=1

1T s = and thus VT s =

tT s 1T s
−1

=

171.2 21.4

The least squares solution is then given by a b = 385 55 55 10 171.2 21.4 = 0.6485 −1.4267

and the resulting straight line s(t) = at + b is plotted together with the ˆ discrete data points. It can be seen that the straight-line ﬁt is very good in this case. The mean (i.e., average) square error is given by 1 ˆ−s s 10 Taking the square root 1 ˆ−s s 10
2 2

= 0.00501

= 0.0708

113

5 4 3 s 2 1 0 −1 0 1 2 3 6 7 t Example 2.11.1 4 5 8 9 10 11

we obtain the root mean square (r.m.s.) error. The mean absolute error is 1 10
10

|ˆi − si | = 0.0648 s
i=1

Remark. The same technique can be used for ﬁtting any linear combination of functions through a discrete data set. For example, by expanding the set of reference vectors to include the squares, cubes, etc., of the entries of the abscissa (time in this case) vector t, we can determine the optimal (in the least-squares sense) ﬁt by a polynomial of any degree we choose. Example 2.11.2. Range estimation. In this example, we assume that we have two signal sources, each emitting a pure sinusoidal waveform. The signals have the same amplitude (hence also power), but diﬀerent frequencies and phase shifts. The sources are mobile, and their location at any time is unknown. A receiver, who knows the two source frequencies Ω1 and Ω2 , also knows that the signal power follows an inverse-power law, i.e., it decays as R−γ , where R is the distance from the source and γ is a known positive constant. Based on this information, the receiver attempts to recover the range ratio R1 /R2 (i.e., the receiver’s relative distance from the two sources) from

or noise. z(t) represents interference. In their discrete form.114 samples of the received signal s(t) = σR1 −γ/2 cos(Ω1 t + φ1 ) + σR2 −γ/2 cos(Ω2 t + φ2 ) + z(t) Here. b1 . namely cos(Ωk t) and sin(Ωk t). . . tm . that has no particular structure (at least none that can be used to improve the model for s(t)). . it makes sense to estimate those amplitudes. Suppose that the receiver records m samples of s(t) corresponding to times t1 . a2 and b2 such that ˆ = a1 u(1) + b1 w(1) + a2 u(2) + b2 w(2) s is the least-squares approximation (among all such linear combinations) to the vector of received samples s. we write cos(Ωk t + φk ) = cos φk cos(Ωk t) − sin φk sin(Ωk t) and use two reference signals for each Ωk . . The solution is obtained in the usual fashion: if V = u(1) w(1) u(2) w(2) then the optimal coeﬃcients are given by   a1  b1  T −1 T    a2  = (V V) V s b2 . Since the distances R1 and R2 both appear in the amplitudes of the received sinusoidal components. these signals are given by vectors u(1) = w (1) (2) cos(Ω1 ti ) sin(Ω1 ti ) cos(Ω2 ti ) sin(Ω2 ti ) m i=1 m i=1 m i=1 m i=1 = = u w(2) = We thus seek coeﬃcients a1 . Use of the common parameter σ here is consistent with the assumption that both sources emit at the same power. To that end. Note that neither of the two components in their respective form shown above can be used as a reference signal for the least-squares solution—the unknown phase shifts must somehow be removed.

the resulting estimate of the ratio R1 /R2 is a2 + b2 1 1 a2 + b2 2 2 −1/γ . the sum a2 +b2 is approximately equal to σ 2 Rk (cos2 φk + k k −γ sin2 φk ) = σ 2 Rk . for each k.115 −γ Since.

. 0.. 0 −2 0 v(2) . v(n) in V are (pairwise) orthogonal.. 0 0 .116 2. . The projection ˆ of s on R(V) is then given by ˆ = Vc. i = j. v(i) . .e. Recall that v(i) . v(n) 2      and hence VT V is a diagonal matrix. . . . . Thus if the columns of V are orthogonal. 0    T −1 T c = (V V) V s =    . ci = v(i) . s    v(2) . . i = j. This means that v(i) ⊥v(j) for any i = j... or least-squares. ..12.. .. . .. . v(j) is the (i. and thus    (VT V)−1 =   v(1) 0 . for every i. or equivalently. 0 v(1) . .. j)th element of the inner product matrix VT V. . ... .1 Orthogonality and Least-Squares Approximation A Simpler Least-Squares Problem The solution of the projection. . . . . s v(i) 2 . problem is greatly simpliﬁed when the reference vectors v(1) . .. . . 0 2 . then    VT V =   v(1) 0 .    .. . v(n) −2      is well-deﬁned.. . . 0 0 . . s  0 v(2) −2 . . Since none of v(1) . v(j) = v(i) 2 . ... . 0 −2 . . . v(n) is an allzeros vector (by the assumption of linear independence). . s s where    v(1) −2 0 . . . .. v(n) −2 v(n) . . 0 2 0 v(2) . . .. .12 2. . . it follows that all the squared norms on the diagonal are strictly positive. 0 0 . s i. .

as shown in Figure 2. s (i) v v(i) 2 The generic term in the above sum is the projection of s onto v(i) (as we saw in Subsection 2. 2. the projection of s = 3e(1) + 4e(2) + 7e(3) on the plane generated by e(1) and e(2) equals ˆ = 3e(1) + 4e(2) . It is also used extensively—and without elaboration—when dealing with projections on subspaces generated by two or more of the standard orthogonal unit vectors e(1) . . Thus if the columns of V are orthogonal. .11.2 Generating Orthogonal Reference Vectors The result of the previous subsection clearly demonstrates the advantage of working with orthogonal reference vectors v(1) . .117 As a result. s v (2) s v 0 (1) R(V) Figure 2. .2). . then the projection of any vector s on R(V) is given by the (vector) sum of the projections of s onto each of the columns. This result is in agreement with three-dimensional geometry. .11: Projection on a plane generated by orthogonal vectors v(1) and v(2) .10. ˆ = Vc = s i=1 n v(i) .12. . For example. v(n) : least squares approximations can be obtained without solving simultaneous equations. . s which is the sum of the projections of s on e(1) (given by 3e(1) ) and on e(2) (given by 4e(2) ). . e(m) in Rm×1 .

. if V= we claim that there exists W= w(1) . w(n) }. if we so choose: Fact. The construction is based on the LU factorization of VT V. • symmetric. . . where equivalence means that both sets produce the same subspace of linear combinations. . In other words. . v(n) } of vectors in Rm×1 (where n ≤ m) is equivalent to a linearly independent and orthogonal set {w(1) . and • positive deﬁnite. . a matrix known to be be • nonsingular . we can always work with orthogonal reference vectors. given any V with linearly independent columns. w(n) v(1) . . . The three properties listed above have the following implications (proofs are omitted): • Nonsingularity of VT V implies that a unique normalized LU factorization of the form VT V = LDU exists where both L and U have unit diagonal elements and D has nonzero diagonal elements. v(n) . .118 As it turns out. such that WT W is diagonal and R(V) = R(W) The last condition means that each v(i) is expressible as a linear combination of w(j) ’s and vice versa (with v and w interchanged). Any linearly independent set {v(1) . there exists a nonsingular n × n matrix B such that V = WB and (equivalently) W = VB−1 We will now show how. meaning that xT VT Vx > 0 for all x = 0. we can obtain such a matrix B. . . . . In matrix terms.

12. Consider the matrix  V= v(1) v(2) v(3)   =   −1 0 1 −1 −1 3 1 1 1 1 −3 −2 −2 −1 0       Gaussian elimination on  4 −4 2 13 −14  VT V =  −4 2 −14 18 m x1 4 −4 2 4 0 0 4 0 0  0 0  1  0 0  1 x2 −4 13 −14 −4 9 −12 −4 9 0 x3 2 −14 18 2 −12 17 2 −12 1    1 −1 1/2 0 1 −4/3  0 0 1  yields 1 −1/2 4/3 Therefore 1 0 1 VT V =  −1 1/2 −4/3  1 0 1 =  −1 1/2 −4/3  4 −4 2 0 9 −12 0 0 1  4 0 0 0 9 0  0 0 1 . positive deﬁniteness of VT V implies that D has (strictly) positive diagonal elements.119 • Symmetry of of VT V in conjunction with the uniqueness of the above factorization implies L = UT • Finally.1. Example 2. The following example illustrates the above-mentioned features of the LU factorization of VT V.

Using the above-mentioned properties of VT V. we will now show that the matrix W = VU−1 has n nonzero orthogonal columns. equivalently. Indeed. Thus w(1) = v(1) w(2) = v(1) + v(2) 1 4 5 4 w(3) = − v(1) + (v(1) + v(2) ) + v(3) = v(1) + v(2) + v(3) 2 3 6 3 . Example 2.12. . WT W = (VU−1 )T VU−1 = (UT )−1 VT VU−1 = (UT )−1 UT DUU−1 = D and thus W has n orthogonal columns w(1) . .1. thereby proving the claim made earlier (with B = U). We have. treating the row vectors w(·)T and v(·)T as scalar variables and omitting the transpose throughout. (Continued.120 demonstrating that VT V = UT DU. WU = V ⇔ UT WT = VT dii > 0 The last equation can be written as (note that UT = L)    (1)T   (1)T  w v 1 0 0  −1 1 0   w(2)T  =  v(2)T  1/2 −4/3 1 w(3)T v(3)T We solve this lower triangular system using standard forward elimination. . w(n) with norms given by w(i) = for all i. . where D has strictly positive diagonal entries.) There is no need to compute U−1 explicitly in order to determine W = VU−1 .

This the same as producing the Cholesky form of the LU factorization of a positive deﬁnite symmetric matrix A. we obtain a new set of orthogonal vectors. In extensive form. • The upper triangular matrix D1/2 U above can be computed in MATLAB using the function CHOL. each having unit norm. A set of vectors having these two properties (orthogonality and unit norm) is known as orthonormal. 3 and 1 respec 2 1/6 1 −2/3   2 1/6   0 −1/2  0 1/2 We make the following ﬁnal remarks on this topic.121 We have obtained three orthogonal vectors with tively. where the lower and upper triangular parts are transposes of each other: A = (D1/2 U)T (D1/2 U) .  −1  0  W = w(1) w(2) w(3) =  1   −1 −1 norms 2. with argument given by VT V. The new matrix W is related to V via V = WD1/2 U and W = VU−1 D−1/2 where D1/2 and D−1/2 are diagonal matrices obtained by taking the same (respective) powers of the diagonal elements of D. • Dividing each w(i) in the above transformation by its norm w(i) = √ dii .

 . Since each entry of the matrix consists of a real and an imaginary part. but with real-valued coeﬃcients x1 . zm Note that the “information-doubling” eﬀect mentioned in the previous paragraph is due to the fact that each zi is complex. since real-world signals are real-valued—it therefore makes sense to represent or approximate them using real-valued reference vectors.13. . where C denotes the complex plane. . . . . This fact has interesting implications about the representation of real-valued signals by complex-valued reference vectors. One class of reference vectors of particular importance in signal analysis comprises segments of sinusoidal signals in discrete time. a complex-valued matrix stores twice as much information as a real-valued matrix of the same size. xm ). which can be expressed as z = z1 e(1) + · · · + zm e(m) Here. We thus have the following straightforward extensions of properties and concepts deﬁned earlier: . zm are the complex-valued elements of z:   z1  z2    z= . A m × n complex-valued matrix takes values in Cm×n .   . .1 Complex-Valued Matrices and Vectors Similarities to the Real-Valued Case Thus far we have considered matrices and vectors with real-valued entries only.13 2. . this restriction is justiﬁable. . the vector z itself has dimension m. it pays to generalize the tools and concepts developed so far to include complex-valued matrices and vectors. these (scalar) operations are replaced by complex addition and multiplication. As the analysis of sinusoids is simpliﬁed by considering complex sinusoids (or phasors). and z1 . Since all matrix computations are based on scalar addition and multiplication. . From an applications perspective. . . . e(m) are the standard (real-valued) unit vectors. e(m) . we see that each column of A is a vector z in Cm×1 . one only needs to ensure that in the complex case. Complex-valued matrices and vectors behave in exactly the same manner as real-valued ones. . e(1) . . .122 2. Partitioning a matrix A ∈ Cm×n into its columns. . as does any real-valued vector x (which is also a linear combination of e(1) .

and each complex multiplication involves four real multiplications and three real additions.13. Thus linear independence of the columns of A means that z = 0 is the only solution of Az = 0. • Solution of Az = b: Gaussian elimination is still applicable.123 • Matrix addition: Deﬁned as before (using complex addition). allowing complex coeﬃcients in linear combinations. Thus v. • Matrix transposition (T ): Deﬁned as before. One can also write out the above matrix equation using real constants and unknowns. as usual. The combination of the complex conjugate ∗ and transpose T operators (in either order) is known as the conjugate transpose. 2.e. Note that if the two vectors are real-valued (i. and the above deﬁnition becomes that of the inner product of two real-valued vectors. the superscript denotes complex conjugate. have zero imaginary parts). twice as many variables are needed. A modiﬁed transpose will be deﬁned in the next subsection.. since each complex addition involves two real additions. In terms of computational eﬀort. No algebraic operations are needed here. • Linear Independence: Deﬁned as before. or Hermitian. • Matrix multiplication: Deﬁned as before (using complex addition and multiplication). a square matrix is nonsingular (and thus has an inverse) provided its columns are linearly independent.2 Inner Products of Complex-Valued Vectors m The inner product of v and w in Cm×1 is deﬁned by v. since each complex variable zi has a real and an imaginary part. operator H. w = i=1 ∗ ∗ vi wi = (v∗ )T w where. this is a more intensive problem. the complex conjugate has no eﬀect on the value of v. • Matrix inverse: As before. w = vH w The following identities involving complex conjugates of scalars are useful in manipulating inner products and conjugate transposes of vectors: (z ∗ )∗ = z .

v = or equivalently.1. v 1/2 which is the same relationship between norm and inner product as in the real-valued case. = = v. w ∗ wH v = vH w Also.124 ∗ ∗ (z1 + z2 )∗ = z1 + z2 ∗ ∗ (z1 z2 )∗ = z1 z2 |z|2 = zz ∗ We immediately obtain: m m ∗ wi vi i=1 ∗ ∗ vi wi i=1 ∗ w.13. Let   1+j v =  −j  5 − 2j  3 w = 1−j  1 + 2j  and . v. w cv. w = |c|2 v. w = c∗ v. This is a key reason for introducing the complex conjugate in the deﬁnition of the inner product. We also have m 1/2 ∗ vi vi i=1 v = = v. w The norm (or length) of v in Cm×1 is deﬁned as m 1/2 v = i=1 |vi | 2 where | · | denotes the magnitude of the complex element vi : ∗ |vi |2 = vi vi = e2 {vi } + m2 {vi } We thus see that v 2 is given by the sum of squares of all real and imaginary parts contained in the vector v. cw = c v. cw cv. Example 2. if c is a complex-valued scalar.

e. use of ’ will result in an error (due to the conjugation). the conjugate transpose (H or ’) is usually needed. in MATLAB: • .125 Then v. • For changing the (row-column) orientation of complex vectors.g. i. v = a + jb and w = c + jd. the ordinary transpose . v. . v v 2 = wH v = 5 − 10j = vH v = (1 + 1) + (1) + (25 + 4) = 32 w 2 = wH w = 9 + (1 + 1) + (1 + 4) = 16 Note that by separating complex vectors into their real and imaginary parts. • Where complex matrices are involved. e. w = (aT − jbT )(c + jd) = aT c + bT d + j(aT d − bT c) The following properties of the conjugate transpose H are identical to those of the transpose T : (AH )H (AB) H = A = BH AH = (AH )−1 (A−1 )H As a ﬁnal reminder.. • In computations involving real matrices (exclusively)..’ must be used.’ denotes T and ’ denotes H. w = vH w = (1 − j) · 3 + j · (1 − j) + (5 + 2j) · (1 + 2j) = 5 + 10j w. we can also express inner products and norms of complex vectors in terms or inner products of real vectors. the same result will be obtained by either transpose.

. they are not orthogonal since v. . . in eﬀect. Following the same argument as in the real-valued case. Remark. Assuming that v(1) . v(n) are linearly independent vectors in Cm×1 (where n ≤ m). v = 0 or equivalently: v⊥w ⇔ vH w = 0 ⇔ wH v = 0 This is essentially the same deﬁnition as in the real-valued case.. it is considerably more diﬃcult to visualize Cm×1 than Rm×1 . we seek to approximate s ∈ Cm×1 by Vc so that the squared norm Vc − s 2 (i. Equivalently. The proof that the projection ˆ on R(V) solves the least-squares aps proximation problem for complex vectors is identical to the proof given in . and even simple concepts such as orthogonality can lead to intuitive pitfalls. w = 0 ⇔ w. w = j. the formulation and solution of the least-squares approximation problem in the complex case turns out to be the same as in the real case.13. With the given deﬁnitions of the inner product and orthogonality for complex vectors. we obtain s VH Vc = VH s Since H is the extension of T to the complex case. For example. in matrix form we have VH ˆ = VH s s and letting ˆ = Vc as before.e. the sum of squares of all real and imaginary parts contained in the error vector Vc − s) is minimized. the scalars v = 1 and w = j are at right angles to each other on the complex plane. .126 2. we deﬁne the projection ˆ of s on R(V) by the following n relationships: s ˆ − s ⊥ v(i) s ⇔ (v(i) )H ˆ = (v(i) )H s s where i = 1. . n. Note that in geometric terms.3 The Least-Squares Problem in the Complex Case Orthogonality of complex-valued vectors is deﬁned in terms of their inner product: v⊥w ⇔ v. we have. yet viewed as vectors v and w in C1×1 . . . . the same formula for computing projections as before.

10. equal to zero by the assumption of orthogonality of ˆ − s s and any vector in R(V). again. then y−s 2 = ˆ−s s 2 + y−ˆ s 2 + y − ˆ. ˆ − s + ˆ − s.4 for real vectors. y − ˆ s s s s The last two terms on the right-hand side (which are conjugates of each other) are. Care should be exercised in writing out one equation: if y is the competing approximation.127 Subsection 2. .

1. HILB and TOEPLITZ. B(1:N. Use it together with the transpose operator . How would you deﬁne the vector r so that the following command string returns an all-zero N-by-N matrix? R = 1.J) is a 2 × 2 matrix consisting of the corner elements of A.1 P 2.1:N) . B = flipud(A). P 2.’ in a single (one-line) command to generate the matrix 1 2 3 4 5 6 7 8 9 10 11 12 from A.K) = B(:. A = toeplitz(R). (MATLAB) Study the functions FLIPUD. 5 6 7 8. Let N be integer. (ii) Suppose that B=A initially./r.hilb(N) .2. (iii) Explain the result of C = A(:) (iv) Study the function RESHAPE.L) swaps the ﬁrst and fourth columns of B.128 Problems Section 2. 9 10 11 12] (i) Find two-element arrays I and J such that A(I. Find two-element arrays K and L such that B(:. (MATLAB) Enter the matrix A = [1 2 3 4.

P 2. 0 2 0 −1 Write out the matrix A and compute Ax.4. Determine the matrix B. If B 1 0  3 =  −2  −1  and B −1 1  4 = 7  .3. A 1  =  3 . . The transformation A : R3 → R3 is such that         1 4 0 −2 A  0  =  −1  . where x= 2 5 −1 T −1 1 =u and G 2 1 =v 3 3    0 1 A 0  =  0  1 3  (ii) The transformation B : R3 → R2 is such that  1 B 0  = 0  2 −4  1 B 1  = 0  3 −2  1 B 1  = 1  −2 1 .2 P 2. . 2  determine B 2 −1 P 2. Let G be a m × 2 matrix such that G Express G in terms of u and v.129 Section 2.5.

8.130 P 2. y2=A*x2 . Explain mathematically why this is so. Compute by hand the matrix product AB in the following cases:   2 −1 −2 −3 3 A =  3 −2  .60). B= . y3=A*x3 . x4 = cos(pi*n/2)’. x2 = cos(pi*n/6)’. (iii) Use the SUBPLOT feature to display bar graphs of all eight vectors (against n) on the same ﬁgure window. x2. and compute the products y1=A*x1 . c(2) = 1. r=c. .7. If  a b c A= d e f  g h i  ﬁnd vectors u and v such that uT Av = b P 2. 4 0 7 1 5     2 0 0 2 −2 3 4 0 . x1 = cos(0*n)’. (i) Write a command that displays the top left 6 × 6 block of A. x3 and x4 as follows: n = 1:60.6. x3 = cos(pi*n/3)’. 4 −5  B= 0 A =  −2 3 −5 1 0 0 1 P 2. (MATLAB) Generate a 60 × 60 matrix A as follows: c = zeros(1. c(1) = -1. y4=A*x4. (ii) Generate four sinusoidal vectors x1.r). A = toeplitz(c. (iv) One of the eight vectors consists mostly of zero values.

P 2.11. Let  a  b  x=   c  d   a  c  Qx =    d  b  If P and Q are 4 × 4 matrices such that   c  a  Px =   and  b  d determine the product PQ. 6:-1:4. (i) Find a (n × n) matrix P such that if x= then Px = xn xn−1 xn−2 . 3:-1:1] .3 P 2. (iii) Let the vector x be given by x[k] = cos(ωk + φ) . determine the rth power Pr for any integer r. . xn T . 7:9] .131 Section 2. P 2. (1 ≤ k ≤ n) Under what conditions on ω. 4:6. Consider the matrices   a b c A= d e f  g h i and  i g h B= f d e  c a b  Express B as P(r) AP(c) .12.9. Matrices A and B are generated in MATLAB using the commands A = [1:3. (ii) Without performing any calculations. . where P(r) and P(c) are permutation matrices. x1 x1 x2 x3 . . T . φ and n is the relationship Px = x true? . Find matrices P and Q such that B = PAQ.10. P 2. . B = [9:-1:7.

Section 2.15.5 P 2.13. .4–2.14. Express each  a b  2a 2b A=  −a −b −2a −2b of the matrices  c d 2c 2d   −c −d  −2c −2d  and  t−1 B =  −2 −1  t t t−3 t−2 1 t 1  t2 t3 t t2   1 t  t−1 1 as a product of a row vector and a column vector. derive A−1 . Solve the simultaneous equations (i) x1 + x1 + 2 x1 + 3 x2 x3 + 2 3 x2 x3 + 3 4 x2 x3 + 4 5 = 14 = 1 = −2 (ii) 3x1 − 5x2 − 5x3 = −37 5x1 − 5x2 − 2x3 = −17 2x1 + 3x2 + 4x3 = 32 using Gaussian elimination.6 P 2. (i) Without performing any matrix operations. Sections 2. (ii) Verify that AA−1 = I by computing the product on the left-hand side of the equation. Consider the matrix A= cos θ − sin θ sin θ cos θ which represents a counterclockwise rotation by an angle θ.132 P 2.

  4  1   b=  −3  4  Section 2.    4 1 21  52  6 6   . (i) Determine the LU and LDV  4 8  1 5 A=  1 4 1 3 factorizations of  4 0 4 −3   7 2  0 −2 (ii) Solve Ax = b.420 2.002 3.18. . (i) Use row pivoting to where  1 2  2 8 A=  3 10 4 12 solve the simultaneous equations Ax = b. Consider the simultaneous equations Ax = b.16 for  1 1 0 3  2 1 −1 1 A=  3 −1 −1 2 −1 2 6 −1    . Repeat P 2.897 3.7 P 2.422 6.8 P 2. (b) with row pivoting. x2 = 1. P 2.412 . Round to four signiﬁcant digits after each computation. b= 3.17. where b = [28 13 23 4]T . where A= 0. by means of the two triangular systems Ly = b and Ux = y P 2.133 Section 2.19.309 The exact solution is x1 = 1.16. Solve using ﬁnite-precision Gaussian elimination (a) without row pivoting.  b=   79  8 8 10 6 82 (ii) Determine the permuted LU factorization LU = PA.

21. Also. P 2. determine (by hand) the answer X. Use Gaussian elimination to determine the inverses of    1 2 4 3 2 1  2 8 6 A =  −1 5 2  and B=  3 10 8 2 3 1 4 12 10 P 2. (ii) Enter .000 1. Consider the simultaneous equations Ax = b.001 3. use the function INVHILB to display its inverse. let each entry of b vary by ±. To that end.000 2. Consider the following MATLAB script: A = [1 2 -4. whose entries are given by aij = 1 . i+j−1  1 6   8  6 is a classic example of an ill-conditioned matrix.20. 2 7 -3]. The command format rat will display it in rational form.000 b =  1. X = A\I Using Gaussian elimination. -1 -1 5. (i) Use the MATLAB function HILB to display the 5 × 5 Hilbert matrix.000 3. examine how the solution x is aﬀected by rounding the entries of b to four signiﬁcant digits. compute norm(x-A\b)). For each of the 23 = 8 extremal values. and determine the maximum distance from the original solution (i.23.0005. I = [1 0 0.999 −2.000  1.. The n × n Hilbert matrix A.e.000  . compute the solution x using the \ (backslash) operator in MATLAB.22.001 1.000 −4. This problem illustrates the eﬀect of perturbing each entry of A by a small random amount. A =  −1. which has integer entries.134 Section 2.000 −2. P 2.000 Assuming that the entries of A are precise. where     3. 0 0 1].9 P 2. 0 1 0.

001*rand(n. s (ii) Show that the projection of ˆ on v(1) is the same as the projection of s s (1) . a(3) and a(4) be the columns of the matrix   1 1/2 1/4 1/8  0 1 1/2 1/4   A=  0 0 1 1/2  0 0 0 1 Determine the least squares approximation p = c1 a(1) + c2 a(2) + c3 a(3) .24. (i) Determine the projection ˆ of s on the plane deﬁned by v(1) and v(2) . Consider the three-dimensional vectors v(1) = [ −1 1 1 ]T . P 2. (iv) Verify that b − f ⊥ a and a − g ⊥ b./(A\b)-1)) Repeat for n=10. (i) Compute a . a(2) .135 format long n = 5. and s = [ 1 2 3 ]T . (ii) Compute the angle θ between a and b (where 0 ≤ θ ≤ π).25. respectively (λ and µ are scalars). A = hilb(n). (iii) Let f be the projection of b on a. What do you observe? Section 2. Consider the four-dimensional vectors a = [ −1 7 2 4 ]T and b = [ 3 0 −1 −5 ]T . v(2) = [ 2 −1 3 ]T . and g be the projection of a on b.26. (Is this result expected from three-dimensional geometry?) on v P 2. Express f and g as λa and µb. b and b − a .10 P 2. b = ones(n.1) c = b + 0.1) Compare the values max(abs(b-c)) and max(abs((A\c). Let a(1) .

28.3 4. ti xi 0.073 0.1 2.44 1.48 0.11 P 2. each showing the discrete data set and the approximating function.2 3. Also determine the relative (root mean square) error p − a(4) a(4) Section 2.5 6.181 0.89 2.6 2 3.00 3.637 0.358 0.299 0.677 0. P 2. Five measurements xi taken at times ti are shown below. (ii) a quadratic f2 (t) = b2 t2 + b1 t + b0 .15 0.27.51 1.1 Let f (t) = c1 t2 + c2 t + c3 be the parabola that yields the least squares ﬁt to the above data set.2 −1 −0.10 0.e.492 0.29 0. Plot two graphs.862 1.22 0.34 1. what is the value of c2 ? P 2.055 0.75 2.88 0.00 0.136 of a(4) based on a(1) .8 1 1.4 5. a(2) and a(3) .29. (ii) Is it possible to deduce the value of c2 independently of the other variables c1 and c3 (i. and let c = [c1 c2 c3 ]T .2 0 0. Consider the data set t s −2 −0..25 It is believed that the data follow an exponential law of the form x(t) = aebt .184 Find the least squares approximation to the data set shown above in terms of (i) a straight line f1 (t) = a1 t + a0 .62 2. without having to solve the entire system of equations)? If so. Consider the data set t s 0. (i) Determine a matrix A and a vector b such that Ac = b.

Find the values of a1 . b2 and c which provide the least squares approximation ˆ to the data s. .30. The data vector s can be found in s2. ln x(t) = ln a + r ln t Determine the values of a and r in the least squares approximation of the vector [si ] = [ln xi ] by the vector [ln a + r ln ti ] (where i = 1. ti xi 10 14. .9 30 26. Plot s and ˆ on the same s s graph. P 2.5t + a2 e−1. .txt contains noisy measurements of y(t) taken every 200 milliseconds starting at t = 0. P 2. s .137 or equivalently. It consists of noisy samples of the sinusoid x(t) = a1 cos(200πt) + a2 sin(200πt) + b1 cos(250πt) + b2 sin(250πt) + c taken every 100 microseconds (starting at t = 0). s Plot s and ˆ on the same graph. . b1 .32.2t (t ≥ 0) where t is in seconds.8 It is believed that the data obey a power law of the form x(t) = atr or equivalently. . The data vector s in s1. a2 .31. P 2. ln x(t) = ln a + bt Determine the values of a and b in the least squares approximation of the vector [si ] = [ln xi ] by the vector [ln a + bti ] (where i = 1. The constants a1 and a2 provide information about the initial state (at t = 0) of the circuit.7 40 32. The transient response of a passive circuit has the form y(t) = a1 e−1. 5). . . 5). Five measurements xi taken at times ti are shown below. . Find the values a1 and a2 which provide the least squares approximation ˆ to the data s.txt.2 50 36.0 20 20.

35.12 P 2. solve Vc = s for s = [ 7 2 −5 ]T . (iii) Determine the projection ˆ of s on the plane deﬁned by v(1) and v(2) . Display VT V. v(2) and v(3) (and thus also V itself) are orthogonal. (ii) Without performing Gaussian elimination. Consider the matrix  1 1 1 0  1 1 −1 0   V=  1 −1 0 1  1 −1 0 −1 (i) Compute the inner product matrix VT V. Consider the 4 × 3 matrix   1 1/2 1/4  0 1 1/3   V=  0 0 1  0 0 0 . (ii) If x= 0 1 −2 3 T  determine a vector c such that Vc = x. Let v(1) . P 2.34. P 2.33. v(2) and v(3) be the columns of the matrix   2 −1 3 1 −1  V= 4 −1 2 2 (i) Show that v(1) .138 Section 2. s What is the relationship between the error vector ˆ − s and the projection s of s onto v(3) ? (iv) Scale the columns of V so as to obtain an orthonormal matrix.

Consider the symmetric matrix   1 −3 2 7 −4  A =  −3 2 −4 5 (i) Using Gaussian elimination.139 (i) The 3 × 3 inner product matrix VT V has an obvious normalized LU factorization of the form VT V = LDU where the diagonal entries of L and U are unity and U = LT . ﬁnd matrices U and D such that A = UT DU where U is an upper triangular matrix whose main diagonal consists of 1’s only. its columns are orthogonal and have nonzero norm).e. (ii) Does there exist a m × 3 matrix V (where m is arbitrary) such that VT V = A? P 2.  .. P 2.37. What are L. D and U in that factorization? (ii) Show how the orthonormal matrix  1  0 W=  0 0  0 0   1  0 0 1 0 0 can be formed by taking linear combinations of the columns of V. and D is a diagonal matrix. Consider the matrix  1 4 −1  −2 1 5   V=  1 −3 3  0 1 1 Find a 3 × 3 nonsingular matrix B such that W = VB−1 is an orthogonal matrix (i.36.

(MATLAB) Consider the discrete-time signals v1. Consider the 5 × 3 matrix  V= v(1) v(2) v(3)   =   2 1 1 1 1  3 2 4 2   1 −1   3 4  2 −1 (i) Express the inner product matrix VT V in the form LDLT .39.01:1)*(pi/4))’. v1 = sin(t). where: w(1) = v(1) w(2) = c12 v(1) + v(2) w(3) = c13 v(1) + c23 v(2) + v(3) P 2. v2. v2 = sin(2*t). w(2) and w(3) . v3 = sin(3*t). v3 and v4 deﬁned by t = ((0:0. obtain a 5 × 3 matrix W such that R(W) = R(V) and the columns of W are orthogonal. Consider a n × 3 real-valued matrix V= which is such that v(1) v(2) v(3)   16 −8 −12 8 8  VT V =  −8 −12 8 11 (i) Determine matrices D (diagonal with positive entries) and U (upper triangular with unit diagonal entries) such that VT V = UT DU (ii) Determine coeﬃcients cij that result in orthogonal vectors w(1) . . P 2.40. v4 = sin(4*t). (ii) Using U = LT . (iii) Determine the projection of s = [ −6 −1 3 −1 8 ]T on R(V) (which is the same as R(W)).140 P 2.38.

w2. P 2. This column provides resolution of order r = 1.e. express x = [ 3 0 4 3 2 6 ]T as a linear combination of the six given sinusoidal vectors. P 2. r6 = cos(pi*n). in terms of sines only.41. w2. reference vectors reﬂect the frequency content of the signal which is being analyzed. The 64 columns of the matrix V are constructed as follows: • The ﬁrst column is an all-ones vector. Fourier sinusoids are not the only class of orthogonal reference vectors used in signal approximation/representation. Verify that the signals obtained are indeed orthogonal..42. . (i) Use MATLAB to verify that the six sinusoidal vectors in R6 deﬁned below are orthogonal: n = (0:5). r2 = cos(pi*n/3). w3 and w4 on the same graph. In wavelet analysis. as well as its “localized” behavior at diﬀerent times. r1 = cos(0*n). This is accomplished by taking as reference vectors scaled and shifted versions of a basic ﬁniteduration pulse (or wavelet). r5 = sin(2*pi*n/3). • The second column is a full-length Haar wavelet: 128 +1’s followed by 128 −1’s. (iii) Without performing any additional computations. which is a succession of two square pulses of opposite amplitudes. r3 = sin(pi*n/3). i. w3 and w4 which are linearly equivalent to the above signals. r4 = cos(2*pi*n/3). (iii) Plot w1.’. (ii) Verify that the vector s = [ 0 −3 1 0 −1 3 ]T can be represented as a linear combination of r3 and r5. (ii) Use the function CHOL to obtain four orthogonal signals w1. and provides resolution of order r = 0.141 (i) Plot all four signals on the same graph. consider the problem of approximating a (m = 256)dimensional signal s using n = 64 reference vectors derived from the basic Haar wavelet. As a simple example.

V = a. The number of columns at that resolution is the maximum possible under the constraint that pulses across columns do not overlap in time (row index).45) (ii) Plot the following two signals: s1 = [exp((0:255)/256)]’.5+exp((116:255)/256)]. end end (i) Compute VT V and hence verify that the columns of V are orthogonal. .txt): p = 8. resolution. Resolution of order r is provided by columns 2r−1 + 1 through 2r of V. % vector size = m = 2^p rmax = 6.:)].1). The matrix V is generated in MATLAB using the following script (also found in haar1. -a(1:2^(p-r)). You can view V using a ribbon plot. for k = 0:2^(r-1)-1 V = [V v(1+mod(-k*2^(p-r+1):-k*2^(p-r+1)+2^p-1. % first column (r=0) is all-ones for r = 1:rmax v = [a(1:2^(p-r)). n = 2^rmax a = ones(2^p. where each column (i.e. s2 = [exp((0:115)/256) -0. 2^p). followed by 128 zeros.1)]. reference vector) is plotted as a separate ribbon: ribbon(V) % Maximize the graph window for a better view view(15. also. • We continue in this fashion to obtain resolutions of order up to r = 6..142 • The third column is a half-length Haar wavelet: a pulse of 64 +1’s followed by 64 −1’s. zeros(2^p-2^(p-r+1). These two columns together provide resolution of order r = 2. Each of these columns contains a scaled and shifted Haar wavelet of duration 29−r . % max. In the fourth column.45) pause view(45.’. the pulse is shifted to the right by 128 positions so that it does not overlap the pulse in the third column.

(i) Find a real-valued 4 × 4 matrix F and a real-valued 4 × 1 vector c such that the above equation is equivalent to   x1  y  F 1  = c  x2  y2 where z1 = x1 + jy1 . Consider the equation Az = b. which provide the highest resolution (r = 6). where A and s are chosen appropriately.) Section 2. this is because s1 looks rather smooth at high resolution.143 (iii) Using the command y = A*((A’*A)\(A’*s)). z2 = x2 + jy2 .13 P 2. −1 + j −2 − 3j 1+j 3 . using columns 33–64 only.43. in which case the matrix V has 256 columns and can provide a complete representation of any 256-point signal. The order of the resolution can be increased to r = 8. each column of V consists of 254 zeros. (ii) Use MATLAB to verify that A\b and F\c are indeed equal. At the highest resolution. On the other hand. (Note that projecting s1 on the highest-resolution columns results in a vector of small values compared to the amplitude of s1. determine and plot the least-squares approximations of s1 and s2 based on the entire matrix V. Repeat. b= 4 2−j . where A= and z ∈ C2×1 . s2 has a sharp discontinuity which is captured in the highest-resolution projection. one +1 and one −1.

Chapter 3 The Discrete Fourier Transform 144 .

e. . . In particular.1. The time index n will range from n = 0 to n = N − 1. x[n] 0 1 2 3 4 5 6 n N−3 N−2 N−1 Figure 3. and how a linear transformation of that vector can be described by a matrix. that pattern cannot be easily discerned by looking at individual numbers in x. . N − 1 instead of n = 1. . . x[N − 1] T Note the shift relative to our earlier indexing scheme: n = 0. N − 1: x = x[0]e(0) + x[1]e(1) + · · · + x[N − 1]e(N −1) The most important feature of this representation is localization in time: each entry x[n] of the vector x gives us the exact value of the signal at time n. . containing N samples or points. . . we are interested in how signal values at diﬀerent times depend on each other. Thus x= x[0] x[1] . . . .1 Complex Sinusoids as Reference Vectors Introduction We have seen how a discrete-time signal consisting of ﬁnitely many samples can be represented as a vector. . . this sort of information cannot be obtained by inspection of the vector x. N . i. we will consider signal vectors x of length N . . . .1: An N -point signal vector x.. if the samples in x follow a pattern closely described by a linear.145 3. The sample values in x provide an immediate representation of the signal in terms of unit pulses at times n = 0. . In signal analysis. . For example. . we seek to interpret the information present in a signal. In what follows. . .1 3. . . In general. . exponential or sinusoidal function of time. .

then x [n] = ejωn . e. x[0] = 1 Recall from Subsection 1. . n∈Z .1. N − 1 The designation “standard” means that the complex-valued amplitude is unity—equivalently. In Fourier analysis.e. 3. The importance of sinusoids is due to (among other factors): • their immunity to distortion when propagating through a linear medium. . where each entry in X provides the (complex) amplitude of a complex sinusoid. we represent x by a vector X of the same length. we will usually assume that 0 ≤ ω < 2π If we extend the discrete time axis to Z (i. In what follows. This enables us to limit the eﬀective frequency range to a single interval of angles of length 2π. the real-valued amplitude is unity and the phase shift is zero.. and is based on representing or approximating a signal by a linear combination of sinusoids.2 Fourier Sinusoids x[n] = ejωn . The two representations are equivalent.3 that frequencies ω1 and ω2 related by ω2 = ω1 + 2kπ (k ∈ Z) yield the same values for x and are thus equivalent. n = −∞ to n = ∞). as well as our ability to group harmonics into a single note. .5. or when processed by a linear ﬁlter. Note that regardless of the value of ω. A standard complex sinusoid x of length N and frequency ω is given by n = 0. whose values provide localization in time. it could miss a more complicated pattern such as a sum of ﬁve sinusoids of diﬀerent frequencies. Fourier analysis is the most common approach to signal analysis. In contrast to the vector x itself. .. the vector X provides localization in frequency. • our perception of real-world signals.g. our ability to detect and separate diﬀerent frequencies in audio signals.146 And while a trained eye could detect such a simple pattern by looking at a plot of x.

so is 2π − ω ∈ (π. These frequencies are given by ω=k· ω = 0. .. it follows that k can be limited to the integers 0. .) Since the eﬀective range of ω is (0. . . . N − 1. The sinusoid x [·] will repeat itself every N samples provided (∀n) which reduces to ejωN = 1 2π N for some integer k. corresponding to k = 0 in the above deﬁnition. ..1. 2π]. (N − 1)2π 2π 4π . then 2π−ω is the (N −k)th Fourier frequency. . • ω = π (which results in a signal whose value alternates between +1 and −1) is a Fourier frequency only when N is even.. The corresponding Fourier sinusoids are complex conjugates of each other: ej(2π−ω)n = e−jωn = (ejωn )∗ n = 0.1. . N −1. . 2π).. . .147 deﬁnes a complex sinusoid having inﬁnite duration. • ω = 0 (which results in a constant signal of value +1) is always a Fourier frequency. We note the following: • All Fourier frequencies are multiples of 2π/N . N N N and thus ejω(n+N ) = ejωn Deﬁnition 3. . Thus there are exactly N distinct frequencies ω for which x [n] = ejωn is periodic with (fundamental) period equal to N or a submultiple of N . . The corresponding Fourier sinusoids v(0) . π) is a Fourier frequency. v(1) . Put diﬀerently. v(N −1) are the N -point vectors given by v (k) [n] = ej(2π/N )kn . The Fourier frequencies for an N -point vector are given by 2π ω=k· N where k = 0. (This expression was also obtained in Subsection 1.4. if ω is the k th Fourier frequency.5. . . N − 1 . . . • If ω ∈ (0.

Fourier sinusoids of length N will be used as reference vectors for signals in RN ×1 . . starting from z = 1. they are well-suited as reference vectors. CN ×1 . we can arrange the Fourier sinusoids as columns of a N × N matrix V: V= v(0) v(1) . .3. The (n. i.) Proof of the orthogonality of Fourier sinusoids involves the summation of geometric series. In particular. . For the remainder of this chapter.3 Orthogonality of Fourier Sinusoids Fourier sinusoids of the same length N are orthogonal. represented by asterisks on the unit circle. . 2. The notation v(k) for the k th Fourier sinusoid is consistent with our earlier notation for reference vectors. N − 1 represents frequency. The Fourier frequencies ω for N = 7 (odd) and N = 8 (even) are illustrated in Figure 3. 3 and 4 are shown in Figure 3. and more generally. . As a result. v(N −1) For V. . N − 1 represents time. k)th entry of V is given by 2πkn 2πkn Vnk = v (k) [n] = ej(2π/N )kn = cos + j sin N N Note in particular that Vnk = Vkn . Note that the n = 0th row and k = 0th column equal unity for all N . 3. The k th column is generated by moving counterclockwise on the unit circle in increments of ω = k(2π/N ). .2..148 2π/7 1 π/4 1 Figure 3. V is always symmetric. . (Recall that the the least squares approximation of a signal by a set of reference vectors is the vector sum of the projections of the signal on each reference vector.e. which we review below.2: Fourier frequencies for N = 7 (left) and N = 8 (right).1. the row index n = 0. while the column index k = 0. The Fourier frequencies and matrices V for N = 1. . . .

z = 1. Fact. The limit of Gn (z) as n → ∞ exists if and only |z| < 1. and is given by G∞ (z) = 1 1−z The above results are from elementary calculus. then all terms in the geometric series are equal to 1. and are easily obtained by noting that: • If z = 1. n + 1. then |z|n+1 converges to zero. • Gn (z) − zGn (z) = 1 − z n+1 . • If |z| < 1. • If |z| = 1 and z = 1. we .149 2π/3  1 1 1 1 −1 1 1 √  1  1 − 2 + j √3 2 1 − 1 − j 23 2  1 √  − 1 − j √3  2 2 − 1 + j 23 2   1 1 1 1  1 j −1 −j     1 −1 1 −1  1 −j −1 j Figure 3. 3 and 4. then |z|n+1 = |z n+1 | diverges to inﬁnity. then z n+1 constantly rotates on the unit circle (hence it does not converge). If z is complex. 2. z = 1. The entries of each matrix are marked on the corresponding unit circle. then Gn (z) = 1 + z + z 2 + · · · + z n = def (1 − z n+1 )/(1 − z). • If |z| > 1. respectively.3: Fourier sinusoids for N = 1. v( ) To establish the orthogonality of the N -point Fourier sinusoids v(k) and having frequencies ω = k(2π/N ) and ω = (2π/N ).

hence its squared magnitude equals unity. then − k cannot be a multiple of N . v( ) −k)N ) = 1 − zN 1−z = ej2π( =0 −k) =1 as needed. In terms of the inner product matrix VH V. we have v(k) . v(k) = v(k) 2 −k) =N (Note that each entry of v(k) is on the unit circle. We have thus obtained the following result. v(N −1) are orthogonal. Using the formula for GN −1 (z). each having squared norm equal to N .150 need to compute the inner product of v(k) and v( ) : N −1 v (k) . The N -point complex Fourier sinusoids comprising the columns of the matrix V = v(0) v(1) . . If 0 ≤ k = ≤ N − 1. this is also consistent with GN −1 (1) = N . then z = 1 and v(k) . Fact. we have VH V = N I . v( But z N = ej(2π/N )( and hence v(k) . and thus z = 1.v ( ) = n=0 N −1 ej(2π/N )kn ej(2π/N )( n=0 N −1 ∗ ej(2π/N ) n = = n=0 −k)n zn = GN −1 (z) where z = ej(2π/N )( If k = .) Of course. .

the above facts imply the following: • The least-squares approximation of any real or complex-valued signal s based on any subset of the above N complex Fourier sinusoids is given by the sum of the projections of s onto each of the sinusoids. and consider the signals s = [2 −1 −1]T and x = [2 −1 0]T The Fourier exponentials are given by the columns of   1 1 √ 1 √   V =  1 − 1 + j √3 − 1 − j √3  2 2 2 2 1 − 1 − j 23 − 1 + j 23 2 2 We have s = Vc where c0 = c1 = c2 = 1 (0) v . In the context of signal approximation and representation.s = 1 3 and x = Vd . the above relationship tells us that it is also nonsingular and has inverse V−1 = 1 H V N It follows that the N complex Fourier sinusoids v(0) . Let N = 3.1.1.s = 1 3 1 (2) v . . . s (k) v N • The least-squares approximation of s based on all N complex Fourier sinusoids is an exact representation of s. v(N −1) are also linearly independent.s = 0 3 1 (1) v .151 Since V is a square matrix. where the projection of s onto v(k) is given by the standard expression v(k) . . . and is given by s = Vc ⇔ c = V−1 s = 1 H V s N Example 3. v(1) .

x = − j 3 6 6 .x = + j 3 6 6 √ 1 (2) 3 5 v .152 and d0 = d1 = d2 = 1 (0) 1 v .x = 3 3 √ 5 3 1 (1) v .

. x and y. For the remainder of this chapter..1.2. we deﬁne S = N c. . the upper-case boldface symbols S. respectively.1 The Discrete Fourier Transform and its Inverse Deﬁnitions We begin our in-depth discussion of the representation of a N -point signal vector s by the Fourier sinusoids in V with a simple change of notation: we replace the coeﬃcient vector c in the equations s = Vc ⇔ c= 1 H V s N by S/N . . .2. Since these sinusoids are the components of s. computation of the coeﬃcients S[0].e. X and Y will be reserved for the discrete Fourier transforms of the signal vectors s. They yield the coeﬃcients needed in order to express the signal s as a sum of Fourier sinusoids.2 3.153 3. We thus obtain 1 VS ⇔ N Recall that the matrix V is symmetric: s= S = VH s Vnk = ej(2π/N )kn = Vkn Thus VH = (VT )∗ = V∗ and (VH )nk = e−j(2π/N )kn Deﬁnition 3. The discrete Fourier transform (DFT) of the N -point vector s is the N -point vector S deﬁned by S = VH s or equivalently. k = 0. . i. N − 1 S will also be referred to as the (complex) spectrum of s. N −1 S[k] = n=0 s[n]e−j(2π/N )kn . . . S[N − 1] amounts to analyzing the signal into N distinct frequencies. The equations in the deﬁnition of the DFT are also known as the analysis equations. . .

since its values are indexed by the frequency parameter k (corresponding to a radian frequency ω = k(2π/N )). The two signals x and X have the same physical dimensions and units. k=0 n = 0. 1 x[n] = N N −1 1 VX N X[k]ej(2π/N )kn . . N − 1 The equations in the deﬁnition of the IDFT are known as the synthesis equations. .. since they produce a signal vector x by summing together complex sinusoids with (complex) amplitudes given by the entries of (1/N )X.2. . Thus any signal in the time domain is also perfectly valid as a signal in the frequency domain (i. which diﬀer only in terms of a complex conjugate and a scaling factor. Clearly. This similarity is quite evident in the analysis and synthesis equations. evolving in time n. such assignment automatically determines the value of the other signal (through the synthesis or analysis equations). and vice versa.. The inverse discrete Fourier transform (IDFT) of the N -point vector X is the N -point vector x deﬁned by x= or equivalently. Either signal can be assigned arbitrary values in CN ×1 . . . the two entities are treated very similarly in the discrete Fourier transform and its inverse.e.154 Deﬁnition 3. denoted by x ←→ X DFT or simply x ←→ X Throughout our discussion. then x is the IDFT of X. • if X is the DFT of x.2. i. The DFT (or spectrum) X will be regarded as a signal in the frequency domain. the lower-case symbol x will denote a signal in the time domain. A signal x and its DFT X form a DFT pair. and • the analysis and synthesis equations are equivalent.e. as the spectrum of a diﬀerent time domain signal). Even though time and frequency have a distinctly diﬀerent physical meaning.

. . . .2. 1 . ..155 3. . For other structural properties of V. 1 v2 v4 . . . since it appears (without conjugation) in the synthesis equation. v. .. N − 1 corresponds to time and the column index k = 0. . . . 1 v v2 . . v(N −1) Vnk = ej(2π/N )kn Recall that the N × N matrix is given by where the row index n = 0. which means that all columns of V can be formed using entries taken from the k = 1st column v(1) . . v 2(N −1) . N −1 corresponds to frequency.. .4. Note that v N = ej(2π/N )N = ej2π = 1 and thus for any integer r. v n+rN = v n Every integer power of v thus equals one of 1. v N −1 . v N −1 . we will use VN instead of V. . . . . . . Whenever the value of N is not clear from the context. . the entries of V are symmetric about the main diagonal since Vnk = v kn = v nk . . (N −1)2 . .. . . This is illustrated in Figure 3. . v        def 2π N + j sin 2π N    V=   1 v N −1 v 2(N −1) As noted earlier.2 Symmetry Properties of the DFT and IDFT Matrices V= v(0) v(1) . . . v N −1 and note that the elements of that column represent equally spaced points on the unit circle. . .. . Letting v = vN = ej(2π/N ) = cos we obtain the simple expression Vnk = v kn and can thus write  1 1 1 . V will be referred to as the IDFT matrix for a N -point vector. we turn to the k = 1st column T v(1) = 1 v v 2 .

156
N=7, v=exp(j2π/7) v v
3 2

N=8, v=exp(jπ/4) v v v7=1 v4=−1
3 2

v

v v8=1

v4 v v5
6

v

5

v v
6

7

Figure 3.4: The entries of the ﬁrst Fourier sinusoid marked on the
unit circle for N = 7 (left) and N = 8 (right).

Example 3.2.1. For N = 6, terms of 1, v, v 2 , v 3 , v 4 and v 5 , We have  1 1  1 ejπ/3   1 ej2π/3 V=  1 −1   1 ej4π/3 1 ej5π/3

the entries of V = V6 can be expressed in where v = ejπ/3 . Note that v 3 = ejπ = −1. 1 ej2π/3 ej4π/3 1 ej2π/3 ej4π/3 1 −1 1 −1 1 −1 1 1         ej4π/3 ej5π/3 ej2π/3 ej4π/3 1 −1 ej4π/3 ej2π/3 ej2π/3 ejπ/3

or equivalently, (since ej(2π−θ) = e−jθ ):     V=    1 1 1 1 1 1 jπ/3 j2π/3 −j2π/3 −jπ/3 1 e e −1 e e 1 ej2π/3 e−j2π/3 1 ej2π/3 e−2jπ/3 1 −1 1 −1 1 −1 −j2π/3 j2π/3 −j2π/3 j2π/3 1 e e 1 e e 1 e−jπ/3 e−j2π/3 −1 ej2π/3 ejπ/3        

The symmetric nature of V = V6 is evident in Example 3.2.1. We note that in the general case (arbitrary N ), the nth and (N − n)th entries of v(k) are complex conjugates of each other: v k(N −n) = v −kn = (v kn )∗ This means that the elements of the k = 1st column exhibit a form of conjugate symmetry. The center of symmetry is the row index N/2, which

157 is at equal distance from n and N − n: n= N − 2 N −n 2 and N −n= N + 2 N −n 2

The value N/2 is an actual row index if N is even, and is midway between two row indices if N is odd. We note also that the n = 0th entry is not part of a conjugate symmetric pair, since the highest row index is n = N − 1, not n = N. Clearly, the same conjugate symmetry arises in the columns of V, with center of symmetry given by the column index k = N/2. This follows easily from the symmetry of V about its diagonal, i.e., from V = VT . We ﬁnally note that since v (N −k)(N −n) = v N
2 −kN −nN +kn

= v kn

we also have radial symmetry (without conjugation) about the point (n, k) = (N/2, N/2): VN −n,N −k = Vnk In particular, radial symmetry implies that entries n = 1, . . . , N − 1 in the (N − k)th column can be obtained by reversing the order of the same entries in the k th column. The symmetry properties of V are summarized in Figure 3.5. Deﬁnition 3.2.3. The DFT matrix W = WN for a N -point vector is deﬁned by W = VH = V∗ Speciﬁcally, (k, n)th entry of W is given by Wkn = wkn where
−1 w = wN = vN = e−j2π/N def

W is known as the DFT matrix because it appears without conjugation in the analysis equation: X = VH x = Wx Note that in the case of W, the frequency index k is the row index, while the time index n is the column index. This is also consistent with the indexing scheme used in the sum form of the analysis equation:
N −1

X[k] =
n=0

x[n]e−j(2π/N )kn

158

0 1 2 0 1 2 . . . N/2 . . . 1 1 1 1 1 . . . 1 . . .

... ...

N/2 . . . 1 ...

N-1 1

= *= =

*=

N-1 1
Figure 3.5: Symmetry properties of the IDFT matrix V. Equal values are denoted by “=” and conjugate values by “∗ =”. We note the following: • The introduction of W allows us to write the equation VH V = VVH = N I in a variety of equivalent forms, using VH = V∗ = W and V = WH = W∗

• Since W is just the complex conjugate of V, it has the same symmetry properties as V.

3.2.3

Signal Structure and the Discrete Fourier Transform

The structure of a signal determines the structure of its spectrum. In order to understand and apply Fourier analysis, one needs to know how certain key features of a signal in the time domain are reﬂected in its spectrum in the frequency domain. A fundamental property of the discrete Fourier transform is its linearity, which is due to the matrix-vector form of the DFT and IDFT transformations.

159 DFT 1. (Linearity) If x ←→ X and y ←→ Y, then for any real or complex scalars α and β, s = αx + βy ←→ S = αX + βY This can be proved using either the analysis or the synthesis equation. Using the former, S = Ws = W(αx + βy) = αWx + βWy = αX + βY as needed. Another fundamental property of the Fourier transform is the duality of the DFT and IDFT transformations. Duality stems from the fact that the two transformations are obtained using matrices W = V∗ and (1/N )V which diﬀer only by a scaling factor and a complex conjugate. As a result, if x ←→ X is a DFT pair, then the time-domain signal y = X has a spectrum Y whose structure is very similar to that of the original time-domain signal x. The precise statement of the duality property will be given in Section 3.4. Before continuing with our systematic development of DFT properties, we consider four simple signals and compute their spectra. Example 3.2.2. Let x be the 0th unit vector e(0) , namely a single unit pulse at time n = 0: x = e(0) = The DFT X is given by X = We(0) Right-multiplying W by a unit vector amounts to column selection. In this case, the 0th column of W is selected: X= 1 1 1 ... 1
T

1 0 0 ... 0

T

=1

i.e., X is the all-ones vector. The same result is obtained using the analysis equation in its sum form:
N −1

X[k] =
n=0

x[n]wkn = 1 · w0 = 1

for all k = 0, . . . , N − 1.

160

x 1 1

X

0

1

2

. . .

N-1

0

1

2

. . .

N-1

Example 3.2.2

Example 3.2.3. We now delay the unit pulse by taking x = e(m) , or equivalently, 1, n = m; x[n] = 0, n = m, where 0 ≤ m ≤ N − 1. The DFT becomes X = We(m) i.e., X is the mth column of W, which is the same as the complex conjugate of v(m) (the mth column of V). Thus X is a Fourier exponential of frequency (N − m)(2π/N ), or equivalently, −m(2π/N ): X[k] = wkm = v −km = e−j(2π/N )km Of course, the same result is obtained using the analysis equation in its sum form:
N −1 n=0

X[k] =

x[n]wkn = 1 · wkm

Note that in this example, the spectrum is complex-valued. It is purely real-valued (for all k) if and only if m = 0 (i.e., there is no delay) or m = N/2. The latter value of m is an actual time index only when the number of samples is even, in which case the resulting spectrum is given by X[k] = (−1)k . Example 3.2.4. Building upon Example 3.2.3, we add a second pulse of unit height at time n = N − m (where m = 0). The two pulses are now symmetric to each other relative to the time instant n = N/2. Denoting the new signal by s, we have s = e(m) + e(N −m) and thus by linearity of the DFT we obtain S = We(m) + We(N −m)

which equals zero if k = 0 and N if k = 0. In particular. we have S[k] = e−j(2π/N )km + e−j(2π/N )k(N −m) = e−j(2π/N )km + ej(2π/N )km 2πmk = 2 cos N The spectrum is now purely real-valued. Y is the (componentwise) sum of all columns of W.161 This is the sum of the DFT’s obtained in Example 3. The sum form of the equation gives N −1 Y [k] = n=0 1 · wkn which is the same as the geometric sum GN −1 (wk ).2.4 (with N = 8 and m = 3) Example 3. The resulting spectrum is Y = N e(0) For a simpler way of deriving Y. v(0) .2. s S 2 1 0 7 2 2 0 1 2 3 4=N/2 5 6 7 2 2 2 Example 3. we introduce unit pulses at every time instant.2. note that y = 1 = v(0) . It is also the inner product v(k) ..e. resulting in the constant signal y = 1. The ﬁgure illustrates the case N = 8 and m = 3. Finally. The analysis equation gives Y = W1 i.5.3 for delays m and N − m.

Hence Y = N e(0) . .. time and frequency domains are interchanged with one of the signals undergoing scaling by N . This similarity is a consequence of the duality property of the DFT. .5 i. The graphs for Examples 3.2 and 3.2. .162 y N Y 1 0 1 2 . N-1 0 1 2 . This means that in the synthesis equation y= 1 VY .5 are similar: essentially. N the DFT vector Y selects the k = 0th column of V and multiplies it by N . which will be discussed formally in Section 3.2. y consists of a single Fourier sinusoid of frequency zero (corresponding to k = 0) and unit amplitude.4.e. . . N-1 Example 3.2.

. . A natural question to ask is whether the spectrum X of a realvalued signal x has special properties that distinguish it from the spectrum of an arbitrary complex-valued signal in the time domain. the DFT (or spectrum) of a timedomain signal.163 3. then X satisﬁes X[0] = X ∗ [0] and X[k] = X ∗ [N − k] for k = 1. we have N −1 n=0 X[k] = and X[N − k] = n=0 x[n]wkn N −1 N −1 x[n]w(N −k)n = n=0 x[n]w−kn . (DFT of a Real-Valued Signal) If x is a real-valued N -point signal. i. The answer is aﬃrmative. To prove this property. we note ﬁrst that N −1 N −1 X[0] = n=0 x[n]w0·n = n=0 x[n] which is real-valued since x has real-valued entries. . any complex-valued vector can be viewed as either • a time-domain signal. or • a frequency-domain signal. all time-domain signals encountered in practice are real-valued. . N − 1.3.. . N − 1.1 Structural Properties of the DFT: Part I The Spectrum of a Real-Valued Signal As we saw in the previous section.e.3 3. DFT 2. Of course. and is summarized below. For k = 1. . as we shall see later. the generalization to complex-valued signals is necessary in order to include complex sinusoids (and. . . complex exponentials) in our analysis.

DFT 2 tells us that the DFT (or spectrum) of a real-valued signal exhibits the same kind of conjugate symmetry as was seen in the rows and columns of W and V. . the phase spectrum is antisymmetric in k.e. • The phase spectrum. i. This symmetry will be explored further in this chapter. . N − 1 i. the amplitude spectrum is symmetric in k with center of symmetry at k = N/2. . These are: • The amplitude spectrum.. we have x∗ [·] = x[·] and therefore X[k] = X ∗ [N − k] as needed.. If x is real-valued. .e. . . π]. given by the angle ∠X[·] quoted in the interval [−π. . . Expressing X[k] in polar form. N − 1. we obtain N −1 ∗ X ∗ [N − k] = n=0 N −1 x[n]w−kn x[n]w−kn n=0 N −1 ∗ = = n=0 x∗ [n]wkn Since x is real valued.164 Taking complex conjugates across the second equation. If x is real-valued. .e. k = 1.. N − 1 i. . X[k] = |X[k]| · ej∠X[k] we obtain two new frequency-domain vectors indexed by k = 0. then DFT 2 implies that |X[k]| = |X ∗ [N − k]|. k = 1. then DFT 2 implies that ∠X[0] = 0 or ± π and ∠X[k] = −∠X[N − k]. . . given by |X[·]|. .

) The resulting real-valued form of the synthesis equation involves Fourier frequencies in the range [0. In Example 3.e. i. There is only one pair of conjugate symmetric entries here: X[1] = X ∗ [2] The amplitude spectrum is given by 1 2.1. the conjugate symmetry of the spectrum X allows us to combine complex conjugate terms X[k]ej(2π/N )kn and X[N − k]e−j(2π/N )(N −k)n = X ∗ [k]e−j(2π/N )kn into a single real-valued sinusoid: 2 e X[k]ej(2π/N )kn = 2|X[k]| cos 2πkn + ∠X[k] N (Note that such pairs occur for values of k other than 0 or N/2.6458 2. π] only: x[n] = 1 2 1 X[0] + X[N/2](−1)n + · N N N |X[k]| cos 0<k<N/2 2πkn + ∠X[k] N The second term (corresponding to frequency ω = π) is present only when N/2 is an integer.6458 while the phase spectrum is given by 0 0.3335 −0.1.1. Indeed.3. when N is even.165 Example 3..3335 T T It is always possible to express a real-valued signal vector x as a linear combination of real-valued sinusoids at the Fourier frequencies. we considered the vector x= and evaluated its DFT as X= 1 5 2 2 −1 0 T +j √ 3 2 5 2 √ −j 3 2 T (note the scaling X = 3c). .

. starting at angle zero.   . we obtain 2πn x[n] = 0. . e(N −1) listed in arbitrary order.. x[N − 2]. .3335 3 for n = 0. If x is a column vector..e. . by an angle of 2π/N ).. x[N − 3]. . x[N − 2]) and can be illustrated by placing the N entries of x on a “time wheel”.3 that the columns of a N × N permutation matrix are the unit vectors e(0) .3333 + 1.166 Example 3.2 Signal Permutations Certain important properties of the DFT involve signal transformations in both time and frequency domains.6 for N = 7. . 0 0   P=   . . can be described by permutation matrices. a circular shift on x produces the vector Px. . We introduce three permutations which are particularly important in our study of the DFT. 0 0     0 1 . . . A circular shift amounts to rotating the wheel counterclockwise by one notch (i. . . namely counterclockwise. .. in the same way as the Fourier frequencies appear on the unit circle..  . 1. The rotation is illustrated in Figure 3.. 0 0 .) The representation of x= 2 −1 0 T using real sinusoids involves a constant term (corresponding to k = 0) together with sinusoid of frequency 2π/3 (corresponding to k = 1 and k = 2).3. . . . . . 2. These transformations are simple permutations of the elements of a vector and. . . . where P is a permutation matrix given by   0 0 . Circular (or Cyclic) Shift P : This permutation is deﬁned by the relationship P (x[0]. . 0 1  1 0 .1. 1 0 . so that x[N − 1] appears at angle zero. .  .7639 cos + 0. x[1].. x[0].. Substituting the values of X[k] computed earlier into the last equation. 3. (Continued. x[N − 1]) = (x[N − 1]. Recall from Section 2. as such.3. .

and delay is used instead of shift. .e.  . the time wheel undergoes one full rotation.. . .   0 0   0 0 Note that Q is an identity matrix ﬂipped left-to-right. i.. .  0 0 . Note that removal of the ﬁrst row and last column of P yields a (N − 1) × (N − 1) identity matrix. x[0]) and described in terms of the permutation  0 0 .. x[N − 2]. Circular Index Reversal R: Also known as periodic index reversal. . . . this transformation diﬀers from its linear counterpart in that reversal takes place among entries x[1]. it is deﬁned by Q(x[0]. . .. x[1]. . For m = N . x[1]) . . Pm represents a circular shift by m positions. x[N − 2]. . A circular shift is also known as a rotation. . Q= . . . .6: Vector x (left) and its circular shift Px (right). .. x[N − 1]) = (x[N − 1]. matrix  0 1 1 0   . .167 x[2] x[1] x[3] x[0] n=0 x[4] x[6] x[5] x[4] x[3] x[5] x[2] x[6] n=0 x[1] x[0] Figure 3. . . x[1]. x[N − 1] only.  . . x[N − 1]. x[2]. . . x[1]. x[N − 1]) = (x[0]. . x[N − 2]. 1 0 . . The term periodic is also used instead of circular. . x = PN x or PN = I Index Reversal Q: Also known as linear index reversal to distinguish it from the circular reversal discussed below. . Clearly.   .  .  .  . with x[0] kept in the same position: R(x[0]. .  0 1 .

 . .7 for the case N = 8). Since Q and R are symmetric. that P is not symmetric. as does the (N/2)th entry when N is even (as depicted in Figure 3. The permutation matrix R for  1  0   0 R=  . and is thus orthonormal. 1 0   .e.  .  .. 0 Note that R = PQ i.3 that a permutation matrix Π satisﬁes ΠT Π = I ⇔ Π−1 = ΠT circular reversal is given by  0 . x[2] x[3] x[1] x[5] x[6] x[7] x[0] x[4] n=0 x[4] x[0] n=0 x[5] x[6] x[7] x[3] x[2] x[1] Figure 3. A circular index reversal amounts to turning the wheel upside down. 1 . it follows that Q−1 = Q and R−1 = R (Note. Recall from Section 2.  . The 0th always remains in the same position. .. .  .. . .) We note one ﬁnal property of any permutation matrix Π: .. 0 0 . on the other hand... 0 1   0 .168 This can be illustrated using again a time wheel with the entries of x arranged in the usual fashion. circular reversal of a column vector can be implemented by linear reversal followed by circular shift. 0 0 0 . .7: Vector x (left) and its circular time-reverse Rx (right)..

let v(3) = 1 −1 1 −1 1 −1 1 −2 5 3 4 −1 T T be the third Fourier sinusoid for N = 6. corresponding to ω = π. x(6) [k] = x[k] v (3) [k] − 1 . x(5) [k] = x[k]v (3) [k] • For every k.169 Fact.2. The following signals can be expressed compactly in terms of x. v(3) and the permutation matrices P and R: • x(1) = • x(2) = • x(3) = • x(4) = • x(5) = • x(6) = 4 −1 1 −1 2 −3 0 −1 1 0 2 4 1 −2 4 9 1 3 6 5 3 T T T T T T 5 −2 9 −3 1 1 2 0 −1 4 0 5 −3 0 −6 Indeed. we have: • x(1) = P2 x • x(2) = Rx • x(3) = x + Rx.3. Let x= Also. This is established by noting that (Πx)T = xT ΠT = xT Π−1 The following example illustrates some of the signal transformations discussed so far. Π acting on a column vector and Π−1 = ΠT acting on a row vector both produce the same permutation of indices. • x(4) = x − Rx • For every k. Example 3.

  . . 0    def  0 0 v 2 . The conjugate symmetry property can be also expressed using the circular reversal matrix R. since both V and W are symmetric. v N −1 Since the k th power of F is obtained by raising each diagonal element to that power. .. . Of course. 0 0 0 .170 3. Since these two rows are complex conjugates of each other.e.k = Vnk and ∗ WN −k.2.4.. the resulting matrix is the complex conjugate of the original one. The key symmetry properties are: • Symmetry about the main diagonal. and the zeroth row is real-valued. .. V = VT and W = WT . Applied to either V or W as a row permutation... We thus have RV = VR = V∗ = W and RW = WR = W∗ = V The eﬀect of circular shift on the rows and columns of V and W can be explained better by introducing a diagonal matrix F whose diagonal elements are the entries of the k = 1st Fourier sinusoid:   1 0 0 . R leaves the m = 0th row in the same position. the same is true about column permutations using R.n (and similarly with respect to column index N/2). we observed certain symmetries in the entries of V and W. .4 3.2. . . . • Conjugate symmetry with respect to row index N/2: ∗ VN −n. i. . while interchanging rows m and N − m for 0 < m < N/2.. . . .1 Structural Properties of the DFT: Part II Permutations of DFT and IDFT Matrices In Subsection 3. . it follows that v(k) = Fk 1 . 0  0 v 0 . . 0  F =    .n = Wk.  . .

3. FN −1 1 A circular shift on the columns of V is obtained by right-multiplying it by PT = P−1 (not by P—see the last fact in Subsection 3. a circular shift on the rows or columns of V (or W) is equivalent to right. . . and F∗ = F−1 FN −1 1 F−1 1 1 F1 .171 where.or left-multiplication by a diagonal matrix whose diagonal is given by a suitable Fourier sinusoid.2). FN −2 1 FN −2 1 3.2 Summary of Identities We have deﬁned the following N × N matrices with the aid of v = ej2π/N and w = v −1 = v ∗ : V : IDFT matrix.4. . deﬁned by Wkn = wkn F : diagonal matrix deﬁned by Fnn = v n P : circular shift matrix R : circular reversal matrix . . . Thus all N Fourier sinusoids can be expressed using powers of F: V = 1 F1 F2 1 . as before. 1 F1 . deﬁned by Vnk = v kn W : DFT matrix. 1 is the all-ones column vector (same as v(0) ). The result is VP−1 = = and thus VP−1 = F−1 V This can be easily generalized to any power m of P: VPm = Fm V Taking transposes of both sides yields P−m V = VFm Taking complex conjugates across the last two equations. . and noting that V∗ = W we obtain WPm = F−m W and P−m W = WF−m In conclusion.

9) 1 1 VX = W∗ X (3.5) (3.11) .10) x = N N We will investigate how certain systematic operations on the time-domain signal x aﬀect its spectrum X.4) (3.2) ∗ =R =R ∗ T RV = VR = V = W RW = WR = W = V VP m m (3. again.3 Main Structural Properties of the DFT x ←→ X In what follows. of the arrow. we will consider the DFT pair where. We have Y = Wy = Wx∗ = RW∗ x∗ = RX∗ (3. (Complex Conjugation) Complex conjugation in the time domain is equivalent to complex conjugation together with circular time reversal in the frequency domain: y = x∗ ←→ Y = RX∗ Proof. DFT 3.8) =F V −m m P V = VF WP m m =F −m W P W = WF m The identities listed above will be used to derive structural properties of the DFT. the time-domain signal appears on the left.6) (3. The two signals are related via the analysis and synthesis equations: X = Wx (3. and the frequencydomain signal on the right. and vice versa.3) (3.172 We have shown that P−1 = PT R −1 (3.1) (3.4. 3.7) (3.

(Circular Time Reversal) Circular reversal of the entries of x is equivalent to circular reversal of the entries of X: y = Rx ←→ Y = RX Proof. if x is realvalued. Using DFT 3. then the resulting signal y = Rx has DFT Y = RX = X∗ where the second equality is due to the circular conjugate symmetry of the DFT of a real-valued signal (i. then x = x∗ and taking DFT’s of both sides.4). Remark. The second equation implies that the relative positions (in time) of these (3. In particular. This relationship expresses a type of conjugate symmetry with respect to circular reversal.4). The ﬁrst equation tells us that the two signals contain exactly the same amounts (in terms of amplitudes) of each Fourier frequency. Indeed. which will be henceforth referred to as circular conjugate symmetry.12) . Remark. the amplitude and phase spectra of the two signals x and y = Rx are related by |Y [k]| = |X[k]| ∠Y [k] = −∠X[k] for all values of k. DFT 4. If we (circularly) time-reverse a real-valued signal x. DFT 2). we obtain X = RX∗ which is precisely the statement of DFT 2. we can easily deduce DFT 2.e.. We have Y = Wy = WRx = RWx = RX where the third equality follows from (3.173 where the third equality follows from (3.

. We have Y = WPm x = F−m Wx = F−m X where the second equality is due to (3.e. Thus in general.. DFT 5. Multiplication of a signal by a sinusoid is also known as amplitude modulation (abbreviated as AM ).14) (3. We have Y = WFm x = Pm Wx = Pm X where the second equality is due to (3. the phase spectrum can be as important as the amplitude spectrum and cannot be ignored in applications such as audio signal compression. Proof. N − 1. . amplitude modulation enables a baseband signal. (Multiplication by a Fourier Sinusoid) Entry-by-entry multiplication of x by the mth Fourier sinusoid is equivalent to a circular shift of the entries of X by m frequency indices: y = Fm x ←→ Y = Pm X Proof. (Circular Time Delay) Circular time shift of the entries of x by m positions is equivalent to multiplication of the entries of X by the corresponding entries of the (N − m)th Fourier sinusoid: y = Pm x ←→ Y = F−m X i.8). This diﬀerence can have a drastic eﬀect on the perceived signal. Remark. music played backward bears little resemblance to the original sound. In real-world communication systems. i.e.7). .. DFT 6. one consisting (3.13) . . Y [k] = v k(N −m) X[k] = v −km X[k] = wkm X[k] for k = 0. for example.174 sinusoids will be very diﬀerent in the two signals.

As a result. similarly to what DFT 6 implies. by application of DFT 7. an audio signal). A sinusoid (known as the carrier ) of that frequency is multiplied by the baseband signal (e. and is particularly prominent in the case of the DFT: basically. For that reason. The similarity between DFT 5 and DFT 6 cannot be overlooked: identical signal operations in two diﬀerent domains result in very similar operations in the opposite domains (the only diﬀerence being a complex conjugate). respectively. resulting in the AM signal.2.10) (the synthesis equation). and thus circular reversal has no eﬀect: Re(0) = e(0) and R1 = 1 (3. computation of any DFT pair yields another DFT pair as a by-product.2. DFT 7. This similarity is a recurrent theme in Fourier analysis. The spectrum of the baseband signal and that of the AM signal are basically related by a shift (in frequency).2 and 3. We have Y = Wy = WX = RVX = N Rx where the third and fourth equalities are due to (3.3) and (3. We saw an instance of the duality property in Examples 3. to be transmitted over a frequency band centered at a much higher carrier frequency. where we showed that e(0) ←→ 1 and 1 ←→ N e(0) The second DFT pair can be obtained from the ﬁrst one. This is known as the duality property of the DFT. the analysis and synthesis equations are identical with the exception of a scaling factor and a complex conjugate. and is stated below. and vice versa..15) . both signals are circularly symmetric. Remark.g. In this particular case. DFT 6 is also referred to as the modulation property. (Duality) If x ←→ X. then y = X ←→ Y = N Rx Proof.5.175 of (relatively) low frequencies.

T is the circular time-reverse of x.     1 1 1 1 1 10  1 −j −1 j   2   −2 + 2j   =  X = Wx =   1 −1 1 −1   3   −2 1 j −1 −j 4 −2 − 2j We have     Note that X exhibits the conjugate symmetry common to the spectra of all real-valued signals (DFT 2). Using X.1 Structural Properties of the DFT: Examples Miscellaneous Signal Transformations T Example 3. π/2.e.5. i.5.5 3.176 3.e. x(1) = 10 −2 − 2j −2 −2 + 2j T X(1) = RX = T is the circular delay of x by one time unit. the k th entry of X is multiplied by w k = (−j)k . Signals x(1) through x(6) are derived from either x or X. Their spectra can be computed using known structural properties of the DFT.1. Consider the 4-point real-valued signal x= 1 2 3 4 We will ﬁrst compute the DFT X of x. i. π and 3π/2. x for each value of k: X(2) = F−1 X = 10 2 + 2j 2 2 − 2j T . • x(2) = 4 1 2 3 (2) = Px... By DFT 5. • x(1) = 1 4 3 2 Rx. By DFT 4. we will then derive the DFT’s of the following signals: • x(1) = • x(2) = • x(3) = • x(4) = • x(5) = • x(6) = 1 4 3 4 1 4 3 2 1 2 3 2 T T T T T 1 −2 3 −4 5 −1 + j −1 −1 − j 5 −1 −1 −1 T The Fourier frequencies in this case are 0.

a signal which exhibits circular conjugate symmetry in the time domain has a real-valued spectrum. By DFT 7 (duality). x(4) = F2 x..e.. the k th entry of X is multiplied by i.1 and 3.5.. i. or.e. It turns out that this property also holds with the time and frequency domains interchanged. x w2k = (−1)k .177 • x(3) = 3 4 1 2 is the circular delay of x by two time units.3. 2 8 6 4 T X(6) = 4R(x/2) = • x(6) = 5 −1 −1 −1 x T equals the real part of x(5) .2 Exploring Symmetries The spectrum of a real-valued signal is always circularly conjugate-symmetric. (3) = P2 x. by π radians..e. x(5) + x(5) = 2 ∗ (6) By DFT 3. The resulting spectrum is X(4) = P2 X = −2 −2 − 2j 10 −2 + 2j T T • x(5) = 5 −1 + j −1 −1 − j we have that equals X/2. the spectrum is shifted by m = 2 frequency indices. This can be seen by taking DFT’s on both sides of the identity x = Rx∗ .1. for each value of k: X(3) = F−2 X = T T 10 2 − 2j −2 2 + 2j T • x(4) = 1 −2 3 −4 is obtained by multiplying the entries of x by those of the Fourier sinusoid (−1)n = v 2n .5.e. i. By DFT 6. this was established in DFT 2 and was illustrated in Examples 3. in terms of actual frequencies. Again by DFT 5. i. complex conjugation in the time domain is equivalent to complex conjugation together with circular reversal in the frequency domain. Thus X(6) = 1 2 2 8 6 4 T + 1 2 2 4 6 8 T = 2 6 6 6 T 3.

Clearly. A signal exhibiting both properties in one domain also exhibits both properties in the other domain.) We summarize the above observations as follows. which always equals its complex conjugate. then its spectrum X will have the same properties. By the foregoing discussion. real-valued signals that preserve both properties. The terms symmetry and conjugate symmetry are equivalent for real-valued signals.3.3. 4 3 2 1 0 1 2 3 4 8 1 12 13 14 15 2 X 3 Figure E. If a signal is real-valued in either domain (time or frequency). X is both real-valued and circularly symmetric.5. (Conjugate symmetry and symmetry are equivalent notions for a real-valued signal. It follows that if a signal x happens to be both real-valued and circularly symmetric. then it exhibits circular conjugate symmetry in the other domain. Example 3.178 and then applying DFT 4 to obtain X = RRX∗ = X∗ This means that X is real-valued. Consider the 16-point signal x whose spectrum X is given by T X= 4 3 2 1 0 0 0 0 0 0 0 0 0 1 2 3 X is shown in Figure E.5. The following example illustrates operations on circularly symmetric.2 (i) We are interested in determining whether the two fundamental timedomain operations: • circular delay by m time units . the time-domain signal x has the same properties (and can be computed quite easily by issuing the command x = ifft(X) in MATLAB).2 (i). Fact.2.5.

real values and circular symmetry).. i.3.5.e.179 • multiplication by the mth Fourier sinusoid preserve either or both properties of x. 4 X (1) 2 2 0 −1 4 8 12 −1 −3 −3 Figure E. . This is because e−j(π/8)km + ej(π/8)km = 2 cos πkm 8 . ..3. A circular time delay of m units in x clearly preserves the real values in x.e. To see how circular symmetry of x is aﬀected by this operation. (i.2 (ii) We note also that summing together two versions of x that have been delayed by complementary amounts. which means that the (delayed) time-domain signal will no longer be circularly symmetric.2 (ii). m and −m (or 16 − m) time units. which undergoes multiplication by the mth Fourier sinusoid: each X[k] is multiplied by e−j(π/8)km .5. The delayed signal is given by x(1) = P8 x and its spectrum equals X (1) [k] = e−jπk X[k] = (−1)k X[k]. . Unless m = 0 (no delay) or m = 8. will also preserve the circular symmetry in x regardless of the value of m. the resulting spectrum will contain complex values. . and thus also preserves the circular symmetry in X. Thus the only nontrivial value of m that preserves circular symmetry is m = 8. k = 0. we consider the spectrum X. 15 The spectrum X(1) is plotted in Figure E.

e. We also note that since the Fourier sinusoids are circularly conjugatesymmetric.5.5. x(2) is circularly symmetric. the spectrum will undergo circular shift by m frequency indices. and circular symmetry will be preserved only in the cases m = 0 and m = 8. 15 . whether it is real or complex. . Correspondingly. and X(3) = P8 X The spectrum X(3) is plotted in Figure E.. Finally. we see that if we multiply x by the sum of two complex Fourier sinusoids which are conjugates of each other.2 (iii) illustrates X(2) for the case m = 4.3.2 (iii) Multiplication of x by a complex Fourier sinusoid of frequency mπ/8 will result in certain entries of x taking complex values. the element-wise product of x and v(m) will also be circularly conjugate-symmetric. . Thus the only nontrivial value of m that preserves real values in x is m = 8.3.5. The graph was generated by multiplying each entry of X by the corresponding entry of 2 0 −2 0 2 0 −2 0 2 0 −2 0 2 0 −2 0 T 8 X (2) 0 −4 4 8 12 −4 Figure E.180 and therefore x(2) = Pm x + P−m x ←→ X (2) [k] = 2 cos πkm 8 X[k] Since the spectrum X(2) is real-valued.3. . have frequency indices m n = 0. Figure E. i. unless m = 0 or m = 8.2 (iv). for which x(3) [n] = ejπn x[n] = (−1)n x[n]. .

2 (iv) and 16 − m (or simply −m).3.2 (v) Figure E.5. Also.3. as well as the relationship between X(4) and the original spectrum X. In this case. each entry of x (in the time domain) is multiplied by the corresponding entry of 2 0 −2 0 2 0 −2 0 2 0 −2 0 2 0 −2 0 T . Note the circular symmetry in X(4) .181 X (3) 4 3 2 3 2 1 8 12 1 0 4 Figure E. x(4) [n] = 2 cos πmn x[n] ←→ X(4) = Pm X + P−m X 8 πmn 8 4 3 2 1 0 4 3 2 1 X (4) 4 3 2 3 2 1 12 1 8 Figure E.2 (v) illustrates X(4) in the case m = 4.5.3. the resulting signal will be real-valued.5. since e−j(π/8)mn + ej(π/8)mn = 2 cos (same identity as encountered earlier).

S[k] = xT F−k y 0 . and study the spectra of the resulting signals. . . x[N − 1]  . k = 0. . . n)th entry of the diagonal matrix F−k .6. . . the DFT of a linear combination of two signals is the linear combination of their DFT’s: αx + βy ←→ αX + βY There are other important ways in which two signal vectors x and y can be combined in both time and frequency domains. .   0 y[1]    S[k] = x[0] x[1] . 0 i. . y and the matrix W. . wk(N −1) y[N − 1] . we examine two speciﬁc ways of combining together two (or more) arbitrary signals in the time domain.. . we can write    1 0 .. . n = 0.6. .) Noting that wkn is the (n. . We will examine two such operations. . . .1.  ..   .   .e. 3.6 Multiplication and Circular Convolution The structural properties of the discrete Fourier transform considered thus far involved operations on a single signal vector of ﬁxed length. i. we do not have a compact expression for the entire vector S as a product of the vectors x. . .1 Duality of Multiplication and Circular Convolution We have seen (in DFT 1) that DFT is a linear transformation. Let us examine the DFT S of the product signal s.e. 0 y[0]  0 wk . . Deﬁnition 3. . which is given by the analysis equation: N −1 S[k] = n=0 x[n]y[n]wkn . N − 1 A special case of the product was encountered earlier. . In this section.. . .182 3. . N − 1 (We note that in this case. where y = v(m) (the mth Fourier sinusoid). The (element-wise) product of N -point vectors x and y is the vector s deﬁned by s[n] = x[n]y[n] . .

DFT 8. we also have X convolution is symmetric in its two arguments. Deﬁnition 3. we use the matrix forms of the analysis and synthesis equations.6. . The circular convolution of the N -point vectors a and b is the N -point vector denoted by a and given by (a b)[n] = aT Pn Rb . i.183 We seek an expression for S[k] in terms of the spectra X and Y.. . as well as the known identities V = VT . Y and the permutation matrices R (circular reversal) and P (circular shift) in the last expression is known as the circular convolution of the vectors X and Y.e. then x[n]y[n] ←→ 1 X N Y Y=Y X. . n = 0. . We obtain S[k] = = = = = 1 VX N T VF−k = Pk V and V = RW F−k y 1 T X VF−k y N 1 T k X P Vy N 1 T k X P RWy N 1 T k X P RY N The operation involving X. To that end. (Multiplication of Two Signals) If x ←→ X and y ←→ Y. Since x[n]y[n] = y[n]x[n]. .2. It is deﬁned for any two vectors of the same length. N − 1 b We can now write our result as follows. circular Fact.

signal spreading) and signal processing (e.g. we note that DFT 8 has a dual property: circular convolution in the time domain corresponds to multiplication in the frequency domain. this feature will be explored further in the following chapter. (Circular Convolution of Two Signals) If x ←→ X and y ←→ Y. We consider the product signal S[k] = X[k]Y [k] (now in the frequency domain). In the meantime. windowing in spectral analysis and ﬁlter design).g. and use the synthesis equation (which diﬀers from the analysis equation by a complex conjugate and the factor N ) to express the time-domain signal s as s[n] = 1 T n X F Y N y. DFT 9. then x y ←→ X[k]Y [k] Proof. As it turns out.. since it can be used to compute the response of a linear system to an arbitrary input vector. We follow an argument parallel to the proof of DFT 8. y ←→ Y and x[n]y[n] ←→ N −1 (X Y). amplitude modulation. Indeed.184 Multiplication in the time domain has many applications in communications (e. convolution is equally (if not more) important as a time-domain operation. Alternatively. We need to show that s[n] is the nth entry of x s[n] = = 1 Wx N T Fn Y 1 T x WFn Y N 1 T n = x P WY N 1 T n = x P R(VY) N = xT Pn Ry = (x y)[n] as needed. multiplication in the time domain is equivalent to (scaled) circular convolution in the frequency domain. this result can be proved using DFT 7 (duality) on the pairs x ←→ X. . By DFT 8..

3 are depicted using four-point time wheels. • For each n = 0.e. i. shift Ry circularly by n indices to obtain Pn Ry. .1 We have P0 Ry = 0 1 0 −1 T .6. By symmetry of circular convolution.6. as the (unconjugated) dot product of x and Pn Ry.1. Consider the four-point vectors x= and let s=x y The vectors x.185 3. N − 1.2 Computation of Circular Convolution y for two N -point vectors The computation of the circular convolution x x and y can be summarized as follows: • Circularly reverse y to obtain Ry. . . 1.. . y and Pn Ry for n = 0. 2. Example 3. • Compute (x y)[n] as xT Pn Ry.6. x and y can be interchanged in the above computation. x b y -1 a b c d T and y= 0 −1 0 1 T c a 0 0 d Ry = P Ry 0 1 P Ry 1 1 0 P Ry 2 -1 P Ry 3 0 0 0 1 -1 0 0 -1 1 -1 0 1 0 Example 3.

1 0 −3 2j 13 −3 T T T 0 −2j 0 6j 0 −6j .1.2. We verify DFT 9 for x= 2 −3 5 −3 T T and y= 0 −1 0 1 T The signal y is the same as in Example 3.6.186 P1 Ry = P2 Ry = P3 Ry = −1 0 1 0 0 1 0 T T T 0 −1 1 0 −1 and thus s[0] = xT P0 Ry = b − d s[1] = xT P1 Ry = c − a s[2] = xT P2 Ry = d − b s[3] = xT P3 Ry = a − c Therefore s= b−d c−a d−b a−c Example 3. S[k] = X[k]Y [k] for all values of k. thus s=x y= 0 3 0 −3 T The DFT’s are easily computed as: X = Y = S = Indeed.6.

8. L ≤ n ≤ N − 1. The N -point zero-padded extension of s is the signal y deﬁned by s[n].1.7. The N -point periodic extension of s is the signal x deﬁned by s[n].2.7 3. where L = 10 and N = 36. In what follows. L ≤ n ≤ N − 1. Note that the signals x and y are fundamentally diﬀerent.7. 0 ≤ n ≤ L − 1. s 0 9 x 0 10 20 y 30 35 0 10 35 Figure 3. Deﬁnition 3. its 36-point periodic extension x and its 36-point zero-padded extension y.7.1 Periodic and Zero-Padded Extensions of a Signal Deﬁnitions We now shift our focus to signals derived from a basic signal vector s by extending its length. y[n] = 0.8: The 10-point signal s. x[n] = x[n − L]. and will derive two distinct types of signals of variable length N ≥ L based on s. Deﬁnition 3. The two types of extensions are illustrated in Figure 3. 0 ≤ n ≤ L − 1. we will assume that s has length L. In .187 3.

3. and so do the associated Fourier sinusoids.2 Periodic Extension to an Integer Number of Periods Assuming that N = M L. However. the relationships between X. on the other hand. This fact is easily established by noting that the k th Fourier frequency for an L-point signal is given by k 2π L = kM 2π ML = kM 2π N and thus equals the kM th frequency for an N -point signal. then the set of Fourier frequencies for an L-point signal is formed by taking every M th Fourier frequency for an N -point signal. there are no identiﬁable correspondences or similarities between X. both DFT’s X and Y are computed using elements of the vector s. where M is an integer. exhibits no activity after time n = L − 1. Fact.7.. The assumption N = M L implies that x contains an exact number of copies of s. where N is arbitrary). we have the following relationship between X and S. Y and S in the general case (i. Proof. Since the extensions x and y are formed by either replicating the basic signal s or appending zeros to it. One important exception is the special case where N is an integer multiple of L. 0. if r/M = integer. We know that s= 1 VL S L . otherwise. starting with the zeroth frequency (ω = 0). Since the elements of X and Y are inner products involving such Fourier sinusoids. This is because the Fourier frequencies for X and Y depend on N . Y and S are not at all apparent.188 the case of x. M = 3 and N = 12.e. Fact.9 illustrates this fact in the case where L = 4. then the DFT of the N -point periodic extension x of an L-point signal s is given by X[r] = M S[r/M ]. If N = M L. If N = M L. the same activity is observed every L time units. Figure 3. y.

. Not surprisingly.e. the (kM )th column of the matrix VN . By replicating the above equation M times.  .  =  .9: Fourier frequencies for a 4-point signal (◦) and the remaining Fourier frequencies for a 12-point signal (×).16)  . ej(2π/L)kn = ej(2π/L)k(n+pL) it follows that the k th column of the N × L matrix is simply given by ej(2π/L)kn = ej(2π/N )kM n . we obtain     VL s  s  1  VL      (3.  L . The k th column of the N × L matrix on the right-hand side is the N -point periodic extension of ej(2π/L)kn . L − 1 which is the k th Fourier sinusoid for an L-point signal. . n = 0. Since for every integer p.. . N − 1 But this is just the (kM )th Fourier sinusoid for an N -point signal. .16) implies that x is a linear combination of no more than L Fourier sinusoids. these sinusoids have the same frequencies as the Fourier components of s.189 π/2 2π/3 5π/6 π π/3 π/6 0 7π/6 4π/3 5π/3 3π/2 11π/6 Figure 3. where VL is the matrix of Fourier sinusoids for an L-point signal. . i. s VL The N × 1 vector on the left-hand side of the equation is simply x. Thus (3. S  . . . n = 0. .

and x can be represented as a linear combination of L Fourier sinusoids only.16) with the synthesis equation x= and obtain the identity  1   L VL VL . . we compare (3. L − 1.190 We have therefore established that X[r] = 0 if r is not a multiple of M .7.e. be consistent with the one provided by Fourier analysis (i. then all the original frequencies for s will be Fourier frequencies for x. . . none of the remaining N − L Fourier sinusoids will appear in x. Example 3. To determine the value of X[kM ] for k = 0. L − 1. VL This can only be true if X[kM ] = (N/L)S[k] = M S[k] for k = 0. If s= has DFT S= then x= has DFT X= 3A 0 0 3B 0 0 3C 0 0 3D 0 0 T T 1 VN X N   1   S = VN X N  a b c d A B C D T . . with the same coeﬃcients. this is because the Fourier frequencies for s may not form a subset of the Fourier frequencies for x. . the DFT) of x. the DFT allows us to express an L-point time-domain signal s as a linear combination of L sinusoids at frequencies which are multiples of 2π/L.. T a b c d a b c d a b c d In summary. In the special case where x consists of an integer number of copies of s. . This representation would not. . The same L sinusoids.1. . in general. could be used to represent the N -point periodic extension x of s. . . .

.3 Zero-Padding to a Multiple of the Signal Length We now turn to the zero-padded extension y of s in the case where N = M L. provided the number of appended zeros is an integer multiple of the (original) signal length. Example 3. the fact that y[n] coincides with s[n] for the ﬁrst L time indices and equals zero thereafter)..191 3. If y= has DFT Y= then s= has DFT S= Y0 Y3 Y6 Y9 a b c d Y0 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 T T a b c d 0 0 0 0 0 0 0 0 T T The DFT of a ﬁnite-length signal vector can thus be obtained by sampling the DFT of its zero-padded extension.7. As we noted earlier. the frequency ω = k(2π/L) is both the k th Fourier frequency for s and the (kM )th Fourier frequency for y. . then the DFT of an L-point signal s can be obtained from the DFT of its N -point zero-padded extension y by sampling: S[k] = Y [kM ] . Fact. L − 1 Proof. If N = M L. the corresponding DFT values are S[k] and Y [kM ]. .7. We have the following relationship between the spectra Y and S. We have N −1 Y [kM ] = n=0 L−1 y[n]e−j(2π/N )kM n s[n]e−j(2π/L)kn n=0 = = S[k] where the ﬁrst equality follows from the deﬁnition of y (i. respectively. k = 0.2. .e. .

. For example. interference from a power supply at 50 Hz. we may be interested in identifying all instruments being played at a particular time instant. (Analogous statements can be made about the human eye. if possible. the human ear can provide more detailed information about the content of the signal: it can identify speciﬁc notes (i. In many applications. i. one would be interested in identifying the note being played during a noisy segment of the piece.. the human ear can perform most of the above mentioned tasks quite reliably. these tasks can be carried out automatically.8 3. frequencies). with little or no human intervention.. identify the diﬀerent voices that are being heard. which is the quantitative analysis of signals. which is particularly adept at identifying visual patterns.1 Detection of Sinusoids Using the Discrete Fourier Transform Introduction An important application of signal analysis is the identiﬁcation of diﬀerent components present in a signal. The discrete Fourier transform is well-suited for the detection of sinusoids in a discrete-time signal. With minimal training. Sinusoids comprise a particularly important class of signals. nor can it process anything other than acoustic signals in the audible frequency band (from approximately 20 Hz to 20 kHz). In many cases. since they arise in a diverse class of physical systems which exhibit linear properties. allows us to identify and separate essential components of a signal based on its numerical samples (rather than its perceptual attributes). it is desirable to identify and isolate certain sinusoidal components present in a signal. A note consists of sinusoids at multiples of a certain fundamental frequency.. Of course. these components are at the so-called Fourier frequencies. we may also want to count and. e.g. and can detect whether an instrument is out of tune. in restoring a musical recording that has been corrupted by static noise or deterioration of the recording medium. With additional training. which are the integer multiples of 2π/N . identifying those sinusoids would be an essential step towards removing the noise from the signal. the human auditory system cannot yield quantitative measures of signal parameters. For example.e. If the recording contains vocals. For an N -point signal. Such systems were introduced in Chapter 2 and will be the focus of our discussion in Chapter 4.192 3. given a musical recording.e. Sinusoids could also represent unwanted components of a signal.) Signal processing. since it provides a decomposition of the signal into sinusoidal components.8.

6π (i.000 samples/sec to yield the discrete-time sinusoid x[n] = 5 cos(0. and the continuous-time signal can be reconstructed perfectly. Example 3. The question is whether we can still use a DFT-based method to estimate the frequency ω0 . 3. then ω0 is the k = 6th Fourier frequency for s. .8.193 First. if N = 20. If ω0 = Ω0 /fs happens to be a Fourier frequency for sample size N . t∈R is sampled at a rate of 1. In general. provided that no aliasing has occurred. and as a result. the frequency of x[·]) is a Fourier frequency for s provided k or equivalently. The answer is aﬃrmative.1..6πn + 3π/4) . it will not exhibit the features seen in the graphs of Example 3. the DFT will analyze the sinusoidal signal into N complex sinusoidal components at frequencies unrelated to ω0 . the frequency ω0 will not be an exact Fourier frequency.1. the amplitude and phase of the sinusoid can be easily obtained from the corresponding entries of the DFT vector. .2 The DFT of a Complex Sinusoid Let s be the L-point complex sinusoid given by s[n] = ejω0 n . let us consider a vector consisting of N uniform samples of a realvalued sinusoid of frequency Ω0 rad/sec. Also. .8. N= 2π = 0. . L − 1 . n∈N If s = x[0 : N − 1] (using MATLAB notation for indices). the remaining entries being zero.6π N 10k 3 for some integer k. The continuous-time sinusoid x(t) = 5 cos(600πt + 3π/4) . obtained at a rate of fs samples per second.e. In that case. then the N -point DFT will will contain exactly two nonzero entries at frequencies ω0 and 2π − ω0 . The frequency ω0 (hence also Ω0 ) can be determined by inspection of the DFT.8. n = 0. For example. The resulting amplitude and phase spectra are shown in the ﬁgure. then ω0 = 0.

s at frequency ω. . . we have L−1 2 =L v. we have v = s and thus v. . s at ω = 2kπ/N .1. s = n=0 v ∗ [n]s[n] = n=0 s[n]e−j(2π/L)kn = S[k] Similarly. L − 1 The inner product v. As N increases. .194 |S[k]| 50 50 0 S[k] 6 10 14 19 3π/4 −3π/4 Example 3. Let v represent a complex sinusoid of variable frequency ω: v[n] = ejωn . the spacing 2π/N between consecutive Fourier frequencies decreases to zero. s over the entire frequency range [0. n = 0.1 where ω0 is a ﬁxed frequency. s is a function of ω. Its value at ω = 2kπ/L (where k is integer) can be computed as the the k th entry in the DFT of s: L−1 L−1 v. can be computed via the DFT of s zero-padded to total length N . s = s For ω = ω0 . in order to establish the orthogonality of the complex sinusoids. and the resulting DFT provides a very dense sampling of v. s = n=0 v ∗ [n]s[n] .8. (This was done in Subsection 3. the inner product v.3 for the special case where both ω and ω0 are Fourier frequencies.) For ω = ω0 . where N > L. 2π). Let us evaluate v.

i. in each case for θ varying over (−2π. Evaluated at ω = 2kπ/N (where N > L). it yields the k th entry in the amplitude spectrum of the N -point zero-padded extension of s. It can be written as the product of a real term and a complex term of unit magnitude by ﬁrst applying the identity 1 − z 2 = z(z −1 − z) to both numerator and denominator: 1 − ejL(ω0 −ω) ejL(ω0 −ω)/2 e−jL(ω0 −ω)/2 − ejL(ω0 −ω)/2 = j(ω −ω)/2 · −j(ω −ω)/2 0 1 − ej(ω0 −ω) e 0 e − ej(ω0 −ω)/2 and then recalling that ejθ − e−jθ = 2j sin θ. 2π). s = e−j(L−1)(ω−ω0 )/2 · sin(L(ω − ω0 )/2) sin((ω − ω0 )/2) (3..18) Evaluated at ω = 2kπ/L.17) = e−j(L−1)(ω−ω0 )/2 · DL (ω − ω0 ) where the function DL (·) is deﬁned by DL (θ) = sin(Lθ/2) sin(θ/2) Since e−j(L−1)(ω−ω0 )/2 = 1. since ω = ω0 and both frequencies are assumed to be in [0. |S[k]|.10 for L = 6 and L = 9. The absolute value of the function DL (θ) is shown in Figure 3. 2π) (which includes the range of values of ω − ω0 in (3.195 L−1 = n=0 L−1 e−jωn ejω0 n ej(ω0 −ω)n n=0 = = 1 − ejL(ω0 −ω) 1 − ej(ω0 −ω) where the last equality is the familiar expression for the geometric sum. We make the following key observations about the function DL (θ): . (Note that the denominator cannot be equal to zero. s | = |DL (ω − ω0 )| = sin(L(ω − ω0 )/2) sin((ω − ω0 )/2) (3. we have that | v.) The inner product is therefore complex. the above expression gives the k th entry in the amplitude spectrum of s.e. The resulting expression is v.17)).

it can be estimated with arbitrary accuracy by locating the maximum value of | v. 2π) and ω be a variable frequency in the same interval.5 0 θ/2π 0. and L − 2 side lobes of width 2π/L and varying height. s . both the numerator and the denominator in the deﬁnition of DL (θ) equal 0. its height being approximately 2/3π that of the main one. The ﬁrst side lobe is the tallest one (in absolute value).10: The function |DL (θ)| for L = 6 (left) and L = 9 (right). Thus if the frequency ω0 of a given complex sinusoid s is not known. 2π). computed for a suﬃciently dense set of frequencies ω in [0. • DL (θ) is periodic with period 2π. This also gives the correct result for the inner product when ω = ω0 . . except those which are also integer multiples of 2π (DL (2πk) = L). . We conclude this lengthy discussion with the following observation. where s[n] = ejω0 n and v[n] = ejωn . Then the magnitude of the inner product v. . s |. • DL (θ) = 0 for all values of θ which are integer multiples of 2π/L. n = 0. Fact.5 1 −0.5 1 Figure 3. • At θ = 0. L − 1 achieves its unique maximum (as ω varies) when ω = ω0 .196 |D6(θ)| 10 8 6 4 2 0 10 8 6 4 2 0 |D9(θ)| −0. . the graph of DL (θ) contains one main lobe of width 4π/L and height L. .5 0 θ/2π 0. which equals L. DL (0) is then deﬁned as the limit of DL (θ) as θ approaches zero. Let ω0 be a ﬁxed frequency in [0. • In each period.

2π) which includes the origin. x = A A jφ e v. 2π). A real-valued sinusoid is the sum of two complex-valued sinusoids at conjugate frequencies: A cos(ω0 n + φ) = A j(ω0 n+φ) A −j(ω0 n+φ) e + e 2 2 We assume that ω0 lies in [0.3 Detection of a Real-Valued Sinusoid As we showed in the previous subsection. . . π]. s | = |DL (ω − ω0 )| For ﬁxed ω0 . L − 1 be the variable-frequency sinusoid used in the computation of the DFT.18): | v. The maximum value of |DL (θ)| over that interval will be achieved at θ = 0. This is a subinterval of (−2π. which is the eﬀective frequency range for real sinusoids. . . n = 0. In most practical situations. Again. . Thus the maximum value of | v. −ω0 ). s | = |DL (ω − ω0 )| occurs at ω = ω0 . . s∗ + v. as well. L − 1 where the remaining components of x are collectively represented by r. We thus model it as x[n] = A cos(ω0 n + φ) + r[n] . n = 0. the diﬀerence θ = ω − ω0 lies in the interval [2π − ω0 . then locating the maximum of the amplitude spectrum. let v[n] = ejωn .10. the observed signal vector will have other components. . . the frequency of a L-point complexvalued sinusoid s[n] = ejω0 n . and equals L. r 2 2 . We then have A A x = ejφ s + e−jφ s∗ + r 2 2 We argue that a satisfactory estimate of frequency ω0 can be obtained from x using the method developed earlier for the complex exponential s. . . L − 1 can be determined by zero-padding the signal to obtain a reasonably dense set of Fourier frequencies in [0. n = 0.8.197 This fact follows from (3. s + e−jφ v. and consider the inner product v. . . as illustrated in Figure 3. 3.

In the current notation.85 (roughly a sixth of the amplitude of the sinusoid).3221)n + 1.198 which yields the DFT of x and its zero-padded extensions by appropriate choice of ω. x | over the range [0. s∗ | will not interfere signiﬁcantly with the main lobe of | v. Note that | v. s |. the side lobes of | v. (A/2)| v. s |. In particular. r | and | v. s∗ | over the range [0. s | = |DL (ω − ω0 )| = and | v. ω0 = 2π(0.12 cos(2π(0. s∗ | = |DL (ω + ω0 )| = sin(L(ω − ω0 )/2) sin((ω − ω0 )/2) sin(L(ω + ω0 )/2) sin((ω + ω0 )/2) As long as ω0 is not too close to 0 or π. r | and | v.39.3221) and φ = 1. Conclusion. | v. s + v.2. s∗ |. If. similarly. The ﬁgure shows plots of (A/2)| v. If a signal vector x has a single strong sinusoidal component. since x and r are real-valued vectors.3202). against cyclic frequency f = ω/2π (cycles/sample). s |. we have A = 5. Both | v. x |.12. which corresponds to frequency index k = 82 in the 256-point DFT. π] will occur very close to the unknown frequency ω0 .39) + r[n] Here r[n] is white Gaussian noise with mean zero and standard deviation 0. Consider the 16-point signal x[n] = 5. where “signiﬁcant” is understood relatively to the height and curvature of the main lobe of (A/2)| v. Example 3. r | does not vary signiﬁcantly near ω0 . x | are symmetric about ω = π (or f = 1/2). x | achieve their maximum over the range 0 ≤ ω ≤ π at ω = 2π(0.8. π] will be negligibly diﬀerent from ω0 . s | and | v. Additional zero-padding would give an estimate closer to the true value 2π(0. we know that | v. where the DFT is computed after padding the signal with suﬃciently many zeros. then the maximum of | v. . then the frequency of that component can be estimated reasonably accurately from the position of the maximum in the left half of the amplitude spectrum. all computed using 256-point DFT’s. | v. We have already studied the behavior of the ﬁrst two terms on the righthand side. and the location of the maximum of | v.3221).

x>| 0. we should note that it is possible to improve the detection and frequency estimation technique outlined above (in the case where multiple sinusoids may be present) by scaling the values of the data vector x using .4 0. Good separation in frequency typically means that the main lobes corresponding to diﬀerent components are suﬃciently far apart.e.|<v.8.8 1 50 40 30 20 10 0 0 0. s>| 50 40 30 20 10 0 0 0. Increasing the number of points N in the DFT does not solve the problem.8 1 50 40 30 20 10 0 0 0.8. Simply put. r>| 50 40 30 20 10 0 0 0.4 0.2 0.2 3.4 Final Remarks The methodology discussed above can be extended to a signal containing two or more strong sinusoidal components.|<v. as long as these components are well-separated on the frequency axis.6 0. take more samples of the signal. since lobe width does not depend on N . Finally.6 0.6 0. Since the main lobe width is 4π/L.. s*>| |<v. the only way to obtain sharper peaks is to increase L.8 1 (A/2). their (unknown) frequencies can be estimated well by locating maxima on the left half of the amplitude spectrum.8 1 Example 3.6 0.2 0.4 0.2 0. i.199 (A/2).4 0.2 |<v.

the height ratio between the main lobe and the side lobes is increased by orders of magnitude. . at the expense of only a moderate increase in main lobe width. Such techniques provide better resolution for sinusoids that are close in frequency.200 certain standard time-domain functions (known as windows) before computing the DFT. As a result of this preprocessing.

Let v(0) . Compute the squared approximation error ˆ − s 2. Let √ 3 β= 2 T 1 α= 2 and (i) Determine a complex number z such that the vector v= equals 1 z z2 z3 z4 z5 (ii) If s= 3 2 −1 0 −1 2 T 1 α + jβ −α + jβ −1 −α − jβ α − jβ T determine the least-squares approximation ˆ of s in the form of a linear s combination of 1 (i. Clearly show the numerical values of the elements of ˆ. Determine the coeﬃcients of these sinusoids in the expression for x. . v(1) and v(7) denote the complex Fourier sinusoids of length N = 8 at frequencies ω = 0.201 Problems Section 3.2. where 0 ≤ ω ≤ π. i.. s P 3. v(1) and v(7) . the all-ones vector). Determine the least-squares approximation ˆ of s s= 4 3 2 1 0 1 2 3 based on v(0) .1. s P 3.e. Consider the eight-point vector √ √ √ √ x = 2 2 − 1 1 −2 2 − 1 5 −2 2 − 1 1 2 2 − 1 −3 T (i) Show that x contains only three Fourier frequencies. v and v∗ . it is the linear combination of exactly three Fourier sinusoids v(k) .. ω = π/4 and ω = 7π/4. (ii) Find an equivalent expression for x in terms of real-valued sinusoids of the form cos(ωn + φ).3. respectively.1 P 3.e.

otherwise. as well as between y and X. The columns of the matrix   1 1 1 1  1 j −1 −j   V=  1 −1 1 −1  1 −j −1 j are the complex Fourier sinusoids of length N = 4.5.6. Section 3. (i) Sketch the six-point vectors x and y deﬁned by x[n] = 1. (ii) Write out the entries of the IDFT matrix V10 (corresponding to a tenpoint signal) using real numbers 1. Express the vector s= 1 4 −2 5 T as a linear combination of the above sinusoids.202 P 3. Let u = ej(2π/9) and z = ej(π/5) . n = 2. In other words.2 P 3. ﬁnd a vector c = [c0 c1 c2 c3 ]T such that s = Vc. . Observe the similarities between x and Y. where m is a nonzero integer between −4 and 4. −1 and complex numbers z m . (iii) Compute the DFT’s X and Y. 4 0. P 3.4. (i) Write out the entries of the DFT matrix W9 (corresponding to a ninepoint signal) using the real number 1 and complex numbers um . y[n] = cos(2πn/3) v(2) + v(4) 2 (ii) Prove that y= where v(k) is the k th Fourier sinusoid for N = 6. where m is a nonzero integer between −4 and 4.

n = 0. . 7 (i) Which Fourier frequencies (for an eight-point sample) are present in the signal x? (ii) Determine the amplitude spectrum of x.8 . (iii) Compute the amplitude and phase spectra of x. (iv) Express x[n] as a linear combination of three real-valued sinusoids.203 Section 3.7. displaying each as a vector. as usual. N − 1 where r is a real number other than −1 or +1. Consider the N -point time-domain signal x given by x[n] = rn . . . displaying your answer in the form φ0 φ1 φ2 φ3 φ4 φ5 φ6 φ7 T P 3. . displaying your answer in the form T A0 A1 A2 A3 A4 A5 A6 A7 (iii) Determine the phase spectrum of x. show that the DFT X of x is given by 1 − rN X[k] = . . .9. . . .3 P 3. . N − 1 1 − rwk where.2 + 2 cos − 0. . show that the square amplitude spectrum is given by |X[k]|2 = (1 − rN )2 . . . (ii) Determine the values of z1 and z2 . Consider the real-valued signal x given by x[n] = 3(−1)n + 7 cos πn πn + 1. (ii) Using the formula |z|2 = zz ∗ . . w = e−j(2π/N ) .8. k = 0. (i) Using the formula for the geometric sum. . 1 + r2 − 2r cos(2kπ/N ) k = 0. 4 2 n = 0. A ﬁve-point real-valued signal x has DFT given by X= 4 1 + j 3 − j z1 z2 T (i) Compute x[0] + x[1] + x[2] + x[3] + x[4] using one entry of X only. . N − 1 . P 3.

) T . Compare your answer to N=16. and deﬁning 1 λ= 2 and a b c d e f T T (Note the symmetry. bar(n.^n.7. use MATLAB to produce a (discrete) plot of the square amplitude spectrum in the equation above. A = abs(fft(x)). The time-domain signal x= has DFT X= A B C D E F √ 3 µ= 2 Using the given parameters.^2) Section 3. Let x= 2 1 −1 −2 −3 3 Display (as vectors) and sketch the following signals: • x(1) = Px • x(2) = P5 x • x(3) = Px + P5 x • x(4) = x + Rx • x(5) = x − Rx • x(6) = F3 x • x(7) = (I − F3 )x P 3.204 (iii) For the case r = 0.4 P 3. x = r. n=(0:N-1)’.) (Note the symmetry. r = 0.10.7 and N = 16.A.11.

x(1) = x(2) = x(3) = x(4) = x(5) = x(6) = x(7) = x(8) = a −b c −d e −f a 0 c 0 e 0 a f e d c b a b c e d c T T T T T d e f b a f f +b a+c b+d c+e d+f 0 µb µc 0 −µe −µf A B C D E F T T e+a T P 3. An eight-point signal x has DFT X= 0 0 0 −1 2 −1 0 0 T Without inverting X. n = 0. write out the components of the DFT vector X(r) for each of the following time-domain signals x(r) .12. . compute the DFT Y of y.13. 7 4 . . .205 for convenience. . compute the numerical values in the DFT Y of y= 2x0 x1 + x7 x2 + x6 x3 + x5 2x4 x5 + x3 x6 + x2 x7 + x1 T P 3. A real-valued eight-point signal vector x= has DFT X given by X= 4 5 − j −1 + 3j −2 −7 S5 S6 S7 T x0 x1 x2 x3 x4 x5 x6 x7 T Without inverting X. which is given by the equation πn y[n] = x[n] · cos .

15. If the DFT X is given by X= X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 T express the DFT’s Y and S in terms of the entries of X. y and s shown in the ﬁgure.14 y 4 3 2 1 0 1 2 3 3 2 1 P 3.14. Its spectrum is given by T X= A B C D E F G H (i) The DFT vector shown above contains duplicate values. . Consider the twelve-point vectors x. x 4 3 2 1 0 Problem P 3.5 P 3.206 Section 3. What are those values? (ii) Express the DFT Y of the signal y (shown on the right) in terms of the entries of X. Consider the signal x shown in the ﬁgure (on the left).

15 P 3. axis tight max(imag(ifft(X))) % See (i) below x = real(ifft(X)).16. bar(Y) % See (iv) below (i) Why is this value so small? (ii) Is this a Fourier sinusoid for this problem? (iii) Why is this value so small? (iv) Derive the answer for Y analytically. % See (ii) below y = x. i. ones(10. X =[ones(11. based on known properties of the DFT.1). zeros(43. . bar(X).*cs. bar(x) cs = cos(3*pi*n/4).1)].207 x 1 0 1 2 3 4 5 6 y 7 8 9 10 11 1 0 1 2 3 4 5 6 s 7 8 9 10 11 1 0 1 2 3 -1 5 7 9 10 11 Problem P 3..e. max(imag(fft(y))) % See (iii) below Y = real(fft(y)). bar(y).1). Run the MATLAB script n = (0:63)’.

this result is consistent with the fact that the length of a vector can be computed (via the Pythagorean theorem) using the projections of that vector on any orthornormal set of reference vectors. can be also written as x 2 = 1 X N 2 (and is easier to prove in this form). . The energy spectrum of the signal vector x is deﬁned as the square of the amplitude spectrum. where a b c d T and y= Also.20. Show that the total energy of the signal vector x equals the sum of the entries in the energy spectrum divided by N . In geometric terms.e.208 P 3. determine the DFT of their elementwise product s[n] = x[n]y[n].18. .6 P 3. Section 3. n = 0. N − 1). . where M is a 4 × 4 matrix of numerical values. i. express s as My.. P 3. namely |X[k]|2 (for k = 0. 2.19. Determine the circular convolution s = x x= 1 2 3 4 T y. Consider two time-domain vectors x and y whose DFT’s are given by X= 1 −2 3 −4 T and Y= A B C D T Without explicitly computing x or y. N −1 |x[n]|2 = n=0 1 N N −1 |X[k]|2 k=0 This relationship.17. 3 P 3. 1. . known as Parseval’s identity. The time-domain signals x and y have DFT’s X= and Y= (i) Is either x or y real-valued? 3 5 8 −4 1 0 1 −1 T T .

The time-domain signals x= and y= have DFT’s X and Y given by X= and Y= Y0 Y1 Y2 Y3 X0 X1 X2 X3 T T 2 0 1 3 T 1 −1 2 −4 T Determine the time-domain signal s whose DFT is given by S= X0 Y2 X1 Y3 X2 Y0 X3 Y1 T P 3. 2. k = 0. . 1. . show that the DFT S of s is given by S[k] = X[k]Y ∗ [k]. determine the DFT of the signal s(1) deﬁned by s(1) [n] = x[n]y[n] . . x = xT Pn y∗ . determine the DFT of the signal s(2) deﬁned by s(2) = x y P 3. S is just the energy spectrum of x. . . . N − 1 (ii) Using (i) above. n = 0. . n = 0. N − 1 What symmetry properties does s have in this case? . . k = 0. . .. N − 1 (iii) Show that in the special case where x = y.e. i. S[k] = |X[k]|2 . . The circular cross-correlation of two N -point vectors x and y is the N -point vector s deﬁned by s[n] = Pn y. 3 (iv) Without inverting X or Y.21. .209 (ii) Does either x or y have circular conjugate symmetry? (iii) Without inverting X or Y.22. (i) Show that s also equals x Ry∗ .

The signal x= has DFT X given by X= A B C D E F G H T a b c d 0 0 0 0 T Express the following DFT’s in terms of the entries of X: (i) The DFT Y of y= (ii) The DFT S of s= a b c d a b c d T 0 0 0 0 a b c d T P 3.24. (i) The signal x has spectrum (DFT) X = [A 0 0 B 0 0 C 0 0 D What special properties does x have? (ii) The signal y has spectrum Y = [A B C D A B C D A B C D]T What special properties does y have? .23.25.7 P 3. X4 and X6 ? (ii) Display the time-domain signal y whose DFT is given by Y= X0 0 0 X2 0 0 X4 0 0 X6 0 0 0 0]T T P 3.210 (iv) Compute s and S for x=y= 1+j 2 1−j 0 T Section 3. The DFT of the signal x= is given by X= X0 X1 X2 X3 X4 X5 X6 X7 T 1 −1 1 −1 0 0 0 0 T (i) What are the values of X0 . X2 .

1)] Let X denote the DFT of x. . z4 . z4 ] z4 . z4 ] % length=20 where z4 = zeros(4. z4 ] s .1). s . s . s . zeros(12.211 P 3.’ and its zero-padded extension x = [s . Express the DFT’s of the following vectors using the entries of X: x1 x2 x3 x4 x5 x6 x7 = = = = = = = s [ [ [ [ [ [ s . In MATLAB notation. s .28. P 3. s ] s . Consider a real-valued vector s of length L.26.27. s . and deﬁne a vector x of length 2L + 1 by   a x= s  Qs where Q is the linear reversal matrix and a ∈ R. s ] s . z4 . The twelve-point signal x= has DFT X given by X= X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 T a b c 0 0 0 0 0 0 0 0 0 T Express the following DFT’s in terms of the entries of X: (i) The DFT Y of y= (ii) The DFT S of s= a b c a b c a b c a b c T a b c 0 0 0 T P 3. consider the 4-point vector s = [a b c d]. s ] s .

P 3.30.8 P 3. and 32 consecutive samples are recorded. .e. Based on the given graph.29 20 24 28 31 (i) What are the frequencies f1 and f2 (in Hz)? (ii) Is it possible to write an equation for the continuous-time signal based on the information given? If so. A continuous-time signal is given by x(t) = A1 cos(2πf1 t + φ1 ) + A2 cos(2πf2 t + φ2 ) + z(t) where z(t) is noise. k = 0. The ﬁgure shows the graph of magnitude of the 32-point DFT (i.) .. . without zero-padding) as a function of the frequency index k. If not so.25 ms for a total of 80 samples. i.29. . the sampling rate of 800 samples/sec is no less than twice each of f1 and f2 .e. . write that equation. 2L Section 3. The signal x(t) is sampled every 1.. . The signal is sampled at a rate of 500 samples/sec (assumed to be greater than the Nyquist rate). 126 84 84 126 0 4 8 12 16 Problem P 3. A continuous-time signal consists of two sinusoids at frequencies f1 and f2 (Hz). . explain why. what are your estimates of f1 and f2 ? (You may assume that no aliasing has occurred. The ﬁgure shows the graph of the DFT of the 80-point signal as a function of the frequency index k = 0. . Show that the DFT X of x has the following expression in terms of the DFT Y of y and the complex number w = e−j2π/(2L+1) : X[k] = a + 2 · e wk Y [k] . 79.212 Let y be the vector obtained by padding s with L + 1 zeros. .

namely sinusoids having frequencies which are exact multiples of the fundamental.31.310-point sample vector? (Note that if it is a Fourier frequency. It is assumed that the sampling rate exceeds the Nyquist rate.30 P 3. which two are closest to ω1 = 2πf1 and ω2 = 2πf2 ? (iii) If the 200 samples are padded with M zeros and the (M + 200)-point DFT is computed.213 250 200 150 100 50 0 0 20 40 60 80 Problem P 3. Played on an instrument. A musical note is a pure sinusoid of a speciﬁc frequency known as fundamental frequency. (i) Does the fundamental frequency of 330 Hz correspond to a Fourier frequency for the 2. this means that there are no harmonics at frequencies k(330) Hz for k ≥ 70. (i) What frequencies ω1 = 2πf1 and ω2 = 2πf2 are present in the discretetime signal obtained by sampling at the above rate? (ii) Of the 200 Fourier frequencies in the DFT.200 samples/sec. A continuous-time signal is a sum of two sinusoids at frequencies 164 Hz and 182 Hz. notes are “colored” by the introduction of harmonics. what would be the least value of M for which both ω1 and ω2 are Fourier frequencies? P 3. An instrument playing a 330 Hz note is recorded digitally over a time interval of duration 50 ms at a sampling rate of 46. yielding 2.310 samples. its harmonics will be Fourier frequencies also. and the DFT of the samples is computed.32.) If it is not a Fourier . 200 samples of the signal are obtained at the rate of 640 samples/sec.

002:1) s = cos(2*pi*f1*n+r). P 3.. p = [p. what is the largest value of N less than or equal to 2.e. Determine the total number of harmonics (positive frequencies only. f2 and f3 (where 0 ≤ f1 < f2 < f3 ≤ 1/2) plus a small amount of noise.310 that would make it a Fourier frequency? (ii) (MATLAB) The vector s3 contains the 2. where each sample is distorted by a small amount of noise.214 frequency.33.34.1000) Obtain an estimate of f1 by locating the maximum value of abs(X). and compute their DFT. Plot abs(X) against cyclic frequency f . for r = 2*pi*(0:0. the ratio of their amplitude to that of the fundamental is no less than 1%. . Take the ﬁrst N entries of that vector. and excluding the fundamental) which are within 40 dB of the fundamental i. Plot s5 against time. Use zero-padding to N = 2048 points and follow the same technique as in part (i) of Problem 3. s4 = A*cos(2*pi*f1*n+q) + z. s4’*s]. obtain an estimate of the amplitude A. The 128-point vector s5 consists of three sinusoids of cyclic frequencies f1 . The 64-point vector s4 was generated in MATLAB using n = (0:63). was also generated earlier. (i) Compute the DFT of s4 extended by zero-padding to N = 1000 points: X = fft(s4. end (The maximum of the inner product s4’*s occurs for a value of r close to the true value of the phase shift q. The parameters A.310 samples of the note. which represents noise. f2 and f3 . (Note: f is related to the Fourier frequency index k by f = k/N . where N was the answer to part (i).’. The vector z. f1 (between 0 and 1/2) and q were previously speciﬁed.) (ii) Obtain an estimate of the phase q using the estimate f1 and the following script: p = [].33 to obtain estimates of f1 .) Finally. P 3.

Chapter 4 Introduction to Linear Filtering 215 .

The length of the segment is ﬁnite provided both n1 and n2 are ﬁnite. it is important to understand how the DFT of a signal vector s behaves as the length N of s approaches inﬁnity. the spacing between consecutive frequencies shrinks to zero.. and thus every frequency ω in [0. In order to express x as a linear combination of sinusoids of inﬁnite length. . The corresponding radian frequencies are ω = 0 through ω = 2π − (2π/N ) in increments of 2π/N . A similar expression for the inﬁnite sequence x would have to include both positive and negative time indices.1. • The synthesis equation 1 s[n] = N N −1 S[k]ej(2π/N )kn k=0 is a sum over frequency indices k = 0 through k = N − 1. The continuous-index version of a sum is . which would range from n = −∞ to n = +∞. also known as sequences. n ∈ Z} As in MATLAB. time axis) for such a sequence x will consist of all integers: x = {x[n]. we will use the notation x[n1 : n2 ] = x[n1 ] . 2π) becomes a Fourier frequency.1 The Discrete-Time Fourier Transform From Vectors to Sequences We will now discuss how the DFT tools developed in the previous chapter can be adapted for the analysis of discrete-time signals of inﬁnite length. x[n2 ] T to denote a segment of the sequence x corresponding to time indices n1 through n2 .216 4. As N → ∞. The index set (i. To that end. . This clearly becomes an inﬁnite sum in the limit as N → ∞.1 4.e. we note the following: • The analysis equation N −1 S[k] = n=0 s[n]e−j(2π/N )kn involves a sum over time indices 0 through N −1.

both the analysis and synthesis equations are expected to take on diﬀerent forms as N → ∞.. x x (L) -L 0 L Figure 4.1. .2 Development of the Discrete-Time Fourier Transform To develop the exact form of the analysis and synthesis equations for the sequence x. The topmost element in each v(k) (which would correspond to time n = −L in this case) equals ej0 = 1.1: The DTFT of a sequence x is obtained from the DFT of the vector x(L) by letting L → ∞. one expects the synthesis equation to involve an integral over all frequencies in [0. i. The standard DFT of x(L) provides us with the coeﬃcients needed for representing x(L) as a linear combination of N sinusoidal vectors v(k) of length N . The inﬁnite sequence x is obtained by letting L → ∞.e. each having a diﬀerent Fourier frequency k(2π/N ). For the task at hand. consider the segment x(L) = x[−L : L] = x[−L] . as shown in Figure 4. 2π). a more suitable choice for the k th Fourier sinusoid is ˜ v(k) = ej(2π/N )kn n=L n=−L . 4. .1.217 an integral. x[L] T which has length N = 2L + 1. thus as N → ∞. N −1 ω=2π → k=0 ω=0 dω In brief.

2L (4. n ∈ Z} is deﬁned by ∞ X(ejω ) = n=−∞ x[n]e−jωn provided the inﬁnite sum converges. This deﬁnition serves as the analysis equation in the case of an inﬁnitelength sequence.218 whose middle entry (corresponding to time n = 0) equals 1.1. in which case the Fourier frequencies are taken in (−π.2) where. . (Incidentally. nor does it alter their norm. N → ∞). v( ) = N. ˜ ˜ v(k) .1) and 1 x[n] = N N −1 X (L) [k]ej(2π/N )kn . Since ejω is periodic with period 2π. the set of Fourier frequencies k(2π/N ) becomes the entire interval [0. It remains to derive a synthesis equation. This circular shift has no eﬀect on the orthogonality of the Fourier sinusoids.1) as a function of ω instead of k. we note that the frequency index k can also range from −L to L. . k = . i.1. The discrete-time Fourier transform (DTFT) of the sequence x = {x[n]. N = 2L + 1. This prompts us to rewrite the sum in (4. . k=0 n = −L. . and provides us with a continuous set of coeﬃcients X(ejω ) for frequencies in [0. L (4. . Thus. 0. again. and to also take the limit as L → ∞ (i. .e. . The resulting expression ∞ x[n]e−jωn n=−∞ is a power series in ejω .e. as in Subsection 3. Deﬁnition 4. to show how the sequence x can be reconstructed by linearly combining . k = The resulting modiﬁed analysis and synthesis equations are L X (L) [k] = n=−L x[n]e−j(2π/N )kn . 2π). 2π) as N → ∞..1. .3. π).) As we noted earlier.. so is the resulting sum. k = 0.

For large values of L.3) • Synthesis: x[n] = 1 2π 0 2π X(ejω )ejωn dω . we can write an approximation for the sum in (4.219 sinusoids ejω with the above coeﬃcients. As suggested earlier. this linear combination takes the form of an integral. we have obtained the DTFT pair x[n] DTFT ←→ X(ejω ) deﬁned by the equivalent equations: • Analysis: X(ejω ) = n=−∞ ∞ x[n]e−jωn (4. n ∈ Z x[n] = 2π 0 In summary. Each signal has its own coeﬃcient (as was the case with the DFT). and the signals are combined at each time instant n by integrating over the range of the continuous index. n∈Z (4. each multiplied by the spacing 2π/N (in frequency). . The approximation (≈) also becomes exact in the limit. the sum converges to the integral of the function over the same interval. which is constructed as follows. 2π).4) The synthesis equation is an example of a linear combination of a continuously indexed set of signals (the continuous index being frequency ω in this case). and therefore 2π 1 X(ejω )ejωn dω . As N → ∞.2) by replacing each coeﬃcient X (L) [k] by the value of X(ejω ) at the same frequency: x[n] ≈ 1 N N −1 X ej(2π/N )k · ej(2π/N )kn k=0 Multiplying and dividing the right-hand side by 2π results in x[n] ≈ 1 2π N −1 X ej(2π/N )k · ej(2π/N )kn · k=0 2π N The sum consists of N equally spaced samples of the function X(ejω )ejωn over the interval [0.

220

4.2
4.2.1

Computation of the Discrete-Time Fourier Transform
Signals of Finite Duration

From a computational viewpoint, there are essential diﬀerences between the DFT of a ﬁnite-length vector and the DTFT of a sequence of inﬁnite length. The DFT is a ﬁnite vector of complex values obtained using a ﬁnite number of ﬂoating-point operations. The DTFT, on the other hand, is an inﬁnite sum computed over a continuum of frequencies, which • is rarely expressible in a simple closed form; and • involves, as a rule, an inﬁnite number of ﬂoating-point operations at each frequency. An obvious exception to the second statement is the class of signal sequences which contain only ﬁnitely many nonzero values. Deﬁnition 4.2.1. The signal sequence x has ﬁnite duration if there exist ﬁnite time indices n1 and n2 such that n < n1 or n > n2 ⇒ x[n] = 0

It has inﬁnite duration otherwise. The DTFT of a ﬁnite-duration sequence is computed by summing together ﬁnitely many terms. We begin by considering three simple examples of such sequences and their DTFT’s. Example 4.2.1. The unit impulse sequence (also known as unit sample sequence) is deﬁned by x[n] = δ[n] = Its DTFT is given by X(ejω ) = 1 · e−jω·0 = 1 Using the synthesis equation 4.4, we see that the unit impulse combines all sinusoids in the frequency range [0, 2π) with equal coeﬃcients: δ[n] = 1 2π
2π 0

1, n = 0; 0, n = 0

ejωn dω

221
x[n] = δ[n] 1 1 ω X(e jω )

0

n

0

Example 4.2.1

Example 4.2.2. The delayed impulse x[n] = δ[n − m] = is shown in the ﬁgure. 1, n = m; 0, otherwise

x[n] = δ[n-m] 1

0

m

n

Example 4.2.2

Its DTFT is given by X(ejω ) = e−jωm = cos ωm − j sin ωm and is complex valued except for m = 0. Example 4.2.3. The symmetric impulse pair x[n] = δ[n + m] + δ[n − m] has DTFT is given by X(ejω ) = e−jω(−m) + e−jωm = 2 cos mω This is a real sinusoid in ω having period 2π/m. It is plotted in the case m = 2.

222
x[n] = δ[n+m] + δ[n−m] 1 1 X(e jω ) 2

-m

0

m

n

0 −2

π

ω

Example 4.2.3

In Example 4.2.3, the time-domain signal was real-valued and symmetric about n = 0: x[n] = x[−n] The DTFT X(ejω ) was also real-valued and symmetric about ω = π. A similar relationship between signals in the time and frequency domains was encountered earlier in our discussion of the DFT. There is, in fact, a direct correspondence between the structural properties of the two transforms, and one set of properties can be derived from the other using standard substitution rules. In the case of the DTFT, symmetry in the time domain is about n = 0 (which is the middle index of the modiﬁed DFT X (L) [k] introduced in Subsection 4.1.2). Also, time-index reversal (involved in the deﬁnition of symmetry) is understood in linear terms, i.e., n −→ −n; circular time reversal is impossible on an inﬁnite time axis. The same is true for time delays of sequences, i.e., they are linear instead of circular. We know that delaying a vector in time results in multiplying its spectrum by a complex sinusoid. As it turns out, the same is true for sequences: Fact. (Time Delay Property of the DTFT) If y[n] = x[n−m], then Y (ejω ) = e−jωm X(ejω ). Proof. We use the analysis equation (4.4) with a change of summation index (n = n − m):

Y (e ) =
n=−∞ ∞

y[n]e−jωn x[n − m]e−jωn
n=−∞

=

223

=
n =−∞

x[n ]e−jω(n +m)

= e−jωm · = e
−jωm

x[n ]e−jωn
n =−∞ jω

X(e )

4.2.2

The DFT as a Sampled DTFT

We now turn to an important connection between the DTFT of a ﬁniteduration signal x and the DFT of a ﬁnite segment which contains all the nonzero values in x. Let x be given by x[n] = s[n], 0 ≤ n ≤ L − 1; 0, otherwise

s

x

0

L-1

0

L-1

Figure 4.2: Vector s and its two-sided inﬁnite zero-padded extension x.

We see that x is obtained from the L-point vector s by padding s with inﬁnitely many zeros on both sides (as shown in Figure 4.2). The DFT S of s is given by
L−1 L−1

S[k] =
n=0

s[n]e

−j(2π/L)kn

=
n=0

x[n]e−j(2π/L)kn = X(ej(2π/L)k )

i.e., it is obtained by sampling the DTFT X(ejω ) of x at the Fourier frequencies for an L-point sample. Similarly, the DFT of the N -point zero-padded extension of s is obtained by sampling X(ejω ) at the Fourier frequencies for an N -point sample. As N increases, the set of Fourier frequencies becomes denser (since the spacing equals 2π/N ), and the DFT of the zero-padded vector provides uniformly spaced samples of X(ejω ) at a higher resolution.

224 If the vector s appears in the sequence x with a delay of m time units, i.e., x[n] = s[n − m], m ≤ n ≤ m + L − 1; 0, otherwise,

then by the time delay property established earlier, the DFT S is obtained by sampling ejωm X(ejω ) at frequencies ω = k(2π/L). Also, a dense plot of the DTFT X(ejω ) can be obtained from the DFT of a zero-padded extension of s by multiplying each entry (of the DFT vector) by the corresponding sample of e−jωm . Example 4.2.4. Each of the ﬁnite-duration sequences x, y and u is a twosided zero-padded extension of an eight-point vector s delayed by a diﬀerent amount. Speciﬁcally, let x[0 : 7] = y[3 : 10] = u[−5 : 2] = s as illustrated in the ﬁgure.

x

0 s 0 u 0
Example 4.2.4

y

Suppose we are interested in computing the DTFT’s X(ejω ), Y (ejω ) and U (ejω ) with frequency resolution of 0.01 cycle/sample, i.e., for ω equal to multiples of 2π/100. For that purpose, it suﬃces to compute 100-point DFT’s. Using the standard MATLAB syntax fft(s,N) for the DFT of the N -point zero-padded extension of s, we have: k = (0:99).’ ; % s also assumed column vector

*(v.3 Signals of Inﬁnite Duration An inﬁnite-duration signal sequence x has the property that x[n] = 0 for inﬁnitely many values of n. and • periodic signals. As a result. for example.5) Two important types of inﬁnite-duration signals which violate the above condition are: • sinusoids (real or complex) of any frequency.2. (Alternatively.^(-3)) ..^5) . X. .225 v X Y U = = = = exp(j*2*pi*k/100) . when the magnitudes of the summands add up to a ﬁnite value: ∞ ∞ |x[n]| · |e n=−∞ −jωn |= n=−∞ |x[n]| < ∞ (4. e. at these two frequencies. in other words. since every ω in [0.3. Thus. Fortunately. We say that x[n] has a discrete spectrum with components. the analysis equation is not needed in order to derive the spectrum of either type of signal. the analysis equation ∞ X(e ) = n=−∞ jω x[n]e−jωn involves an inﬁnite sum of complex-valued terms. The expression is meaningful only when the sum converges absolutely. fft(s.100) . each of Y and U is the DFT of a circular shift on the zeropadded extension of s. A real or complex sinusoid of requires no further frequency analysis. X. 2π) is a valid Fourier frequency for the purpose of representing a sequence. we use a vertical line for each sinusoidal component. as shown in Figure 4.) 4.g. x[n] = A cos(ω0 n + φ) = A jφ jω0 n A −jφ −jω0 n e e + e e 2 2 is a sum of two complex sinusoids at frequencies ω0 and 2π − ω0 . or lines. To plot a discrete spectrum against frequency.*(v.

then the synthesis equation s[n] = 1 L L−1 S[k]ej(2π/L)kn . From our discussion in Subsection 3.3: The spectrum of the real-valued sinusoid x[n] = A cos(ω0 n + φ) is shown for ω ∈ [0. k=0 n = 0. Even though the sum in the analysis equation fails to converge to a valid function X(ejω ). . 2π). . 0 L-1 -L 0 L 2L Figure 4. Periodic sequences can be also represented in terms of sinusoids without recourse to the analysis equation. we know that if S is the (Lpoint) DFT of s. Note that (A/2)ejφ and (A/2)e−jφ are real-valued only when φ = 0 or φ = π. L − 1 . so that the integral in the synthesis equation provides the correct time-domain signal. s x .4: An inﬁnite periodic sequence as the two-sided periodic extension of its ﬁrst period. This representation is beyond the scope of the present discussion.7. Remark... . This is illustrated in Figure 4. . This is because a sequence x of period L is the inﬁnite two-sided periodic extension of the L-point vector s = x[0 : L − 1] .4.2..226 X(e jω ) (A/2)e jφ (A/2)e -jφ 0 ω0 π 2π−ω0 2π Figure 4. it is still possible to express a discrete spectrum mathematically using a special type of function (of ω).. .

k=0 n∈Z and we have established the following. The spectrum of a periodic sequence is illustrated in Figure 4. 2π). A periodic sequence x of period L is the sum of L sinusoidal components at frequencies which are multiples of 2π/L.5).5. both of which violate the convergence condition (4. The coeﬃcients of these sinusoids are given by the DFT of x[0 : L − 1] scaled by 1/L. Thus 1 x[n] = L L−1 S[k]ej(2π/L)kn . shown for ω ∈ [0. Fact. This signal is shown (for 0 < a < 1) together with the cases a = 1 and a > 1. X(e jω ) 0 2π/L ω 2π Figure 4. The spectrum X(ejω ) also consists of L lines at the above-mentioned frequencies. For |a| < 1. n < 0 where |a| < 1.5).5: The spectrum of a periodic signal of period L. Let x be a decaying exponential in positive time: x[n] = an . n ≥ 0.2.5. Example 4. we have ∞ X(ejω ) = n=0 ∞ an e−jωn (ae−jω )n n=0 = = 1 1 − ae−jω . We conclude our discussion with an example of an inﬁnite-duration signal which satisﬁes the convergence condition (4. 0.227 also returns x[n] for values of n outside the range 0 : L − 1.

1.2. The condition |ae−jω | = |a| < 1 guarantees convergence of the inﬁnite sum. .5 where the last expression is the formula for the inﬁnite geometric sum from Subsection 3.3.228 0<a<1 3 2 1 0 −1 3 2 1 0 −1 a=1 3 2 1 0 −1 a>1 −3−2−1 0 1 2 3 4 5 6 −3−2−1 0 1 2 3 4 5 6 −3−2−1 0 1 2 3 4 5 6 Example 4.

e. Time-Invariance and Causality In our earlier discussion of the DFT and its applications..g. we implicitly assume that the linear ﬁlter is active at all times—this is despite the fact that all signals encountered in practice have ﬁnite duration.6: A linear ﬁlter. Identifying these components in the DFT allows us to reconstitute the signal (via an inverse DFT) in a more selective manner. radar return signals in a particular frequency band.3 4. This type of signal transformation is known as frequencyselective ﬁltering and can be applied to any segment of the signal. Thus a linear ﬁlter H is a linear system whose input and output are related by y = H(x) x Linear Filter H y Figure 4.. The remainder of this chapter is devoted to basic concepts needed for the analysis of linear ﬁlters and their operation. In many applications. a power supply at 50 Hz).g. Since the time index for a signal sequence varies from n = −∞ to n = +∞. e. by retaining desirable frequencies while eliminating undesirable ones.229 4.. or interference signals from a known source (e. we saw how the main sinusoidal components of a signal manifest themselves in the DFT of a suitably long segment of that signal. once the frequencies of interest have been determined. The term linear system. used earlier for a linear transformation. Frequency-selective ﬁltering can be performed more directly and eﬃciently without use of DFT’s..1 Introduction to Linear Filters Linearity. Since the objective of ﬁltering is to retain certain sinusoidal components . these frequencies are known ahead of time. We call y the response of the ﬁlter to the input x. In such cases. A linear ﬁlter is a special type of a linear transformation of an input signal sequence x to an output sequence y.g. at the rate at which the original signal is being sampled or recorded. and with minimal delay.3. it is possible to use a so-called linear ﬁlter to generate the ﬁltered signal in real time—i. e. is also applicable to a linear ﬁlter.

This means that the current output y[n] depends only on present and past values of the input sequence—in this particular case.. n∈Z (4. As n increases. Causality is an option which becomes necessary for ﬁlters operating in real time. If. The output y[n] is a linear combination of x[n]..6) Recall our earlier deﬁnition of a linear transformation: y(1) = H(x(1) ) y(2) = H(x(2) ) ⇒ c1 y(1) + c2 y(2) = H(c1 x(1) + c2 x(2) ) It is straightforward to show that the transformation H deﬁned by (4. x[n−1] and x[n−2].. i.e. on the other hand. We begin our discussion of ﬁlters by exploring that relationship in a very simple example. respectively. the time window containing the three most recent values of the input slides to the right. n represents a parameter other than time (e. In addition.6) satisﬁes the above deﬁnition of linearity. This is an essential constraint on all systems operating in real time. we illustrate the operation of the ﬁlter in Figure 4. The deﬁning properties of a linear ﬁlter are linearity and time invariance. Consider the transformation H described by the input-output relationship y[n] = x[n] − x[n − 1] + x[n − 2] .7. • Causality. . Before proceeding with the analysis of the ﬁlter described by (4. Writing (4. H has the following properties: • Time-Invariance. the same coeﬃcients (1.e. x[n − 1] and x[n − 2] does not change with the time index n.g. −1 and 1) are used to combine the three most recent input samples at any time n. causality is not a crucial constraint. x[n]. a spatial index in an image or a pointer in an array of recorded data). this means that the relationship between the output sample y[n] and the input samples x[n].6). while the values of the coeﬃcients remain the same. x[n − 1] and x[n − 2] with coeﬃcients 1. In the context of the above example. an essential property of the transformation H is the relationship between the spectra (i. where future values of the input are unavailable.6) in the form y[·] = x[·] − x[· − 1] + x[· − 2] better illustrates this feature.230 while eliminating others. DTFT’s) of the sequences x and y. −1 and 1.

consider a purely sinusoidal input at frequency ω: x[n] = ejωn . 4.7: Illustration of the operation of the ﬁlter y[n] = x[n] − x[n − 1] + x[n − 2]. The output sequence is then given by y[n] = ejωn − ejω(n−1) + ejω(n−2) = (1 − e−jω + e−j2ω ) · ejωn = (1 − e−jω + e−j2ω ) · x[n] for every value of n..e.3. Note that y is obtained by scaling each sample in x by the same (i. independent of time) amount.231 x n 1 y -1 1 n Figure 4.6) determines the relationship between the spectra of the input and output sequences x and y. Since a complex scaling factor is equivalent to a change of amplitude and a shift in phase. we have illustrated the following important property of linear ﬁlters. n∈Z .2 Sinusoidal Inputs and Frequency Response In order to understand how the time-domain equation (4.

Since the ﬁlter is linear. From (4. In the ﬁrst case. the ωk ’s) as for the input sequence x. n ∈ Z (4. Nonlinear ﬁlters distort pure sinusoids by introducing components at other frequencies. the response y will be the sum of the responses to each of the sinusoids on the right-hand side. In other words. complex-valued. n ∈ Z ⇒ y[n] = H(ejω )ejωn .7). this eﬀect is known as harmonic distortion. which includes all periodic sequences. When processed by a linear ﬁlter. In this case.232 Fact. but with coeﬃcients scaled by the corresponding value of the frequency response H(ejωk ). we know that the response to ejωk n is given by H(ejωk )ejωk n . y[n] = k Yk ejωk n . The function H(ejω ) is known as the frequency response of the ﬁlter. H(ejω ) = 1 − e−jω + e−j2ω We can thus write x[n] = ejωn . in general. We thus obtain y[n] = H(ejωk )Xk ejωk n k We conclude that the output sequence y also has a discrete spectrum consisting of lines at the same frequencies (i. The coeﬃcients Xk are.. We may assume that the sequence x can be expressed as a sum of sinusoidal components. where the summation is either (i) discrete or (ii) continuous. pure sinusoids undergo no distortion other than a change in amplitude and a shift in phase. notably multiples of the input frequency.7) We are now in a position to explore the relationship between an arbitrary input sequence x and the corresponding output sequence y = H(x) by shifting our focus to the frequency domain. we have x[n] = k Xk ejωk n where the sum is over a discrete set of frequencies ωk . Remark. The fact stated above can be proved for (almost) any linear ﬁlter using a similar factorization of the output y[n] in terms of the input x[n] = ejωn and a a frequency-dependent complex scaling factor H(ejω ).e.

Fact. Figure 4.3 Amplitude and Phase Response We return to our basic example y[n] = x[n] − x[n − 1] + x[n − 2] .2 between input and output spectra. then so is the output spectrum. we have (by the synthesis equation for the inverse DTFT) a representation of the ﬁlter input as x[n] = 1 2π 2π 0 X(ejω )ejωn dω i.3..e. the complex spectrum of the output signal is given by the product of the complex spectrum of the input signal and the ﬁlter frequency response: Y (ejω ) = H(ejω )X(ejω ) If the input spectrum is discrete. 4. 2π).233 where Yk = H(ejωk )Xk In the second case.3. x is a linear combination of complex sinusoids whose frequencies range over the continuous interval [0. This important result is stated below. given by H(ejω )X(ejω ).8 illustrates the relationship 4. n∈Z and compute the magnitude and angle of the frequency response H(ejω ) = 1 − e−jω + e−j2ω . For a linear ﬁlter. and the same relationship holds with X(ejω ) and Y (ejω ) replaced by the coeﬃcients in the two spectra. Each of these sinusoids is scaled by the corresponding value of the frequency response H(ejω ) when processed by the ﬁlter. in fact. and linearity of the ﬁlter implies that y[n] = 1 2π 2π 0 H(ejω )X(ejω )ejωn dω Comparing this expression for y[n] with one provided by the synthesis equation 2π 1 y[n] = Y (ejω )ejωn dω 2π 0 we conclude that Y (ejω ) is.

8: Input-output relationship in the frequency domain.e. (−1)n will undergo the most ampliﬁcation (or least attenuation). will result in y[n] = 0 for all n.234 X(e jω ) H(e jω ) Y(e jω ) = H(e jω )X(e jω ) Figure 4. we see that |H(ejω )| is symmetric about ω = π. and has two zeros at ω = π/3 and ω = 5π/3. the amplitude response achieves its maximum at ω = π. it exhibits the same kind of conjugate symmetry as does the DFT of a real-valued vector. i. For the magnitude |H(ejω )| (also known as the amplitude response of the ﬁlter). This is due to the fact that H(ejω ) is the DTFT of a real-valued sequence. we have H(ejω ) 2 = H ∗ (ejω )H(ejω ) = (1 − e−jω + e−j2ω )(1 − ejω + ej2ω ) = 3 − 2(ejω + e−jω ) + (ej2ω + e−j2ω ) = 3 − 4 cos ω + 2 cos 2ω and thus |H(ejω )| = (3 − 4 cos ω + 2 cos 2ω)1/2 Since cos(nπ + θ) = cos(nπ − θ). This implies that: • among all possible complex sinusoidal inputs. ej(5π/3)n and cos(πn/3 + φ) will all be nulled out. Note that in this case.. namely δ[n] − δ[n − 1] + δ[n − 2] and as such. The symmetry of the ﬁlter coeﬃcients also allows us to obtain |H(ejω )| in a more direct way: H(ejω ) = e−jω (ejω − 1 + e−jω ) = e−jω (2 cos ω − 1) . • sinusoidal inputs such as ej(π/3)n .

Frequencies and phases are normalized by 2π and π. ∠H(e ) =  −ω. the second term equals 0 (when the number is positive) or π (when the number is negative). Amplitude Response 3 1 0. 5π/3 < ω < 2π. jω −ω + π. In this case.5 ω/2π 1 0 0. we obtain |H(ejω )| = |2 cos ω − 1| which is equivalent to the earlier expression (as can be shown by using the identity cos 2ω = 2 cos2 ω − 1). The angle ∠H(ejω ) is known as the phase response of the ﬁlter.  −ω.5 2 0 1 −0. π/3 ≤ ω ≤ 5π/3. .e.. The amplitude and phase responses are plotted in Figure 4. ∠H(ejω ) = −∠H(ej(π−ω) ).235 Since |e−jω | = 1. we have ∠H(ejω ) = −ω + ∠(2 cos ω − 1) Since 2 cos ω−1 is a real number. which is also due to the fact that the ﬁlter coeﬃcients are real-valued.9.5 ω/2π 1 Figure 4. against cyclic frequency ω/2π.5 0 −1 Phase Response (rad/π) 0 0.9: Amplitude response (left) and phase response (right) of the ﬁlter y[n] = x[n]−x[n−1]+x[n−2]. i. respectively. We thus have  0 ≤ ω < π/3. Note that the phase response is antisymmetric about ω = π.

i. we saw that the response of the linear ﬁlter H to the complex sinusoidal input x[n] = ejωn .e. Writing z in the standard polar form z = rejω we have x[n] = rn ejωn Thus a complex sinusoid is a special case of a complex exponential where r = 1. • is constant if |z| = 1. |x[n]| = rn · ejωn = rn and thus the magnitude of x[n] • increases geometrically (in n) if |z| > 1.. n∈Z where z is any complex number other than z = 0 (which would give z n = ∞ for n < 0). • decreases geometrically if |z| < 1. Also. n∈Z In other words. the input and output signal sequences diﬀer by a complex scaling factor which depends on the frequency ω: y = H(ejω ) · x This factor was called the frequency response of the ﬁlter. n∈Z .236 4.4 4. equals y[n] = H(ejω )ejωn . This scaling (or proportionality) relationship between the input and output sequences can be generalized to a class of signals known as complex exponentials: x[n] = z n .4. z lies on the unit circle.1 Linear Filtering in the Frequency Domain System Function and its Relationship to Frequency Response In the previous section.

8) n∈Z The fact established in Subsection 4..2. . namely that x[n] = ejωn .237 |z1| z1 z2 1 . i. where z is complex...10.3.10: Three choices of z (left) and the corresponding complex exponentials in magnitude (right). n ∈ Z ⇒ y[n] = H(ejω )ejωn . n ∈ Z (4.. . n n n z3 n Figure 4.3. Consider the ﬁlter introduced in Section 4. The function H(z). ....... z = ejω . The response of the ﬁlter to x[n] = z n equals y[n] = z n − z n−1 + z n−2 = (1 − z −1 + z −2 )z n = (1 − z −1 + z −2 ) · x[n] Letting H(z) = 1 − z −1 + z −2 we can restate the above result as follows: x[n] = z n . namely y[n] = x[n] − x[n − 1] + x[n − 2] . of the ﬁlter H. or transfer function.. n ∈ Z is a special case of (4. n |z2| .8) where z = ejωn . The frequency response of the ﬁlter is obtained from H(z) by taking z on the unit circle. is known as the system function... n ∈ Z ⇒ y[n] = H(z)z n . as illustrated in Figure 4. . n |z3| .e.

which results in an output y (1) [n] = H1 (z)z n This scaled complex exponential is the input to the second ﬁlter H2 . x is also the input to the ﬁrst ﬁlter H1 . the same system function is obtained if H2 is followed by H1 . . and thus y[n] = y (2) [n] = H1 (z)H2 (z)z n We have established the following.238 4.2 Cascaded Filters The cascade (also referred to as series or tandem) connection of two ﬁlters H1 and H2 is illustrated in Figure 4. the frequency response of H is given by H(ejω ) = H1 (ejω )H2 (ejω ) Note that the order in which the two ﬁlters are cascaded is immaterial. A linear ﬁlter is a special type of linear transformation which is invariant in time.8). The cascade connection H of two ﬁlters H1 and H2 has system function given by H(z) = H1 (z)H2 (z) Similarly.11: Cascade connection of two ﬁlters. From (4.4. At each time instant. we showed that the cascade (A ◦ B)(·) of two linear transformations A(·) and B(·) is represented by the matrix product AB (where A and B are the matrices corresponding to A(·) and B(·). the output of H1 serves as the input to H2 . Fact. In this case. In Chapter 2. H (1) (1) (2) (2) x=x H1 y =x H2 y =y Figure 4. The cascade of two such ﬁlters is also a linear ﬁlter H whose system function H(z) bears a particularly simple relationship to the system functions H1 (z) and H2 (z) of the constituent ﬁlters. respectively). we know that H(z) is the scaling factor between the input and output sequences x and y when the former (input) is given by x[n] = z n .11.

4. There is nothing special about choosing three input samples. with coeﬃcients which are constant in time.1. and thus the system function of the cascade is given by H(z) = H1 (z)H2 (z) = 1 − 2z −1 + 3z −2 − 2z −3 + z −4 Similarly. Consider a cascade of two identical ﬁlters. H(ejω ) = H1 (ejω )H2 (ejω ) = 1 − 2e−jω + 3e−j2ω − 2e−j3ω + e−j4ω = e−j2ω (3 − 4 cos ω + 2 cos 2ω) = e−j2ω (1 − 2 cos ω)2 4. H1 (z) = H2 (z) = 1 − z −1 + z −2 . The system function of the FIR ﬁlter deﬁned by (4.1. The term impulse response will be deﬁned shortly. Thus y (1) [n] = x[n] − x[n − 1] + x[n − 2] y[n] = y (1) [n] − y (1) [n − 1] + y (1) [n − 2] In this case.239 Example 4. we can draw the same conclusions about the ﬁlter given by the equation y[n] = b0 x[n] + b1 x[n − 1] + · · · + bM x[n − M ] .6). The qualiﬁer ﬁnite in the above deﬁnition reﬂects the fact that a ﬁnite number of input samples are combined to form a particular output sample.4.9) is given by H(z) = b0 + b1 z −1 + · · · + bM z −M . A ﬁlter deﬁned by (4. n∈Z (4.3 The General Finite Impulse Response Filter We introduced linear ﬁlters by considering a particular example where the ﬁlter output at any particular instant is formed by linearly combining the three most recent values of the input.9) where the M + 1 most recent values of the input are combined to form the output at any given time. Deﬁnition 4.4.9) is known as a ﬁnite impulse response (FIR) ﬁlter of order M . whose inputoutput relationship is given by (4.

3. all we need to do is compute the DFT of the (M + 1)-point vector b zero-padded to length N . the DTFT of the ﬁnite-duration sequence h deﬁned by h[n] = bn .3 is shown in Figure 4.9 in the previous section are the graphs of abs(H) and angle(H).12.-1. Thus for the ﬁlter considered in Section 4. 0.12: The coeﬃcient vector (left) and the impulse response sequence (right) of the FIR ﬁlter y[n] = x[n] − x[n − 1] + x[n − 2].e. The coeﬃcient vector b and the impulse response h for the example of Section 4. otherwise. The relationship between b and the frequency response H(ejω ) suggests a convenient way of computing H(ejω ) based on the DFT of b. 2π). the frequency response at N = 256 evenly spaced frequencies can be computed in MATLAB using the command H = fft([1. respectively. .. we see that H(ejω ) is.240 and the frequency response is obtained by letting z = ejω : H(ejω ) = b0 + b1 e−jω + · · · + bM e−jωM Comparing the above equation for H(ejω ) with the deﬁnition of the DTFT. b 1 1 h 1 1 0 -1 -1 2 Figure 4. i. The sequence h is known as the impulse response of the ﬁlter. As we shall see soon. 0 ≤ n ≤ M . To evaluate H(ejω ) at N ≥ M + 1 evenly spaced frequencies in [0.256) The amplitude and phase responses in Figure 4. in fact.1]. to the signal x[n] = δ[n]. it is the response of the ﬁlter to a unit impulse.

we can write the signal in terms of two complex sinusoids as x[n] = A jφ jω0 n A −jφ −jω0 n e e + e e 2 2 (In other words. we provide a partial answer to this question for two special types of inputs. n∈Z Using the identity ejθ + e−jθ = 2 cos θ. for the last equality. It therefore pays to examine how a simple component such as an FIR ﬁlter (given by the above equation) can be implemented eﬃciently. a new value of the input x[n] is read into a buﬀer.4.e. computational eﬃciency is a key consideration.) By linearity of the ﬁlter. both of which are readily expressed in terms of complex exponentials or complex sinusoids.. A third class of such input signals will be considered in the following subsection. x has a discrete spectrum with lines at frequencies ω and 2π − ω.4 Frequency Domain Approach to Computing Filter Response The input-output relationship y[n] = b0 x[n] + b1 x[n − 1] + · · · + bM x[n − M ] provides us with a direct way of implementing an FIR ﬁlter. In applications where a single processor is used to perform many signal processing tasks in parallel.241 4. At every instant n. we also used the fact the the ﬁlter coeﬃcients bi are real. The ﬁrst type of input we consider is a real-valued sinusoid. We thus have y[n] = A jφ A e H(ejω0 )ejω0 n + e−jφ H ∗ (ejω0 )e−jω0 n 2 2 . x[n] = A cos(ω0 n + φ) . In this subsection. and the output y[n] is computed using M + 1 multiplications and M additions. i. an old value (namely x[n − M − 1]) is shifted out of the same buﬀer. the output is given by y[n] = A jφ A e H(ejω0 )ejω0 n + e−jφ H(e−jω0 )e−jω0 n 2 2 At this point. we note that H(e−jω ) = b0 + b1 ejω + · · · + bM ejωM = b0 + b1 e−jω + · · · + bM e−jωM ∗ = H ∗ (ejω ) where.

Using the fact established above. We have H(ej(2π/3) ) = e−j(2π/3) (2 cos(2π/3) − 1) = −2e−j(2π/3) = 2ej(π/3) and thus |H(ej(2π/3) )| = 2.242 The two terms on the right-hand side are complex conjugates of each other. introduced in Section 4. we conclude the following. 3 12 n∈Z n∈Z .3. n∈Z be the input to the second-order FIR ﬁlter y[n] = x[n] − x[n − 1] + x[n − 2] . and thus the identity z + z ∗ = 2 e{z} gives y[n] = e AH(ejω0 )ej(ω0 n+φ) An alternative expression can be written using the amplitude and phase responses H(ejω ) and ∠H(ejω ): H(ejω ) = H(ejω ) ej∠H(e and thus y[n] = e A H(ejω0 ) ej(ω0 n+φ+j∠H(e jω0 )) jω ) = A H(ejω0 ) cos ω0 n + φ + ∠H(ejω0 ) Comparing the resulting expression with x[n] = A cos(ω0 n + φ). Example 4. while the phase shift (of the output relative to the input) is given by the phase response at the same frequency. The response of a linear ﬁlter to a real-valued sinusoid of frequency ω0 is a sinusoid of the same frequency. we obtain y[n] = 2 cos = 2 cos 2πn π π − + 3 4 3 2πn π + .2. The gain (ratio of output to input amplitude) is given by the ﬁlter amplitude response at ω0 .4. Let x[n] = cos 2πn π − 3 4 . Fact. ∠H(ej(2π/3) ) = π/3.

3.472 = 10. and the real oscillating exponential x[n] = Arn cos(ω0 n + φ) . In each case.4..326 y (2) [n] = 10.793 · ej1.5 Response to a Periodic Input The third type of input x for which the ﬁlter response y can be readily computed using a frequency domain-based approach (i.4.614 + j10. the output can be computed in terms of the system function H(z) as follows: x[n] = Arn =⇒ y[n] = AH(r)rn x[n] = Arn cos(ω0 n + φ) =⇒ y[n] = A H(rejω0 ) rn cos ω0 n + φ + ∠H(rejω0 ) Example 4.243 The foregoing analysis can be also extended to the real exponential x[n] = Arn .e. with two diﬀerent input sequences: x(1) [n] = 3n x(2) [n] = 3n cos The ﬁlter system function is given by H(z) = 1 − z −1 + z −2 and thus H(3) = 7/9 H(3e It follows that y (1) [n] = 7 n ·3 9 2πn + 1.326 3 j(2π/3) 2πn 3 ) = 2. Consider the same ﬁlter as in Section 4. based on H(ejω )) is a periodic sequence of period (say) L: x[n + L] = x[n] . n∈Z .793 · 3n · cos 4.3. n∈Z n∈Z where A and r are real-valued.

e. the output sequence y is given by 1 y[n] = L L−1 H ej(2π/L)k X [k]ej(2π/L)kn k=0 Since y is also periodic with period L. using the DFT of b after zero-padding to a multiple of L. Comparing the two expressions for y[n].244 Before we explain how H(ejω ) can be used to compute y. we have (similarly to x) 1 y[n] = L L−1 Y[k]ej(2π/L)kn k=0 where Y[·] is the DFT of y[0 : L − 1]. It therefore suﬃces to compute y[0 : L − 1]. it describes the entire sequence x.. the DFT of y[0 : L − 1] is obtained by multiplying each entry of the DFT of x[0 : L − 1] by the frequency response of the ﬁlter at the corresponding Fourier frequency.. • Compute H(ejω ) at the same Fourier frequencies. By linearity of the ﬁlter.e. . then the equation 1 x[n] = L L−1 X [k]ej(2π/L)kn k=0 holds for every n. which will be referred to as the ﬁrst period of y. if the ﬁrst period x[0 : L − 1] of x has DFT X [·]. The frequency domain approach to computing y[0 : L − 1] exploits the fact that the periodic input signal x is the sum of L sinusoids at frequencies which are multiples of 2π/L. i.2. This observation leads to the following algorithm for computing y[0 : L − 1] given x[0 : L − 1] and the coeﬃcient vector b: • Compute the DFT of x[0 : L − 1].3.g. we note that y[n + L] = b0 x[n + L] + b1 x[n + L − 1] + · · · + bM x[n + L − M ] = b0 x[n] + b1 x[n − 1] + · · · + bM x[n − M ] = y[n] and thus the response is also periodic with period L. we see that Y[k] = H ej(2π/L)k X [k] In other words. As was discussed in Subsection 4.

circular convolution is also known as periodic convolution. we see that the ﬁrst period of the output signal is. assuming that M + 1 is satisﬁes (J − 1)L < M + 1 ≤ JL for some integer J > 1. • Invert the DFT of the resulting vector to obtain y[0 : L − 1].4. Recalling that element-wise multiplication of two DFT’s of the same length is equivalent to circular convolution in the time domain.’. computed by circularly convolving the ﬁrst period of the input signal with the zero-padded (to length L) coeﬃcient vector.L) would lead to truncation of b. resulting in an error.. If the length of the coeﬃcient vector is greater than the input period L. and such that T x[0 : 6] = 2 5 3 4 −7 −4 −1 The ﬁrst period of the output sequence is computed in MATLAB as follows: x b X H Y y = = = = = = [2 5 3 4 -7 -4 -1].e.7). in eﬀect. the coeﬃcient vector was shorter than the ﬁrst input period.5).245 • Multiply the two vectors element-by-element. By zero-padding the coeﬃcient vector b to length L and taking its DFT. fft(b. Example 4. i. Suppose that the input sequence x is periodic with period L = 7.4. then the algorithm outlined in Example 4. This fact can be established independently by examining the time-domain expression M y[n] = k=0 bk x[n − k] in the special case where x is periodic with period L. H. [1 -1 1]. M + 1 < L. Consider the second-order ﬁlter introduced in Section 4. we obtained precisely those values of the frequency response H(ejω ) which were needed in order to compute the product in (4.3.’ .4.4. are: . fft(x). Two possible modiﬁcations. For this reason.4 needs to be modiﬁed— otherwise the command H = fft(b. ifft(Y) y[0 : 6] = % filter coefficient vector resulting in −1 2 0 6 −8 7 −4 T In the example above.*X.

246 • Zero-pad b to length JL. . then extract every J th entry of the resulting DFT. • Use the ﬁrst L periods of x (in conjunction with zero-padding b to length JL). This approach is also equivalent to a circular convolution in the time domain (with vectors of length JL). Or. in which case the answer will consist of the ﬁrst L periods of y.

π] has a symmetric band in [−π. Remark. as well. This is illustrated in Figure 4.5 4. For ﬁlters with real-valued coeﬃcients operating in discrete time. thus any frequency band in [0. The range of frequencies which are preserved is known as the ﬁlter passband..) Thus it suﬃces to specify H(ejω ) over [0.14. which is also the eﬀective range of frequencies for real-valued sinusoids. π] (dark) and its symmetric bands (light). highpass. π]. Any frequency band (i.. bandpass and bandstop. The frequency characteristics (i.e. interval) in [0.247 4. π].10) also implies that H(ejω ) has conjugate symmetry about ω = 0. passband and stopband) of each type are illustrated in Figure 4. while those that are rejected are referred to as the stopband. shown on the unit circle (left) and the frequency axis (right). Equation (4.5.e.13. π] has a symmetric band in [π. −π 0 π 2π Figure 4. 0]. it is customary to specify frequency bands as subintervals of [0.1 Frequency-Selective Filters Main Filter Types We motivated our discussion of linear ﬁlters using the concept of frequency selection.10) (This is equivalent to symmetry in the amplitude response and antisymmetry in the phase response. A frequency-selective ﬁlter reconstructs at its output certain desirable sinusoidal components present in the input signal. . This is because the ﬁlter frequency response H(ejω ) has conjugate symmetry about ω = π: H(ej(2π−ω) ) = H(e−jω ) = H ∗ (ejω ) (4. while attenuating or eliminating others. The four main types of frequency selective ﬁlters are: lowpass.13: A frequency band in [0. 2π].

The ﬁlter frequency response is shown in Figure 4. . π]). plotted over a full cycle (−π. π].2 Ideal Filters An (zero-delay) ideal ﬁlter has the following properties: • the passband and stopband are complements of each other (with respect to [0. 4. We illustrate some general properties of ideal ﬁlters by considering the ideal lowpass ﬁlter.5.15: Frequency response of an ideal lowpass ﬁlter.14: Passbands and stopbands of frequency-selective ﬁlters. and • the frequency response is zero over the stopband. The edge of the passband is known as the cutoﬀ frequency ωc .248 PASS 0 STOP π Lowpass 0 STOP PASS π Highpass STOP 0 PASS STOP π 0 PASS STOP PASS π Bandpass Bandstop Figure 4. • the frequency response is constant over the passband.15. H(e jω ) A −π −ωc 0 ωc π Figure 4.

5. These values range from A(1−δ) to A(1+δ) . For any cutoﬀ frequency ωc < π. Similar conclusions can be drawn for other ideal frequency-selective ﬁlters (highpass. • The passband is the interval of frequencies over which the amplitude response takes high values. we see that inﬁnitely many future values of the input signal x are needed in order to compute the output at any time n.17 shows the main features of the amplitude response |H(ejω )| of a lowpass ﬁlter used in practice. 4. This is clearly infeasible in practice (i. bandpass and bandstop).16 for the case ωc = π/4. h is a sequence of inﬁnite duration.3 Amplitude Response of Practical Filters Figure 4.15. zero response over the stopband. By examining the inputoutput relationship ∞ y[n] = k=−∞ h[k]x[n − k] . ﬁlters with perfectly ﬂat response over the passband. We thus have h[n] = = = = = = 1 2π 1 2π A 2π 2π 0 H(ejω )ejωn dω H(ejω )ejωn dω ejωn dω ωc −ωc π −π ωc −ωc A ejωn 2π jn A ejωc n − e−jωc n · 2π jn A sin(nωc ) πn The impulse response h is plotted in Figure 4. (with A=1). extending from n = −∞ to n = +∞..e. In short. Practical lowpass ﬁlters have frequency responses that necessarily deviate from the ideal form shown in Figure 4.249 The impulse response h of the ideal lowpass ﬁlter is given by the inverse DTFT of its frequency response H(ejω ). and inﬁnitely steep edges at the cutoﬀ frequencies are not practically realizable. the ﬁlter is “irrecoverably” noncausal).

1 −20 −10 0 10 20 Figure 4. (as shown in Figure 4. and the ﬂuctucation in value is known as (passband) ripple.17).25 0.15 0.2 0.17: Amplitude response of a practical lowpass ﬁlter.250 0. A(1+δ) A(1−δ) |H(exp(jω))| Aε 0 ω p ω s π Figure 4.05 0 −0. .1 0. plotted for n = −15 to n = 15.16: Impulse response h[n] of an ideal lowpass ﬁlter with cutoﬀ frequency ωc = π/4.05 −0. The edge (endpoint) of the passband ωp is the highest frequency for which |H(ejω )| = A(1 − δ).

0 ≤ n ≤ M . Again. (Note that an ideal ﬁlter has δ = 0.251 • The stopband is the interval of frequencies over which the amplitude response takes low values. otherwise.17. bandpass and bandstop) are deﬁned similarly. the ﬂuctuation in the value of |H(ejω )| is referred to as (stopband) ripple. the resulting impulse response sequence bn . 4. In the case where M is even-valued.3 by examining the the second-order ﬁlter y[n] = x[n] − x[n − 1] + x[n − 2] . the details of which are beyond the scope of our discussion. i.4 Practical FIR Filters We began our discussion of FIR ﬁlters in Section 4. whose frequency response H(ejω ) = e−jω (2 cos ω − 1) crudely approximates that of a bandstop ﬁlter. delayed in time by M/2 units. it is necessary to combine more values of the input signal. these range from 0 to in Figure 4.) The parameters of other ﬁlter types (highpass. ωs ) separating the passband from the stopband. • The transition band is the interval (ωp . = 0. To obtain ﬁlters with good frequency responses. The ratio midpoint of amplitude range over passband A 1 = = maximum of amplitude over stopband A is known as the stopband attenuation. There are several design techniques for frequency-selective FIR ﬁlters. h[n] = 0.e. The edge (endpoint) of the stopband ωs is the lowest frequency for which |H(ejω )| = A . resembles the impulse response sequence of an ideal ﬁlter over the time interval n = −M/2 and n = M/2. n∈Z . ωp = ωs = ωc .5. use a higher ﬁlter order M . and an empty transition band.. Odd values of M are also used in practice.

1.3 0. 0.18 shows the impulse response of a lowpass ﬁlter of order M = 17. We illustrate this property using two examples.e. i.1 −5 0 5 10 15 20 Figure 4. which is symmetric about n = 17/2.11) can be always written in the form H(ejω ) = e−j(M ω/2) F (ω) (4.4 0.3). bn = bM −n (4.1 0 −0.18: Impulse response of a lowpass ﬁlter of order M = 17..2 0. Example 4.5. As a result. Consider the FIR ﬁlter with coeﬃcient vector b= 1 −2 3 −2 1 T .12) where F (ω) is a real-valued function expressible as a sum of cosines.11) (This feature was also present in the second-order ﬁlter introduced in Section 4.252 The coeﬃcient vectors of FIR ﬁlters used for frequency selection typically have symmetry about n = M/2. their impulse response sequences are also symmetric about n = M/2: h[n] = h[M − n] Figure 4. The frequency response of an FIR ﬁlter satisfying (4.

12) allows us to express the amplitude and phase responses as H(ejω ) = |F (ω)| and ∠H(ejω ) = − Mω + 2 0. The other important class of ﬁlters (known as IIR.2. resulting in sinusoidal components reappearing at the output with variable delays. adding a linear function of frequency to the phase spectrum is equivalent to a delay in the time domain. those in the passband) are reconstructed at the output with the same delay. thus the phase response is exactly linear. π.5. In this case. Since F (ω) has no zeros over the passband. This is a very appealing property of FIR ﬁlters with symmetric coeﬃcients. F (ω) ≥ 0. The ﬁlter frequency response is given by H(ejω ) = 1 − 2e−jω + 3e−j2ω − 2e−j3ω + e−j4ω = e−j2ω ej2ω − 2ejω + 3 − 2e−jω + e−j2ω = e−j2ω 3 − 2(ejω + e−jω ) + (ej2ω + e−j2ω ) = e−j2ω (3 − 4 cos ω + 2 cos 2ω) and thus F (ω) = 3 − 4 cos ω + 2 cos 2ω Example 4. or inﬁnite impulse response) exhibits a nonlinear phase response. can severely change the shape of a pulse and can be particularly noticeable in images..2. equal to M/2 time units. F (ω) < 0. . This eﬀect. Note that the phase response is a linear function of ω with jumps of π occurring wherever F (ω) changes sign. Thus all frequency components of interest (i. no such discontinuities occur there.253 Here M = 4 (even).3). The coeﬃcient vector is given by T 1 −1 2 2 −1 1 b= and the ﬁlter frequency response is given by H(ejω ) = 1 − e−jω + 2e−j2ω + 2e−j3ω − e−j4ω + e−j5ω = e−j(5ω/2) (ej(5ω/2) − ej(3ω/2) + 2ej(ω/2) +2e−j(ω/2) − e−j(3ω/2) + e−j(5ω/2) ) = e−j(5ω/2) (4 cos(ω/2) − 2 cos(3ω/2) + 2 cos(5ω/2)) and thus F (ω) = 4 cos(ω/2) − 2 cos(3ω/2) + 2 cos(5ω/2) Equation (4. By the time delay property of the DTFT (Subsection 4.e. M = 5 (odd). known as phase distortion.

Fact.19. highpass. . then Y (ejω ) = X(ej(ω−ω0 ) ). then the above transformation will produce a bandpass ﬁlter provided ωs < ω0 < π − ωs . (Multiplication by a Complex Sinusoid in Time Domain) If y[n] = ejω0 n x[n]. These transformations are illustrated in Figure 4.254 4. This follows from the analysis equation: ∞ Y (e ) = n=−∞ ∞ jω y[n]e−jnω x[n]ejω0 n e−jnω n=−∞ ∞ = = = X(e x[n]e−jn(ω−ω0 ) n=−∞ j(ω−ω0 ) ) This property. applied to the impulse response sequence h and the frequency response H(ejω ).5. and vice versa. • Multiplication of h[n] by ejω0 n + e−jω0 n 2 produces a new frequency response equal to cos(ω0 n) = H(ej(ω−ω0 ) ) + H(ej(ω+ω0 ) ) 2 If the original ﬁlter is lowpass with passband and stopband edges at ω = ωp and ω = ωs .5 Filter Transformations Multiplication of a sequence by a complex sinusoid in the time domain results in a shift of its DTFT in frequency. bandpass and bandstop). For example: • Multiplication of h[n] by ejπn = (−1)n results in H(ejω ) being shifted in frequency by π radians—this transforms a lowpass ﬁlter into a highpass one. respectively. The center of the passband will be at ω = ω0 and its width will be about 2ωp . allows us to make certain straightforward transformations between frequency-selective ﬁlters of diﬀerent types (lowpass.

255 (i) A −2π −π 0 π 2π (ii) A −2π −π 0 π 2π (iii) A/2 −2π −π −ω 0 0 ω0 π 2π Figure 4. and lowpass (i) to bandpass (iii). .19: Multiplication of the impulse response by a sinusoid results in ﬁlter type transformation: lowpass (i) to highpass (ii).

256 4. We have seen that for certain classes of input signals (notably sinusoids.13) of a ﬁnite impulse response ﬁlter allows us to directly compute the response y of the ﬁlter to an arbitrary input sequence x. the output sequence can be computed eﬃciently using frequencydomain properties of the ﬁlter such as the system function H(z) and the frequency response H(ejω ). The products bk x[n − k] are computed. In most practical applications. . In Section 4. the time-reverse of the vector b is aligned with a window of the input signal spanning time indices n − M through n. In our discussion of the FIR frequency response and system function. the signals involved are not as structured.13) as follows: At each time n. then indices n − M through n are all (strictly) negative. The time-reverse of b is then shifted to the right by one time index.13) directly. etc. To see why that sequence was named so.1 Time Domain Computation of Filter Response Impulse Response The input-output relationship y[n] = b0 x[n] + b1 x[n − 1] + · · · + bM x[n − M ] (4. we described the operation of the FIR ﬁlter deﬁned by (4. exponentials and periodic signals).20 (ii).6. . a new window is formed.6 4. This implies that y[n] = b0 x[n] + b1 x[n − 1] + · · · + bM x[n − M ] = 0 . or else have “nice” closed forms which enable us to invert the equation Y (ejω ) = H(ejω )X(ejω ) with ease. and the computation is repeated to yield y[n + 1]. and thus x[n − M ] = . and it becomes necessary to use the equation (4. This procedure is illustrated graphically in Figure 4. we deﬁned the so-called impulse response of the ﬁlter as the sequence h formed by padding the vector b with inﬁnitely many zeros on both sides.3. then added together to produce the output sample y[n]. The frequency domain-based approach is generally recommended for input signals whose spectra are discrete. = x[n] = 0. consider an input signal consisting of a unit impulse at time zero: x[n] = δ[n] This signal is plotted in the top graph of Figure 4.20 (i). If n < 0.

bM n (i) Figure 4. b2 b1 b0 Σ ... namely x[0] = 1. A zero value for y[n] is also obtained for n > M . the designation ﬁnite impulse response (FIR) is appropriate for this ﬁlter. the window of input samples which are aligned with the (time-reversed) vector b includes the only nonzero value of the input.. = h[2] = b2 b0 n b1 2 b4 (ii) b3 . h[n] = (4. 0 ≤ n ≤ M ..20: (i) Filter response at time n formed by linear combination of input signal values in a sliding time window. For 0 ≤ n ≤ M . = y[n] Σ ... Then y[n] = bn x[0] = bn as illustrated in the bottom graph of Figure 4. since the duration of the impulse response sequence is ﬁnite (it begins at time 0 and ends at time M ). otherwise. Thus the response of the ﬁlter to a unit impulse is given by bn .20 (ii). b2 b1 b0 bM .14) Clearly.257 x[n-M] x[n] 1 n 0 2 n bM . . 0. in which case indices n − M through n are all (strictly) positive and the corresponding values of the input signal are all zero. (ii) Filter response to a unit impulse.

e.15) where the conversion to an inﬁnite sum was possible because h[k] = 0 for k < 0 and k > M .. the inﬁnite sum in (4. The concept of convolution was encountered earlier in its circular form.1.258 4.6. which involved two vectors of the same (ﬁnite) length. In the meantime. ∞ ∞ h[k]x[n − k] = k=−∞ k=−∞ x[k]h[n − k] .2 Convolution and Linearity The deﬁnition of h in (4. Although only a ﬁnite number (i. At a given time n. we note that linear convolution is symmetric: h∗x=x∗h or equivalently.13) as follows: M y[n] = k=0 M bk x[n − k] h[k]x[n − k] k=0 ∞ = = h[k]x[n − k] k=−∞ (4.14) allows us to rewrite the input-output relationship (4. The relationship between linear and circular convolution will be explored in the next section. the output is formed by summing together all products of the form h[k]x[n − k] (note that the time indices in the two arguments always add up to n). The (linear) convolution of sequences h and x is the sequence y denoted by y =h∗x and deﬁned by y[n] = k=−∞ ∞ h[k]x[n − k] In the above sum. k is a (dummy) variable of summation. (Such ﬁlters are beyond the scope of the present discussion. M + 1) of summands are involved here.15) gives us a general form for the input-output relationship of a linear ﬁlter which would also hold if the impulse response had inﬁnite duration.) Deﬁnition 4.6.

∞ y[n] = k=−∞ x[k]h[n − k] This is the same equation as (4. The easiest way to form x[n − k] is to reverse x in time: x[k] = x[−k] .15) in its simplest and most frequently encountered form..16) can be also obtained directly from the linearity of the ﬁlter. (4. where both signal sequences h and x have ﬁnite duration. in order to compute y[n]. we need to form the products h[k]x[n − k] (where k ranges over all time indices). We saw earlier that the response of the FIR ﬁlter to the input x[n] = δ[n] is given by h[n].14) above. 4. We will assume that x[n] = 0 for n < 0 or n ≥ L. which is the signal h[n − k].6. the response y to input x will be the linear combination of the responses to each of the delayed impulses in the sum.e. deﬁned in (4. then sum them together. i.15). As k ranges over all integers (for ﬁxed n). namely ∞ y[n] = k=−∞ x[k]h[n − k] . h[n] = 0 for n < 0 or n ≥ P (thus the order of the FIR ﬁlter equals M = P − 1).3 Computation of the Convolution Sum We now illustrate the computation of (4. and similarly. and thus ∞ ∞ ∞ h[k]x[n − k] = k=−∞ k =−∞ h[n − k ]x[k ] = k=−∞ x[k]h[n − k] The last expression for the convolution sum. ˜ . We can express any input x as a linear combination of delayed impulses: ∞ x[n] = k=−∞ x[k]δ[n − k] By linearity of the ﬁlter. the response to a delayed impulse x[n] = δ[n − k] n∈Z is given by h[n] delayed by k time instants. As we noted previously.16) above.259 This can be shown using a change of variable from k to k = n − k in (4. By a similar argument. so does k (in reverse order).

As a result.21: Illustration of the convolution of two ﬁnite-length sequences h and x. h[k]x[n − k] = 0 for all k. where x[n − k] is plotted (as a function of k) for diﬀerent values of n. and .21. the nonzero values of the sequence x[n − k] do not overlap (in time) with those of h[k]. We note the following: x[k] h[k] 0 L-1 k x[n1-k] n1 x[-k] 0 P-1 k x[n2-k] n2 x[n3-k] n3 x[n4-k] n4 x[n5-k] n5 Figure 4. • For n < 0.260 ˜ then delay x by n time instants: x[k − n] = x[n − k] ˜ This procedure is depicted in Figure 4.

If the nonzero segments of h and x begin at time n = 0 and have length P and L. no overlap occurs when the left edge of x[n − k] follows the right edge of h[k]. and the resulting value of y[n] is. in general. . Thus we only need to compute y[n] for n = 0. • For 0 ≤ n ≤ P + L − 2. 5. The convolution y = h ∗ x of two ﬁnite-duration sequences h and x is also a ﬁnite-duration sequence.) • Similarly.. the nonzero segments of x[n − k] and h[k] overlap in varying degrees. Example 4. This happens when n − (L − 1) > P − 1. then the nonzero segment of y also begins at time n = 0 and has length P + L − 1. . Consider the two sequences h and x given by h[n] = −δ[n] + 3δ[n − 1] − 3δ[n − 2] + δ[n − 3] and x[n] = δ[n] + 2δ[n − 1] + 3δ[n − 2] h[k] 3 x[k] 3 2 0 1 3 k 1 0 2 k -1 -3 Example 4. (This is expected since the impulse response is causal and the input is zero for negative time indices.6.e. Fact.6. nonzero.1.1 Based on our earlier discussion.261 y[n] = 0. . i. . respectively. We summarize our conclusions as follows. . the nonzero segment of y = h ∗ x begins at time n = 0 and has length 4 + 3 − 1 = 6. when n > P + L − 2.

then L−1 X(z) = k=0 x[k]z −k i.e. then the sum consists of a ﬁnite number of nonzero terms and convergence is not an issue (except at z = 0). k h[k] x[−k] x[1 − k] x[2 − k] x[3 − k] x[4 − k] x[5 − k] We thus obtain −2 −1 3 0 −1 2 1 3 2 3 1 2 3 −3 1 2 3 3 1 4 5 y[n] −1 1 0 4 −7 3 1 2 3 1 2 3 1 2 1 y[0 : 5] = −1 1 0 4 −7 3 T Remark. and the resulting output value y[n] is entered in the last column.6. If x has ﬁnite duration.. the same result would have been obtained in the previous example by interchanging the signals x and h. . of course.262 We follow the technique outlined earlier and illustrated in Figure 4. The products h[k]x[n − k] are then computed and summed together for every value of n.e. Instead of plotting. that the inﬁnite sum converges. Since convolution is symmetric. n = 0 : 5). If x[n] = 0 for n < 0 and n ≥ L (as was the case in the previous section).. X(z) is a polynomial of degree L − 1 in the variable z −1 = 1/z. 4.4 Convolution and Polynomial Multiplication The z-transform of a sequence x is a function of the complex variable z ∈ C deﬁned by ∞ X(z) = k=−∞ x[k]z −k provided. we tabulate (in rows) the values of h[k] along with those x[n − k] for each time index n of interest (i.21.

To see this. which reduces to the ﬁlter frequency response when z = ejω . This term is formed by adding together products of the form h[k]z −k · x[n − k]z −(n−k) = h[k]x[n − k]z −n Its coeﬃcient is thus given by the sum h[0]x[n] + h[1]x[n − 1] + · · · + h[n − 1]x[1] + h[n]x[0] which in turn equals (h ∗ x)[n] = y[n] We therefore have L+P −2 Y (z) = n=0 y[n]z −n = H(z)X(z) This result can be simply stated as follows.4: if h is a FIR ﬁlter response of order M = P − 1.263 Note that the z-transform was encountered earlier in Section 4. A special case of this result is the input-output relationship Y (ejω ) = H(ejω )X(ejω ) . write P −1 L−1 H(z)X(z) = k=0 h[k]z −k k=0 −1 x[k]z −k + · · · + h[P − 1]z −(P −1) = h[0] + h[1]z · x[0] + x[1]z −1 + · · · + x[L − 1]z −(L−1) and consider the z −n term (where 0 ≤ n ≤ L + P − 2) in the resulting product. Fact. The z-transform of the ﬁlter response is given by the product of the ﬁlter system function and the z-transform of the input sequence. The convolution y = h ∗ x mirrors the multiplication of the polynomials X(z) and H(z). then P −1 H(z) = k=0 h[k]z −k is the ﬁlter system function.

4. 4.22: Impulse response of a cascade. which is also the impulse response of the cascade..6.e. Linear convolution of two sequences in the time domain is equivalent to multiplication (of their DTFT’s or. We note that Y (z) = H(z)X(z) also holds for inﬁnite-duration sequences. respectively. provided the inﬁnite sums involved converge. Consider two FIR ﬁlters with impulse responses h(1) [n] = δ[n] − 2δ[n − 1] − 2δ[n − 2] + δ[n − 3] . δ (1) (1) (2) H1 h H2 h *h Figure 4. We have seen in Subsection 4. the impulse response sequence of the cascade is given by the convolution of the two impulse response sequences. if the input to the cascade is a unit impulse.2 that the system function H(z) of the cascade of two ﬁlters is given by H(z) = H1 (z)H2 (z) Thus we must also have h = h(1) ∗ h(2) i.264 in the frequency domain. Conclusion. The same result can be obtained using a time domain-based argument. This is illustrated graphically in Figure 4.2. and system functions H1 (z) and H2 (z).22. The second ﬁlter will act on h(1) to produce a response equal to h(1) ∗ h(2) . more generally. Brieﬂy.5 Impulse Response of Cascaded Filters Consider two ﬁlters with impulse response sequences h(1) and h(2) .6. Thus linear convolution of sequences is analogous to circular convolution of vectors (which also corresponds to multiplication of DFT’s in the frequency domain). their ztransforms) in the frequency domain. then the ﬁrst ﬁlter will produce an output equal to h(1) . Example 4. which was encountered earlier.

265 and h(2) [n] = δ[n] + δ[n − 1] − 3δ[n − 2] + δ[n − 3] + δ[n − 4] The impulse response h of the cascade can be computed directly in the time domain using the formula h = h(1) ∗ h(2) and the convolution technique explained in Subsection 4. the cascade is an FIR ﬁlter of order M = 7 with coeﬃcient vector b = h[0 : 7] = [ 1 −1 −7 6 6 −7 −1 1 ]T As a closing remark. Alternatively.4.3. . we note that the time domain and frequency domain formulas for cascaded ﬁlters developed here and in Subsection 4. we have H1 (ejω ) = 1 − 2e−jω − 2e−j2ω + e−j3ω H2 (ejω ) = 1 + e−jω − 3e−j2ω + e−j3ω + e−j4ω and therefore H(ejω ) = H1 (ejω )H2 (ejω ) = 1 − e−jω − 7e−j2ω + 6e−j3ω + 6e−j4ω − 7e−j5ω − e−j6ω + e−j7ω The impulse response h of the cascade is read oﬀ the coeﬃcients of H(ejω ): h[n] = δ[n] − δ[n − 1] − 7δ[n − 2] + 6δ[n − 3] + 6δ[n − 4] − 7δ[n − 5] − δ[n − 6] + δ[n − 7] In particular.6.2 can be generalized to an arbitrary number of ﬁlters connected in cascade.

b = s = and c=b∗s= −1 1 0 4 −7 3 −1 3 −3 1 1 2 3 T T T The vectors b and s often (though not always) have nonzero ﬁrst and last entries. In Example 4. b[0].1. by converting them to sequences of ﬁnite duration.266 4. 0 ≤ n ≤ P − 1. As we saw earlier. we saw that the convolution of two ﬁnite-duration sequences also results in a sequence of ﬁnite duration.1 Convolution in Practice Linear Convolution of Two Vectors In the previous section. 0 ≤ n ≤ L − 1.e. b[P − 1] or s[L − 1]) is replaced by a zero. 0. y =h∗x is a sequence of total duration P + L − 1 time instants: y[n] = c[n].. s[0]. and x[n] = s[n]. since c[0] = b[0]s[0] and c[P + L − 2] = b[P − 1]s[L − 1] If any single endpoint (i. This allows us to extend the concept of (linear) convolution to vectors. then the corresponding endpoint of c (i. 0. The convolution of b and s is deﬁned by b ∗ s = c = y[0 : P + L − 2] Example 4. the ﬁrst and last entries of c = b ∗ s are also nonzero.7 4. respectively. otherwise.6. 0 ≤ n ≤ P + L − 2. be the sequences obtained by padding b and s with inﬁnitely many zeros on both sides.. otherwise.7. This leads to the following useful property: .e.7. 0. Suppose b and s are vectors of length P and L. Let h[n] = b[n]. otherwise. In such cases.1. c[0] or c[P + L − 2]) also becomes a zero.

and showed that circular convolution in the time domain is equivalent to (element-wise) multiplication of DFT’s in the frequency domain. Note that the same product terms b[k]s[n − k] will be generated in corresponding rows of the two arrays. we introduced the circular convolution ( ) of two vectors of the same length N . 0L−1 ] [s .2 Linear Convolution versus Circular Convolution In our discussion of the DFT. 0i ] 4.267 Fact.7. then (using the semicolon notation as in MATLAB) [0i . As before. 0i ] = [b ∗ s . Linear Convolution: b0 b1 b2 b3 b4 b5 s8 s7 s6 s5 s4 s3 s2 s1 s8 s7 s6 s5 s4 s3 s2 s8 s7 s6 s5 s4 s3 s8 s7 s6 s5 s4 s8 s7 s6 s5 s8 s7 s6 s8 s7 s8 s0 s1 s0 s2 s1 s0 s3 s2 s1 s0 s4 s3 s2 s1 s0 s5 s4 s3 s2 s1 s0 s6 s5 s4 s3 s2 s1 s7 s6 s5 s4 s3 s2 s8 s7 s6 s5 s4 s3 s8 s7 s6 s5 s4 s8 s7 s6 s5 s8 s7 s6 s8 s7 s8 s0 s1 s0 s2 s1 s0 s3 s2 s1 s0 s4 s3 s2 s1 s0 s5 s4 s3 s2 s1 s0 s6 s5 s4 s3 s2 s1 s0 s7 s6 s5 s4 s3 s2 s1 s0 → → → → → → → → → → → → → → c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 . zero entries are omitted. b] ∗ s = [0i . If 0i represents an all-zeros vector of length i. after zero-padding the two vectors to the appropriate output length (which is known in advance). then b ∗ s = [b . Fact. If b = b[0 : P − 1] and s = s[0 : L − 1]. 0P −1 ] This result is demonstrated in the arrays below for the case P = 6 and L = 9. where the output vector has length P + L − 1 = 14. It turns out that linear convolution of two vectors can be also implemented by circular convolution. b ∗ s] and b ∗ [s .

17) H(ejω ) = n=0 L−1 b[n]e−jωn s[n]e−jωn n=0 P +L−2 X(ejω ) = Y (ejω ) = c[n]e−jωn n=0 . s and c=b∗s (respectively) with inﬁnitely many zeros. we turn to the frequency domain. then y =h∗x The DTFT’s of the three sequences in the above equation are related by Y (ejω ) = H(ejω )X(ejω ) We have P −1 (4. If. h. as before.268 Circular Convolution: b0 b1 b2 b3 b4 b5 s0 s1 s0 s2 s1 s0 s3 s2 s1 s0 s4 s3 s2 s1 s0 s5 s4 s3 s2 s1 s0 s6 s5 s4 s3 s2 s1 s7 s6 s5 s4 s3 s2 s8 s7 s6 s5 s4 s3 s8 s7 s6 s5 s4 s8 s7 s6 s5 s8 s7 s6 s8 s7 s8 s0 s1 s0 s2 s1 s0 s3 s2 s1 s0 s4 s3 s2 s1 s0 s5 s4 s3 s2 s1 s0 s6 s5 s4 s3 s2 s1 s0 s7 s6 s5 s4 s3 s2 s1 s0 s8 s7 s6 s5 s4 s3 s2 s1 s8 s7 s6 s5 s4 s3 s2 s8 s7 s6 s5 s4 s3 s8 s7 s6 s5 s4 s8 s7 s6 s5 s8 s7 s6 s8 s7 s8 → → → → → → → → → → → → → → c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 The formal proof of this fact using the time-domain deﬁnitions of the linear and circular convolution is neither diﬃcult nor particularly insightful. x and y are the ﬁnite-duration sequences obtained by padding b. For a more interesting proof.

17). From DFT 9. As it turns out.s) which is implemented entirely in the time domain. . that N is greater than or equal to the length of s. For large values of N . . of course.4. the frequency-domain approach can be much faster. This approach was illustrated in Section 4.2. it follows that the time-domain vectors must satisfy c = [b . then inverting the element-wise product of the DFT’s. Taking N = P + L − 1 (the size of the longest vector in this case). An equivalent frequencydomain implementation using FFT’s is . where we computed the response of an FIR ﬁlter to a periodic input using (in eﬀect) a circular convolution implemented via DFT’s. 0P −1 ] Y (ej(2π/N )k ) = k th entry in the DFT of c From (4. 0L−1 ] [s . In MATLAB. 0L−1 ] and that of [s . If b and s are zero-padded to a total length N > P + L − 1.2.269 In Subsection 4. The equivalence of linear and circular convolution for ﬁnite vectors gives us an option of performing this operation in the frequency domain. implementation via DFT’s can be greatly advantageous in practice. for which the number of such operations is only of the order of N log2 N . 0L−1 ] X(ej(2π/N )k ) = k th entry in the DFT of [s . provided. we saw that sampling the DTFT of a ﬁnite-duration sequence at frequencies ωk = 2πk . 0P −1 ] Remark. 0P −1 ]. by computing the DFT’s of the two vectors involved (zero-padded to the appropriate length). we conclude that the DFT of c is given by the element-wise product of the DFT of [b . . the standard command for convolving two vectors b and s is c = conv(b. then circular convolution will result in c being padded with N −(P +L−1) zeros. . N k = 0. we have H(ej(2π/N )k ) = k th entry in the DFT of [b . while the DFT and its inverse can be implemented using fast Fourier transform (FFT) algorithms. N − 1 yields the DFT of s zero-padded to length N . This is because the number of ﬂoating point operations required to convolve two vectors of length N in the time domain is of the order of N 2 .

fft(b.g.18) Clearly. B. ifft(C) 4. a linear ﬁlter will operate over long periods of time. A pointer inow gives the position of x[n] in the circular buﬀer.N). The circular buﬀer is a vector of size M + 1 which at any particular time n holds Pr x[n − M : n]. The subsequent ﬁlter output y[n + 1] is computed using the vector x[n − M + 1 : n + 1].3 Circular Buﬀers for Real-Time FIR Filters In many practical applications. x[n + 1] x[n − M ] .270 N B S C c = = = = = length(b) + length(s) . x[n − M + 1 : n] Using signal permutations.e. and diﬀer only in one element: x[n − M : n] = while x[n − M + 1 : n + 1] = x[n − M + 1 : n] . fft(s. then r = inow .7. x[n−M +1 : n+1] can be generated by applying a left circular shift on x[n − M : n]: P−1 x[n − M : n] = x[n − M + 1 : n] . The use of a circular shift in updating the vector of input values suggests a computationally eﬃcient implementation of an FIR ﬁlter in terms of a socalled circular buﬀer..*S. At time n + 1. the M previous input values must also be available at any given time. processing input signals which are much longer than the ﬁlter’s impulse response.1. Both x[n − M : n] and x[n − M + 1 : n + 1] contain the subvector x[n − M + 1 : n]. The current output y[n] can be computed as soon as the current input x[n] becomes available using the input-output relationship M y[n] = k=0 bk x[n − k] (4. it is not diﬃcult to see that if the buﬀer is indexed by 1 : M + 1. x[n − M ] followed by replacing the last entry x[n − M ] by x[n + 1]. i. a circularly shifted version of x[n − M : n]. If the ﬁlter is part of a system operating in real time (e. the buﬀer is updated as follows: . the delay in generating output samples shouldn’t—and needn’t—be too long. a digital speech encoder used during a telephone conversation).N).. since computation of y[n] requires knowledge of the vector x[n−M : n].

Figure 4. Convolving long vectors using frequency . .23: A circular buﬀer of size P = M + 1 = 7 used for computing y[0 : 8]. by convolving consecutive blocks (segments) of the input signal with the ﬁlter impulse response.7. . and • the input x[n−M +1] at that position is replaced by the current input x[n + 1].18) at time n is the (unconjugated) inner product of the coeﬃcient vector b with the contents of the circular buﬀer read in reverse circular order starting with the current input sample x[n]. . it may be advantageous to ﬁlter the input signal in a block-wise fashion. 4. After the update. assuming all inputs before n = 0 were identically equal to zero. i..4 Block Convolution In applications where longer delays between input and output can be tolerated. n=6 n=7 n=8 . .271 • the pointer inow is incremented circularly by one (position). The ﬁlter output (4. x[0] x[1] x[2] x[3] x[4] x[5] x[6] x[7] x[1] x[2] x[3] x[4] x[5] x[6] x[7] x[8] x[2] x[3] x[4] x[5] x[6] Figure 4. For each value of n. the arrow (same as the pointer inow ) indicates the current input x[n].23 illustrates the sequential updating of the circular buﬀer in the case M = 6.e. the buﬀer holds Pr+1 x[n − M + 1 : n + 1]. n=0 n=1 n=2 x[0] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x[0] x[1] x[0] x[1] x[2] . The initial position of x[0] (obtained by initialization of inow ) is arbitrary.

Speciﬁcally.. b2 b1 b0 s (1) bP-1 .. which has been partitioned into J blocks of length L: s = [s(0) . s(J−1) ] The block length L is a design parameter chosen according to factors such as processing speed and acceptable delay between input and output. . sum) of the responses to each of the input blocks s(j) . b2 b1 b0 0L bP-1 . s (1) bP-1 . . By linearity and time-invariance of the ﬁlter.. namely the time-shifted responses to s(j−1) and s(j) ..2. The ﬁlter acts on an input vector s. Since the response to s(j) has a total duration of L+P −1 > L time units.24: Linearity and time-invariance applied to block convolution.24 for the case J = 2.e. it follows that the responses overlap in time. consider an FIR ﬁlter of order M = P − 1 > 0 with coeﬃcient vector b. To develop a model for the block-wise implementation of convolution. Consider the FIR ﬁlter with coeﬃcient vector b = [ 1 −3 −3 1 ]T .. the vector c[jL : jL + P − 2] is formed by summing together parts of two vectors. s(1) . . for 0 < j ≤ J − 1. Note that we have also taken the length of s as an exact multiple of L (zero-padding may be used for the last block if necessary).. The underlying concept of superposition is illustrated graphically in Figure 4. . Here we will assume that L ≥ P . b2 b1 b0 s (2) s (2) 0L Figure 4. the output c=b∗s is the superposition (i. namely jL. Example 4.272 domain-based algorithms such as the FFT is generally more eﬃcient than following the sliding window approach (in the time domain) discussed in the previous subsection.. where each response is delayed in time by the appropriate index.7.

06 ] = [ 1 −5 6 2 −19 −23 −2 10 −2 0 0 0 0 0 0 ]T and b ∗ [06 . We express s as s = [s(1) . s(2) ] = [s(1) . 0.273 driven by the ﬁnite-duration input sequence x. Suppose we wish to compute the ﬁlter response y to x using block convolution with block length L = 6. b ∗ s(2) ] = [ 0 0 0 0 0 0 7 −16 −35 −8 −8 −9 18 11 −4 ]T We thus have b ∗ s = b ∗ [s(1) . 06 ] = [b ∗ s(1) . s(2) ] = [06 . s(2) ] = [ 1 −5 6 2 −19 −23 5 −6 −37 −8 −8 −9 18 11 −4 ]T and the ﬁlter response is given by y[n] = (b ∗ s)[n]. otherwise. . where x[0 : 11] = s = [ 1 −2 3 4 4 −2 7 5 1 3 −1 −4 ]T and x[n] = 0 for n < 0 and n > 11. 0 ≤ n ≤ 14. 06 ] + b ∗ [06 . 06 ] + [06 . s(2) ] where s(1) = [ 1 −2 3 4 4 −2 ]T and s(2) = [ 7 5 1 3 −1 −4 ]T Then b ∗ [s(1) .

(Hint: Consult Subsection 3. inclusive? P 4. How would you deﬁne an array x in MATLAB. |n| > L express X(ejω ) as a sum of L cosines plus a constant.274 Problems Sections 4.996π.2.7 + 6 cos 35 17πn + 2. what is its period? (ii) Sketch the amplitude and phase spectra of x (both of which are line spectra). so that the command X = fft(x.1.) (ii) Let M = 23. Show that X(ejω ) can be written as a sum of cosines with real coeﬃcients.0 24 (i) Is the sequence periodic.3. (i) Sketch the signal sequence x[n] = 1.500) computes X(ejω ) at 500 equally spaced frequencies ω ranging from 0 to 1.2.2 P 4. 0 ≤ n ≤ M 0.8 + 2 cos 14 18πn − 0. |n| ≤ L 0. (ii) Generalize the result of part (i) to a symmetric signal sequence of ﬁnite duration (2L + 1 samples): If x[n] = x[−n] = β|n| . P 4.1–4. and if so. otherwise and express its Fourier transform X(ejω ) in the form ejKω F (ω) where F (ω) is a real-valued function which is symmetric about ω = 0. Consider the signal sequence x deﬁned by x[n] = cos 3πn − 1. (i) Sketch the signal sequence x[n] = δ[n + 2] − δ[n + 1] + 5δ[n] − δ[n − 1] + δ[n − 2] and write an expression for its discrete-time Fourier transform X(ejω ). .8.

n∈Z 2 . and which computes the amplitude and phase response of the ﬁlter at 256 equally spaced frequencies between 0 and 2π(1 − 256−1 ). Consider the FIR ﬁlter whose input x and output y are related by y[n] = x[n] − x[n − 1] − x[n − 2] + x[n − 3] (i) Write out an expression for the system function H(z).5.4 P 4.) P 4. (iii) Determine the output y[n] when the input sequence x is given by each of the following expressions (where n ∈ Z): • x[n] = 1 • x[n] = (−1)n • x[n] = ejπn/4 • x[n] = cos(πn/4 + φ) • x[n] = 2−n • x[n] = 2−n cos(πn/4) (In all cases except the third. (ii) Express |H(ejω )|2 in terms of cosines only.3–4. Consider the FIR ﬁlter y[n] = x[n] − 3x[n − 1] + x[n − 2] + x[n − 3] − 3x[n − 4] + x[n − 5] (i) Write MATLAB code which includes the function fft. (ii) Express the frequency response of the ﬁlter in the form e−jαω F (ω) where F (ω) is a real-valued sum of cosines. your answer should involve real-valued terms only. Plot |H(ejω )| as a function of ω.4. (iii) Determine the response y[n] of the ﬁlter to the exponential input sequence 1 n x[n] = .275 Sections 4.

and such that x[0 : 6] = [ 1 −1 0 3 1 −2 0 ]T Using DFT’s (and MATLAB). n∈Z (iii) Express the frequency response of the ﬁlter in the form e−jαω F (ω) where F (ω) is a real-valued sum of cosines. P 4. angle(H).276 P 4. P 4. where the coeﬃcient vector b is given by b = [ 1 2 −2 −1 4 −1 −2 2 1 ]T Let the input x be an inﬁnite periodic sequence with period L = 7.7. fft(a. abs(H).6.500). (ii) Determine the output y of the ﬁlter when the input x is given by x[n] = 1 3 n . Consider two FIR ﬁlters with coeﬃcient vectors b and c. 2π). The MATLAB code a H A q = = = = [ 1 -3 5 -3 1 ].8. computes the amplitude response A and phase response q of a FIR ﬁlter over 500 equally spaced frequencies in the interval [0. Consider a FIR ﬁlter whose input x and output y are related by 8 y[n] = k=0 bk x[n − k] .’ . determine the ﬁrst period y[0 : 6] of the output sequence y. where b= and c= 1 −2 2 −1 3 2 1 2 3 T T . write an expression for y[n] in terms of values of x. (i) If x and y are (respectively) the input and output sequences of that ﬁlter.

the vector in s6. The text ﬁle s6. determine its coeﬃcient vector. (ii) Determine the response y[n] of the cascade to the sinusoidal input sequence nπ . (i) Determine the system function H(z) of the cascade. Is the cascade also a FIR ﬁlter? If so. What kind of symmetry does the impulse response exhibit? (ii) Plot |H(ejω )|..txt contains the coeﬃcient vector of a FIR ﬁlter of order M = 40. . What type of ﬁlter (lowpass. the magnitude of the frequency response of the ﬁlter.e. (iv) How would you convert this ﬁlter to a bandpass ﬁlter whose passband is centered around ω = π/2? What would be resulting width of the passband? Verify your answers using MATLAB.9.e.. n∈Z x[n] = cos 2 Section 4.5 P 4. P 4. (ii) Express the amplitude response of the cascade as a sum of sines or cosines (as appropriate) with real-valued coeﬃcients. Is the cascade also a FIR ﬁlter? If so. determine its coeﬃcient vector. i. (i) Plot the impulse response of the ﬁlter (i.277 (i) Determine the system function H(z) of the cascade. bandpass or bandstop) do we have here? Determine the edges of the passband and stopband. (iii) If A1 = maximum value of |H(ejω )| over the passband A2 = minimum value of |H(ejω )| over the passband A3 = maximum value of |H(ejω )| over the stopband compute the ratios A1 /A2 and A2 /A3 (using your graphs). Consider the FIR ﬁlter with coeﬃcient vector b= 1 1 1 1 T Two copies of this ﬁlter are connected in series (cascade). highpass.10.txt) using the STEM or BAR functions in MATLAB.

what is its period? (ii) If x[n] is the input to a ﬁlter with amplitude and phase responses as shown in ﬁgure. Consider the signal sequence x given by x[n] = 2 cos(3.12.12 Section 4. 2π/3 ≤ ω ≤ 4π/3. Consider the signal x[n] = 3 cos πn π πn π − + 5 cos + 15 3 4 2 (i) Is the signal periodic.7) + cos(1. P 4. what is its period? (ii) If x is the input to the linear ﬁlter with frequency response described by 2e−j3ω . 2π). H(e jω ) A π/3 H(e jω ) −π 0 π/6 π/3 π/2 2π/3 5π/6 π −π −π/3 π/6 π/2 5π/6 π Problem P 4. n∈Z (i) Is x periodic? If so.11.13.6 P 4. H(ejω ) = 0.4n + 0.0n − 1. and if so. determine the output sequence y. Consider the FIR ﬁlter with impulse response h given by h[n] = δ[n] − 2δ[n − 1] + 3δ[n − 3] − 2δ[n − 4] (i) Without using convolution.8) .278 P 4. all other ω in [0. determine the resulting output y[n]. determine the response y of the ﬁlter to the input signal x given by x[n] = δ[n + 1] − δ[n − 1] .

P 4. 1. (i) Determine the convolution y =x∗x For what values of n is y[n] nonzero? (ii) Using the results of Problem P 4. Consider the signal sequence x[n] = of Problem P 4. (iii) By how many instants should y be delayed (or advanced) in time so that the resulting signal is symmetric about n = 0? What is the DTFT of that signal? P 4. write an equation for the DTFT Y (ejω ) of y. Consider the ﬁlter with impulse response given by h[n] = 1. P 4.16. Sketch your answer in the case α = 2.2.2.279 (Hint: Express y in terms of h. 0. Deﬁne x by x[n] = δ[n] − δ[n − 1] + δ[n − 2] − δ[n − 3] and let h be the (inﬁnite impulse response) sequence given by h[n] = αn . otherwise Compute y = x ∗ h. (iii) Obtain the answer to (ii) by direct multiplication of the z-transforms H(z) and X(z).14. determine the output signal y. 0 ≤ n ≤ 8 0. n < 0.15. n ≥ 0. otherwise . 0 ≤ n ≤ M 0.) (ii) Now let the input signal x be given by x[n] = δ[n] + 2δ[n − 1] + 3δ[n − 2] − δ[n − 3] Using convolution.

where b represents a FIR coeﬃcient vector and s represents the ﬁlter input.) (ii) Verify that the following MATLAB script generates the required matrix B from a column vector b: . H (2) (z) and H(z)..18.7 P 4. n∈Z are connected in series (cascade) to form a single ﬁlter with impulse response h. (iii) Sketch the output y of the cascade when the input x is given by x[n] = δ[n + 1] . See also Problem P 2.8. The matrix B describes this transformation in the case where the input signal has ﬁnite duration L. (i) Construct a (P + L − 1) × L matrix B such that b ∗ s = Bs (In eﬀect. Two FIR ﬁlters whose impulse response sequences h(1) and h(2) are given by h(1) [n] = δ[n] + δ[n − 1] + δ[n − 2] + δ[n − 3] . every FIR ﬁlter can be viewed as a linear transformation of the input signal sequence.e. (i) Determine the corresponding system functions H (1) (z).280 (i) Determine the response of the ﬁlter to the input x(1) [n] = cos nπ π + 2 8 . (ii) Determine h (i. n∈Z n∈Z Section 4. h[n] for every n ∈ Z). Consider the convolution of b = b[0 : P − 1] and s = s[0 : L − 1]. and h(2) [n] = δ[n] − δ[n − 2] .17. n∈Z (ii) Determine the output of the ﬁlter to the input x(2) [n] = δ[n] − δ[n − 1] + δ[n − 2] P 4.

while 1 x = randn(1). % % computation of scalar variable y % yvec = [yvec .21.20.e. zeros(L-1. where P is a known (previously deﬁned) integer and b is a known column vector or length P.L-1)]. xcirc = zeros(P. numbers or loops. y].281 c = [b . ireverse = toeplitz(1:P. inow = P. other than previously deﬁned). Consider the following incomplete MATLAB code. A circular buﬀer for an FIR ﬁlter of order M = 16 is initialized at time n = 0 by placing x[0] in the ﬁrst (leftmost or topmost) position within the buﬀer. Complete the code using three commands (in the commented-out section) that contain no additional variables (i. x]. iupdate = [2:P 1]. xvec = []. Consider the FIR ﬁlter y[n] = x[n] − 2x[n − 1] − x[n − 2] + 4x[n − 3] − x[n − 4] − 2x[n − 5] + x[n − 6] . and the output sequence is given by yvec.1).s) for random choices of s. B = toeplitz(c.1)]. Test your code against the results obtained using the function CONV. Compare B*s and conv(b.r). Express the contents of the buﬀer at time n = 139 in the form x[n1 : n2 ] . driven by a sequence xvec of Gaussian noise. end This code simulates the continuous operation (note the while loop) of a FIR ﬁlter with coeﬃcient vector b. P 4. The scalar variables x and y denote the current input and output. r = [b(1) zeros(1.. x[n3 : n4 ] P 4. yvec = []. [1 P:-1:2]).19. P 4. xvec = [xvec .

(i) Without using MATLAB.’ . P 4. x P 4. determine the vector c. ifft(C). [ 1 2 -1 2 1]. resulting in an output sequence y.23. Consider the two vectors b= and s= 2 −3 0 7 −3 4 1 5 2 −3 4 −5 −1 −1 T a0 a1 a2 a3 2c0 2c1 2c2 2c3 T T (i) Determine the convolution b ∗ s using the function CONV in MATLAB.8). The signal x is the input to a FIR ﬁlter with coeﬃcient vector b. How would you modify the code shown above so that c = y[0 : 3]? . Determine the response of the ﬁlter to y x[0 : 7] = ˆ (ˆ[·] = 0 otherwise).22.*B. (ii) Repeat using the functions FFT and IFFT instead of CONV.8). Consider the following MATLAB code: a b A B C c = = = = = = [ 1 -2 3 -4 ]. A. while the response to x[0 : 3] = ˜ (˜[·] = 0 otherwise) is given by x y [0 : 9] = ˜ 1 −2 −2 10 −8 −10 18 −2 −9 4 T c0 c1 c2 c3 T (˜[·] = 0 otherwise). (ii) Let a = x[0 : 3]. where x is a sequence of period L = 4. fft(b.282 The response of the ﬁlter to the input x[0 : 3] = (x[·] = 0 otherwise) is given by y[0 : 9] = 1 −1 0 −7 8 13 −20 −1 11 −4 T a0 a1 a2 a3 T (y[·] = 0 otherwise).’ . fft(a.

9).txt contains a vector s of length L = 64. The data ﬁle s7.’ . P 4.*B. ifft(E). fft(a. fft(a.25.’ . Your answer should be the same as for part (i).’ .*D. Let b= 2 −1 −3 1 3 1 −3 −1 2 T (i) Use the CONV function in MATLAB to compute the convolution c = b∗s. fft(d. . followed by a summation. A. determine the vector c. What is the length of c? (ii) Compute c using three convolutions of output length 25 = 32.9). determine the vector e obtained by the following code: a d A D E e = = = = = = [ 2 -1 0 -1 2 ]. [-3 2 -5 -1 1 3 -2 5 1 -1]. (ii) Without performing a new convolution.283 P 4. (i) Without using MATLAB.14). ifft(C). [ 3 -2 5 1 -1].14). fft(b.24. Consider the following MATLAB code: a b A B C c = = = = = = [ 2 -1 0 -1 2 ]. A.’ .

least-squares approximation (projection) and orthogonality—we developed the representation of an N -dimensional signal vector in terms of N sinusoidal components known as the discrete Fourier transform (DFT). Sequences and vectors are treated next. With the aid of basic tools and concepts from linear algebra—notably linear independence. Our approach in this book clearly follows a diﬀerent order. which. namely on the suﬃciency of Nyquist-rate sampling for the faithful reconstruction of bandlimited analog waveforms. as it turns out.Epilogue Our overview of signals took us from analog waveforms to discrete-time sequences (obtained by sampling) to ﬁnite-dimensional vectors. In the second part of this book. 284 .6. The traditional exposition of signal analysis begins with the sinusoidal representation of periodic analog waveforms known as the Fourier series. is quite accessible at this point. Gaussian elimination. resulting in the so-called Fourier transform. the motivated reader might be interested in a preview of that development. and the DTFT and DFT are developed using some of the ideas and results from Fourier series and the Fourier transform. we will develop similar tools for the analysis of analog waveforms and examine the implications of sampling from a frequency-domain viewpoint. While both the Fourier series and the Fourier transform for analog waveforms will be developed in the second part of this book. This representation is then extended to aperiodic waveforms (by taking a limit as the signal period tends to inﬁnity). It is also possible to provide here a deﬁnitive answer to a question raised in Section 1. The extension of vectors to sequences (by taking N → ∞) brought the continuum of frequency into play and gave rise to the discretetime Fourier transform (DTFT).

as shown in Figure E. t∈R Associated with the period T0 are two familiar parameters. Together with the corresponding frequencies kΩ0 . and ejΩ0 t encountered in Chapter 1 are all periodic with period T0 . i.e. t ∈ R} is periodic with period T0 if s(t + T0 ) = s(t) . These sinusoids are also referred to as harmonics of the basic sinusoid (real or complex) of frequency Ω0 . k ∈ Z} for the periodic signal s(t). . Sk ). and ejkΩ0 t are periodic with period T0 /|k|. We know that the spectrum of a time-domain sequence is a periodic function of the frequency ω (measured in radians per sample). the time-domain sequence {x[n]. It follows that for any nonzero integer k. with period 2π.285 The Fourier Series A continuous-time signal {s(t). The key result in Fourier series is that a “well-behaved” periodic signal s(t) with period T0 can be expressed as a linear combination of the sinusoids ejkΩ0 t . this result can be established by considering the usual DTFT analysis and synthesis formulas for discrete-time signals. The continuous-time sinusoids cos Ω0 t.1. the sinusoids cos kΩ0 t. these coeﬃcients yield a discrete (line) spectrum {(kΩ0 . As it turns out. sin Ω0 t. all of which are periodic with period T0 .. In other words. n ∈ Z} can be recovered from X(ejω ) using the integral (synthesis equation) x[n] = 1 2π 2π 0 X(ejω )ejωn dω . and that there is no periodicity in the spectrum (as was the case with discrete-time sequences). they are also periodic with period T0 . ∞ s(t) = k=−∞ Sk ejkΩ0 t . the cyclic frequency f0 = 1/T0 (Hz) and the angular frequency Ω0 = 2π/T0 (rad/sec). and since T0 is an integer multiple of T0 /|k|. t∈R The coeﬃcients Sk can be obtained from s(t) by a suitable integral over a time interval of length equal to one period. Note that the frequency Ω is in radians per second (as opposed to per sample). Thus given a spectrum X(ejω ). T0 . sin kΩ0 t.

To that end. we see that s(t) can be associated with a sequence {Sk . we deﬁne t = ω/Ω0 (which has the correct units. . over one period. the product of s(t) and the complex conjugate of the k th sinusoid. (A Fourier series can be also regarded as a power series in ejΩ0 t .e. k ∈ Z} deﬁned by Sk = x[−k] and such that Sk = = = 1 2π Ω0 2π 1 T0 2π 0 2π/Ω0 0 T0 0 def def def X(ejω )e−jωk dω s(t)e−jkΩ0 t dt s(t)e−jkΩ0 t dt From the analysis equation.. we then obtain ∞ ∞ s(t) = X(e jΩ0 t )= k=−∞ x[k]e −jkΩ0 t = k=−∞ Sk ejkΩ0 t We have therefore established that the periodic signal s(t) can be expressed as a Fourier series. n ∈ Z} using the sum (analysis equation) ∞ X(ejω ) = n=−∞ x[n]e−jωn The trick here is to associate the continuous frequency parameter ω with continuous time. i. namely seconds) and use the periodic frequencydomain signal X(ejω ) to create a periodic time-domain signal s(t): s(t) = X(ejΩ0 t ) = X(ejω ) Since X(ejω ) is an arbitrary signal of period 2π (in ω).) The k th coeﬃcient Sk is obtained by integrating. From the synthesis equation.286 while X(ejω ) can be obtained from {x[n]. s(t) is an arbitrary signal of period 2π/Ω0 = T0 (in t). a linear combination of complex sinusoids whose frequencies are multiples of Ω0 .

e. Consider. . the time-domain signal is a sequence. while the frequency-domain signal is a discrete sequence of coeﬃcients (i. which is a central theme in Fourier analysis. by e−|t| .1 is a graphical illustration of the duality of the Fourier series and the DTFT.. Duality was encountered earlier in the discussion of the DFT. Also.287 In using the DTFT to derive the Fourier series. Figure E. a line spectrum). the three signals deﬁned. The Fourier Transform The Fourier transform is a frequency-domain representation for certain classes of aperiodic continuous-time signals. i. respectively). satisfy ∞ ∞ |s(t)|dt < ∞ −∞ or −∞ |s(t)|2 dt < ∞ (or both). and that if s(t) is locally unbounded. and that the DFT and IDFT operations were essentially the same (with the exception of a scaling factor and a complex conjugate). while the frequency-domain signal is continuous-parameter periodic signal. These signals are typically either absolutely integrable or square-integrable. the Fourier series is the dual of the DTFT where the two domains are interchanged: the time-domain signal is a continuousparameter periodic signal. while the second signal violates both integrability conditions and does not have a Fourier transform. it cannot approach inﬁnity too fast. The third signal (which is aperiodic) violates both integrability conditions and does not have a Fourier transform in the conventional sense. for example. where it was shown that the time and frequency domains had the same dimension (namely vector size N ). Either type of integrability implies that the signal s(t) decays suﬃciently fast in t as t → ∞ or t → −∞. Both the Fourier series and the DTFT use (essentially) the same sum and integral for transitioning between the two domains (discrete-to-continuous and continuous-to-discrete. we exploited the duality of time and frequency. yet it has an obvious frequency-domain representation in terms of a line spectrum (at frequencies Ω = ±1 and Ω = ±π rad/sec). In eﬀect. the two domains are quite diﬀerent: time n is a discrete parameter taking values in Z.e. et and cos t + cos πt The ﬁrst signal satisﬁes both integrability conditions and has a Fourier transform.. In the case of the DTFT. while frequency ω is a continuous parameter taking values in an interval of length 2π. for all t ∈ R.

. The idea behind the proof is as follows: if s∆ [·] has a frequency-domain representation which converges. T /2r . by reducing the sampling period to zero. then the discrete-parameter graphs are also scaled and time-reversed versions of each other (duality of the Fourier series and the DTFT). a sample sequence (bottom left) and its discrete-time Fourier transform (bottom right). to a single ”spectrum-like” function. −T /2 < t < T /2} and then taking a suitable limit as T → ∞. as ∆ → 0. def n∈Z (where ∆ > 0 is the sampling period). since that signal can be asymptotically obtained in this fashion. . . . .e. It is possible to obtain the Fourier transform of an aperiodic signal {s(t). . then that function must be valid as a frequency-domain representation for the continuous-time signal {s(t). To simplify the analysis.288 s(t) S0 S-1 S-3 −T0 0 T0 t S-2 X(e jω) 0 S1 Ω0 S2 S3 Ω x[n] 0 1 n −2π 0 2π ω Figure E. t ∈ R} also. i. T /2. Here we will follow a diﬀerent approach based on the DTFT of the sampled signal s∆ [n] = s(n∆) . we will let ∆ → 0 by taking ∆ = T.. . If the continuousparameter graphs are scaled versions of each other.1: A periodic waveform (top left) and its Fourier series coeﬃcients (top right). . t ∈ R} by considering the Fourier series of the periodic extension of the time-truncated signal {s(t). T /4.

This ensures that the sampling instants for diﬀerent values of ∆ are nested . upon replacing ∆S∆ (ejΩ∆ ) by S(Ω) and 2π/∆ by its limiting value ∞. The synthesis equation (inverse DTFT) gives s(t) = s∆ [t/∆] π 1 = S∆ (ejω )ejω(t/∆) dω 2π −π = 1 2π π/∆ −π/∆ {∆S∆ (ejΩ∆ )} · ejΩt dΩ where the change of variables Ω = ω/∆ was used for the last integral. where T > 0 is ﬁxed. It is interesting to note that the equivalent expression s(t) = 1 2π 2π/∆ 0 ∆S∆ (ejΩ∆ )ejΩt dΩ will. As we will soon see. which are suﬃcient for the speciﬁcation of the entire signal s(·).289 in turn. by virtue of the integrability conditions on s(·)). Viewed as a function of Ω. Note that Ω is now measured in radians per second. we obtain a dense set of points t on the time axis. leading to a new function S(Ω) = lim ∆S∆ (ejΩ∆ ) . to 1 2π Thus also s(t) = ∞ −∞ S(Ω)ejΩt dΩ ∞ −∞ 1 2π S(Ω)ejΩt dΩ Remark. the last integral will converge. any time t = (m2−r )T . and thus t is the (t/∆)th sampling instant for the sequence s∆ [·]. ∆→0 def Ω∈R The function S(Ω) cannot be periodic. If t is such a point. as ∆ → 0. the expression ∆S∆ (ejΩ∆ ) is periodic with period 2π/∆. result in the incorrect integral 1 2π ∞ 0 S(Ω)ejΩt dΩ . By varying m and r. this expression converges (as ∆ → 0) for each Ω ∈ R. Assuming that it decays suﬃciently fast as Ω → ∞ and Ω → −∞ (which it does. which is appropriate for a continuous-time signal. in particular. is a sampling instant for all suﬃciently small ∆. then t/∆ is an integer for suﬃciently small ∆. where r ∈ N and m ∈ Z.

Here. the period T /∆ approaches inﬁnity. ∆. This gives rise to interesting duality properties. Note that in this case. Thus the correct limits for the integral should be −∞ to ∞. which are apparent from the similarities between the analysis and synthesis equations: ∞ S(Ω) = −∞ s(t)e−jΩt dt ∞ 0 s(t) = 1 2π S(Ω)ejΩt dΩ Nyquist Sampling In Section 1. s(t) is the inverse Fourier transform of S(Ω). Similarly. We will show how sampling at a rate greater than. 2fB . As ∆ → 0. time and frequency. the two domains. a period of S(Ω) takes up the entire real axis. and thus S(Ω) = lim ∆S∆ (ejΩ∆ ) = ∆→0 ∞ −∞ s(t)e−jΩt dt The function S(Ω) is the Fourier transform. the integral for s(t) is taken over an entire period of the function ∆S∆ (ejΩ∆ ). or sampling period. or equal to. Ω (in rad/sec) is a continuous frequency parameter ranging from −∞ to ∞. each multiplied by the spacing. no matter how small ∆ is. and the limiting function S(Ω) is aperiodic—in other words. fB ) (Hz) cannot be sampled at a rate less than 2fB without loss of information due to aliasing. the sum converges to an integral. or spectrum. we use the analysis equation: ∞ ∆S∆ (e jΩ∆ ) = ∆· n=−∞ ∞ s∆ [n]e−jnΩ∆ s(n∆) · e−jnΩ∆ · ∆ = n=−∞ The last sum consists of equally spaced samples (over the entire time axis) of the function s(t)e−jΩt .290 for s(t). To show that ∆S∆ (ejΩ∆ ) indeed converges. As ∆ → 0. This paradox can be resolved by observing that. of the continuoustime signal s(t). we saw that an analog signal made up of a continuum of sinusoidal components with frequencies in the range [0. have the same dimension (that of the continuum).6.

The signal S(Ω) has ﬁnite duration in the frequency domain. We thus have s(t) = 1 2π ∞ −∞ S(Ω)ejΩt dΩ = 1 2π ΩB −ΩB S(Ω)ejΩt dΩ (E. it can be periodically extended outside the interval (−ΩB . ΩB ) (see Figure E. the Fourier series will return S(Ω) for Ω in the interval (−ΩB . we have ∞ S(Ω) = n=−∞ An ejnπ(Ω/ΩB ) (E.2) S(Ω) -ΩB 0 ΩΒ Ω Figure E. Clearly. Let us assume then that the continuous-time signal {s(t). we obtain an expression for each of these samples: s(nTs ) = 1 2π ΩB −ΩB S(Ω)ejπ(Ω/ΩB ) dΩ (E.2: Shown in bold is the Fourier transform (spectrum) S(Ω) of a bandlimited continuous-time signal.1).1) Sampling at the Nyquist rate yields a sequence of samples at times t = nTs . Using the Fourier series formulas developed earlier (with the period parameter equal to 2ΩB ).2) and the resulting periodic extension can be expressed in terms of a Fourier series in Ω.291 (samples/second) allows for perfect reconstruction of the continuous-time signal. As such. t ∈ R} has a Fourier transform S(Ω) which vanishes outside the interval (−ΩB . where π def 1 Ts = = 2fB ΩB From (E. ΩB ).3) . ΩB ) (where ΩB = 2πfB ).

1)).3).5) Thus the sequence of samples of s(·) obtained at the Nyquist rate 2fB completely determine the spectrum S(Ω) through its Fourier series expansion (E.3) and (E.3). . Writing ∞ def sin(ΩB τ ) ΩB τ s(t) = n=−∞ s(nTs ) · IΩB (t − nTs ) we can also see that s(·) is a linear combination of the signals IΩB (· − nTs ). which are time-delayed versions of IΩB (·) centered at the sampling instants nTs . we can obtain an explicit formula for s(t) at any time t in terms of the samples {s(nTs )} by combining (E. we immediately obtain π A−n = s(nTs ) = Ts s(nTs ) ΩB (E.292 where An = 1 2ΩB ΩB −ΩB S(Ω)e−jnπ(Ω/ΩB ) dΩ (E.4) Comparing (E. it follows that sampling at the Nyquist rate is suﬃcient for the faithful reconstruction of the bandlimited signal s(t).1).5) as follows: s(t) = = 1 2π ΩB −ΩB ∞ ∞ A−n e−jnπ(Ω/ΩB ) n=−∞ ΩB −ΩB · ejΩt dΩ 1 · s(nTS ) · 2ΩB n=−∞ ejΩ(t−nTs ) dΩ The inner integral equals ejΩB (t−nTs ) − e−jΩB (t−nTs ) 2 sin(ΩB (t − nTs )) = j(t − nTs ) t − nTs and thus the ﬁnal expression for s(t) is ∞ s(t) = n=−∞ s(nTs ) · sin(ΩB (t − nTs )) ΩB (t − nTs ) Clearly. Since S(Ω) uniquely determines s(t) (via the analysis equation (E. s(t) as a weighted sum of all samples s(nTs ) with coeﬃcients provided by the interpolation function IΩB (τ ) = evaluated for τ = t − nTs . As it turns out. (E.2) and (E.