This action might not be possible to undo. Are you sure you want to continue?
Lecture Notes
Fall 2009
Dr. Neal Patwari
University of Utah
Department of Electrical and Computer Engineering
c _2006
ECE 5520 Fall 2009 2
Contents
1 Class Organization 8
2 Introduction 8
2.1 ”Executive Summary” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Why not Analog? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Networking Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Channels and Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Encoding / Decoding Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6 Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.7 Topic: Random Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.8 Topic: Frequency Domain Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.9 Topic: Orthogonality and Signal spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.10 Related classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Power and Energy 13
3.1 DiscreteTime Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Decibel Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 TimeDomain Concept Review 15
4.1 Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Impulse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Bandwidth 16
5.1 Continuoustime Frequency Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.1.1 Fourier Transform Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2 Linear Time Invariant (LTI) Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Bandpass Signals 21
6.1 Upconversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.2 Downconversion of Bandpass Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7 Sampling 23
7.1 Aliasing Due To Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7.2 Connection to DTFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.3 Bandpass sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
8 Orthogonality 28
8.1 Inner Product of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
8.2 Inner Product of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
8.3 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8.4 Orthogonal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
9 Orthonormal Signal Representations 32
9.1 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
9.2 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
9.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
ECE 5520 Fall 2009 3
10 MultiVariate Distributions 37
10.1 Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
10.2 Conditional Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
10.3 Simulation of Digital Communication Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
10.4 Mixed Discrete and Continuous Joint Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
10.5 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
10.6 Gaussian Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
10.6.1 Complementary CDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
10.6.2 Error Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
10.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
10.8 Gaussian Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
10.8.1 Envelope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11 Random Processes 46
11.1 Autocorrelation and Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
11.1.1 White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
12 Correlation and MatchedFilter Receivers 47
12.1 Correlation Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
12.2 Matched Filter Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
12.3 Amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
12.4 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
12.5 Correlation Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
13 Optimal Detection 51
13.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
13.2 Bayesian Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
14 Binary Detection 52
14.1 Decision Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.2 Formula for Probability of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.3 Selecting R
0
to Minimize Probability of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
14.4 LogLikelihood Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
14.5 Case of a
0
= 0, a
1
= 1 in Gaussian noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
14.6 General Case for Arbitrary Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
14.7 Equiprobable Special Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
14.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
14.9 Review of Binary Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
15 Pulse Amplitude Modulation (PAM) 58
15.1 Baseband Signal Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
15.2 Average Bit Energy in Mary PAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
16 Topics for Exam 1 60
ECE 5520 Fall 2009 4
17 Additional Problems 61
17.1 Spectrum of Communication Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
17.2 Sampling and Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
17.3 Orthogonality and Signal Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
17.4 Random Processes, PSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
17.5 Correlation / Matched Filter Receivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
18 Probability of Error in Binary PAM 63
18.1 Signal Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
18.2 BER Function of Distance, Noise PSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
18.3 Binary PAM Error Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
19 Detection with Multiple Symbols 66
20 Mary PAM Probability of Error 67
20.1 Symbol Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
20.1.1 Symbol Error Rate and Average Bit Energy . . . . . . . . . . . . . . . . . . . . . . . . . . 67
20.2 Bit Errors and Gray Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
20.2.1 Bit Error Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
21 Intersymbol Interference 69
21.1 Multipath Radio Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
21.2 Transmitter and Receiver Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
22 Nyquist Filtering 70
22.1 Raised Cosine Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
22.2 SquareRoot Raised Cosine Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
23 Mary Detection Theory in Ndimensional signal space 72
24 Quadrature Amplitude Modulation (QAM) 74
24.1 Showing Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
24.2 Constellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
24.3 Signal Constellations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
24.4 Angle and Magnitude Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
24.5 Average Energy in MQAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
24.6 PhaseShift Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
24.7 Systems which use QAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
25 QAM Probability of Error 80
25.1 Overview of Future Discussions on QAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
25.2 Options for Probability of Error Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
25.3 Exact Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
25.4 Probability of Error in QPSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
25.5 Union Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
25.6 Application of Union Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
25.6.1 General Formula for Union Boundbased Probability of Error . . . . . . . . . . . . . . . . 84
ECE 5520 Fall 2009 5
26 QAM Probability of Error 85
26.1 NearestNeighbor Approximate Probability of Error . . . . . . . . . . . . . . . . . . . . . . . . . 85
26.2 Summary and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
27 Frequency Shift Keying 88
27.1 Orthogonal Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
27.2 Transmission of FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
27.3 Reception of FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
27.4 Coherent Reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
27.5 Noncoherent Reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
27.6 Receiver Block Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
27.7 Probability of Error for Coherent Binary FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
27.8 Probability of Error for Noncoherent Binary FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
27.9 FSK Error Probabilities, Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
27.9.1 Mary NonCoherent FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
27.9.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
27.10Bandwidth of FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
28 Frequency Multiplexing 95
28.1 Frequency Selective Fading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
28.2 Beneﬁts of Frequency Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
28.3 OFDM as an extension of FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
29 Comparison of Modulation Methods 98
29.1 Diﬀerential Encoding for BPSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
29.1.1 DPSK Transmitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
29.1.2 DPSK Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
29.1.3 Probability of Bit Error for DPSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
29.2 Points for Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
29.3 Bandwidth Eﬃciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
29.3.1 PSK, PAM and QAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
29.3.2 FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
29.4 Bandwidth Eﬃciency vs.
E
b
N
0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
29.5 Fidelity (P [error]) vs.
E
b
N
0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
29.6 Transmitter Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
29.6.1 Linear / Nonlinear Ampliﬁers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
29.7 Oﬀset QPSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
29.8 Receiver Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
30 Link Budgets and System Design 105
30.1 Link Budgets Given C/N
0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
30.2 Power and Energy Limited Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
30.3 Computing Received Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
30.3.1 Free Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
30.3.2 Nonfreespace Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
30.3.3 Wired Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
30.4 Computing Noise Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
ECE 5520 Fall 2009 6
30.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
31 Timing Synchronization 113
32 Interpolation 114
32.1 Sampling Time Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
32.2 Seeing Interpolation as Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
32.3 Approximate Interpolation Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
32.4 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
32.4.1 Higher order polynomial interpolation ﬁlters . . . . . . . . . . . . . . . . . . . . . . . . . . 117
33 Final Project Overview 118
33.1 Review of Interpolation Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
33.2 Timing Error Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
33.3 Earlylate timing error detector (ELTED) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
33.4 Zerocrossing timing error detector (ZCTED) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
33.4.1 QPSK Timing Error Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
33.5 Voltage Controlled Clock (VCC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
33.6 Phase Locked Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
33.6.1 Phase Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
33.6.2 Loop Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
33.6.3 VCO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
33.6.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
33.6.5 DiscreteTime Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
34 Exam 2 Topics 126
35 Source Coding 127
35.1 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
35.2 Joint Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
35.3 Conditional Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
35.4 Entropy Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
35.5 Source Coding Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
36 Review 133
37 Channel Coding 133
37.1 R. V. L. Hartley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
37.2 C. E. Shannon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
37.2.1 Noisy Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
37.2.2 Introduction of Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
37.2.3 Introduction of Power Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
37.3 Combining Two Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
37.3.1 Returning to Hartley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
37.3.2 Final Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
37.4 Eﬃciency Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
38 Review 138
ECE 5520 Fall 2009 7
39 Channel Coding 138
39.1 R. V. L. Hartley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
39.2 C. E. Shannon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
39.2.1 Noisy Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
39.2.2 Introduction of Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
39.2.3 Introduction of Power Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
39.3 Combining Two Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
39.3.1 Returning to Hartley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
39.3.2 Final Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
39.4 Eﬃciency Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
ECE 5520 Fall 2009 8
Lecture 1
Today: (1) Syllabus (2) Intro to Digital Communications
1 Class Organization
Textbook Few textbooks cover solely digital communications (without analog) in an introductory communi
cations course. But graduates today will almost always encounter / be developing solely digital communication
systems. So half of most textbooks are useless; and that the other half is sparse and needs supplemental material.
For example, the past text was the ‘standard’ text in the area for an undergraduate course, Proakis & Salehi,
J.G. Proakis and M. Salehi, Communication Systems Engineering, 2nd edition, Prentice Hall, 2001. Students
didn’t like that I had so many supplemental readings. This year’s text covers primarily digital communications,
and does it in depth. Finally, I ﬁnd it to be very wellwritten. And, there are few options in this area. I will
provide additional readings solely to provide another presentation style or ﬁt another learning style. Unless
speciﬁed, these are optional.
Lecture Notes I type my lecture notes. I have taught ECE 5520 previously and have my lecture notes from
past years. I’m constantly updating my notes, even up to the lecture time. These can be available to you at
lecture, and/or after lecture online. However, you must accept these conditions:
1. Taking notes is important: I ﬁnd most learning requires some writing on your part, not just watching.
Please take your own notes.
2. My written notes do not and cannot reﬂect everything said during lecture: I answer questions
and understand your perspective better after hearing your questions, and I try to tailor my approach
during the lecture. If I didn’t, you could just watch a recording!
2 Introduction
A digital communication system conveys discretetime, discretevalued information across a physical channel.
Information sources might include audio, video, text, or data. They might be continuoustime (analog) signals
(audio, images) and even 1D or 2D. Or, they may already be digital (discretetime, discretevalued). Our
object is to convey the signals or data to another place (or time) with as faithful representation as possible.
In this section we talk about what we’ll cover in this class, and more importantly, what we won’t cover.
2.1 ”Executive Summary”
Here is the one sentence version: We will study how to eﬃciently encode digital data on a noisy, bandwidth
limited analog medium, so that decoding the data (i.e., reception) at a receiver is simple, eﬃcient, and high
ﬁdelity.
The keys points stuﬀed into that one sentence are:
1. Digital information on an analog medium: We can send waveforms, i.e., realvalued, continuoustime
functions, on the channel (medium). These waveforms are from a discrete set of possible waveforms.
What set of waveforms should we use? Why?
ECE 5520 Fall 2009 9
2. Decoding the data: When receiving a signal (a function) in noise, none of the original waveforms will
match exactly. How do you make a decision about which waveform was sent?
3. What makes a receiver diﬃcult to realize? What choices of waveforms make a receiver simpler to imple
ment? What techniques are used in a receiver to compensate?
4. Eﬃciency, Bandwidth, and Fidelity: Fidelity is the correctness of the received data (i.e., the opposite of
error rate). What is the tradeoﬀ between energy, bandwidth, and ﬁdelity? We all want high ﬁdelity, and
low energy consumption and bandwidth usage (the costs of our communication system).
You can look at this like an impedance matching problem from circuits. You want, for power eﬃciency, to
have the source impedance match the destination impedance. In digital comm, this means that we want our
waveform choices to match the channel and receiver to maximize the eﬃciency of the communication system.
2.2 Why not Analog?
The previous text used for this course, by Proakis & Salehi, has an extensive analysis and study of analog
communication systems, such as radio and television broadcasting (Chapter 3). In the recent past, this course
would study both analog and digital communication systems. Analog systems still exist and will continue to
exist; however, development of new systems will almost certainly be of digital communication systems. Why?
• Fidelity
• Energy: transmit power, and device power consumption
• Bandwidth eﬃciency: due to coding gains
• Moore’s Law is decreasing device costs for digital hardware
• Increasing need for digital information
• More powerful information security
2.3 Networking Stack
In this course, we study digital communications from bits to bits. That is, we study how to take ones and zeros
from a transmitter, send them through a medium, and then (hopefully) correctly identify the same ones and
zeros at the receiver. There’s a lot more than this to the digital communication systems which you use on a
daily basis (e.g., iPhone, WiFi, Bluetooth, wireless keyboard, wireless car key).
To manage complexity, we (engineers) don’t try to build a system to do everything all at once. We typically
start with an application, and we build a layered network to handle the application. The 7layer OSI stack,
which you would study in a CS computer networking class, is as follows:
• Application
• Presentation (*)
• Session (*)
• Transport
• Network
ECE 5520 Fall 2009 10
• Link Layer
• Physical (PHY) Layer
(Note that there is also a 5layer model in which * layers are considered as part of the application layer.) ECE
5520 is part of the bottom layer, the physical layer. In fact, the physical layer has much more detail. It is
primarily divided into:
• Multiple Access Control (MAC)
• Encoding
• Channel / Medium
We can control the MAC and the encoding chosen for a digital communication.
2.4 Channels and Media
We can chose from a few media, but we largely can’t change the properties of the medium (although there are
exceptions). Here are some media:
• EM Spectra: (anything above 0 Hz) Radio, Microwave, mmwave bands, light
• Acoustic: ultrasound
• Transmission lines, waveguides, optical ﬁber, coaxial cable, wire pairs, ...
• Disk (data storage applications)
2.5 Encoding / Decoding Block Diagram
Information
Source
Source
Encoder
Channel
Encoder
Modulator
Channel
Information
Output
Source
Decoder
Channel
Decoder
Demodulator
Upconversion
Down
conversion
OtherComponents:
Digitalto Analog
Converter
AnalogtoDigital
Converter
Synchron
ization
Figure 1: Block diagram of a singleuser digital communication system, including (top) transmitter, (middle)
channel, and (bottom) receiver.
Notes:
• Information source comes from higher networking layers. It may be continuous or packetized.
• Source Encoding: Finding a compact digital representation for the data source. Includes sampling of
continuoustime signals, and quantization of continuousvalued signals. Also includes compression of
those sources (lossy, or lossless). What are some compression methods that you’re familiar with? We
present an introduction to source encoding at the end of this course.
ECE 5520 Fall 2009 11
• Channel encoding refers to redundancy added to the signal such that any bit errors can be corrected.
A channel decoder, because of the redundancy, can correct some bit errors. We will not study channel
encoding, but it is a topic in the (ECE 6520) Coding Theory.
• Modulation refers to the digitaltoanalog conversion which produces a continuoustime signal that can be
sent on the physical channel. It is analogous to impedance matching  proper matching of a modulation to
a channel allows optimal information transfer, like impedance matching ensured optimal power transfer.
Modulation and demodulation will be the main focus of this course.
• Channels: See above for examples. Typical models are additive noise, or linear ﬁltering channel.
Why do we do both source encoding (which compresses the signal as much as possible) and also channel
encoding (which adds redundancy to the signal)? Because of Shannon’s sourcechannel coding separation
theorem. He showed that (given enough time) we can consider them separately without additional loss. And
separation, like layering, reduces complexity to the designer.
2.6 Channels
A channel can typically be modeled as a linear ﬁlter with the addition of noise. The noise comes from a variety
of sources, but predominantly:
1. Thermal background noise: Due to the physics of living above 0 Kelvin. Well modeled as Gaussian, and
white; thus it is referred to as additive white Gaussian noise (AWGN).
2. Interference from other transmitted signals. These other transmitters whose signals we cannot completely
cancel, we lump into the ‘interference’ category. These may result in nonGaussian noise distribution, or
nonwhite noise spectral density.
The linear ﬁltering of the channel result from the physics and EM of the medium. For example, attenuation in
telephone wires varies by frequency. Narrowband wireless channels experience fading that varies quickly as a
function of frequency. Wideband wireless channels display multipath, due to multiple timedelayed reﬂections,
diﬀractions, and scattering of the signal oﬀ of the objects in the environment. All of these can be modeled as
linear ﬁlters.
The ﬁlter may be constant, or timeinvariant, if the medium, the TX and RX do not move or change.
However, for mobile radio, the channel may change very quickly over time. Even for stationary TX and RX, in
real wireless channels, movement of cars, people, trees, etc. in the environment may change the channel slowly
over time.
Transmitted
Signal
LTIFilter
h(t)
Noise
Received
Signal
Figure 2: Linear ﬁlter and additive noise channel model.
In this course, we will focus primarily on the AWGN channel, but we will mention what variations exist for
particular channels, and how they are addressed.
ECE 5520 Fall 2009 12
2.7 Topic: Random Processes
Random things in a communication system:
• Noise in the channel
• Signal (bits)
• Channel ﬁltering, attenuation, and fading
• Device frequency, phase, and timing oﬀsets
These random signals often pass through LTI ﬁlters, and are sampled. We want to build the best receiver
possible despite the impediments. Optimal receiver design is something that we study using probability theory.
We have to tolerate errors. Noise and attenuation of the channel will cause bit errors to be made by the
demodulator and even the channel decoder. This may be tolerated, or a higher layer networking protocol (eg.,
TCPIP) can determine that an error occurred and then rerequest the data.
2.8 Topic: Frequency Domain Representations
To ﬁt as many signals as possible onto a channel, we often split the signals by frequency. The concept of
sharing a channel is called multiple access (MA). Separating signals by frequency band is called frequency
division multiple access (FDMA). For the wireless channel, this is controlled by the FCC (in the US) and called
spectrum allocation. There is a tradeoﬀ between frequency requirements and time requirements, which will be
a major part of this course. The Fourier transform of our modulated, transmitted signal is used to show that
it meets the spectrum allocation limits of the FCC.
2.9 Topic: Orthogonality and Signal spaces
To show that signals sharing the same channel don’t interfere with each other, we need to show that they are
orthogonal. This means, in short, that a receiver can uniquely separate them. Signals in diﬀerent frequency
bands are orthogonal.
We can also employ multiple orthogonal signals in a single transmitter and receiver, in order to provide
multiple independent means (dimensions) on which to modulate information. We will study orthogonal signals,
and learn an algorithm to take an arbitrary set of signals and output a set of orthogonal signals with which to
represent them. We’ll use signal spaces to show graphically the results, as the example in Figure 36.
M=8 M=16
Figure 3: Example signal space diagram for Mary Phase Shift Keying, for (a) M = 8 and (b) M = 16. Each
point is a vector which can be used to send a 3 or 4 bit sequence.
ECE 5520 Fall 2009 13
2.10 Related classes
1. Prerequisites: (ECE 5510) Random Processes; (ECE 3500) Signals and Systems.
2. Signal Processing: (ECE 5530): Digital Signal Processing
3. Electromagnetics: EM Waves, (ECE 53205321) Microwave Engineering, (ECE 5324) Antenna Theory,
(ECE 5411) Fiberoptic Systems
4. Breadth: (ECE 5325) Wireless Communications
5. Devices and Circuits: (ECE 3700) Fundamentals of Digital System Design, (ECE 5720) Analog IC Design
6. Networking: (ECE 5780) Embedded System Design, (CS 5480) Computer Networks
7. Advanced Classes: (ECE 6590) Software Radio, (ECE 6520) Information Theory and Coding, (ECE 6540):
Estimation Theory
Lecture 2
Today: (1) Power, Energy, dB (2) Timedomain concepts (3) Bandwidth, Fourier Transform
Two of the biggest limitations in communications systems are (1) energy / power; and (2) bandwidth. Today’s
lecture provides some tools to deal with power and energy, and starts the review of tools to analyze frequency
content and bandwidth.
3 Power and Energy
Recall that energy is power times time. Use the units: energy is measured in Joules (J); power is measured in
Watts (W) which is the same as Joules/second (J/sec). Also, recall that our standard in signals and systems is
deﬁne our signals, such as x(t), as voltage signals (V). When we want to know the power of a signal we assume
it is being dissipated in a 1 Ohm resistor, so [x(t)[
2
is the power dissipated at time t (since power is equal to
the voltage squared divided by the resistance).
A signal x(t) has energy deﬁned as
E =
_
∞
−∞
[x(t)[
2
dt
For some signals, E will be inﬁnite because the signal is nonzero for an inﬁnite duration of time (it is always
on). These signals we call power signals and we compute their power as
P = lim
T→∞
1
2T
_
T
−T
[x(t)[
2
dt
The signal with ﬁnite energy is called an energy signal.
ECE 5520 Fall 2009 14
3.1 DiscreteTime Signals
In this book, we refer to discrete samples of the sampled signal x as x(n). You may be more familiar with the
x[n] notation. But, Matlab uses parentheses also; so we’ll follow the Rice text notation. Essentially, whenever
you see a function of n (or k, l, m), it is a discretetime function; whenever you see a function of t (or perhaps
τ) it is a continuoustime function. I’m sorry this is not more obvious in the notation.
For discretetime signals, energy and power are deﬁned as:
E =
∞
n=−∞
[x(n)[
2
(1)
P = lim
N→∞
1
2N + 1
N
n=−N
[x(n)[
2
(2)
3.2 Decibel Notation
We often use a decibel (dB) scale for power. If P
lin
is the power in Watts, then
[P]
dBW
= 10 log
10
P
lin
Decibels are more general  they can apply to other unitless quantities as well, such as a gain (loss) L(f) through
a ﬁlter H(f),
[L(f)]
dB
= 10 log
10
[H(f)[
2
(3)
Note: Why is the capital B used? Either the lowercase ‘b’ in the SI system is reserved for bits, so when the
‘bel’ was ﬁrst proposed as log
10
(), it was capitalized; or it referred to a name ‘Bell’ so it was capitalized. In
either case, we use the unit decibel 10 log
10
() which is then abbreviated as dB in the SI system.
Note that (3) could also be written as:
[L(f)]
dB
= 20 log
10
[H(f)[ (4)
Be careful with your use of 10 vs. 20 in the dB formula.
• Only use 20 as the multiplier if you are converting from voltage to power; i.e., taking the log
10
of a voltage
and expecting the result to be a dB power value.
Our standard is to consider power gains and losses, not voltage gains and losses. So if we say, for example,
the channel has a loss of 20 dB, this refers to a loss in power. In particular, the output of the channel has 100
times less power than the input to the channel.
Remember these two dB numbers:
• 3 dB: This means the number is double in linear terms.
• 10 dB: This means the number is ten times in linear terms.
And maybe this one:
• 1 dB: This means the number is a little over 25% more (multiply by 5/4) in linear terms.
With these three numbers, you can quickly convert losses or gains between linear and dB units without a
calculator. Just convert any dB number into a sum of multiples of 10, 3, and 1.
Example: Convert dB to linear values:
ECE 5520 Fall 2009 15
1. 30 dBW
2. 33 dBm
3. 20 dB
4. 4 dB
Example: Convert linear values to dB:
1. 0.2 W
2. 40 mW
Example: Convert power relationships to dB:
Convert the expression to one which involves only dB terms.
1. P
y,lin
= 100P
x,lin
2. P
o,lin
= G
connector,lin
L
−d
cable,lin
, where P
o,lin
is the received power in a ﬁberoptic link, where d is the cable
length (typically in units of km), G
connector,lin
is the gain in any connectors, and L
cable,lin
is a loss in a 1
km cable.
3. P
r,lin
= P
t,lin
G
t,lin
G
t,lin
λ
2
(4πd)
2
, where λ is the wavelength (m), d is the path length (m), and G
t,lin
and G
t,lin
are the linear gains in the antennas, P
t,lin
is the transmit power (W) and P
r,lin
is the received power (W).
This is the Friis free space path loss formula.
These last two are what we will need in Section 6.4, when we discuss link budgets. The main idea is that
we have a limited amount of power which will be available at the receiver.
4 TimeDomain Concept Review
4.1 Periodicity
Def ’n: Periodic (continuoustime)
A signal x(t) is periodic if x(t) = x(t +T
0
) for some constant T
0
,= 0 for all t ∈ R. The smallest such constant
T
0
> 0 is the period.
If a signal is not periodic it is aperiodic.
Periodic signals have Fourier series representations, as deﬁned in Rice Ch. 2.
Def ’n: Periodic (discretetime)
A DT signal x(n) is periodic if x(n) = x(n + N
0
) for some integer N
0
,= 0, for all integers n. The smallest
positive integer N
0
is the period.
ECE 5520 Fall 2009 16
4.2 Impulse Functions
Def ’n: Impulse Function
The (Dirac) impulse function δ(t) is the function which makes
_
∞
−∞
x(t)δ(t)dt = x(0) (5)
true for any function x(t) which is continuous at t = 0.
We are deﬁning a function by its most important property, the ‘sifting property’. Is there another deﬁnition
which is more familiar?
Solution:
δ(t) = lim
T→0
_
1/T, −T ≤ t ≤ T
0, o.w.
You can visualize δ(t) here as an inﬁnitely high, inﬁnitesimally wide pulse at the origin, with area one. This is
why it ‘pulls out’ the value of x(t) in the integral in (5).
Other properties of the impulse function:
• Time scaling,
• Symmetry,
• Sifting at arbitrary time t
0
,
The continuoustime unit step function is
u(t) =
_
1, t ≥ 0
0, o.w.
Example: Sifting Property
What is
_
∞
−∞
sin(πt)
πt
δ(1 −t)dt?
The discretetime impulse function (also called the Kronecker delta or δ
K
) is deﬁned as:
δ(n) =
_
1, n = 0
0, o.w.
(There is no need to get complicated with the math; this is well deﬁned.) Also,
u(n) =
_
1, n ≥ 0
0, o.w.
5 Bandwidth
Bandwidth is another critical resource for a digital communications system; we have various deﬁnitions to
quantify it. In short, it isn’t easy to describe a signal in the frequency domain with a single number. And, in
the end, a system will be designed to meet a spectral mask required by the FCC or system standard.
ECE 5520 Fall 2009 17
Periodicity
Time Periodic Aperiodic
ContinuousTime Laplace Transform
x(t) ↔ X(s)
Fourier Series Fourier Transform
x(t) ↔ a
k
x(t) ↔ X(jω)
X(jω) =
_
∞
t=−∞
x(t)e
−jωt
dt
x(t) =
1
2π
_
∞
ω=−∞
X(jω)e
jωt
dω
DiscreteTime zTransform
x(n) ↔ X(z)
Discrete Fourier Transform (DFT) Discrete Time Fourier Transform (DTFT)
x(n) ↔ X[k] x(n) ↔ X(e
jΩ
)
X[k] =
N−1
n=0
x(n)e
−j
2π
N
kn
X(e
jΩ
) =
∞
n=−∞
x(n)e
−jΩn
x(n) =
1
N
N−1
k=0
X[k]e
j
2π
N
nk
x(n) =
1
2π
_
π
n=−π
X(e
jΩ
)dΩ
Table 1: Frequency Transforms
Intuitively, bandwidth is the maximum extent of our signal’s frequency domain characterization, call it X(f).
A baseband signal absolute bandwidth is often deﬁned as the W such that X(f) = 0 for all f except for the
range −W ≤ f ≤ W. Other deﬁnitions for bandwidth are
• 3dB bandwidth: B
3dB
is the value of f such that [X(f)[
2
= [X(0)[
2
/2.
• 90% bandwidth: B
90%
is the value which captures 90% of the energy in the signal:
_
B
90%
−B
90%
[X(f)[
2
df = 0.90
_
∞
=−∞
Xd[(f)[
2
df
As a motivating example, I mention the squareroot raised cosine (SRRC) pulse, which has the following
desirable Fourier transform:
H
RRC
(f) =
_
¸
¸
_
¸
¸
_
√
T
s
, 0 ≤ [f[ ≤
1−α
2Ts _
Ts
2
_
1 + cos
_
πTs
α
_
[f[ −
1−α
2Ts
___
,
1−α
2Ts
≤ [f[ ≤
1+α
2Ts
0, o.w.
(6)
where α is a parameter called the “rolloﬀ factor”. We can actually analyze this using the properties of the
Fourier transform and many of the standard transforms you’ll ﬁnd in a Fourier transform table.
The SRRC and other pulse shapes are discussed in Appendix A, and we will go into more detail later on.
The purpose so far is to motivate practicing up on frequency transforms.
5.1 Continuoustime Frequency Transforms
Notes about continuoustime frequency transforms:
1. You are probably most familiar with the Laplace Transform. To convert it to the Fourier tranform, we
replace s with jω, where ω is the radial frequency, with units radians per second (rad/s).
ECE 5520 Fall 2009 18
2. You may prefer the radial frequency representation, but also feel free to use the rotational frequency f
(which has units of cycles per sec, or Hz. Frequency in Hz is more standard for communications; you
should use it for intuition. In this case, just substitute ω = 2πf. You could write X(j2πf) as the notation
for this, but typically you’ll see it abbreviated as X(f). Note that the deﬁnition of the Fourier tranform
in the f domain loses the
1
2π
in the inverse Fourier transform deﬁnition.s
X(j2πf) =
_
∞
t=−∞
x(t)e
−j2πft
dt
x(t) =
_
∞
f=−∞
X(j2πf)e
j2πft
df
(7)
3. The Fourier series is limited to purely periodic signals. Both Laplace and Fourier transforms are not
limited to periodic signals.
4. Note that e
jα
= cos(α) +j sin(α).
See Table 2.4.4 in the Rice book.
Example: Square Wave
Given a rectangular pulse x(t) = rect(t/T
s
),
x(t) =
_
1, −T
s
/2 < t ≤ T
s
/2
0, o.w.
What is the Fourier transform X(f)? Calculate both from the deﬁnition and from a table.
Solution: Method 1: From the deﬁnition:
X(jω) =
_
Ts/2
t=−Ts/2
e
−jωt
dt
=
1
−jω
e
−jωt
¸
¸
¸
¸
Ts/2
t=−Ts/2
=
1
−jω
_
e
−jωTs/2
−e
jωTs/2
_
= 2
sin(ωT
s
/2)
ω
= T
s
sin(ωT
s
/2)
ωT
s
/2
This uses the fact that
1
−2j
_
e
−jα
−e
jα
_
= sin(α). While it is sometimes convenient to replace sin(πx)/(πx) with
sincx, it is confusing because sinc(x) is sometimes deﬁned as sin(πx)/(πx) and sometimes deﬁned as (sin x)/x.
No standard deﬁnition for ‘sinc’ exists! Rather than make a mistake because of this, the Rice book always
writes out the expression fully. I will try to follow suit.
Method 2: From the tables and properties:
x(t) = g(2t) where g(t) =
_
1, [t[ < T
s
0, o.w.
(8)
From the table, G(jω) = 2T
s
sin(ωTs)
ωTs
. From the properties, X(jω) =
1
2
G
_
j
ω
2
_
. So
X(jω) = T
s
sin(ωT
s
/2)
ωT
s
/2
ECE 5520 Fall 2009 19
See Figure 4(a).
(a)
−4/Ts −3/Ts −2/Ts −1/Ts 0 1/Ts 2/Ts 3/Ts 4/Ts
0
Ts/2
Ts
f
T
s
s
i
n
c
(
π
f
T
s
)
(b)
−4/Ts −3/Ts −2/Ts −1/Ts 0 1/Ts 2/Ts 3/Ts 4/Ts
−50
−40
−30
−20
−10
0
f
2
0
l
o
g
1
0
(
X
(
f
)
/
T
s
)
Figure 4: (a) Fourier transform X(j2πf) of rect pulse with period T
s
, and (b) Power vs. frequency
20 log
10
(X(j2πf)/T
s
).
Question: What if Y (jω) was a rect function? What would the inverse Fourier transform y(t) be?
5.1.1 Fourier Transform Properties
See Table 2.4.3 in the Rice book.
Assume that F¦x(t)¦ = X(jω). Important properties of the Fourier transform:
1. Duality property:
x(jω) = F¦X(−t)¦
x(−jω) = F¦X(t)¦
(Confusing. It says is that you can go backwards on the Fourier transform, just remember to ﬂip the
result around the origin.)
2. Time shift property:
F¦x(t −t
0
)¦ = e
−jωt
0
X(jω)
ECE 5520 Fall 2009 20
3. Scaling property: for any real a ,= 0,
F¦x(at)¦ =
1
[a[
X
_
j
ω
a
_
4. Convolution property: if, additionally y(t) has Fourier transform X(jω),
F¦x(t) ⋆ y(t)¦ = X(jω) Y (jω)
5. Modulation property:
F¦x(t)cos(ω
0
t)¦ =
1
2
X(ω −ω
0
) +
1
2
X(ω +ω
0
)
6. Parceval’s theorem: The energy calculated in the frequency domain is equal to the energy calculated in
the time domain.
_
∞
t=−∞
[x(t)[
2
dt =
_
∞
f=−∞
[X(f)[
2
df =
1
2π
_
∞
ω=−∞
[X(jω)[
2
dω
So do whichever one is easiest! Or, check your answer by doing both.
5.2 Linear Time Invariant (LTI) Filters
If a (deterministic) signal x(t) is input to a LTI ﬁlter with impulse response h(t), the output signal is
y(t) = h(t) ⋆ x(t)
Using the above convolution property,
Y (jω) = X(jω) H(jω)
5.3 Examples
Example: Applying FT Properties
If w(t) has the Fourier transform
W(jω) =
jω
1 +jω
ﬁnd X(jω) for the following waveforms:
1. x(t) = w(2t + 2)
2. x(t) = e
−jt
w(t −1)
3. x(t) = 2
∂w(t)
∂t
4. x(t) = w(1 −t)
Solution: To be worked out in class.
Lecture 3
Today: (1) Bandpass Signals (2) Sampling
Solution: From the Examples at the end of Lecture 2:
ECE 5520 Fall 2009 21
1. Let x(t) = z(2t) where z(t) = w(t + 2). Then
Z(jω) = e
j2ω
jω
1 +jω
, and X(jω) =
1
2
Z(jω/2).
So X(jω) =
1
2
e
jω
jω/2
1+jω/2
. Alternatively, let x(t) = r(t + 1) where r(t) = w(2t). Then
R(jω) =
1
2
W(jω/2), and X(jω) = e
jω
R(jω).
Again, X(jω) =
1
2
e
jω
jω/2
1+jω/2
.
2. Let x(t) = e
−jt
z(t) where z(t) = w(t − 1). Then Z(ω) = e
−jω
W(jω), and X(jω) = Z(j(ω − 1)). So
X(jω) = e
−j(ω−1)
W(j(ω −1)) = e
−j(ω+1)
j(ω+1)
1+j(ω+1)
.
3. X(jω) = 2jω
jω
1+jω
= −
2ω
2
1+jω
.
4. Let x(t) = y(−t), where y(t) = w(1 +t). Then
X(jω) = Y (−jω), where Y (jω) = e
jω
W(jω) = e
jω
jω
1 +jω
So X(jω) = e
−jω −jω
1−jω
.
6 Bandpass Signals
Def ’n: Bandpass Signal
A bandpass signal x(t) has Fourier transform X(jω) = 0 for all ω such that [ω ±ω
c
[ > W/2, where W/2 < ω
c
.
Equivalently, a bandpass signal x(t) has Fourier transform X(f) = 0 for all f such that [f ± f
c
[ > B/2, where
B/2 < f
c
.
f
c
B
X f ( )
f
c
B
Figure 5: Bandpass signal with center frequency f
c
.
Realistically, X(jω) will not be exactly zero outside of the bandwidth, there will be some ‘sidelobes’ out of
the main band.
6.1 Upconversion
We can take a baseband signal x
C
(t) and ‘upconvert’ it to be a bandpass ﬁlter by multiplying with a cosine at
the desired center frequency, as shown in Fig. 6.
ECE 5520 Fall 2009 22
x t
C
( )
cos(2 ) pf t
c
Figure 6: Modulator block diagram.
Example: Modulated Square Wave
We dealt last time with the square wave x(t) = rect(t/T
s
). This time, we will look at:
x(t) = rect(t/T
s
) cos(ω
c
t)
where ω
c
is the center frequency in radians/sec (f
c
= ω
c
/(2π) is the center frequency in Hz). Note that I use
“rect” as follows:
rect(t) =
_
1, −1/2 < t ≤ 1/2
0, o.w.
What is X(jω)?
Solution:
X(jω) = F¦rect(t/T
s
)¦ ⋆ F¦cos(ω
c
t)¦
X(jω) =
1
2π
T
s
sin(ωT
s
/2)
ωT
s
/2
⋆ π [δ(ω −ω
c
) +δ(ω +ω
c
)]
X(jω) =
T
s
2
sin[(ω −ω
c
)T
s
/2]
(ω −ω
c
)T
s
/2
+
T
s
2
sin[(ω +ω
c
)T
s
/2]
(ω +ω
c
)T
s
/2
The plot of X(j2πf) is shown in Fig. 7.
f
c
X f ( )
f
c
Figure 7: Plot of X(j2πf) for modulated square wave example.
6.2 Downconversion of Bandpass Signals
How will we get back the desired baseband signal x
C
(t) at the receiver?
1. Multiply (again) by the carrier 2 cos(ω
c
t).
2. Input to a lowpass ﬁlter (LPF) with cutoﬀ frequency ω
c
.
This is shown in Fig. 8.
ECE 5520 Fall 2009 23
2cos(2 ) pf t
c
x t ( )
LPF
Figure 8: Demodulator block diagram.
Assume the general modulated signal was,
x(t) = x
C
(t) cos(ω
c
t)
X(jω) =
1
2
X
C
(j(ω −ω
c
)) +
1
2
X
C
(j(ω +ω
c
))
What happens after the multiplication with the carrier in the receiver?
Solution:
X
1
(jω) =
1
2π
X(jω) ⋆ 2π [δ(ω −ω
c
) +δ(ω +ω
c
)]
X
1
(jω) =
1
2
X
C
(j(ω −2ω
c
)) +
1
2
X
C
(jω) +
1
2
X
C
(jω) +
1
2
X
C
(j(ω + 2ω
c
))
There are components at 2ω
c
and −2ω
c
which will be canceled by the LPF. (Does the LPF need to be ideal?)
X
2
(jω) = X
C
(jω)
So x
2
(t) = x
C
(t).
7 Sampling
A common statement of the Nyquist sampling theorem is that a signal can be sampled at twice its bandwidth.
But the theorem really has to do with signal reconstruction from a sampled signal.
Theorem: (Nyquist Sampling Theorem.) Let x
c
(t) be a baseband, continuous signal with bandwidth B (in
Hz), i.e., X
c
(jω) = 0 for all [ω[ ≥ 2πB. Let x
c
(t) be sampled at multiples of T, where
1
T
≥ 2B to yield the
sequence ¦x
c
(nT)¦
∞
n=−∞
. Then
x
c
(t) = 2BT
∞
n=−∞
x
c
(nT)
sin(2πB(t −nT))
2πB(t −nT)
. (9)
Proof: Not covered.
Notes:
• This is an interpolation procedure.
• Given ¦x
n
¦
∞
n=−∞
, how would you ﬁnd the maximum of x(t)?
• This is only precise when X(jω) = 0 for all [ω[ ≥ 2πB.
ECE 5520 Fall 2009 24
7.1 Aliasing Due To Sampling
Essentially, sampling is the multiplication of a impulse train (at period T with the desired signal x(t):
x
sa
(t) = x(t)
∞
n=−∞
δ(t −nT)
x
sa
(t) =
∞
n=−∞
x(nT)δ(t −nT)
What is the Fourier transform of x
sa
(t)?
Solution: In the frequency domain, this is a convolution:
X
sa
(jω) = X(jω) ⋆
2π
T
∞
n=−∞
δ
_
ω −
2πn
T
_
=
2π
T
∞
n=−∞
X
_
j
_
ω −
2πn
T
__
for all ω (10)
=
1
T
X(jω) for [ω[ < 2πB
This is shown graphically in the Rice book in Figure 2.12.
The Fourier transform of the sampled signal is many copies of X(jω) strung at integer multiples of 2π/T,
as shown in Fig. 9.
Figure 9: The eﬀect of sampling on the frequency spectrum in terms of frequency f in Hz.
Example: Sinusoid sampled above and below Nyquist rate
Consider two sinusoidal signals sampled at 1/T = 25 Hz:
x
1
(nT) = sin(2π5nT)
x
2
(nT) = sin(2π20nT)
What are the two frequencies of the sinusoids, and what is the Nyquist rate? Which of them does the Nyquist
theorem apply to? Draw the spectrums of the continuous signals x
1
(t) and x
2
(t), and indicate what the spectrum
is of the sampled signals.
Figure 10 shows what happens when the Nyquist theorem is applied to the each signal (whether or not it is
valid). What observations would you make about Figure 10(b), compared to Figure 10(a)?
Example: Square vs. round pulse shape
Consider the square pulse considered before, x
1
(t) = rect(t/T
s
). Also consider a parabola pulse (this doesn’t
ECE 5520 Fall 2009 25
(a)
−1 −0.5 0 0.5 1
−1
−0.5
0
0.5
1
1.5
Time
x
(
t
)
Sampled
Interp.
(b)
−1 −0.5 0 0.5 1
−1
−0.5
0
0.5
1
1.5
Time
x
(
t
)
Sampled
Interp.
Figure 10: Sampled (a) x
1
(nT) and (b) x
2
(nT) are interpolated (—) using the Nyquist interpolation formula.
really exist in the wild – I’m making it up for an example.)
x
2
(t) =
_
1 −
_
2t
Ts
_
2
, −
1
2Ts
≤ t ≤
1
2Ts
0, o.w.
What happens to x
1
(t) and x
2
(t) when they are sampled at rate T?
In the Matlab code EgNyquistInterpolation.m we set T
s
= 1/5 and T = 1/25. Then, the sampled pulses
are interpolated using (9). Even though we’ve sampled at a pretty high rate, the reconstructed signals will not
be perfect representations of the original, in particular for x
1
(t). See Figure 11.
7.2 Connection to DTFT
Recall (10). We calculated the Fourier transform of the product by doing the convolution in the frequency
domain. Instead, we can calculate it directly from the formula.
ECE 5520 Fall 2009 26
(a)
−1 −0.5 0 0.5 1
−0.2
0
0.2
0.4
0.6
0.8
1
Time
x
(
t
)
Sampled
Interp.
(b)
−1 −0.5 0 0.5 1
−0.2
0
0.2
0.4
0.6
0.8
1
Time
x
(
t
)
Sampled
Interp.
Figure 11: Sampled (a) x
1
(nT) and (b) x
2
(nT) are interpolated (—) using the Nyquist interpolation formula.
Solution:
X
sa
(jω) =
_
∞
t=−∞
∞
n=−∞
x(nT)δ(t −nT)e
−jωt
=
∞
n=−∞
x(nT)
_
∞
t=−∞
δ(t −nT)e
−jωt
(11)
=
∞
n=−∞
x(nT)e
−jωnT
The discretetime Fourier transform of x
sa
(t), I denote DTFT¦x
sa
(t)¦, and it is:
DTFT¦x
sa
(t)¦ = X
d
(e
jΩ
) =
∞
n=−∞
x(nT)e
−jΩn
Essentially, the diﬀerence between the DTFT and the Fourier transform of the sampled signal is the relationship
Ω = ωT = 2πfT
ECE 5520 Fall 2009 27
But this deﬁnes the relationship only between the Fourier transform of the sampled signal. How can we
relate this to the FT of the continuoustime signal? A: Using (10). We have that X
d
(e
jω
) = X
sa
_
Ω
2πT
_
. Then
plugging into (10),
Solution:
X
d
(e
jΩ
) = X
sa
_
j
Ω
T
_
=
2π
T
∞
n=−∞
X
_
j
_
Ω
T
−
2πn
T
__
=
2π
T
∞
n=−∞
X
_
j
_
Ω −2πn
T
__
(12)
If x(t) is suﬃciently bandlimited, X
d
(e
jΩ
) ∝ X(jΩ/T) in the interval −π < Ω < π. This relationship holds
for the Fourier transform of the original, continuoustime signal between −π < Ω < π if and only if the
original signal x(t) is suﬃciently bandlimited so that the Nyquist sampling theorem applies.
Notes:
• Most things in the DTFT table in Table 2.8 (page 68) are analogous to continuous functions which are
not bandlimited. So  don’t expect the last form to apply to any continuous function.
• The DTFT is periodic in Ω with period 2π.
• Don’t be confused: The DTFT is a continuous function of Ω.
Lecture 4
Today: (1) Example: Bandpass Sampling (2) Orthogonality / Signal Spaces I
7.3 Bandpass sampling
Bandpass sampling is the use of sampling to perform downconversion via frequency aliasing. We assume that
the signal is truly a bandpass signal (has no DC component). Then, we sample with rate 1/T and we rely on
aliasing to produce an image the spectrum at a relatively low frequency (less than onehalf the sampling rate).
Example: Bandpass Sampling
This is Example 2.6.2 (pp. 6065) in the Rice book. In short, we have a bandpass signal in the frequency band,
450 Hz to 460 Hz. The center frequency is 455 Hz. We want to sample the signal. We have two choices:
1. Sample at more than twice the maximum frequency.
2. Sample at more than twice the bandwidth.
The ﬁrst gives a sampling frequency of more than 920 Hz. The latter gives a sampling frequency of more than
20 Hz. There is clearly a beneﬁt to the second part. However, traditionally, we would need to downconvert the
signal to baseband in order to sample it at the lower rate. (See Lecture 3 notes).
ECE 5520 Fall 2009 28
What is the sampled spectrum, and what is the bandwidth of the sampled signal in radians/sample if the
sampling rate is:
1. 1000 samples/sec?
2. 140 samples/sec?
Solution:
1. Sample at 1000 samples/sec: Ω
c
= 2π
455
500
=
455
500
π. The signal is very sparse  it occupies only
2π
10 Hz
1000 samples/sec
= π/50 radians/sample.
2. Sample at 140 samples/sec: Copies of the entire frequency spectrum will appear in the frequency
domain at multiples of 140 Hz or 2π140 radians/sec. The discretetime frequency is modulo 2π, so the
center frequency of the sampled signal is:
Ω
c
= 2π
455
140
mod 2π =
13π
2
mod 2π =
π
2
The value π/2 radians/sample is equivalent to 35 cycles/second. The bandwidth is 2π
10
140
= π/7 radi
ans/sample of sampled spectrum.
8 Orthogonality
From the ﬁrst lecture, we talked about how digital communications is largely about using (and choosing some
combination of) a discrete set of waveforms to convey information on the (continuoustime, realvalued) channel.
At the receiver, the idea is to determine which waveforms were sent and in what quantities.
This choice of waveforms is partially determined by bandwidth, which we have covered. It is also determined
by orthogonality. Loosely, orthogonality of waveforms means that some combination of the waveforms can be
uniquely separated at the receiver. This is the critical concept needed to understand digital receivers. To build
a better framework for it, we’ll start with something you’re probably more familiar with: vector multiplication.
8.1 Inner Product of Vectors
We often use an idea called inner product, which you’re familiar with from vectors (as the dot product). If we
have two vectors x and y,
x =
_
¸
¸
¸
_
x
1
x
2
.
.
.
x
n
_
¸
¸
¸
_
y =
_
¸
¸
¸
_
y
1
y
2
.
.
.
y
n
_
¸
¸
¸
_
Then we can take their inner product as:
x
T
y =
i
x
i
y
i
Note the
T
transpose ‘inbetween’ an inner product. The inner product is also denoted ¸x, y). Finally, the norm
of a vector is the square root of the inner product of a vector with itself,
x =
_
¸x, x)
ECE 5520 Fall 2009 29
A norm is always indicated with the   symbols outside of the vector.
Example: Example Inner Products
Let x = [1, 1, 1]
T
, y = [0, 1, 1]
T
, z = [1, 0, 0]
T
. What are:
• ¸x, y)?
• ¸z, y)?
• x?
• z?
Recall the inner product between two vectors can also be written as
x
T
y = xy cos θ
where θ is the angle between vectors x and y. In words, the inner product is a measure of ‘samedirectionness’:
when positive, the two vectors are pointing in a similar direction, when negative, they are pointing in somewhat
opposite directions; when zero, they’re at crossdirections.
8.2 Inner Product of Functions
The inner product is not limited to vectors. It also is applied to the ‘space’ of squareintegrable real functions,
i.e., those functions x(t) for which
_
∞
−∞
x
2
(t)dt < ∞.
The ‘space’ of squareintegrable complex functions, i.e., those functions x(t) for which
_
∞
−∞
[x(t)[
2
dt < ∞
also have an inner product. These norms are:
¸x(t), y(t)) =
_
∞
−∞
x(t)y(t)dt Real functions
¸x(t), y(t)) =
_
∞
−∞
x(t)y
∗
(t)dt Complex functions
As a result,
x(t) =
¸
_
∞
−∞
x
2
(t)dt Real functions
x(t) =
¸
_
∞
−∞
[x(t)[
2
dt Complex functions
Example: Sine and Cosine
Let
x(t) =
_
cos(2πt), 0 < t ≤ 1
0, o.w.
y(t) =
_
sin(2πt), 0 < t ≤ 1
0, o.w.
ECE 5520 Fall 2009 30
What are x(t) and y(t)? What is ¸x(t), y(t))?
Solution: Using cos
2
x =
1
2
(1 + cos 2x),
x(t) =
¸
_
1
0
cos
2
(2πt)dt =
¸
1
2
_
1 +
1
2π
sin(2πt)
_
dt
¸
¸
¸
¸
1
0
=
_
1/2
The solution for y(t) also turns out to be
_
1/2. Using sin 2x = 2 cos xsin x, the inner product is,
¸x(t), y(t)) =
_
1
0
cos(2πt) sin(2πt)dt
=
_
1
0
1
2
sin(4πt)dt
=
−1
8π
cos(4πt)
¸
¸
¸
¸
1
0
=
−1
8π
(1 −1) = 0
8.3 Deﬁnitions
Def ’n: Orthogonal
Two signals x(t) and y(t) are orthogonal if
¸x(t), y(t)) = 0
Def ’n: Orthonormal
Two signals x(t) and y(t) are orthonormal if they are orthogonal and they both have norm 1, i.e.,
x(t) = 1 y(t) = 1
(a) Are x(t) and y(t) from the sine/cosine example above orthogonal? (b) Are they orthonormal?
Solution: (a) Yes. (b) No. They could be orthonormal if they’d been scaled by
√
2.
x t
1
( )
A
A
x t
2
( )
A
A
Figure 12: Two WalshHadamard functions.
Example: WalshHadamard 2 Functions
Let
x
1
(t) =
_
_
_
A, 0 < t ≤ 0.5
−A, 0.5 < t ≤ 1
0, o.w.
x
2
(t) =
_
A, 0 < t ≤ 1
0, o.w.
ECE 5520 Fall 2009 31
These are shown in Fig. 12. (a) Are x
1
(t) and x
2
(t) orthogonal? (b) Are they orthonormal?
Solution: (a) Yes. (b) Only if A = 1.
8.4 Orthogonal Sets
Def ’n: Orthogonal Set
M signals x
1
(t), . . . , x
M
(t) are mutually orthogonal, or form an orthogonal set, if ¸x
i
(t), x
j
(t)) = 0 for all i ,= j.
Def ’n: Orthonormal Set
M mutually orthogonal signals x
1
(t), . . . , x
M
(t) are form an orthonormal set, if x
i
 = 1 for all i.
(a) (b)
Figure 13: WalshHadamard 2length and 4length functions in image form. Each function is a row of the image,
where crimson red is 1 and black is 1.
Figure 14: WalshHadamard 64length functions in image form.
Do the 64 WH functions form an orthonormal set? Why or why not?
Solution: Yes. Every pair i, j with i ,= j agrees half of the time, and disagrees half of the time. So the function
is halftime 1 and halftime 1 so they cancel and have inner product of zero. The norm of the function in row
i is 1 since the amplitude is always +1 or −1.
Example: CDMA and WalshHadamard
This example is just for your information (not for any test). The set of 64length WalshHadamard sequences
ECE 5520 Fall 2009 32
is used in CDMA (IS95) in two ways:
1. Reverse Link: Modulation: For each group of 6 bits to be send, the modulator chooses one of the 64
possible 64length functions.
2. Forward Link: Channelization or Multiple Access: The base station assigns each user a single WH function
(channel). It uses 64 functions (channels). The data on each channel is multiplied by a Walsh function so
that it is orthogonal to the data received by all other users.
Lecture 5
Today: (1) Orthogonal Signal Representations
9 Orthonormal Signal Representations
Last time we deﬁned the inner product for vectors and for functions. We talked about orthogonality and
orthonormality of functions and sets of functions. Now, we consider how to take a set of arbitrary signals and
represent them as vectors in an orthonormal basis.
What are some common orthonormal signal representations?
• Nyquist sampling: sinc functions centered at sampling times.
• Fourier series: complex sinusoids at diﬀerent frequencies
• Sine and cosine at the same frequency
• Wavelets
And, we will come up with others. Each one has a limitation – only a certain set of functions can be exactly
represented in a particular signal representation. Essentially, we must limit the set of possible functions to a
set. That is, some subset of all possible functions.
We refer to the set of arbitrary signals as:
o = ¦s
0
(t), s
1
(t), . . . , s
M−1
(t)¦
For example, a transmitter may be allowed to send any one of these M signals in order to convey information
to a receiver.
We’re going to introduce another set called an orthonormal basis:
B = ¦φ
0
(t), φ
1
(t), . . . , φ
N−1
(t)¦
Each function in B is called a basis function. You can think of B as an unambiguous and useful language to
represent the signals from our set o. Or, in analogy to color printers, a color printer can produce any of millions
of possible colors (signal set), only using black, red, blue, and green (basis set) (or black, cyan, magenta, yellow).
ECE 5520 Fall 2009 33
9.1 Orthonormal Bases
Def ’n: Span
The span of the set B is the set of all functions which are linear combinations of the functions in B. The span
is referred to as Span¦B¦ and is
Span¦B¦ =
_
N−1
k=0
a
k
φ
k
(t)
_
a
0
,...,a
N−1
∈R
Def ’n: Orthogonal Basis
For any arbitrary set of signals o = ¦s
i
(t)¦
M−1
i=0
, the orthogonal basis is an orthogonal set of the smallest size N,
B = ¦φ
i
(t)¦
N−1
i=0
, for which s
i
(t) ∈ Span¦B¦ for every i = 0, . . . , M − 1. The value of N is called the dimension
of the signal set.
Def ’n: Orthonormal Basis
The orthonormal basis is an orthogonal basis in which each basis function has norm 1.
Notes:
• The orthonormal basis is not unique. If you can ﬁnd one basis, you can use, for example, ¦−φ
k
(t)¦
N−1
k=0
as
another orthonormal basis.
• All orthogonal bases will have size N: no smaller basis can include all signals s
i
(t) in its span, and a larger
set would not be a basis by deﬁnition.
The signal set might be a set of signals we can use to transmit particular bits (or sets of bits). (As the
WalshHadamard functions are used in IS95 reverse link). An orthonormal basis for an arbitrary signal set
tells us:
• how to build the receiver,
• how to represent the signals in ‘signalspace’, and
• how to quickly analyze the error rate of the scheme.
9.2 Synthesis
Consider one of the signals in our signal set, s
i
(t). Given that it is in the span of our basis B, it can be
represented as a linear combination of the basis functions,
s
i
(t) =
N−1
k=0
a
i,k
φ
k
(t)
The particular constants a
i,k
are calculated using the inner product:
a
i,k
= ¸s
i
(t), φ
k
(t))
The projection of one signal i onto basis function k is deﬁned as a
i,k
φ
k
(t) = ¸s
i
(t), φ
k
(t))φ
k
(t). Then the signal
is equal to the sum of its projections onto the basis functions.
Why is this?
ECE 5520 Fall 2009 34
Solution: Since s
i
(t) ∈ Span¦B¦, we know there are some constants ¦a
i,k
¦
N−1
k
such that
s
i
(t) =
N−1
k=0
a
i,k
φ
k
(t)
Taking the inner product of both sides with φ
j
(t),
¸s
i
(t), φ
j
(t)) =
_
N−1
k=0
a
i,k
φ
k
(t), φ
j
(t)
_
¸s
i
(t), φ
j
(t)) =
N−1
k=0
a
i,k
¸φ
k
(t), φ
j
(t))
¸s
i
(t), φ
j
(t)) = a
i,j
So, now we can now represent a signal by a vector,
s
i
= [a
i,0
, a
i,1
, . . . , a
i,N−1
]
T
This and the basis functions completely represent each signal. Plotting ¦s
i
¦
i
in an N dimensional grid is
termed a constellation diagram. Generally, this space that we’re plotting in is called signal space.
We can also synthesize any of our M signals in the signal set by adding the proper linear combination of
the N bases. By choosing one of the M signals, we convey information, speciﬁcally, log
2
M bits of information.
(Generally, we choose M to be a power of 2 for simplicity).
See Figure 5.5 in Rice (page 254), which shows a block diagram of how a transmitter would synthesize one
of the M signals to send, based on an input bitstream.
Example: Positionshifted pulses
Plot the signal space diagram for the signals,
s
0
(t) = u(t) −u(t −1)
s
1
(t) = u(t −1) −u(t −2)
s
2
(t) = u(t) −u(t −2)
given the orthonormal basis,
φ
0
(t) = u(t) −u(t −1)
φ
1
(t) = u(t −1) −u(t −2)
What are the signal space vectors, s
0
, s
1
, and s
2
?
Solution: They are s
0
= [1, 0]
T
, s
1
= [0, 1]
T
, and s
2
= [1, 1]
T
. They are plotted in the signal space diagram in
Figure 15.
Energy:
• Energy can be calculated in signal space as
Energy¦s
i
(t)¦ =
_
−∞
∞s
2
i
(t)dt =
N−1
k=0
a
2
i,k
Proof?
ECE 5520 Fall 2009 35
1
1
f
1
f
2
Figure 15: Signal space diagram for positionshifted signals example.
• We will ﬁnd out later that it is the distances between the points in signal space which determine the bit
error rate performance of the receiver.
d
i,j
=
¸
¸
¸
_
N−1
k=0
(a
i,k
−a
j,k
)
2
for i, j in ¦0, . . . , M −1¦.
• Although diﬀerent ON bases can be used, the energy and distance between points will not change.
Example: Amplitudeshifted signals
Now consider
s
0
(t) = 1.5[u(t) −u(t −1)]
s
1
(t) = 0.5[u(t) −u(t −1)]
s
2
(t) = −0.5[u(t) −u(t −1)]
s
3
(t) = −1.5[u(t) −u(t −1)]
and the orthonormal basis,
φ
1
(t) = u(t) −u(t −1)
What are the signal space vectors for the signals ¦s
i
(t)¦? What are their energies?
Solution: s
0
= [1.5], s
1
= [0.5], s
2
= [−0.5], s
3
= [−1.5]. See Figure 16.
1
f
1
1 1 0
Figure 16: Signal space diagram for amplitudeshifted signals example.
Energies are just the squared magnitude of the vector: 2.25, 0.25, 0.25, and 2.25, respectively.
9.3 Analysis
At a receiver, our job will be to analyze the received signal (a function) and to decide which of the M possible
signals was sent. This is the task of analysis. It turns out that an orthonormal bases makes our analysis very
straightforward and elegant.
We won’t receive exactly what we sent  there will be additional functions added to the signal function we
sent.
ECE 5520 Fall 2009 36
• Thermal noise
• Interference from other users
• Selfinterference
We won’t get into these problems today. But, we might say that if we send signal m, i.e., s
m
(t) from our signal
set, then we would receive
r(t) = s
m
(t) +w(t)
where the w(t) is the sum of all of the additive signals that we did not intend to receive. But w(t) might
not (probably not) be in the span of our basis B, so r(t) would not be in Span¦B¦ either. What is the best
approximation to r(t) in the signal space? Speciﬁcally, what is ˆ r(t) ∈ Span¦B¦ such that the energy of the
diﬀerence between ˆ r(t) and r(t) is minimized, i.e.,
argmin
ˆ r(t)∈Span{B}
_
∞
−∞
[ˆ r(t) −r(t)[
2
dt (13)
Solution: Since ˆ r(t) ∈ Span¦B¦, it can be represented as a vector in signal space,
x = [x
0
, x
1
, . . . , x
N−1
]
T
.
and the synthesis equation is
ˆ r(t) =
N−1
k=0
x
k
φ
k
(t)
If you plug in the above expression for ˆ r(t) into (13), and then ﬁnd the minimum with respect to each x
k
, you’d
see that the minimum error is at
x
k
=
_
∞
−∞
r(t)φ
k
(t)dt
that is, x
k
= ¸r(t), φ
k
(t)), for k = 0, . . . , N.
Example: Analysis using a WalshHadamard 2 Basis
See Figure 17. Let s
0
(t) = φ
0
(t) and s
1
(t) = φ
1
(t). What is ˆ r(t)?
Solution:
ˆ r(t) =
_
_
_
1, 0 ≤ t < 1
−1/2, 1 ≤ t < 2
0, o.w.
Lecture 6
Today: (1) Multivariate Distributions
ECE 5520 Fall 2009 37
r t ( )
2
2
1 2
r( ) t
2
2
1 2
f
1
( ) t
1 2
f
2
( ) t
1/2
1/2
1 2
1/2
1/2
1
Figure 17: Signal and basis functions for Analysis example.
10 MultiVariate Distributions
For two random variables X
1
and X
2
,
• Joint CDF: F
X
1
,X
2
(x
1
, x
2
) = P [¦X
1
≤ x
1
¦ ∩ ¦X
2
≤ x
2
¦] It is the probability that both events happen
simultaneously.
• Joint pmf: P
X
1
,X
2
(x
1
, x
2
) = P [¦X
1
= x
1
¦ ∩ ¦X
2
= x
2
¦] It is the probability that both events happen
simultaneously.
• Joint pdf: f
X
1
,X
2
(x
1
, x
2
) =
∂
2
∂x
1
∂x
2
F
X
1
,X
2
(x
1
, x
2
)
The pdf and pmf integrate / sum to one, and are nonnegative. The CDF is nonnegative and nondecreasing,
with lim
x
i
→.−∞
F
X
1
,X
2
(x
1
, x
2
) = 0 and lim
x
1
,x
2
→.+∞
F
X
1
,X
2
(x
1
, x
2
) = 1.
Note: Two errors in Rice book dealing with the deﬁnition of the CDF. Eqn 4.9 should be:
F
X
(x) = P [X ≤ x] =
_
x
−∞
f
X
(t)dt
and Eqn 4.15 should be:
F
X
(x) = P [X ≤ x]
To ﬁnd the probability of an event, you integrate. For example, for event B ∈ S,
• Discrete case: P [B] =
(X
1
,X
2
)∈B
P
X
1
,X
2
(x
1
, x
2
)
• Continuous Case: P [B] =
_ _
(x
1
,x
2
)∈B
f
X
1
,X
2
(x
1
, x
2
)dx
1
dx
2
ECE 5520 Fall 2009 38
The marginal distributions are:
• Marginal pmf: P
X
2
(x
2
) =
x
1
∈S
X
1
P
X
1
,X
2
(x
1
, x
2
)
• Marginal pdf: f
X
2
(x
2
) =
_
x
1
∈S
X
1
f
X
1
,X
2
(x
1
, x
2
)dx
1
Two random variables X
1
and X
2
are independent iﬀ for all x
1
and x
2
,
• P
X
1
,X
2
(x
1
, x
2
) = P
X
1
(x
1
)P
X
2
(x
2
)
• f
X
1
,X
2
(x
1
, x
2
) = f
X
2
(x
2
)f
X
1
(x
1
)
10.1 Random Vectors
Def ’n: Random Vector
A random vector (R.V.) is a list of multiple random variables X
1
, X
2
, . . . , X
n
,
X = [X
1
, X
2
, . . . , X
n
]
T
Here are the Models of R.V.s:
1. The CDF of R.V. X is F
X
(x) = F
X
1
,...,Xn
(x
1
, . . . , x
n
) = P [X
1
≤ x
1
, . . . , X
n
≤ x
n
].
2. The pmf of a discrete R.V. X is P
X
(x) = P
X
1
,...,Xn
(x
1
, . . . , x
n
) = P [X
1
= x
1
, . . . , X
n
= x
n
].
3. The pdf of a continuous R.V. X is f
X
(x) = f
X
1
,...,Xn
(x
1
, . . . , x
n
) =
∂
n
∂x
1
···∂xn
F
X
(x).
10.2 Conditional Distributions
Given event B ∈ S which has P [B] > 0, the joint probability conditioned on event B is
• Discrete case:
P
X
1
,X
2
B
(x
1
, x
2
) =
_
P
X
1
,X
2
(x
1
,x
2
)
P[B]
, (X
1
, X
2
) ∈ B
0, o.w.
• Continuous Case:
f
X
1
,X
2
B
(x
1
, x
2
) =
_
f
X
1
,X
2
(x
1
,x
2
)
P[B]
, (X
1
, X
2
) ∈ B
0, o.w.
Given r.v.s X
1
and X
2
,
• Discrete case. The conditional pmf of X
1
given X
2
= x
2
, where P
X
2
(x
2
) > 0, is
P
X
1
X
2
(x
1
[x
2
) = P
X
1
,X
2
(x
1
, x
2
)/P
X
2
(x
2
)
• Continuous Case: The conditional pdf of X
1
given X
2
= x
2
, where f
X
2
(x
2
) > 0, is
f
X
1
X
2
(x
1
[x
2
) = f
X
1
,X
2
(x
1
, x
2
)/f
X
2
(x
2
)
ECE 5520 Fall 2009 39
Def ’n: Bayes’ Law
Bayes’ Law is a reformulation of he deﬁnition of the marginal pdf. It is written either as:
f
X
1
,X
2
(x
1
, x
2
) = f
X
2
X
1
(x
2
[x
1
)f
X
1
(x
1
)
or
f
X
1
X
2
(x
1
[x
2
) =
f
X
2
X
1
(x
2
[x
1
)f
X
1
(x
1
)
f
X
2
(x
2
)
10.3 Simulation of Digital Communication Systems
A simulation of a digital communication system is often used to estimate a bit error rate. Each bit can either
be demodulated without error, or with error. Thus the simulation of one bit is a Bernoulli trial. This trial E
i
is in error (E
1
= 1) with probability p
e
(the true bit error rate) and correct (E
i
= 0) with probability 1 − p
e
.
What type of random variable is E
i
?
Simulations run many bits, say N bits through a model of the communication system, and count the number
of bits that are in error. Let S =
N
i=1
E
i
, and assume that ¦E
i
¦ are independent and identically distributed
(i.i.d.).
1. What type of random variable is S?
2. What is the pmf of S?
3. What is the mean and variance of S?
Solution: E
i
is called a Bernoulli r.v. and S is called a Binomial r.v., with pmf
P
S
(s) =
_
N
s
_
p
s
e
(1 −p
e
)
N−s
The mean of S is the the same as the mean of the sum of ¦E
i
¦,
E
S
[S] = E
{E
i
}
_
N
i=1
E
i
_
=
N
i=1
E
E
i
[E
i
]
=
N
i=1
[(1 −p) 0 +p 1] = Np
We can ﬁnd the variance of S the same way:
Var
S
[S] = Var
{E
i
}
_
N
i=1
E
i
_
=
N
i=1
Var
E
i
[E
i
]
=
N
i=1
_
(1 −p) (0 −p)
2
+p (1 −p)
2
¸
=
N
i=1
_
(1 −p)p
2
+ (1 −p)(p −p
2
)
¸
= Np(1 −p)
ECE 5520 Fall 2009 40
We may also be interested knowing how many bits to run in order to get an estimate of the bit error rate.
For example, if we run a simulation and get zero bit errors, we won’t have a very good idea of the bit error rate.
Let T
1
be the time (number of bits) up to and including the ﬁrst error.
1. What type of random variable is T
1
?
2. What is the pmf of T
1
?
3. What is the mean of T
1
?
Solution: T
1
is a Geometric r.v. with pmf
P
T
1
(t) = (1 −p
e
)
t−1
p
e
The mean of T
1
is
E[T
1
] =
1
p
e
Note the variance of T
1
is Var [T
1
] = (1 − p
e
)/p
2
e
, so the standard deviation for very low p
e
is almost the same
as the expected value.
So, even if we run an experiment until the ﬁrst bit error, our estimate of p
e
will have relatively high variance.
10.4 Mixed Discrete and Continuous Joint Variables
This was not covered in ECE 5510, although it was in the Yates & Goodman textbook. We’ll often have X
1
discrete and X
2
continuous.
Example: Digital communication system in noise
Let X
1
is the transmitted signal voltage, and X
2
is the received signal voltage, which is modeled as
X
2
= aX
1
+N
where N is a continuousvalued random variable representing the additive noise of the channel, and a is a
constant which represents the attenuation of the channel. In a digital system, X
1
may take a discrete set of
values, e.g., ¦1.5, 0.5, −0.5, −1.5¦. But the noise N may be continuous, e.g., Gaussian with mean µ and variance
σ
2
. As long as σ
2
> 0, then X
2
will be continuousvalued.
Workaround: We will sometimes use a pdf for a discrete random variable. For instance, if
P
X
1
(x
1
) =
_
_
_
0.6, x
1
= 0
0.4, x
1
= 1
0, o.w.
then we would write a pdf with the probabilities as amplitudes of a dirac delta function centered at the value
that has that probability,
f
X
1
(x
1
) = 0.6δ(x
1
) + 0.4δ(x
1
−1)
This pdf has integral 1, and nonnegative values for all x
1
. Any probability integral will return the proper value.
For example, the probability that −0.5 < x
1
< 0.5 would be an integral that would return 0.6.
ECE 5520 Fall 2009 41
Example: Joint distribution of X
1
, X
2
Consider the channel model X
2
= X
1
+N, and
f
N
(n) =
1
√
2πσ
2
e
−n
2
/(2σ
2
)
where σ
2
is some known variance, and the r.v. X
1
is independent of N with
P
X
1
(x
1
) =
_
_
_
0.5, x
1
= 0
0.5, x
1
= 1
0, o.w.
.
1. What is the pdf of X
2
?
2. What is the joint p.d.f. of (X
1
, X
2
) ?
Solution:
1. When two independent r.v.s are added, the pdf of the sum is the convolution of the pdfs of the inputs.
Writing
f
X
1
(x
1
) = 0.5δ(x
1
) + 0.5δ(x
1
−1)
we convolve this with f
N
(n) above, to get
f
X
2
(x
2
) = 0.5f
N
(x
2
) + 0.5f
N
(x
2
−1)
f
X
2
(x
2
) =
1
2
√
2πσ
2
_
e
−x
2
2
/(2σ
2
)
+e
−(x
2
−1)
2
/(2σ
2
)
_
2. What is the joint p.d.f. of (X
1
, X
2
) ? Since X
1
and X
2
are NOT independent, we cannot simply multiply
the marginal pdfs together. It is necessary to use Bayes’ Law.
f
X
1
,X
2
(x
1
, x
2
) = f
X
2
X
1
(x
2
[x
1
)f
X
1
(x
1
)
Given a value of X
1
(either 0 or 1) we can write down the pdf of X
2
. So break this into two cases:
f
X
1
,X
2
(x
1
, x
2
) =
_
_
_
f
X
2
X
1
(x
2
[0)f
X
1
(0), x
1
= 0
f
X
2
X
1
(x
2
[1)f
X
1
(1), x
1
= 1
0, o.w.
f
X
1
,X
2
(x
1
, x
2
) =
_
_
_
0.5f
N
(x
2
), x
1
= 0
0.5f
N
(x
2
−1), x
1
= 1
0, o.w.
f
X
1
,X
2
(x
1
, x
2
) = 0.5f
N
(x
2
)δ(x
1
) +
0.5f
N
(x
2
−1)δ(x
1
−1)
These last two lines are completely equivalent. Use whichever seems more convenient for you. See Figure
18.
ECE 5520 Fall 2009 42
−1
0
1
2
−4
−2
0
2
4
0
X
2
X
1
Figure 18: Joint pdf of X
1
and X
2
, the input and output (respectively) of the example additive noise commu
nication system, when σ = 1.
10.5 Expectation
Def ’n: Expected Value (Joint)
The expected value of a function g(X
1
, X
2
) of random variables X
1
and X
2
is given by,
1. Discrete: E[g(X
1
, X
2
)] =
X
1
∈S
X
1
X
2
∈S
X
2
g(X
1
, X
2
)P
X
1
,X
2
(x
1
, x
2
)
2. Continuous: E[g(X
1
, X
2
)] =
_
X
1
∈S
X
1
_
X
2
∈S
X
2
g(X
1
, X
2
)f
X
1
,X
2
(x
1
, x
2
)
Typical functions g(X
1
, X
2
) are:
• Mean of X
1
or X
2
: g(X
1
, X
2
) = X
1
or g(X
1
, X
2
) = X
2
will result in the means µ
X
1
and µ
X
2
.
• Variance (or 2nd central moment) of X
1
or X
2
: g(X
1
, X
2
) = (X
1
− µ
X
1
)
2
or g(X
1
, X
2
) = (X
2
− µ
X
2
)
2
.
Often denoted σ
2
X
1
and σ
2
X
2
.
• Covariance of X
1
and X
2
: g(X
1
, X
2
) = (X
1
−µ
X
1
)(X
2
−µ
X
2
).
• Expected value of the product of X
1
and X
2
, also called the ‘correlation’ of X
1
and X
2
: g(X
1
, X
2
) = X
1
X
2
.
10.6 Gaussian Random Variables
For a single Gaussian r.v. X with mean µ
X
and variance σ
2
X
, we have the pdf,
f
X
(x) =
1
_
2πσ
2
X
e
−
(x−µ
X
)
2
2σ
2
Consider Y to be Gaussian with mean 0 and variance 1. Then, the CDF of Y is denoted as F
Y
(y) = P [Y ≤ y] =
Φ(y). So, for X, which has nonzero mean and nonunitvariance, we can write its CDF as
F
X
(x) = P [X ≤ x] = Φ
_
x −µ
X
σ
X
_
ECE 5520 Fall 2009 43
You can prove this by showing that the event X ≤ x is the same as the event
X −µ
X
σ
X
≤
x −µ
X
σ
X
Since the lefthand side is a unitvariance, zero mean Gaussian random variable, we can write the probability
of this event using the unitvariance, zero mean Gaussian CDF.
10.6.1 Complementary CDF
The probability that a unitvariance, zero mean Gaussian r.v. X exceeds some value x is one minus the CDF,
that is, 1 −Φ(x). This is so common in digital communications, it is given its own name, Q(x),
Q(x) = P [X > x] = 1 −Φ(x)
What is Q(x) in integral form?
Q(x) =
_
∞
x
1
√
2π
e
−w
2
/2
dw
For an Gaussian r.v. X with variance σ
2
X
,
P [X > x] = Q
_
x −µ
X
σ
X
_
= 1 −Φ
_
x −µ
X
σ
X
_
10.6.2 Error Function
In math, in some texts, and in Matlab, the Q(x) function is not used. Instead, there is a function called erf(x)
erf(x)
2
√
π
_
x
0
e
−t
2
dt
Example: Relationship between Q() and erf()
What is the functional relationship between Q() and erf()?
Solution: Substituting t = u/
√
2 (and thus dt = du/
√
2),
erf(x)
2
√
2π
_
√
2x
0
e
−u
2
/2
du
= 2
_
√
2x
0
1
√
2π
e
−u
2
/2
du
= 2
_
Φ(
√
2x) −
1
2
_
Equivalently, we can write Φ() in terms of the erf() function,
Φ(
√
2x) =
1
2
erf(x) +
1
2
Finally let y =
√
2x, so that
Φ(y) =
1
2
erf
_
y
√
2
_
+
1
2
ECE 5520 Fall 2009 44
Or in terms of Q(),
Q(y) = 1 −Φ(y) =
1
2
−
1
2
erf
_
y
√
2
_
(14)
You should go to Matlab and create a function Q(y) which implements:
function rval = Q(y)
rval = 1/2  1/2 .* erf(y./sqrt(2));
10.7 Examples
Example: Probability of Error in Binary Example
As in the previous example, we have a model system in which the receiver sees X
2
= X
1
+N. Here, X
1
∈ ¦0, 1¦
with equal probabilities and N is independent of X
1
and zeromean Gaussian with variance 1/4. The receiver
decides as follows:
• If X
2
≤ 1/3, then decide that the transmitter sent a ‘0’.
• If X
2
> 1/3, then decide that the transmitter sent a ‘1’.
1. Given that X
1
= 1, what is the probability that the receiver decides that a ‘0’ was sent?
2. Given that X
1
= 0, what is the probability that the receiver decides that a ‘1’ was sent?
Solution:
1. Given that X
1
= 1, since X
2
= X
1
+ N, it is clear that X
2
is also a Gaussian r.v. with mean 1 and
variance 1/4. Then the probability that the receiver decides ‘0’ is the probability that X
2
≤ 1/3,
P [error[X
1
= 1] = P [X
2
≤ 1/3]
= P
_
X
2
−1
_
1/4
≤
1/3 −1
_
1/4
_
= 1 −Q((−2/3)/(1/2))
= 1 −Q(−4/3)
2. Given that X
1
= 0, the probability that the receiver decides ‘1’ is the probability that X
2
> 1/3,
P [error[X
1
= 0] = P [X
2
> 1/3]
= P
_
X
2
_
1/4
>
1/3
_
1/4
_
= Q((1/3)/(1/2))
= Q(2/3)
ECE 5520 Fall 2009 45
10.8 Gaussian Random Vectors
Def ’n: Multivariate Gaussian R.V.
An nlength R.V. X is multivariate Gaussian with mean µ
X
, and covariance matrix C
X
if it has the pdf,
f
X
(x) =
1
_
(2π)
n
det(C
X
)
exp
_
−
1
2
(x −µ
X
)
T
C
−1
X
(x −µ
X
)
_
where det() is the determinant of the covariance matrix, and C
−1
X
is the inverse of the covariance matrix.
Any linear combination of jointly Gaussian random variables is another jointly Gaussian ran
dom variable. For example, if we have a matrix A and we let a new random vector Y = AX, then Y is also
a Gaussian random vector with mean Aµ
X
and covariance matrix AC
X
A
T
.
If the elements of X were independent random variables, the pdf would be the product of the individual
pdfs (as with any random vector) and in this case the pdf would be:
f
X
(x) =
1
_
(2π)
n
i
σ
2
i
exp
_
−
n
i=1
(x
i
−µ
X
i
)
2
2σ
2
X
i
_
Section 4.3 spends some time with 2D Gaussian random vectors, which is the dimension with which we
spend most of our time in this class.
10.8.1 Envelope
The envelope of a 2D random vector is its distance from the origin (0, 0). Speciﬁcally, if the vector is (X
1
, X
2
),
the envelope R is
R =
_
X
2
1
+X
2
2
(15)
If both X
1
and X
2
are i.i.d. Gaussian with mean 0 and variance σ
2
, the pdf of R is
f
R
(r) =
_
r
σ
2
exp
_
−
r
2
2σ
2
_
, r ≥ 0
0, o.w.
This is the Rayleigh distribution. If instead X
1
and X
2
are independent Gaussian with means µ
1
and µ
2
,
respectively, and identical variances σ
2
, the envelope R would now have a Rice (a.k.a. Rician) distribution,
f
R
(r) =
_
r
σ
2
exp
_
−
r
2
+s
2
2σ
2
_
I
0
_
rs
σ
2
_
, r ≥ 0
0, o.w.
where s
2
= µ
2
1
+µ
2
2
, and I
0
() is the zeroth order modiﬁed Bessel function of the ﬁrst kind.
Note that both the Rayleigh and Rician pdfs can be derived from the Gaussian distribution and (15) using
the transformation of random variables methods studied in ECE 5510. Both pdfs are sometimes needed to
derive probability of error formulas.
Lecture 7
Today: (1) Matlab Simulation (2) Random Processes (3) Correlation and Matched Filter Receivers
ECE 5520 Fall 2009 46
11 Random Processes
A random process X(t, s) is a function of time t and outcome (realization) s. Outcome s lies in sample space
S. Outcome or realization is included because there could be ways to record X even at the same time. For
example, multiple receivers would record diﬀerent noisy signals even of the same transmission.
A random sequence X(n, s) is a sequence of random variables indexed by time index n and realization s ∈ S.
Typically, we omit the “s” when writing the name of the random process or random sequence, and abbreviate
it to X(t) or X(n), respectively.
11.1 Autocorrelation and Power
Def ’n: Mean Function
The mean function of the random process X(t, s) is
µ
X
(t) = E[X(t, s)]
Note the mean is taken over all possible realizations s. If you record one signal over all time t, you don’t have
anything to average to get the mean function µ
X
(t).
Def ’n: Autocorrelation Function
The autocorrelation function of a random process X(t) is
R
X
(t, τ) = E[X(t)X(t −τ)]
The autocorrelation of a random sequence X(n) is
R
X
(n, k) = E[X(n)X(n −k)]
Def ’n: Widesense stationary (WSS)
A random process is widesense stationary (WSS) if
1. µ
X
= µ
X
(t) = E[X(t)] is independent of t.
2. R
X
(t, τ) depends only on the time diﬀerence τ and not on t. We then denote the autocorrelation function
as R
X
(τ).
A random process is widesense stationary (WSS) if
1. µ
X
= µ
X
(n) = E[X(n)] is independent of n.
2. R
X
(n, k) depends only on k and not on n. We then denote the autocorrelation function as R
X
(k).
The power of a signal is given by R
X
(0).
We can estimate the autocorrelation function of an ergodic, widesense stationary random process from one
realization of a powertype signal x(t),
R
X
(τ) = lim
T→∞
1
T
_
T/2
t=−T/2
x(t)x(t −τ)dt
This is not a critical topic for this class, but we now must deﬁne “ergodic”. For an ergodic random process,
its time averages are equivalent to its ensemble averages. An example that helps demonstrate the diﬀerence
between time averages and ensemble averages is the nonergodic process of opinion polling. Consider what
happens when a pollster takes either a timeaverage or a ensemble average:
ECE 5520 Fall 2009 47
• Time Average: The pollster asks the same person, every day for N days, whether or not she will vote for
person X. The time average is taken by dividing the total number of “Yes”s by N.
• Ensemble Average: The pollster asks N diﬀerent people, on the same day, whether or not they will vote
for person X. The ensemble average is taken by dividing the total number of “Yes”s by N.
Will they come up with the same answer? Perhaps; but probably not. This makes the process nonergodic.
Power Spectral Density We have seen that for a WSS random process X(t) (and for a random sequence
X(n) we compute the power spectral density as,
S
X
(f) = F¦R
X
(τ)¦
S
X
(e
jΩ
) = DTFT¦R
X
(k)¦
11.1.1 White Noise
Let X(n) be a WSS random sequence with autocorrelation function
R
X
(k) = E[X(n)X(n −k)] = σ
2
δ(k)
This says that each element of the sequence X(n) has zero covariance with every other sample of the sequence,
i.e., it is uncorrelated with X(m), m ,= n.
• Does this mean it is independent of every other sample?
• What is the PSD of the sequence?
This sequence is commonly called ‘white noise’ because it has equal parts of every frequency (analogy to
light).
For continuous time signals, white noise is a commonly used approximation for thermal noise. Let X(t) be
a WSS random process with autocorrelation function
R
X
(τ) = E[X(t)X(t −τ)] = σ
2
δ(τ)
• What is the PSD of the sequence?
• What is the power of the signal?
Realistically, thermal noise is not constant in frequency, because as frequency goes very high (e.g., 10
15
Hz), the
power spectral density goes to zero. The Rice book (4.5.2) has a good analysis of the physics of thermal noise.
12 Correlation and MatchedFilter Receivers
We’ve been talking about how our digital transmitter can transmit at each time one of a set of signals ¦s
i
(t)¦
M
i=1
.
This transmission conveys to us which of M messages we want to send, that is, log
2
M bits of information.
At the receiver, we receive signal s
i
(t) scaled and corrupted by noise:
r(t) = bs
i
(t) +n(t)
How do we decide which signal i was transmitted?
ECE 5520 Fall 2009 48
12.1 Correlation Receiver
What we’ve been talking about it the inner product. In other terms, the inner product is a correlation:
¸r(t), s
i
(t)) =
_
∞
−∞
r(t)s
i
(t)dt
But we want to build a receiver that does the minimum amount of calculation. If s
i
(t) are nonorthogonal,
then we can reduce the amount of correlation (and thus multiplication and addition) done in the receiver by
instead, correlating with the basis functions, ¦φ
k
(t)¦
N
k=1
. Correlation with the basis functions gives
x
k
= ¸r(t), φ
k
(t)) =
_
∞
−∞
r(t)φ
k
(t)dt
for k = 1 . . . N. As notation,
x = [x
1
, x
2
, . . . , x
N
]
T
Now that we’ve done these N correlations (inner products), we can compute the estimate of the received signal
as
ˆ r(t) =
K−1
k=0
x
k
φ
k
(t)
This is the ‘correlation receiver’, shown in Figure 5.1.6 in Rice (page 226).
Noisefree channel Since r(t) = bs
i
(t) +n(t), if n(t) = 0 and b = 1 then
x = a
i
where a
i
is the signal space vector for s
i
(t).
Noisy Channel Now, let b = 1 but consider n(t) to be a white Gaussian random process with zero mean and
PSD S
N
(f) = N
0
/2. Deﬁne
n
k
= ¸n(t), φ
k
) =
_
∞
−∞
n(t)φ
k
(t)dt.
What is x in terms of a
i
and the noise ¦n
k
¦? What are the mean and covariance of ¦n
k
¦?
Solution:
x
k
=
_
∞
−∞
r(t)φ
k
(t)dt
=
_
∞
−∞
[s
i
(t) +n(t)]φ
k
(t)dt
= a
i,k
+
_
∞
−∞
n(t)φ
k
(t)dt
= a
i,k
+n
k
First, n
k
is zero mean:
E[n
k
] =
_
∞
−∞
E[n(t)] φ
k
(t)dt = 0
ECE 5520 Fall 2009 49
Next we can show that n
1
, . . . , n
N
are i.i.d. by calculating the autocorrelation R
n
(m, k).
R
n
(m, k) = E[n
k
n
m
]
=
_
∞
t=−∞
_
∞
τ=−∞
E[n(t)n(τ)] φ
k
(t)φ
m
(τ)dτdt
=
_
∞
t=−∞
_
∞
τ=−∞
N
0
2
δ(t −τ)φ
k
(t)φ
m
(τ)dτdt
=
N
0
2
_
∞
t=−∞
φ
k
(t)φ
m
(t)dτdt
=
N
0
2
δ
k,m
=
_
N
0
2
, m = k
0, o.w.
Is ¦n
k
¦ WSS? Is it a Gaussian random sequence?
Since the noise components are independent, then x
k
are also independent. (Why?) What is the pdf of x
k
?
What is the pdf of x?
Solution: x
k
are independent because x
k
= a
i,k
+ n
k
and a
i,k
is a deterministic constant. Then since n
k
is
Gaussian,
f
X
k
(x
k
) =
1
_
2π(N
0
/2)
e
−
(x
k
−a
i,k
)
2
2(N
0
/2)
And, since ¦X
k
¦ are independent,
f
x
(x) =
N
k=1
f
X
k
(x
k
)
=
N
k=1
1
_
2π(N
0
/2)
e
−
(x
k
−a
i,k
)
2
2(N
0
/2)
=
1
[2π(N
0
/2)]
N/2
e
−
P
N
k=1
(x
k
−a
i,k
)
2
2(N
0
/2)
An example is in ece5510 lec07.m.
12.2 Matched Filter Receiver
Above we said that
x
k
=
_
∞
t=−∞
r(t)φ
k
(t)dt
But there really are ﬁnite limits – let’s say that the signal has a duration T, and then rewrite the integral as
x
k
=
_
T
t=0
r(t)φ
k
(t)dt
This can be written as
x
k
=
_
T
t=0
r(t)h
k
(T −t)dt
ECE 5520 Fall 2009 50
where h
k
(t) = φ
k
(T −t). (Plug in (T −t) in this formula and the Ts cancel out and only positive t is left.) This
is the output of a convolution, taken at time T,
x
k
= r(t) ⋆ h
k
(t)[
t=T
Or equivalently
x
k
= r(t) ⋆ φ
k
(T −t)[
t=T
This is Rice Section 5.1.4.
Notes:
• The x
k
can be seen as the output of a ‘matched’ ﬁlter at time T.
• This works at time T. The output for other times will be diﬀerent in the correlation and matched ﬁlter.
• These are just two diﬀerent physical implementations. We might, for example, have a physical ﬁlter with
the impulse response φ
k
(T −t) and thus it is easy to do a matched ﬁlter implementation.
• It may be easier to ‘see’ why the correlation receiver works.
Try out the Matlab code, correlation and matched filter rx.m, which is posted on WebCT.
12.3 Amplitude
Before we said b = 1. A real channel attenuates! The job of our receiver will be to amplify the signal by
approximately 1/b. This eﬀectively multiplies the noise. In general, textbooks will assume that the automatic
gain control (AGC) works properly, and then will assume the noise signal is multiplied accordingly.
Lecture 8
Today: (1) Optimal Binary Detection
12.4 Review
At a digital transmitter, we can transmit at each time one of a set of signals ¦s
i
(t)¦
M−1
i=0
. This transmission
conveys to us which of M messages we want to send, that is, log
2
M bits of information.
At the receiver, we assume we receive signal s
i
(t) corrupted by noise:
r(t) = s
i
(t) +n(t)
How do we decide which signal i ∈ ¦0, . . . , M − 1¦ was transmitted? We split the task into downconversion,
gain control, correlation or matched ﬁlter reception, and detection, as shown in Figure 19.
Down
converter
AGC Correlationor
MatchedFilter
Optimal
Detector
r t ’( )e
 2 j f t p
c
r t ’( ) r t ( ) x
b
Figure 19: A block diagram of the receiver blocks which discussed in lecture 7 & 8.
ECE 5520 Fall 2009 51
12.5 Correlation Receiver
Our signal set can be represented by the orthonormal basis functions, ¦φ
k
(t)¦
K−1
k=0
. Correlation with the basis
functions gives
x
k
= ¸r(t), φ
k
(t)) = a
i,k
+n
k
for k = 0 . . . K −1. We denote vectors:
x = [x
0
, x
1
, . . . , x
K−1
]
T
a
i
= [a
i,0
, a
i,1
, . . . , a
i,K−1
]
T
n = [n
0
, n
1
, . . . , n
K−1
]
T
Hopefully, x and a
i
should be close since i was actually sent. In an example Matlab simulation, ece5520 lec07.m,
we simulated sending a
i
= [1, 1]
T
and receiving r(t) (and thus x) in noise.
In general, the pdf of x is multivariate Gaussian with each component x
k
independent, because:
• x
k
= a
i,k
+n
k
• a
i,k
is a deterministic constant
• The ¦n
k
¦ are i.i.d. Gaussian.
The joint pdf of x is
f
X
(x) =
K−1
k=0
f
X
k
(x
k
) =
1
[2π(N
0
/2)]
N/2
e
−
P
N
k=1
(x
k
−a
i,k
)
2
2(N
0
/2)
(16)
We showed that there are two ways to implement this: the correlation receiver, and the matched ﬁlter re
ceiver. Both result in the same output x. We simulated this in the Matlab code correlation and matched filter rx.m.
13 Optimal Detection
We receive r(t) and use the matched ﬁlter or correlation receiver to calculate x. Now, how exactly do we decide
which s
i
(t) was sent?
Consider this in signal space, as shown in Figure 20.
1
f
1
1 1 0
X
Figure 20: Signal space diagram of a PAM system with an X to mark the receiver output x.
Given the signal space diagram of the transmitted signals and the received signal space vector x, what rules
should we have to decide on i? This is the topic of detection. Optimal detection uses rules designed to minimize
the probability that a symbol error will occur.
13.1 Overview
We’ll show two things:
• If each symbol i is equally probable, then the decision will be, pick the i with a
i
closest to x.
• If symbols aren’t equally probable, then we’ll need to shift the decision boundaries.
ECE 5520 Fall 2009 52
Detection theory is a major learning objective of this course. So even though it is somewhat
theoretical, it is very applicable in digital receivers. Further, detection is applicable in a wide variety of problems,
for example,
• Medical applications: Does an image show a tumor? Does a blood sample indicate a disease? Does a
breathing sound indicate an obstructed airway?
• Communications applications: Receiver design, signal detection.
• Radar applications: Obstruction detection, motion detection.
13.2 Bayesian Detection
When we say ‘optimal detection’ in the Bayesian detection framework, we mean that we want the smallest
probability of error. The probability of error is denoted
P [error]
By error, we mean that a diﬀerent signal was detected than the one that was sent. At the start of every detection
problem, we list the events that could have occurred, i.e., the symbols that could have been sent. We follow all
detection and statistics textbooks and label these classes H
i
, and write:
H
0
: r(t) = s
0
(t) +n(t)
H
1
: r(t) = s
1
(t) +n(t)
H
M−1
: r(t) = s
M−1
(t) +n(t)
This must be a complete listing of events. That is, the events H
0
∪ H
1
∪ ∪ H
M−1
= S, where the ∪ means
union, and S is the complete event space.
14 Binary Detection
Let’s just say for now that there are only two signals s
0
(t) and s
1
(t), and only one basis function φ
0
(t). Then,
instead of vectors x, a
0
and a
1
, and n, we’ll have only scalars: x, a
0
and a
1
, and n. When we talk about them
as random variables, we’ll substitute N for n and X for x. We need to decide from X whether s
0
(t) or s
1
(t)
was sent.
The event listing is,
H
0
: r(t) = s
0
(t) +n(t)
H
1
: r(t) = s
1
(t) +n(t)
Equivalently,
H
0
: X = a
0
+N
H
1
: X = a
1
+N
We use the law of total probability to say that
P [error] = P [error ∩ H
0
] +P [error ∩ H
1
] (17)
Where the cap means ‘and’. Then using Bayes’ Law,
P [error] = P [error[H
0
] P [H
0
] +P [error[H
1
] P [H
1
]
ECE 5520 Fall 2009 53
14.1 Decision Region
We’re making a decision based only on X. Over some set R
0
of values of X, we’ll decide that H
0
happened
(s
0
(t) was sent). Over a diﬀerent set R
1
of values, we’ll decide H
1
occurred (that s
1
(t) was sent). We can’t be
indecisive, so
• There is no overlap: R
0
∩ R
1
= ∅.
• There are no values of x disregarded: R
0
∪ R
1
= S.
14.2 Formula for Probability of Error
So the probability of error is
P [error] = P [X ∈ R
1
[H
0
] P [H
0
] +P [X ∈ R
0
[H
1
] P [H
1
] (18)
The probability that X is in R
1
is one minus the probability that it is in R
0
, since the two are complementary
sets.
P [error] = (1 −P [X ∈ R
0
[H
0
])P [H
0
] +P [X ∈ R
0
[H
1
] P [H
1
]
P [error] = P [H
0
] −P [X ∈ R
0
[H
0
] P [H
0
] +P [X ∈ R
0
[H
1
] P [H
1
]
Now note that probabilities that X ∈ R
0
are integrals over the event (region) R
0
.
P [error] = P [H
0
] −
_
x∈R
0
f
XH
0
(x[H
0
)P [H
0
] dx
+
_
x∈R
0
f
XH
1
(x[H
1
)P [H
1
] dx
= P [H
0
] (19)
+
_
x∈R
0
_
f
XH
1
(x[H
1
)P [H
1
] −f
XH
0
(x[H
0
)P [H
0
]
_
dx
We’ve got a lot of things in the expression in (19), but the only thing we can change is the region R
0
. Everything
else is determined by the time we get to this point. So the question is, how do you pick R
0
to minimize (19)?
14.3 Selecting R
0
to Minimize Probability of Error
We can see what the integrand looks like. Figure 21 shows the conditional probability density functions. Figure
?? shows the joint densities (the conditional pdfs multiplied by the bit probabilities P [H
0
] and P [H
1
]. Finally,
Figure ?? shows the full integrand of (19), the diﬀerence between the joint densities.
We can pick R
0
however we want  we just say what region of x, and the integral in (19) will integrate over
it. The objective is to minimize the probability of error. Which x’s should we include in the region? Should
we include x which has a positive value of the integrand? Or should we include the parts of x which have a
negative value of the integrand?
Solution: Select R
0
to be all x such that the integrand is negative.
Then R
0
is the area in which
f
XH
0
(x[H
0
)P [H
0
] > f
XH
1
(x[H
1
)P [H
1
]
If P [H
0
] = P [H
1
], then this is the region in which X is more probable given H
0
than given H
1
.
ECE 5520 Fall 2009 54
(a)
−3 −2 −1 0 1 2 3
0
0.2
0.4
0.6
0.8
1
r
P
r
o
b
a
b
i
l
i
t
y
D
e
n
s
i
t
y
f
XH
1
f
XH
2
(b)
(c)
−3 −2 −1 0 1 2 3
−0.6
−0.4
−0.2
0
0.2
0.4
r
I
n
t
e
g
r
a
n
d
f
X ∩H
2
− f
X ∩H
1
Figure 21: The (a) conditional p.d.f.s (likelihood functions) f
XH
0
(x[H
0
) and f
XH
1
(x[H
1
), (b) joint p.d.f.s
f
XH
0
(x[H
0
)P [H
0
] and f
XH
1
(x[H
1
)P [H
1
], and (c) diﬀerence between the joint p.d.f.s, f
XH
1
(x[H
1
)P [H
1
] −
f
XH
0
(x[H
0
)P [H
0
], which is the integrand in (19).
ECE 5520 Fall 2009 55
Rearranging the terms,
f
XH
1
(x[H
1
)
f
XH
0
(x[H
0
)
<
P [H
0
]
P [H
1
]
(20)
The left hand side is called the likelihood ratio. The right hand side is a threshold. Whenever x indicates that
the likelihood ratio is less than the threshold, then we’ll decide H
0
, i.e., that s
0
(t) was sent. Otherwise, we’ll
decide H
1
, i.e., that s
1
(t) was sent.
Equation (20) is a very general result, applicable no matter what conditional distributions x has.
14.4 LogLikelihood Ratio
For the Gaussian distribution, the math gets much easier if we take the log of both sides. Why can we do this?
Solution: 1. Both sides are positive, 2. The log() function is strictly increasing.
Now, the loglikelihood ratio is
log
f
XH
1
(x[H
1
)
f
XH
0
(x[H
0
)
< log
P [H
0
]
P [H
1
]
14.5 Case of a
0
= 0, a
1
= 1 in Gaussian noise
In this example, n ∼ ^(0, σ
2
N
). In addition, assume for a minute that a
0
= 0 and a
1
= 1. What is:
1. The log of a Gaussian pdf?
2. The loglikelihood ratio?
3. The decision regions for x?
Solution: What is the log of a Gaussian pdf?
log f
XH
0
(x[H
0
) = log
_
_
1
_
2πσ
2
N
e
−
x
2
2σ
2
N
_
_
= −
1
2
log(2πσ
2
N
) −
x
2
2σ
2
N
(21)
The log f
XH
1
(x[H
1
) term will be the same but with (x − 1)
2
instead of x
2
. Continuing with the loglikelihood
ratio,
log f
XH
1
(x[H
1
) −log f
XH
0
(x[H
0
) < log
P [H
0
]
P [H
1
]
x
2
2σ
2
N
−
(x −1)
2
2σ
2
N
< log
P [H
0
]
P [H
1
]
x
2
−(x −1)
2
< 2σ
2
N
log
P [H
0
]
P [H
1
]
2x −1 < 2σ
2
N
log
P [H
0
]
P [H
1
]
x <
1
2
+σ
2
N
log
P [H
0
]
P [H
1
]
ECE 5520 Fall 2009 56
In the end result, there is a simple test for x  if it is below the decision threshold, decide H
0
. If it is above
the decision threshold,
x >
1
2
+σ
2
N
log
P [H
0
]
P [H
1
]
decide H
1
. Rather than writing both inequalities each time, we use the following notation:
x
H
1
>
<
H
0
1
2
+σ
2
N
log
P [H
0
]
P [H
1
]
This completely describes the detector receiver.
For simplicity, we also write x
H
1
>
<
H
0
γ where
γ =
1
2
+σ
2
N
log
P [H
0
]
P [H
1
]
(22)
14.6 General Case for Arbitrary Signals
If, instead of a
0
= 0 and a
1
= 1, we had arbitrary values for them (the signal space representations of s
0
(t) and
s
1
(t)), we could have derived the result in the last section the same way. As long as a
0
< a
1
, we’d still have
r
H
1
>
<
H
0
γ, but now,
γ =
a
0
+a
1
2
+
σ
2
N
a
1
−a
0
log
P [H
0
]
P [H
1
]
(23)
14.7 Equiprobable Special Case
If symbols are equally likely, P [H
1
] = P [H
0
], then
P[H
1
]
P[H
0
]
= 1 and the logarithm of the fraction is zero. So then
x
H
1
>
<
H
0
a
0
+a
1
2
The decision above says that if x is closer to a
0
, decide that s
0
(t) was sent. And if x is closer to a
1
, decide that
s
1
(t) was sent. The boundary is exactly halfway in between the two signal space vectors.
This receiver is also called a maximum likelihood detector, because we only decide which likelihood function
is higher (neither is scaled by the prior probabilities P [H
0
] or P [H
1
].
14.8 Examples
Example: When H
1
becomes less likely, which direction will the optimal threshold move, towards
a
0
or towards a
1
?
Solution: Towards a
1
.
Example: Let a
0
= −1, a
1
= 1, σ
2
N
= 0.1, P [H
1
] = 0.4, and P [H
0
] = 0.6. What is the decision
threshold for x?
ECE 5520 Fall 2009 57
Solution: From (23),
γ = 0 +
0.1
2
log
0.6
0.4
= 0.05 log 1.5 ≈ 0.0203
Example: Can the decision threshold be higher than both a
0
and a
1
in this binary, onedimensional
signalling, receiver?
Given a
0
, a
1
, σ
2
N
, P [H
1
], and P [H
0
], you should be able to calculate the optimal decision threshold γ.
Example: In this example, given all of the above constants and the optimal threshold γ, calculate
the probability of error from (18).
Starting from
P [error] = P [x ∈ R
1
[H
0
] P [H
0
] +P [x ∈ R
0
[H
1
] P [H
1
]
we can use the decision regions in (22) to write
P [error] = P [x > γ[H
0
] P [H
0
] +P [x < γ[H
1
] P [H
1
]
What is the ﬁrst probability, given that r[H
0
is Gaussian with mean a
0
and variance σ
2
N
? What is the second
probability, given that x[H
1
is Gaussian with mean a
1
and variance σ
2
N
? What is then the overall probability
of error?
Solution:
P [x > γ[H
0
] = Q
_
γ −a
0
σ
N
_
P [x < γ[H
1
] = 1 −Q
_
γ −a
1
σ
N
_
= Q
_
a
1
−γ
σ
N
_
P [error] = P [H
0
] Q
_
γ −a
0
σ
N
_
+P [H
1
] Q
_
a
1
−γ
σ
N
_
14.9 Review of Binary Detection
We did three things to prove some things about the optimal detector:
• We wrote the formula for the probability of error.
• We found the decision regions which minimized the probability of error.
• We used the log operator to show that for the Gaussian error case the decision regions are separated by
a single threshold.
• We showed the formula for that threshold, both in the equiprobable symbol case, and in the general case.
Lecture 9
Today: (1) Finish Lecture 8 (2) Intro to Mary PAM
ECE 5520 Fall 2009 58
Sample exams (20078) and solutions posted on WebCT. Of course, by putting these sample exams up, you
know the actual exam won’t contain the same exact problems. There is no guarantee this exam will be the
same level of diﬃculty of either past exam.
Notes on 2008 sample exam:
• Problem 1(b) was not material covered this semester.
• Problem 3 is not on the topic of detection theory, it is on the probability topic. There is a typo: f
N
(n)
should be f
N
(t).
• Problem 6 typo: x
1
(t) =
_
t, −1 ≤ t ≤ 1
0, o.w.
and x
1
(t) =
_
1
2
(3t
2
−1), −1 ≤ t ≤ 1
0, o.w.
. That is, the “x”s
on the RHS of the given expressions should be “t”s.
Notes on 2007 sample exam:
• Problem 2 should have said “with period T
sa
” rather than “at rate T
sa
”.
• The book that year used ϕ
i
(t) instead of φ
i
(t) as the notation for a basis function, and α
i
instead of s
i
as
the signal space vector for signal s
i
(t).
• Problem 6 is on the topic of detection theory.
15 Pulse Amplitude Modulation (PAM)
Def ’n: Pulse Amplitude Modulation (PAM)
Mary Pulse Amplitude Modulation is the use of M scaled versions of a single basis function p(t) = φ
0
(t) as a
signal set,
s
i
(t) = a
i
p(t), for i = 0, . . . , M
where a
i
is the amplitude of waveform i.
Each sent waveform (a.k.a. “symbol”) conveys k = log
2
M bits of information. Note we’re calling it p(t)
instead of φ
0
(t), and a
i
instead of a
i,0
, both for simplicity, since there’s only one basis function.
15.1 Baseband Signal Examples
When you compose a signal of a number of symbols, you’ll just add each timedelayed signal to the transmitted
waveform. The resulting signal we’ll call s(t) (without any subscript) and is
s(t) =
n
a(n)p(t −nT
s
)
where a(n) ∈ ¦a
0
, . . . , a
M−1
¦ is the nth symbol transmitted, and T
s
is the symbol period (not the sample
period!). Instead of just sending one symbol, we are sending one symbol every T
s
units of time. That makes
our symbol rate as 1/T
s
symbols per second.
Note: Keep in mind that when s(t) is to be transmitted on a bandpass channel, it is modulated with a
carrier cos(2πf
c
t),
x(t) = ℜ
_
s(t)e
j2πfct
_
ECE 5520 Fall 2009 59
Rice Figures 5.2.3 and 5.2.6 show continuoustime and discretetime realizations, respectively, of PAM
transmitters and receivers.
Example: Binary Bipolar PAM
Figure 22 shows a 4ary PAM signal set using amplitudes a
0
= −A, a
1
= A. It shows a signal set using square
pulses,
φ
0
(t) = p(t) =
_
1/
√
T
s
, 0 ≤ t < T
s
0, o.w.
0 0.2 0.4 0.6 0.8 1
−1
−0.5
0
0.5
1
Time t
V
a
l
u
e
Figure 22: Example signal for binary bipolar PAM example for A =
√
T
s
.
Example: Binary Unipolar PAM
Unipolar binary PAM uses the amplitudes a
0
= 0, a
1
= A. The average energy per symbol has decreased. This
is also called OnOﬀ Keying (OOK). If the yaxis in Figure 22 was scaled such that the minimum was 0 instead
of −1, it would represent unipolar PAM.
Example: 4ary PAM
A 4ary PAM signal set using amplitudes a
0
= −3A, a
1
= −A, a
2
= A, a
3
= 3A is shown in Figure 23. It shows
a signal set using square pulses,
p(t) =
_
1/
√
T
s
, 0 ≤ t < T
s
0, o.w.
Typical Mary PAM (for all even M) uses the following amplitudes:
−(M −1)A, −(M −3)A, . . . , −A, A, . . . , +(M −3)A, +(M −1)A
Rice Figure 5.2.1 shows the signal space diagram of Mary PAM for diﬀerent values of M.
15.2 Average Bit Energy in Mary PAM
Assuming that each symbol s
i
(t) is equally likely to be sent, we want to calculate the average bit energy c
b
,
which is 1/ log
2
M times the symbol energy c
s
,
c
s
= E
__
∞
−∞
a
2
i
p
2
(t)dt
_
=
_
∞
−∞
E
_
a
2
i
¸
φ
2
0
(t)dt
= E
_
a
2
i
¸
(24)
ECE 5520 Fall 2009 60
0 0.2 0.4 0.6 0.8 1
−3
−2
−1
0
1
2
3
Time t
V
a
l
u
e
Figure 23: Example signal for 4ary PAM example.
Let ¦a
0
, . . . , a
M−1
¦ = −(M −1)A, −(M −3)A, . . . , (M −1)A, as described above to be the typical Mary PAM
case. Then
E
_
a
2
i
¸
=
A
2
M
_
(1 −M)
2
+ (3 −M)
2
+ + (M −1)
2
¸
=
A
2
M
M
m=1
(2m−1 −M)
2
=
A
2
M
M(M
2
−1)
3
= A
2
(M
2
−1)
3
So c
s
=
(M
2
−1)
3
A
2
, and
c
b
=
1
log
2
M
(M
2
−1)
3
A
2
Lecture 10
Today: Review for Exam 1
16 Topics for Exam 1
1. Fourier transform of signals, properties (12)
2. Bandpass signals
3. Sampling and aliasing
4. Orthogonality / Orthonormality
5. Correlation and Covariance
6. Signal space representation
7. Correlation / matched ﬁlter receivers
ECE 5520 Fall 2009 61
Make sure you know how to do each HW problem in HWs 13. It will be good to know the ‘tricks’ of how
each problem is done, since they will reappear.
Read over the lecture notes 17. If anything is confusing, ask me.
You may have one side of an 8.511 sheet of paper for notes. You will be provided with Table 2.4.4 and
Table 2.4.8.
17 Additional Problems
17.1 Spectrum of Communication Signals
1. Bateman Problem 1.9: An otherwise ideal mixer (multiplier) generates an internal DC oﬀset which sums
with the baseband signal prior to multiplication with the cos(2πf
c
t) carrier. How does this aﬀect the
spectrum of the output signal?
2. Haykin & Moyer Problem 2.25. A signal x(t) of ﬁnite energy is applied to a square law device whose
output y(t) is deﬁned by y(t) = x
2
(t). The spectrum of x(t) is bandlimited to −W ≤ ω ≤ W. Prove that
the spectrum of y(t) is bandlimited to −2W ≤ ω ≤ 2W.
3. Consider the pulse shape
x(t) =
_
cos
_
πt
Ts
_
, −
Ts
2
< t <
Ts
2
0, o.w.
(a) Draw a plot of the pulse shape x(t).
(b) Find the Fourier transform of x(t).
4. Rice 2.39, 2.52
17.2 Sampling and Aliasing
1. Assume that x(t) is a bandlimited signal with X(f) = 0 for [f[ > W. When x(t) is sampled at a rate
T =
1
2W
, its samples X
n
are,
X
n
=
_
_
_
1, n = ±1
2, n = 0
0, o.w.
What was the original continuoustime signal x(t)? Or, if x(t) cannot be determined, why not?
2. Rice 2.99, 2.100
17.3 Orthogonality and Signal Space
1. (from L.W. Couch 2007, Problem 249). Three functions are shown in Figure P249.
(a) Show that these functions are orthogonal over the interval (−4, 4).
(b) Find the scale factors needed to scale the functions to make them into an orthonormal set over the
interval (−4, 4).
(c) Express the waveform
w(t) =
_
1, 0 ≤ t ≤ 4
0, o.w.
in signal space using the orthonormal set found in part (b).
ECE 5520 Fall 2009 62
2. Rice Example 5.1.1 on Page 219.
3. Rice 5.2, 5.5, 5.13, 5.14, 5.16, 5.23, 5.26
17.4 Random Processes, PSD
1. (From Couch 2007 Problem 280) An RC lowpass ﬁlter is the circuit shown in Figure 24. Its transfer
function is
X(t) Y(t)
R
C
+ +
 
Figure 24: An RC lowpass ﬁlter.
H(f) =
Y (f)
X(f)
=
1
1 +j2πRCf
Given that the PSD of the input signal is ﬂat and constant at 1, i.e., S
X
(f) = 1, design an RC lowpass
ﬁlter that will attenuate this signal by 20 dB at 15 kHz. That is, ﬁnd the value of RC to satisfy the design
speciﬁcations.
2. Let a binary 1D communication system be described by two possible signals, and noise with diﬀerent
variance depending on which signal is sent, i.e.,
H
0
: r = a
0
+n
0
H
1
: r = a
1
+n
1
where α
0
= −1 and α
1
= 1, and n
0
is zeromean Gaussian with variance 0.1, and n
1
is zeromean Gaussian
with variance 0.3.
(a) What is the conditional pdf of r given H
0
?
(b) What is the conditional pdf of r given H
1
?
17.5 Correlation / Matched Filter Receivers
1. A binary communication system uses the following equallylikely signals,
s
1
(t) =
_
cos
_
πt
T
_
, −
T
2
≤ t ≤
T
2
0, o.w.
s
2
(t) =
_
−cos
_
πt
T
_
, −
T
2
≤ t ≤
T
2
0, o.w.
At the receiver, a signal x(t) = s
i
(t) +n(t) is received.
(a) Is s
1
(t) an energytype or powertype signal?
(b) What is the energy or power (depending on answer to (a)) in s
1
(t)?
(c) Describe in words and a block diagram the operation of an correlation receiver.
ECE 5520 Fall 2009 63
2. Proakis & Salehi 7.8. Suppose that two signal waveforms s
1
(t) and s
2
(t) are orthogonal over the interval
(0, T). A sample function n(t) of a zeromean, white noise process is correlated with s
1
(t) and
2
(t) to
yield,
n
1
=
_
T
0
s
1
(t)n(t)dt
n
2
=
_
T
0
s
2
(t)n(t)dt
Prove that E[n
1
n
2
] = 0.
3. Proakis & Salehi 7.16. Prove that when a sinc pulse g
T
(t) = sin(πt/T)/(πt/T) is passed through its
matched ﬁlter, the output is the same sinc pulse.
Lecture 11
Today: (1) Mary PAM Intro from Lecture 9 notes (2) Probability of Error in PAM
18 Probability of Error in Binary PAM
This is in Section 6.1.2 of Rice.
How are N
0
/2 and σ
N
related? Recall from lecture 7 that the variance of n
k
came out to be N
0
/2. We had
also called the variance of noise component n
k
as σ
2
N
. So:
σ
2
N
=
N
0
2
. (25)
18.1 Signal Distance
We mentioned, when talking about signal space diagrams, a distance between vectors,
d
i,j
= a
i
−a
j
 =
_
M
k=1
(a
i,k
−a
j,k
)
2
_
1/2
For the 1D, two signal case, there is only d
1,0
,
d
1,0
= [a
1
−a
0
[ (26)
18.2 BER Function of Distance, Noise PSD
We have consistently used a
1
> a
0
. In the Gaussian noise case (with equal variance of noise in H
0
and H
1
),
and with P [H
0
] = P [H
1
], we came up with threshold γ = (a
0
+a
1
)/2, and
P [error] = P [x > γ[H
0
] P [H
0
] +P [x < γ[H
1
] P [H
1
]
ECE 5520 Fall 2009 64
Which simpliﬁed using the Q function,
P [x > γ[H
0
] = Q
_
γ −a
0
σ
N
_
P [x < γ[H
1
] = 1 −Q
_
γ −a
1
σ
N
_
= Q
_
a
1
−γ
σ
N
_
(27)
Thus
P [x > γ[H
0
] = Q
_
(a
1
−a
0
)/2
σ
N
_
P [x < γ[H
1
] = Q
_
(a
1
−a
0
)/2
σ
N
_
(28)
So both are equal, and again with P [H
0
] = P [H
1
] = 1/2,
P [error] = Q
_
(a
1
−a
0
)/2
σ
N
_
(29)
We’d assumed a
1
> a
0
, so if we had also calculated for the case a
1
< a
0
, we’d have seen that in general, the
following formula works:
P [error] = Q
_
[a
1
−a
0
[
2σ
N
_
Then, using (25) and (26), we have
P [error] = Q
_
d
0,1
2
_
N
0
/2
_
= Q
_
_
¸
d
2
0,1
2N
0
_
_
(30)
This will be important as we talk about binary PAM. This expression,
Q
_
_
¸
d
2
0,1
2N
0
_
_
Is one that we will see over and over again in this class.
18.3 Binary PAM Error Probabilities
For binary PAM (M = 2) it takes one symbol to encode one bit. Thus ‘bit’ and ‘symbol’ are interchangeable.
In signal space, our signals s
0
(t) = a
0
p(t) and s
1
(t) = a
1
p(t) are just s
0
= a
0
and s
1
= a
1
. Now we can use
(30) to compute the probability of error.
For bipolar signaling, when A
0
= −A and A
1
= A,
P [error] = Q
_
_
¸
4A
2
2N
0
_
_
= Q
_
_
¸
2A
2
N
0
_
_
For unipolar signaling, when A
0
= 0 and A
1
= A,
P [error] = Q
_
_
¸
A
2
2N
0
_
_
ECE 5520 Fall 2009 65
However, this doesn’t take into consideration that the unipolar signaling method uses only half of the energy
to transmit. Since half of the bits are exactly zero, they take no signal energy. Using (24), what is the average
energy of bipolar and unipolar signaling?
Solution: For the bipolar signalling, A
2
m
= A
2
always, so c
b
= c
s
= A
2
. For the unipolar signalling, A
2
m
equals A
2
with probability 1/2 and zero with probability 1/2. Thus c
b
= c
s
=
1
2
A
2
.
Now, we rewrite the probability of error expressions in terms of c
b
. For bipolar signaling,
P [error] = Q
_
_
¸
4A
2
2N
0
_
_
= Q
_
_
2c
b
N
0
_
For unipolar signaling,
P [error] = Q
_
_
2c
b
2N
0
_
= Q
_
_
c
b
N
0
_
Discussion These probabilities of error are shown in Figure 25.
0 2 4 6 8 10 12 14
10
−6
10
−5
10
−4
10
−3
10
−2
10
−1
10
0
E
b
/ N
0
ratio, dB
T
h
e
o
r
e
t
i
c
a
l
P
r
o
b
a
b
i
l
i
t
y
o
f
E
r
r
o
r
Unipolar
Bipolar
Figure 25: Probability of Error in binary PAM signalling.
• Even for the same bit energy c
b
, the bit error rate of bipolar PAM beats unipolar PAM.
• What is the diﬀerence in c
b
/N
0
for a constant probability of error?
• What is the diﬀerence in terms of probability of error for a given c
b
/N
0
?
Lecture 12
Today: (0) Exam 1 Return, (1) MultiHypothesis Detection Theory, (2) Probability of Error in Mary
PAM
ECE 5520 Fall 2009 66
19 Detection with Multiple Symbols
When we only had s
0
(t) and s
1
(t), we saw that our decision was based on which of the following was higher:
f
XH
0
(x[H
0
)P [H
0
] and f
XH
1
(x[H
1
)P [H
1
]
For Mary signals, we’ll see that we need to consider the highest of all M possible events H
m
,
H
0
: r(t) = s
0
(t) +n(t)
H
1
: r(t) = s
1
(t) +n(t)
H
M−1
: r(t) = s
M
(t) +n(t)
which have joint probabilities,
H
0
: f
XH
0
(x[H
0
)P [H
0
]
H
1
: f
XH
1
(x[H
1
)P [H
1
]
H
M−1
: f
XH
M−1
(x[H
M−1
)P [H
M−1
]
We will ﬁnd which of these joint probabilities is highest. For this class, we’ll only consider the case of equally
probable signals. (While equiprobable signals is sometimes not the case for M = 2 binary detection, it is very
rare in higher M communication systems.) If P [H
0
] = = P [H
M−1
] then we only need to ﬁnd the i that
makes the likelihood f
XH
i
(x[H
i
) maximum, that is, maximum likelihood detection.
Symbol Decision = arg max
i
f
XH
i
(x[H
i
)
For Gaussian (conditional) r.v.s with equal variances σ
2
N
, it is better to maximize the log of the likelihood rather
than the likelihood directly, so
log f
XH
i
(x[H
i
) = −
1
2
log(2πσ
2
N
) −
(x −a
i
)
2
2σ
2
N
This is maximized when (x − a
i
)
2
is minimized. Essentially, this is a (squared) distance between x and a
i
. So,
the decision is, ﬁnd the a
i
which is closest to x.
The decision between two neighboring signal vectors a
i
and a
i+1
will be
r
H
i+1
>
<
H
i
γ
i,i+1
=
a
i
+a
i+1
2
As an example: 4ary PAM. See Figure 26.
r
a
1
a
2
a
3
a
4
g
12
g
23
g
34
Figure 26: Signal space representation and decision thresholds for 4ary PAM.
ECE 5520 Fall 2009 67
20 Mary PAM Probability of Error
20.1 Symbol Error
The probability that we don’t get the symbol correct is the probability that x does not fall within the range
between the thresholds in which it belongs. Here, each a
i
= A
i
. Also the noise σ
2
N
= N
0
/2, as discussed above.
Assuming neighboring symbols a
i
are spaced by 2A, the decision threshold is always A away from the symbol
values. For the symbols i in the ‘middle’,
P(symbol error[H
i
) = 2Q
_
A
_
N
0
/2
_
For the symbols i on the ‘sides’,
P(symbol error[H
i
) = Q
_
A
_
N
0
/2
_
So overall,
P(symbol error) =
2(M −1)
M
Q
_
A
_
N
0
/2
_
20.1.1 Symbol Error Rate and Average Bit Energy
How does this relate to the average bit energy c
b
? From last lecture, c
b
=
1
log
2
M
(M
2
−1)
3
A
2
, which means that
A =
_
3 log
2
M
M
2
−1
c
b
So
P(symbol error) =
2(M −1)
M
Q
_
_
6 log
2
M
M
2
−1
c
b
N
0
_
(31)
Equation (31) is plotted in Figure 27.
0 2 4 6 8 10 12 14
10
−6
10
−5
10
−4
10
−3
10
−2
10
−1
10
0
E
b
/ N
0
ratio, dB
T
h
e
o
r
e
t
i
c
a
l
P
r
o
b
a
b
i
l
i
t
y
o
f
E
r
r
o
r
M=2
M=4
M=8
M=16
Figure 27: Probability of Symbol Error in Mary PAM.
ECE 5520 Fall 2009 68
20.2 Bit Errors and Gray Encoding
For binary PAM, there are only two symbols, one will be assigned binary 0 and the other binary 1. When you
make one symbol error (decide H
0
or H
1
in error) then it will cause one bit error.
For Mary PAM, bits and symbols are not synonymous. Instead, we have to carefully assign bit codes to
symbols 1 . . . M.
Example: Bit coding of M = 4 symbols
While the two options shown in Figure 28 both assign 2bits to each symbol in unique ways, one will lead to a
higher bit error rate than the other. Is one is better or worse?
1 1 1 0
“00” “01” “10” “11”
“00” “11” “10” “01” Option1:
Option2:
Figure 28: Two options for assigning bits to symbols.
The key is to recall the model for noise. It will not shift a signal uniformly across symbols. It will tend to
leave the received signal x close to the original signal a
i
. The neighbors of a
i
will be more likely than distant
signal space vectors.
Thus Gray encoding will only change one bit across boundaries, as in Option 2 in Figure 28.
Example: Bit coding of M = 8 symbols
Assign three bits to each symbol such that any two nearest neighbors are diﬀerent in only one bit (Gray
encoding).
Solution: Here is one solution.
f
1 1
“101” “100”
1 0
“001” “000”
3 2
“010” “011”
2 3
“110” “111”
Figure 29: Gray encoding for 8ary PAM.
20.2.1 Bit Error Probabilities
How many bit errors are caused by a symbol error in Mary PAM?
• One. If Gray encoding is used, the errors will tend to be just one bit ﬂipped, more than multiple bits
ﬂipped. At least at high E
b
/N
0
,
P(error) ≈
1
log
2
M
P(symbol error) (32)
• Maybe more, up to log
2
M in the worst case. Then, we need to study further the probability that x will
jump more than one decision region.
As of now  if you have to estimate a bit error rate for Mary PAM  use (32). We will show that this
approximation is very good, in almost all cases. We will also show examples in multidimensional signalling
(e.g., QAM) when this is not a good approximation.
ECE 5520 Fall 2009 69
Lecture 13
Today: (1) Pulse Shaping / ISI, (2) NDim Detection Theory
21 Intersymbol Interference
1. Consider the spectrum of the ideal 1D PAM system with a square pulse.
2. Consider the timedomain of the signal which is close to a rect function in the frequency domain.
We don’t want either: (1) occupies too much spectrum, and (2) occupies too much time.
1. If we try (1) above and use FDMA (frequency division multiple access) then the interference is outofband
interference.
2. If we try (2) above and put symbols right next to each other in time, our own symbols experience
interference called intersymbol interference.
In reality, we want to compromise between (1) and (2) and experience only a small amount of both.
Now, consider the eﬀect of ﬁltering in the path between transmitter and receiver. Filters will be seen in
1. Transmitter: Baseband ﬁlters come from the limited bandwidth, speed of digital circuitry. RF ﬁlters are
used to limit the spectral output of the signal (to meet outofband interference requirements).
2. Channel: The wideband radio channel is a sum of timedelayed impulses. Wired channels have (non
ideal) bandwidth – the attenuation of a cable or twisted pair is a function of frequency. Waveguides have
bandwidth limits (and modes).
3. Receiver: Bandpass ﬁlters are used to reduce interference, noise power entering receiver. Note that the
‘matched ﬁlter’ receiver is also a ﬁlter!
21.1 Multipath Radio Channel
(Background knowledge). The measured radio channel is a ﬁlter, call its impulse response h(τ, t). If s(t) is
transmitted, then
r(t) = s(t) ⋆ h(τ, t)
will be received. Typically, the radio channel is modeled as
h(τ, t) =
L
l=1
α
l
e
jφ
l
δ(τ −τ
l
), (33)
where α
l
and φ
l
are the amplitude and phase of the lth multipath component, and τ
l
is its time delay. Essen
tially, each reﬂection, diﬀraction, or scattering adds an additional impulse to (33) with a particular time delay
(depending on the extra path length compared to the lineofsight path) and complex amplitude (depending
on the losses and phase shifts experienced along the path). The amplitudes of the impulses tend to decay over
time, but multipath with delays of hundreds of nanoseconds often exist in indoor links and delays of several
microseconds often exist in outdoor links.
ECE 5520 Fall 2009 70
The channel in (33) has inﬁnite bandwidth (why?) so isn’t really what we’d see if we looked just at a narrow
band of the channel.
Example: 24002480 MHz Measured Radio Channel
The ‘Delay Proﬁle’ is shown in Figure 30 for an example radio channel. Actually what we observe
PDP(t) = r(t) ⋆ x(t) = (x(t) ⋆ h(t)) ⋆ x(t)
PDP(t) = r(t) ⋆ x(t) = R
X
(t) ⋆ h(t)
PDP(f) = H(f)S
X
(f) (34)
Note that 200 ns is 0.2µs is 1/ 5 MHz. At this symbol rate, each symbol would fully bleed into the next symbol.
(a)
0 100 200 300 400
0
0.05
0.1
0.15
0.2
0.25
Delay, ns
A
m
p
l
i
t
u
d
e
D
e
l
a
y
P
r
o
f
i
l
e
(b)
0 100 200 300 400
0
0.05
0.1
0.15
0.2
0.25
Delay, ns
A
m
p
l
i
t
u
d
e
D
e
l
a
y
P
r
o
f
i
l
e
Figure 30: Multiple delay proﬁles measured on two example links: (a) 13 to 43, and (b) 14 to 43.
How about for a 5 kHz symbol rate? Why or why not?
Other solutions:
• OFDM (Orthogonal Frequency Division Multiplexing)
• Direct Sequence Spread Spectrum / CDMA
• Adaptive Equalization
21.2 Transmitter and Receiver Filters
Example: Bipolar PAM Signal Input to a Filter
How does this eﬀect the correlation receiver and detector? Symbols sent over time are no longer orthogonal
– one symbol ‘leaks’ into the next (or beyond).
22 Nyquist Filtering
Key insight:
• We don’t need the leakage to be zero at all time.
ECE 5520 Fall 2009 71
• The value out of the matched ﬁlter (correlator) is taken only at times nT
s
, for integer n.
Nyquist ﬁltering has the objective of, at the matched ﬁlter output, making the sum of ‘leakage’ or ISI from
one symbol be exactly 0 at all other sampling times nT
s
.
Where have you seen this condition before? This condition, that there is an inner product of zero, is the con
dition for orthogonality of two waveforms. What we need are pulse shapes (waveforms), that when shifted in
time by T
s
, are orthogonal to each other. In more mathematical language, we need p(t) such that
¦φ
n
(t) = p(t −nT
s
)¦
n=...,−1,0,1,...
forms an orthonormal basis. The square root raised cosine, shown in Figure 32 accomplishes this. But lots of
other functions also accomplish this. The Nyquist ﬁltering theorem provides an easy means to come up with
others.
Theorem: Nyquist Filtering
Proof: A necessary and suﬃcient condition for x(t) to satisfy
x(nT
s
) =
_
1, n = 0
0, o.w.
is that its Fourier transform X(f) satisfy
∞
m=−∞
X
_
f +
m
T
s
_
= T
s
Proof: on page 833, Appendix A, of Rice book.
Basically, X(f) could be:
• X(f) = rect(fT
s
), i.e., exactly constant within −
1
2Ts
< f <
1
2Ts
band and zero outside.
• It may bleed into the next ‘channel’ but the sum of
+X
_
f −
1
T
s
_
+X(f) +X
_
f +
1
T
s
_
+
must be constant across all f.
Thus X(f) is allowed to bleed into other frequency bands – but the neighboring frequencyshifted copy of X(f)
must be lower s.t. the sum is constant. See Figure 31.
If X(f) only bleeds into one neighboring channel (that is, X(f) = 0 for all [f[ >
1
Ts
), we denote the diﬀerence
between the ideal rect function and our X(f) as ∆(f),
∆(f) = [X(f) −rect(fT
s
)[
then we can rewrite the Nyquist Filtering condition as,
∆
_
1
2T
s
−f
_
= ∆
_
1
2T
s
+f
_
for −1/T
s
≤ f < 1/T
s
. Essentially, symmetric about f = 1/(2T
s
). See Figure 31
ECE 5520 Fall 2009 72
1
2T
f
X f ( )
X f+ T ( / ) 1
X f X f+ T ( )+ ( / ) 1
1
T
Figure 31: The Nyquist ﬁltering criterion – 1/T
s
frequencyshifted copies of X(f) must add up to a constant.
Bateman presents this whole condition as “A Nyquist channel response is characterized by the transfer
function having a transition band that is symmetrical about a frequency equal to 0.5 1/T
s
.”
Activity: Doityourself Nyquist ﬁlter. Take a sheet of paper and fold it in half, and then in half the other
direction. Cut a function in the thickest side (the edge that you just folded). Leave a tiny bit so that it is not
completely disconnected into two halves. Unfold. Drawing a horizontal line for the frequency f axis, the middle
is 0.5/T
s
, and the vertical axis it X(f).
22.1 Raised Cosine Filtering
H
RC
(f) =
_
¸
_
¸
_
T
s
, 0 ≤ [f[ ≤
1−α
2Ts
Ts
2
_
1 + cos
_
πTs
α
_
[f[ −
1−α
2Ts
___
,
1−α
2Ts
≤ [f[ ≤
1+α
2Ts
0, o.w.
(35)
22.2 SquareRoot Raised Cosine Filtering
But we usually need do design a system with two identical ﬁlters, one at the transmitter and one at the receiver,
which in series, produce a zeroISI ﬁlter. In other words, we need [H(f)[
2
to meet the Nyquist ﬁltering condition.
H
RRC
(f) =
_
¸
¸
_
¸
¸
_
√
T
s
, 0 ≤ [f[ ≤
1−α
2Ts _
Ts
2
_
1 + cos
_
πTs
α
_
[f[ −
1−α
2Ts
___
,
1−α
2Ts
≤ [f[ ≤
1+α
2Ts
0, o.w.
(36)
23 Mary Detection Theory in Ndimensional signal space
We are going to start to talk about QAM, a modulation with two dimensional signal vectors, and later even
higher dimensional signal vectors. We have developed Mary detection theory for 1D signal vectors, and now
we will extend that to Ndimensional signal vectors.
Setup:
• Transmit: one of M possible symbols, a
0
, . . . , a
M−1
.
ECE 5520 Fall 2009 73
−6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7
−0.05
0
0.05
0.1
0.15
0.2
0.25
Time, t/T
sy
B
a
s
i
s
F
u
n
c
t
i
o
n
φ
i
(
t
)
Figure 32: Two SRRC Pulses, delayed in time by nT
s
for any integer n, are orthogonal to each other.
• Receive: the symbol plus noise:
H
0
: X = a
0
+n
H
M−1
: X = a
M−1
+n
• Assume: n is multivariate Gaussian, each component n
i
is independent with zero mean and variance
σ
N
= N
0
/2.
• Assume: Symbols are equally likely.
• Question: What are the optimal decision regions?
When symbols are equally likely, the optimal decision turns out to be given by the maximum likelihood
receiver,
ˆ
i = argmax
i
log f
XH
i
(x[H
i
) (37)
Here,
f
XH
i
(x[H
i
) =
1
(2πσ
2
N
)
N/2
exp
_
−
j
(x
j
−a
i,j
)
2
2σ
2
N
_
which can also be written as
f
XH
i
(x[H
i
) =
1
(2πσ
2
N
)
N/2
exp
_
−
x −a
i

2
2σ
2
N
_
So when we want to solve (37) we can simplify quickly to:
ˆ
i = argmax
i
_
log
1
(2πσ
2
N
)
N/2
−
x −a
i

2
2σ
2
N
_
ˆ
i = argmax
i
−
x −a
i

2
2σ
2
N
ˆ
i = argmin
i
x −a
i

2
ˆ
i = argmin
i
x −a
i
 (38)
ECE 5520 Fall 2009 74
Again: Just ﬁnd the a
i
in the signal space diagram which is closest to x.
Pairwise Comparisons When is x closer to a
i
than to a
j
for some other signal space point j? Solution:
The two decision regions are separated by a straight line (Note: replace “line” with plane in 3D, or
subspace in ND). To ﬁnd this line:
1. Draw a line segment connecting a
i
and a
j
.
2. Draw a point in the middle of that line segment.
3. Draw the perpendicular bisector of the line segment through that point.
Why?
Solution: Try to ﬁnd the locus of point x which satisfy
x −a
i

2
= x −a
j

2
You can do this by using the inner product to represent the magnitude squared operator:
(x −a
i
)
T
(x −a
i
) = (x −a
j
)
T
(x −a
j
)
Then use FOIL (multiply out), cancel, and reorganize to ﬁnd a linear equation in terms of x. This is left as an
exercise.
Decision Regions Each pairwise comparison results in a linear division of space. The combined decision
region of R
i
is the space which is the intersection of all pairwise decision regions. (All conditions must be
satisﬁed.)
Example: Optimal Decision Regions
See Figure 33.
Lecture 14
Today: (1) QAM & PSK, (2) QAM Probability of Error
24 Quadrature Amplitude Modulation (QAM)
Quadrature Amplitude Modulation (QAM) is a twodimensional signalling method which uses the inphase and
quadrature (cosine and sine waves, respectively) as the two dimensions. Thus QAM uses two basis functions.
These are:
φ
0
(t) =
√
2p(t) cos(ω
0
t)
φ
1
(t) =
√
2p(t) sin(ω
0
t)
where p(t) is a pulse shape (like the ones we’ve looked at previously) with support on T
1
≤ t ≤ T
2
. That is, p(t)
is only nonzero within that window.
Previously, we’ve separated frequency upconversion and downconversion from pulse shaping. This deﬁni
tion of the orthonormal basis speciﬁcally considers a pulse shape at a frequency ω
0
. We include it here because
ECE 5520 Fall 2009 75
(a)
(b)
(c)
(d)
Figure 33: Example signal space diagrams. Draw the optimal decision regions.
ECE 5520 Fall 2009 76
it is critical to see how, with the same pulse shape p(t), we can have two orthogonal basis functions. (This is
not intuitive!)
We have two restrictions on p(t) that makes these two basis functions orthonormal:
• p(t) is unitenergy.
• p(t) is ‘low pass’; that is, it has low frequency content compared to ω
0
t.
24.1 Showing Orthogonality
Let’s show that the basis functions are orthogonal. We’ll need these cosine / sine function relationships:
sin(2A) = 2 cos Asin A
cos A−cos B = −2 sin
_
A+B
2
_
sin
_
A−B
2
_
As a ﬁrst example, let p(t) = c for a constant c , for T
1
≤ t ≤ T
2
. Are the two bases orthogonal? First, show
that
¸φ
0
(t), φ
1
(t)) = c
2
sin(ω
0
(T
2
+T
1
)) sin(ω
0
(T
2
−T
1
))
ω
0
Solution: First, see how far we can get before plugging in for p(t):
¸φ
0
(t), φ
1
(t)) =
_
T
2
T
1
√
2p(t) cos(ω
0
t)
√
2p(t) sin(ω
0
t)dt
= 2
_
T
2
T
1
p
2
(t) cos(ω
0
t) sin(ω
0
t)dt
=
_
T
2
T
1
p
2
(t) sin(2ω
0
t)dt
Next, consider the constant p(t) = c for a constant c, for T
1
≤ t ≤ T
2
.
¸φ
0
(t), φ
1
(t)) = c
2
_
T
2
T
1
sin(2ω
0
t)dt
= c
2
_
−
cos(2ω
0
t)
2ω
0
¸
¸
¸
¸
T
2
T
1
= −c
2
cos(2ω
0
T
2
) −cos(2ω
0
T
1
)
2ω
0
= −c
2
cos(2ω
0
T
2
) −cos(2ω
0
T
1
)
2ω
0
I want the answer in terms of T
2
−T
1
(for reasons explained below), so
¸φ
0
(t), φ
1
(t)) = c
2
sin(ω
0
(T
2
+T
1
)) sin(ω
0
(T
2
−T
1
))
ω
0
We can see that there are two cases:
ECE 5520 Fall 2009 77
1. The pulse duration T
2
− T
1
is an integer number of periods. That is, ω
0
(T
2
− T
1
) = πk for some integer
k. In this case, the right sin is zero, and so the correlation is exactly zero.
2. Otherwise, the numerator bounded above and below by +1 and 1, because it is a sinusoid. That is,
−
c
2
ω
0
≤ ¸φ
0
(t), φ
1
(t)) ≤
c
2
ω
0
Typically, ω
0
is a large number. For example, frequencies 2πω
0
could be in the MHz or GHz ranges.
Certainly, when we divide by numbers on the order of 10
6
or 10
9
, we’re going to get a very small inner
product. For engineering purposes, φ
0
(t) and φ
1
(t) are orthogonal.
Finally, we can attempt the proof for the case of arbitrary pulse shape p(t). In this case, we use the ‘lowpass’
assumption that the maximum frequency content of p(t) is much lower than 2πω
0
. This assumption allows us
to assume that p(t) is nearly constant over the course of one cycle of the carrier sinusoid.
This is wellillustrated in Figure 5.15 in the Rice book (page 298). In this ﬁgure, we see a pulse modulated
by a sine wave at frequency 2πω
0
. Zooming in on any few cycles, you can see that the pulse p
2
(t) is largely
constant across each cycle. Thus, when we integrate p
2
(t) sin(2ω
0
t) across one cycle, we’re going to end up with
approximately zero. Proof?
Solution: How many cycles are there? Consider the period [T
1
, T
2
]. How many times can it be divided by
π/ω
0
? Let the integer number be
L =
_
(T
2
−T
1
)ω
0
π
_
.
Then each cycle is in the period, [T
1
+ (i − 1)π/ω
0
, T
1
+ iπ/ω
0
], for i = 1, . . . , L. Within each of these cycles,
assume p
2
(t) is nearly constant. Then, in the same way that the earlier integral was zero, this part of the
integral is zero here. The only remainder is the remainder (partial cycle) period, [T
1
+Lπ/ω
0
, T
2
]. Thus
¸φ
0
(t), φ
1
(t)) = p
2
(T
1
+Lπ/ω
0
)
_
T
2
T
1
+Lπ/ω
0
sin(2ω
0
t)dt
= p
2
(T
1
+Lπ/ω
0
)
cos(2ω
0
T
2
) −cos(2ω
0
T
1
+ 2πL)
2ω
0
= p
2
(T
1
+Lπ/ω
0
)
sin(ω
0
(T
2
+T
1
)) sin(ω
0
(T
2
−T
1
))
ω
0
Again, the inner product is bounded on either side:
−
p
2
(T
1
+Lπ/ω
0
)
ω
0
≤ ¸φ
0
(t), φ
1
(t))
p
2
(T
1
+Lπ/ω
0
)
ω
0
which for very large ω
0
, is nearly zero.
24.2 Constellation
With these two basis functions, Mary QAM is deﬁned as an arbitrary signal set a
0
, . . . , a
M−1
, where each
signal space vector a
k
is twodimensional:
a
k
= [a
k,0
, a
k,1
]
T
ECE 5520 Fall 2009 78
The signal corresponding to symbol k in (Mary) QAM is thus
s(t) = a
k,0
φ
0
(t) +a
k,1
φ
1
(t)
= a
k,0
√
2p(t) cos(ω
0
t) +a
k,1
√
2p(t) sin(ω
0
t)
=
√
2p(t) [a
k,0
cos(ω
0
t) +a
k,1
sin(ω
0
t)]
Note that we could also write the signal s(t) as
s(t) =
√
2p(t)R
_
a
k,0
e
−jω
0
t
+ja
k,1
e
−jω
0
t
_
=
√
2p(t)R
_
e
−jω
0
t
(a
k,0
+ja
k,1
)
_
(39)
In many textbooks, you will see them write a QAM signal in shorthand as
s
CB
(t) = p(t)(a
k,0
+ja
k,1
)
This is called ‘Complex Baseband’. If you do the following operation you can recover the real signal s(t) as
s(t) =
√
2R
_
e
−jω
0
t
s
SB
(t)
_
You will not be responsible for Complex Baseband notation, but you should be able to read other books and
know what they’re talking about.
Then, we can see that the signal space representation a
k
is given by
a
k
= [a
k,0
, a
k,1
]
T
for k = 0, . . . , M −1
24.3 Signal Constellations
M=64QAM
Figure 34: Square signal constellation for 64QAM.
• See Figure 5.17 for examples of square QAM. These constellations use M = 2
a
for some even integer a,
and arrange the points in a grid. One such diagram for M = 64 square QAM is also given here in Figure
34.
• Figure 5.18 shows examples of constellations which use M = 2
a
for some odd integer a, and arrange the
points in a grid. These are either rectangular grids, or use squares with the corners cut out.
ECE 5520 Fall 2009 79
24.4 Angle and Magnitude Representation
You can plot a
k
in signal space and see that it has a magnitude (distance from the origin) of [a
k
[ =
_
a
2
k,0
+a
2
k,1
and angle of ∠a
k
= tan
−1
a
k,1
a
k,0
. In the continuous time signal s(t) this is
s(t) =
√
2p(t)[a
k
[ cos(ω
0
t +∠a
k
)
24.5 Average Energy in MQAM
Recall that the average energy is calculated as:
c
s
=
1
M
M−1
i=0
[a
i
[
2
c
b
=
1
M log
2
M
M−1
i=0
[a
i
[
2
where c
s
is the average energy per symbol and c
b
is the average energy per bit. We’ll work in class some
examples of ﬁnding c
b
in diﬀerent constellation diagrams.
24.6 PhaseShift Keying
Some implementations of QAM limit the constellation to include only signal space vectors with equal magnitude,
i.e.,
[a
0
[ = [a
1
[ = = [a
M−1
[
The points a
i
for i = 0, . . . , M −1 are uniformly spaced on the unit circle. Some examples are shown in Figure
35.
BPSK For example, binary phaseshift keying (BPSK) is the case of M = 2. Thus BPSK is the same as
bipolar (2ary) PAM. What is the probability of error in BPSK? The same as in bipolar PAM, i.e., for equally
probable symbols,
P [error] = Q
_
_
2c
b
N
0
_
.
QPSK M = 4 PSK is also called quadrature phase shift keying (QPSK), and is shown in Figure 35(a). Note
that the rotation of the signal space diagram doesn’t matter, so both ‘versions’ are identical in concept (although
would be a slightly diﬀerent implementation). Note how QPSK is the same as M = 4 square QAM.
24.7 Systems which use QAM
See [Couch 2007], wikipedia, and the Rice book:
• Digital Microwave Relay, various manufacturerspeciﬁc protocols. 6 GHz, and 11 GHz.
• Dialup modems: use a M = 16 or M = 8 QAM constellation.
• DSL. G.DMT uses multicarrier (up to 256 carriers) methods (OFDM), and on each narrowband (4.3kHz)
carrier, it can send up to 2
15
QAM (32,768 QAM). G.Lite uses up to 128 carriers, each with up to 2
8
= 256
QAM.
ECE 5520 Fall 2009 80
(a)
QPSK QPSK
(b)
M=8 M=16
Figure 35: Signal constellations for (a) M = 4 PSK and (b) M = 8 and M = 16 PSK.
• Cable modems. Upstream: 6 MHz bandwidth channel, with 64 QAM or 256 QAM. Downstream: QPSK
or 16 QAM.
• 802.11a and 802.11g: Adaptive modulation method, uses up to 64 QAM.
• Digital Video Broadcast (DVB): APSK used in ETSI standard.
25 QAM Probability of Error
25.1 Overview of Future Discussions on QAM
Before we get into details about the probabilities of error, you should realize that there are two main consider
ations when a constellation is designed for a particular M:
1. Points with equal distances separating them and their neighboring points tend to reduce probability of
error.
2. The highest magnitude points (furthest from the origin) strongly (negatively) impact average bit energy.
There are also some practical considerations that we will discuss
1. Some constellations have more complicated decision regions which require more computation to implement
in a receiver.
2. Some constellations are more diﬃcult (energyconsuming) to amplify at the transmitter.
ECE 5520 Fall 2009 81
25.2 Options for Probability of Error Expressions
In order of preference:
1. Exact formula. In a few cases, there is an exact expression for P [error] in an AWGN environment.
2. Union bound. This is a provable upper bound on the probability of error. It is not an approximation. It
can be used for “worst case” analysis which is often very useful for engineering design of systems.
3. Nearest Neighbor Approximation. This is a way to get a solution that is analytically easier to handle.
Typically these approximations are good at high
E
b
N
0
.
25.3 Exact Error Analysis
Exact probability of error formulas in Ndimensional modulations can be very diﬃcult to ﬁnd. This is because
our decisions regions are more complex than one threshold test. They require an integration of a ND Gaussian
pdf across an area. We needed a Q function to get a tail probability for a 1D Gaussian pdf. Now we need
not just a tail probability... but more like a section probability which is the volume under some part of the
Ndimensional Gaussian pdf.
For example, consider Mary PSK. Essentially, we must ﬁnd calculate the probability of symbol error as 1
M=8 M=16
Figure 36: Signal space diagram for Mary PSK for M = 8 and M = 16.
minus the area in the sector within ±
π
M
of the correct angle φ
i
. This is,
P(symbol error) = 1 −
_
r∈R
i
1
2πσ
2
e
−
r−α
i
2
2σ
2
(40)
This integral is a double integral, and we don’t generally have any exact expression to use to express the result
in general.
25.4 Probability of Error in QPSK
In QPSK, the probability of error is analytically tractable. Consider the QPSK constellation diagram, when
Gray encoding is used. You have already calculated the decision regions for each symbol; now consider the
decision region for the ﬁrst bit.
The decision is made using only one dimension, of the received signal vector x, speciﬁcally x
1
. Similarly,
the second bit decision is made using only x
2
. Also, the noise contribution to each element is independent. The
decisions are decoupled – x
2
has no impact on the decision about bit one, and x
1
has no impact on the decision
ECE 5520 Fall 2009 82
on bit two. Since we know the bit error probability for each bit decision (it is the same as bipolar PAM) we can
see that the bit error probability is also
P [error] = Q
_
_
2c
b
N
0
_
(41)
This is an extraordinary result – the bit rate will double in QPSK, but in theory, the bit error rate does not
increase. As we will show later, the bandwidth of QPSK is identical to that of BPSK.
25.5 Union Bound
From 5510 or an equivalent class (or a Venn diagram) you may recall the probability formula, that for two
events E and F that
P [E ∪ F] = P [E] +P [F] −P [E ∩ F]
You can prove this from the three axioms of probability. (This holds for any events E and F!) Then, using the
above formula, and the ﬁrst axiom of probability, we have that
P [E ∪ F] ≤ P [E] +P [F] . (42)
Furthermore, from (42) it is straightforward to show that for any list of sets E
1
, E
2
, . . . E
n
we have that
P
_
n
_
i=1
E
i
_
≤
n
i=1
P [E
i
] (43)
This is called the union bound, and it is very useful across communications. If you know one inequality, know
this one. It is useful when the overlaps E
i
∩ E
j
are small but diﬃcult to calculate.
25.6 Application of Union Bound
In this class, our events are typically decision events. The event E
i
may represent the event that we decide H
i
,
when some other symbol not equal to i was actually sent. In this case,
P [symbol error] ≤
n
i=1
P [E
i
] (44)
The union bound gives a conservative estimate. This can be useful for quick initial study of a modulation
type.
Example: QPSK
First, let’s study the union bound for QPSK, as shown in Figure 37(a). Assume s
1
(t) is sent. We know that
(from previous analysis):
P [error] = Q
_
_
2c
b
N
0
_
So the probability of symbol error is one minus the probability that there were no error in either bit,
P [symbol error] = 1 −
_
1 −Q
_
_
2c
b
N
0
__
2
(45)
In contrast, let’s calculate the union bound on the probability of error. We deﬁne the two error events as:
ECE 5520 Fall 2009 83
a
1
E
2
E
4
a
2
a
3
a
0
a
1
E
2
E
4
a
2
a
3
a
0
(a) (b)
Figure 37: Union bound examples. (a) is QPSK with symbols of equal energy
√
c
s
. In (b) a
1
= −a
3
=
[0,
_
3c
s
/2]
T
and a
2
= −a
0
= [
_
c
s
/2, 0]
T
.
• E
2
, the event that r falls on the wrong side of the decision boundary with a
2
.
• E
0
, the event that r falls on the wrong side of the decision boundary with a
0
.
Then we write
P [symbol error[H
1
] = P [E
2
∪ E
0
∪ E
3
]
Questions:
1. Is this equation exact?
2. Is E
2
∪ E
0
∪ E
3
= E
2
∪ E
0
? Why or why not?
Because of the answer to the question (2.),
P [symbol error[H
1
] = P [E
2
∪ E
0
]
But don’t be confused. The Rice book will not tell you to do this! His strategy is to ignore redundancies,
because we are only looking for an upper bound any way. Either way produces an upper bound, and in fact,
both produce nearly identical numerical results. However, it is perfectly acceptable (and in fact a better bound)
to remove redundancies and then apply the bound.
Then, we use the union bound.
P [symbol error[H
1
] ≤ P [E
2
] +P [E
0
]
These two probabilities are just the probability of error for a binary modulation, and both are identical, so
P [symbol error[H
1
] ≤ 2Q
_
_
2c
b
N
0
_
What is missing / What is the diﬀerence in this expression compared to (45)?
See Figure 38 to see the union bound probability of error plot, compared to the exact expression. Only at
very low E
b
/N
0
is there any noticeable diﬀerence!
Example: 4QAM with two amplitude levels
This is shown (poorly) in Figure 37(b). The amplitudes of the top and bottom symbols are
√
3 times the
ECE 5520 Fall 2009 84
0 1 2 3 4 5 6
10
−2
10
−1
E
b
/N
0
Ratio, dB
P
r
o
b
a
b
i
l
i
t
y
o
f
S
y
m
b
o
l
E
r
r
o
r
QPSK Exact
QPSK Union Bound
Figure 38: For QPSK, the exact probability of symbol error expression vs. the union bound.
amplitude of the symbols on the right and left. (They are positioned to keep the distance between points in the
signal space equal to
_
2c
s
/N
0
.) I am calling this ”2amplitude 4QAM” (I made it up).
What is the union bound on the probability of symbol error, given H
1
? Solution: Given symbol 1, the
probability is the same as above. Deﬁning E
2
and E
0
as above, these two distances between symbol a
1
and a
2
or a
0
are the same:
_
2c
s
/N
0
. Thus the formula for the union bound is the same.
What is the union bound on the probability of symbol error, given H
2
? Solution: Now, it is
P [symbol error[H
2
] ≤ 3Q
_
_
2c
b
N
0
_
So, overall, the union bound on probability of symbol error is
P [symbol error[H
2
] ≤
3 2 + 2 2
4
Q
_
_
2c
b
N
0
_
How about average energy? For QPSK, the symbol energies are all equal. Thus c
av
= c
s
. For the two
amplitude 4QAM modulation,
c
av
= c
s
2(0.5) + 2(1.5)
4
= c
s
Thus there is no advantage to the twoamplitude QAM modulation in terms of average energy.
25.6.1 General Formula for Union Boundbased Probability of Error
This is given in Rice 6.76:
P [symbol error] ≤
1
M
M−1
m=0
M−1
n=0
n=m
P [decide H
n
[H
m
]
Note the ≤ sign. This means the actual P [symbol error] will be at most this value. It is probably less than this
value. This is a general formula and is not necessarily the best upper bound. What do we mean? If I draw any
ECE 5520 Fall 2009 85
function that is always above the actual P [symbol error] formula, I have drawn an upper bound. But I could
draw lots of functions that are upper bounds, some higher than others.
In particular, for some of the error events, decide H
n
[H
m
may be redundant, and do not need to be included.
We will talk about the concept of “neighboring” symbols to symbol i. These neighbors are the ones that are
necessary to include in the union bound.
Note that the Rice book uses E
avg
where I use c
s
to denote average symbol energy. The Rice book uses E
b
where I use c
b
to denote average bit energy. I prefer c
s
to make clear we are talking about Joules per symbol.
Given this, the pairwise probability of error, P [decide H
n
[H
m
], is given by:
P [decide H
n
[H
m
] = Q
_
_
¸
d
2
m,n
2N
0
_
_
= Q
_
_
¸
d
2
m,n
2c
s
c
s
N
0
_
_
(46)
where d
m,n
is the Euclidean distance between the symbols m and n in the signal space diagram. If we use a
constant A when describing the signal space vectors (as we usually do), then, since c
s
will be proportional to
A
2
and d
2
m,n
will be proportional to A
2
, the factors A will cancel out of the expression.
Proof of (46)?
Lecture 14
Today: (1) QAM Probability of Error, (2) Union Bound and Approximations
Lecture 15
Today: (1) Mary PSK and QAM Analysis
26 QAM Probability of Error
26.1 NearestNeighbor Approximate Probability of Error
As it turns out, the probability of error is often well approximated by the terms in the Union Bound (Lecture 14,
Section 2.6.1) with the smallest d
m,n
. This is because higher d
m,n
means a higher argument in the Qfunction,
which in turn means a lower value of the Qfunction. A little extra distance means a much lower value of the
Qfunction. So approximately,
P [symbol error] ≈
N
min
M
Q
_
_
¸
d
2
min
2N
0
_
_
P [symbol error] ≈
N
min
M
Q
_
_
¸
d
2
min
2c
s
c
s
N
0
_
_
(47)
where
d
min
= min
m=n
d
m,n
, m = 0, . . . , M −1, and n = 0, . . . , M −1.
and N
min
is the number of pairs of symbols which are separated by distance d
min
.
ECE 5520 Fall 2009 86
26.2 Summary and Examples
We’ve been formulating in class a pattern or stepbystep process for ﬁnding the probability of bit or symbol
error in arbitrary Mary PSK and QAM modulations. These steps include:
1. Calculate the average energy per symbol, c
s
, as a function of any amplitude parameters (typically
we’ve been using A).
2. Calculate the average energy per bit, c
b
, using c
b
= c
s
/ log
2
M.
3. Draw decision boundaries. It is not necessary to include redundant boundary information, that is, an
error region which is completely contained in a larger error region.
4. Calculate pairwise distances between pairs of symbols which contribute to the decision boundaries.
5. Convert distances to be in terms of c
b
/N
0
, if they are not already, using the answer to step (1.).
6. Calculate the union bound on probability of symbol error using the formula derived in the previous
lecture,
P [symbol error] ≤
1
M
M−1
m=0
M−1
n=0
n=m
P [decide H
n
[H
m
]
P [decide H
n
[H
m
] = Q
_
_
¸
d
2
m,n
2N
0
_
_
7. Approximate using the nearest neighbor approximation if needed:
P [symbol error] ≈
N
min
M
Q
_
_
¸
d
2
min
2N
0
_
_
where d
min
is the shortest pairwise distance in the constellation diagram, and N
min
is the number of
ordered pairs (i, j) which are that distance apart, d
i,j
= d
min
(don’t forget to count both (i, j) and (j, i)!).
8. Convert probability of symbol error into probability of bit error if Gray encoding can be assumed,
P [error] ≈
1
log
2
M
P [symbol error]
Example: Additional QAM / PSK Constellations
Solve for the probability of symbol error in the signal space diagrams in Figure 39. You should calculate:
• An exact expression if one should happen to be available,
• The Union bound,
• The nearestneighbor approximation.
ECE 5520 Fall 2009 87
Figure 39: Signal space diagram for some example (made up) constellation diagrams.
ECE 5520 Fall 2009 88
on, ﬁrst, probability of symbol error, and then also probability of bit error. The plots are in terms of amplitude
A, but all probability of error expressions should be written in terms of c
b
/N
0
.
Lecture 16
Today: (1) QAM Examples (Lecture 15) , (2) FSK
27 Frequency Shift Keying
Frequency modulation in general changes the center frequency over time:
x(t) = cos(2πf(t)t).
Now, in frequency shift keying, symbols are selected to be sinusoids with frequency selected among a set of M
diﬀerent frequencies ¦f
0
, f
1
, . . . f
M−1
¦.
27.1 Orthogonal Frequencies
We will want our cosine wave at these M diﬀerent frequencies to be orthonormal. Consider f
k
= f
c
+k∆f, and
thus
φ
k
(t) =
√
2p(t) cos(2πf
c
t + 2πk∆ft) (48)
where p(t) is a pulse shape, which for example, could be a rectangular pulse:
p(t) =
_
1/
√
T
s
, 0 ≤ t ≤ T
s
0, o.w.
• Does this function have unit energy?
• Are these functions orthogonal?
Solution:
¸φ
k
(t), φ
m
(t)) =
_
∞
t=−∞
(2/T
s
) cos(2πf
c
t + 2πk∆ft) cos(2πf
c
t + 2πk∆ft)
= 1/T
s
_
Ts
t=0
cos(2π(k −m)∆ft)dt +
1/T
s
_
Ts
t=0
cos(4πf
c
t + 2π(k +m)∆ft)dt
= 1/T
s
_
sin(2π(k −m)∆ft)
2π(k −m)∆f
¸
¸
¸
¸
Ts
t=0
=
sin(2π(k −m)∆fT
s
)
2π(k −m)∆fT
s
Yes, they are orthogonal if 2π(k − m)∆fT
s
is a multiple of π. (They are also approximately orthogonal if
∆f is realy big, but we don’t want to waste spectrum.) For general k ,= m, this requires that ∆fT
s
= n/2, i.e.,
∆f = n
1
2T
s
= n
f
sy
2
ECE 5520 Fall 2009 89
for integer n. (Otherwise, no they’re not.)
Thus we need to plug into (48) for ∆f = n
1
2Ts
for some integer n in order to have an orthonormal basis.
What n? We will see that we either use an n of 1 or 2.
Signal space vectors a
i
are given by
a
0
= [A, 0, . . . , 0]
a
1
= [0, A, . . . , 0]
.
.
.
a
M−1
= [0, 0, . . . , A]
What is the energy per symbol? Show that this means that A =
√
c
s
.
For M = 2 and M = 3 these vectors are plotted in Figure 40.
M=2FSK M=3FSK
Figure 40: Signal space diagram for M = 2 and M = 3 FSK modulation.
27.2 Transmission of FSK
At the transmitter, FSK can be seen as a switch between M diﬀerent carrier signals, as shown in Figure 41.
FSK
Out
Signalf (t)
1
Signalf (t)
2
Switch
Figure 41: Block diagram of a binary FSK transmitter.
But usually it is generated from a single VCO, as seen in Figure 42.
Def ’n: Voltage Controlled Oscillator (VCO)
A sinusoidal generator with frequency that linearly proportional to an input voltage.
Note that we don’t need to sent square wave input into the VCO. In fact, bandwidth will be lower when we
send less steep transitions, for example, SRRC pulses.
ECE 5520 Fall 2009 90
Def ’n: Continuous Phase Frequency Shift Keying
FSK with no phase discontinuity between symbols. In other words, the phase of the output signal φ
k
(t) does
not change instantaneously at symbol boundaries iT
s
for integer i, and thus φ(t
−
+iT
s
) = φ(t
+
+iT
s
) where t
−
and t
+
are the limiting times just to the left and to the right of 0, respectively.
There are a variety of ﬂavors of CPFSK, which are beyond the scope of this course.
VCO
FSK
Out
Frequency
Signalf(t)
Figure 42: Block diagram of a binary FSK transmitter.
27.3 Reception of FSK
FSK reception is either phase coherent or phase noncoherent. Here, there are M possible carrier frequencies,
so we’d need to know and be synchronized to M diﬀerent phases θ
i
, one for each symbol frequency:
cos(2πf
c
t +θ
0
)
cos(2πf
c
t + 2π∆ft +θ
1
)
.
.
.
cos(2πf
c
t + 2π(M −1)∆ft +θ
M−1
)
27.4 Coherent Reception
FSK reception can be done via a correlation receiver, just as we’ve seen for previous modulation types.
Each phase θ
k
is estimated to be
ˆ
θ
k
by a separate phaselocked loop (PLL). In one ﬂavor of CPFSK (called
Sunde’s FSK), the carrier cos(2πf
k
t + θ
k
) is sent with the transmitted signal, to aid in demodulation (at the
expense of the additional energy). This is the only case where I’ve heard of using coherent FSK reception.
As M gets high, coherent detection becomes diﬃcult. These M PLLs must operate even though they can
only synchronize when their symbol is sent, 1/M of the time (assuming equallyprobable symbols). Also, having
M PLLs is a drawback.
27.5 Noncoherent Reception
Notice that in Figure 40, the sign or phase of the sinusoid is not very important – only one symbol exists in
each dimension. In noncoherent reception, we just measure the energy in each frequency.
This is more diﬃcult than it sounds, though – we have a fundamental problem. As we know, for every
frequency, there are two orthogonal functions, cosine and sine (see QAM and PSK). Since we will not know the
phase of the received signal, we don’t know whether or not the energy at frequency f
k
correlates highly with
the cosine wave or with the sine wave. If we only correlate it with one (the sine wave, for example), and the
phase makes the signal the other (a cosine wave) we would get a inner product of zero!
The solution is that we need to correlate the received signal with both a sine and a cosine wave at the
frequency f
k
. This will give us two inner products, lets call them x
I
k
using the capital I to denote inphase and
x
Q
k
with Q denoting quadrature.
ECE 5520 Fall 2009 91
cos(2 ) pf t
i
sin(2 ) pf t
i
x
i
Q
x
i
I
length:
() +
2
()
2
x
i
I
x
i
Q
Figure 43: The energy in a noncoherent FSK receiver at one frequency f
k
is calculated by ﬁnding its correlation
with the cosine wave (x
I
k
) and sine wave (x
Q
k
) at the frequency of interest, f
k
, and calculating the squared length
of the vector [x
I
k
, x
Q
k
]
T
.
The energy at frequency f
k
, that is,
c
f
k
= (x
I
k
)
2
+ (x
Q
k
)
2
is calculated for each frequency f
k
, k = 0, 1, . . . , M−1. You could prove this, but the optimal detector (Maximum
likelihood detector) is then to decide upon the frequency with the maximum energy c
f
k
. Is this a threshold
test?
We would need new analysis of the FSK noncoherent detector to ﬁnd its analytical probability of error.
27.6 Receiver Block Diagrams
The block diagram of the the noncoherent FSK receiver is shown in Figures 45 and 46 (copied from the Proakis
& Salehi book). Compare this to the coherent FSK receiver in Figure 44.
Figure 44: (Proakis & Salehi) Phasecoherent demodulation of Mary FSK signals.
ECE 5520 Fall 2009 92
Figure 45: (Proakis & Salehi) Demodulation of Mary FSK signals for noncoherent detection.
Figure 46: (Proakis & Salehi) Demodulation and squarelaw detection of binary FSK signals.
ECE 5520 Fall 2009 93
27.7 Probability of Error for Coherent Binary FSK
First, let’s look at coherent detection of binary FSK.
1. What is the detection threshold line separating the two decision regions?
2. What is the distance between points in the Binary FSK signal space?
What is the probability of error for coherent binary FSK? It is the same as bipolar PAM, but the symbols are
spaced diﬀerently (more closely) as a function of c
b
. We had that
P [error]
2−ary
= Q
_
_
¸
d
2
0,1
2N
0
_
_
Now, the spacing between symbols has reduced by a factor of
√
2/2 compared to bipolar PAM, to d
0,1
=
√
2c
b
.
So
P [error]
2−Co−FSK
= Q
_
_
c
b
N
0
_
For the same probability of bit error, binary FSK is about 1.5 dB better than OOK (requires 1.5 dB less energy
per bit), but 1.5 dB worse than bipolar PAM (requires 1.5 dB more energy per bit).
27.8 Probability of Error for Noncoherent Binary FSK
The energy detector shown in Figure 46 uses the energy in each frequency and selects the frequency with
maximum energy.
This energy is denoted r
2
m
in Figure 46 for frequency m and is
r
2
m
= r
2
mc
+r
2
ms
This energy measure is a statistic which measures how much energy was in the signal at frequency f
m
. The
‘envelope’ is a term used for the square root of the energy, so r
m
is termed the envelope.
Question: What will r
2
m
equal when the noise is very small?
As it turns out, given the noncoherent receiver and r
mc
and r
ms
, the envelope r
m
is an optimum (suﬃcient)
statistic to use to decide between s
1
. . . s
M
.
What do they do to prove this in Proakis & Salehi? They prove it for binary noncoherent FSK. It takes
quite a bit of 5510 to do this proof.
1. Deﬁne the received vector r as a 4 length vector of the correlation of r(t) with the sin and cos at each
frequency f
1
, f
2
.
2. They formulate the prior probabilities f
rH
i
(r[H
i
). Note that this depends on θ
m
, which is assumed to be
uniform between 0 and 2π, and independent of the noise.
f
rH
i
(r[H
i
) =
_
2π
0
f
r,θmH
i
(r, φ[H
i
)dφ
=
_
2π
0
f
rθm,H
i
(r[φ, H
i
)f
θmH
i
(φ[H
i
)dφ
(49)
Note that f
rθm,H
0
(r[φ, H
0
) is a 2D Gaussian random vector with i.i.d. components.
ECE 5520 Fall 2009 94
3. They formulate the joint probabilities f
r∩H
0
(r ∩ H
0
) and f
r∩H
1
(r ∩ H
1
).
4. Where the joint probability f
r∩H
0
(r ∩H
0
) is greater than f
rH
1
(r[H
1
), the receiver decides H
0
. Otherwise,
it decides H
1
.
5. The decisions in this last step, after manipulation of the pdfs, are shown to reduce to this decision (given
that P [H
0
] = P [H
1
]):
_
r
2
1c
+r
2
1s
H
0
>
<
H
1
_
r
2
2c
+r
2
2s
The “envelope detector” can equally well be called the “energy detector”, and it often is.
The above proof is simply FYI, and is presented since it does not appear in the Rice book.
An exact expression for the probability of error can be derived, as well. The proof is in Proakis & Salehi,
Section 7.6.9, page 430, which is posted on WebCT. The expression for probability of error in binary noncoherent
FSK is given by,
P [error]
2−NC−FSK
=
1
2
exp
_
−
c
b
2N
0
_
(50)
The expressions for probability of error in binary FSK (both coherent and noncoherent) are important, and
you should make note of them. You will use them to be able to design communication systems that use FSK.
Lecture 17
Today: (1) FSK Error Probabilities (2) OFDM
27.9 FSK Error Probabilities, Part 2
27.9.1 Mary NonCoherent FSK
For Mary noncoherent FSK, the derivation in the Proakis & Salehi book, section 7.6.9 shows that
P [symbol error] =
M−1
n=1
(−1)
n+1
_
M −1
n
_
1
n + 1
e
−log
2
M
n
n+1
E
b
N
0
and
P [error]
M−nc−FSK
=
M/2
M −1
P [symbol error]
See Figure 47.
Proof Summary: Our noncoherent receiver ﬁnds the energy in each frequency. These energy values no longer
have a Gaussian distribution (due to the squaring of the amplitudes in the energy calculation). They instead are
either Ricean (for the transmitted frequency) or Rayleigh distributed (for the “other” M −1 frequencies). The
probability that the correct frequency is selected is the probability that the Ricean random variable is larger
than all of the other random variables measured at the other frequencies.
Example: Probability of Error for Noncoherent M = 2 case
Use the above expressions to ﬁnd the P [symbol error] and P [error] for binary noncoherent FSK.
Solution:
P [symbol error] = P [bit error] =
1
2
e
−
1
2
E
b
N
0
ECE 5520 Fall 2009 95
0 2 4 6 8 10 12
10
−6
10
−5
10
−4
10
−3
10
−2
10
−1
E
b
/N
0
Ratio, dB
P
r
o
b
a
b
i
l
i
t
y
o
f
E
r
r
o
r
M=2
M=4
M=8
M=16
M=32
Figure 47: Probability of bit error for noncoherent reception of Mary FSK.
27.9.2 Summary
The key insight is that probability of error decreases for increasing M. As M → ∞, the probability of error
goes to zero, for the case when
E
b
N
0
> −1.6dB. Remember this for later  it is related to the ShannonHartley
theorem on the limit of reliable communication.
27.10 Bandwidth of FSK
Carson’s rule is used to calculate the bandwidth of FM signals. For Mary FSK, it tells us that the approximate
bandwidth is,
B
T
= (M −1)∆f + 2B
where B is the onesided bandwidth of the digital baseband signal. For the nulltonull bandwidth of raised
cosine pulse shaping, B = (1 +α)/T
s
. Note for square wave pulses, B = 1/T
s
.
28 Frequency Multiplexing
Frequency multiplexing is the division of the total bandwidth (call it B
T
) into many smaller frequency bands,
and sending a lower bit rate on each one.
Speciﬁcally, divide your B
T
bandwidth into K subchannels, you now have B
T
/K bandwidth on each sub
channel. Note that in the multiplexed system, each subchannel has a symbol period of T
s
K, longer by a factor
of K so that the bandwidth of that subchannel has narrower bandwidth. In HW 7, you show that the total
bit rate is the same in the multiplexed system, so you don’t lose anything by this division. Furthermore, your
transmission energy is the same. The energy would be divided by K in each subchannel, so the sum of the
energy is constant.
For each band, we might send a PSK or PAM or QAM signal on that band. Our choice is arbitrary.
28.1 Frequency Selective Fading
Frequency selectivity is primarily a problem in wireless channels. But, frequency dependent gains are also
experienced in DSL, because phone lines were not designed for higher frequency signals. For example, consider
Figure 48, which shows an example frequency selective fading pattern, 10 log
10
[H(f)[
2
, where H(f) is an
ECE 5520 Fall 2009 96
B K
T
/
Severelyfaded
channel
Figure 48: An example frequency selective fading pattern. The whole bandwidth is divided into K subchannels,
and each subchannel’s channel ﬁlter is mostly constant.
example wireless channel. This H(f) might be experienced in an average outdoor channel. (The f is relative,
i.e., take the actual frequency
˜
f = f +f
c
for some center frequency f
c
.)
• You can see that some channels experience severe attenuation, while most experience little attenuation.
Some channels experience gain.
• For stationary transmitter and receiver and stationary f, the channel stays constant over time. If you put
down the transmitter and receiver and have them operate at a particular f
′
, then the loss 10 log
10
[H(f
′
)[
2
will not change throughout the communication. If the channel loss is too great, than the error rate will
be too high.
Problems:
1. Wideband: The bandwidth is used for one channel. The H(f) acts as a ﬁlter, which introduces ISI.
2. Narrowband: Part of the bandwidth is used for one narrow channel, at a lower bit rate. The power is
increased by a fade margin which makes the
E
b
N
0
high enough for low BER demodulation except in the
most extreme fading situations.
When designing for frequency selective fading, designers may use a wideband modulation method which is
robust to frequency selective fading. For example,
• Frequency multiplexing methods (including OFDM),
• Directsequence spread spectrum (DSSS), or
• Frequencyhopping spread spectrum (FHSS).
We’re going to mention frequency multiplexing.
28.2 Beneﬁts of Frequency Multiplexing
Now, bit errors are not an ‘all or nothing’ game. In frequency multiplexing, there are K parallel bitstreams,
each of rate R/K, where R is the total bit rate. As a ﬁrst order approximation, a subchannel either experiences
a high SNR and makes no errors; or is in a severe fade, has a very low SNR, and experiences a BER of 0.5 (the
ECE 5520 Fall 2009 97
worst bit error rate!). If β is the probability that a subchannel experiences a severe fade, the overall probability
of error will be 0.5β.
Compare this to a single narrowband channel, which has probability β that it will have a 0.5 probability of
error. Similarly, a wideband system with very high ISI might be completely unable to demodulate the received
signal.
Frequency multiplexing is typically combined with channel coding designed to correct a small percentage of
bit errors.
28.3 OFDM as an extension of FSK
This is section 5.5 in the Rice book.
In FSK, we use a single basis function at each of diﬀerent frequencies. In QAM, we use two basis functions
at the same frequency. OFDM is the combination:
φ
0,I
(t) =
_ _
2/T
s
cos(2πf
c
t), 0 ≤ t ≤ T
s
0, o.w.
φ
0,Q
(t) =
_ _
2/T
s
sin(2πf
c
t), 0 ≤ t ≤ T
s
0, o.w.
φ
1,I
(t) =
_ _
2/T
s
cos(2πf
c
t + 2π∆ft), 0 ≤ t ≤ T
s
0, o.w.
φ
1,Q
(t) =
_ _
2/T
s
sin(2πf
c
t + 2π∆ft), 0 ≤ t ≤ T
s
0, o.w.
.
.
.
φ
M−1,I
(t) =
_ _
2/T
s
cos(2πf
c
t + 2π(M −1)∆ft), 0 ≤ t ≤ T
s
0, o.w.
φ
M−1,Q
(t) =
_ _
2/T
s
sin(2πf
c
t + 2π(M −1)∆ft), 0 ≤ t ≤ T
s
0, o.w.
where ∆f =
1
2Ts
. These are all orthogonal functions! We can transmit much more information than possible in
Mary FSK. (Note we have 2M basis functions here!)
The signal on subchannel k might be represented as:
x
k
(t) =
_
2/T
s
[a
k,I
(t) cos(2πf
k
t) +a
k,Q
(t) sin(2πf
k
t)]
The complex baseband signal of the sum of all K signals might then be represented as
x
l
(t) =
_
2/T
s
R
_
K
k=1
(a
k,I
(t) +ja
k,Q
(t))e
j2πk∆ft
_
x
l
(t) =
_
2/T
s
R
_
K
k=1
A
k
(t)e
j2πk∆ft
_
(51)
where A
k
(t) = a
k,I
(t) +ja
k,Q
(t). Does this look like an inverse discrete fourier transform? If yes, than you can
see why it is possible to use an IFFT and FFT to generate the complex baseband signal.
1. FFT implementation: There is a particular implementation of the transmitter and receiver that use
FFT/IFFT operations. This avoids having K independent transmitter chains and receiver chains. The
ECE 5520 Fall 2009 98
FFT implementation (and the speed and ease of implementation of the FFT in hardware) is why OFDM
is popular.
Since the K carriers are orthogonal, the signal is like Kary FSK. But, rather than transmitting on one of
the K carriers at a given time (like FSK) we transmit information in parallel on all K channels simultaneously.
An example state space diagram for K = 3 and PAM on each channel is shown in Figure 49.
OFDM,3subchannels
of4aryPAM
Figure 49: Signal space diagram for K = 3 subchannel OFDM with 4PAM on each channel.
Example: 802.11a
IEEE 802.11a uses OFDM with 52 subcarriers. Four of the subcarriers are reserved for pilot tones, so eﬀectively
48 subcarriers are used for data. Each data subcarrier can be modulated in diﬀerent ways. One example is to
use 16 square QAM on each subcarrier (which is 4 bits per symbol per subcarrier). The symbol rate in 802.11a
is 250k/sec. Thus the bit rate is
250 10
3
OFDM symbols
sec
48
subcarriers
OFDM symbol
4
coded bits
subcarrier
= 48
Mbits
sec
Lecture 18
Today: Comparison of Modulation Methods
29 Comparison of Modulation Methods
This section ﬁrst presents some miscellaneous information which is important to real digital communications
systems but doesn’t ﬁt nicely into other lectures.
29.1 Diﬀerential Encoding for BPSK
Def ’n: Coherent Reception
The reception of a signal when its carrier phase is explicitly determined and used for demodulation.
ECE 5520 Fall 2009 99
For coherent reception of PSK, will always need some kind of phase synchronization in BPSK. Typically,
this means transmitting a training sequence.
For noncoherent reception of PSK, we use diﬀerential encoding (at the transmitter) and decoding (at the
receiver).
29.1.1 DPSK Transmitter
Now, consider the bit sequence ¦b
n
¦, where b
n
is the nth bit that we want to send. The sequence b
n
is a sequence
of 0’s and 1’s. How do we decide which phase to send? Prior to this, we’ve said, send a
0
if b
n
= 0, and send a
1
if b
n
= 1.
Instead of setting k for a
k
only as a function of b
n
, in diﬀerential encoding, we also include k
n−1
. Now,
k
n
=
_
k
n−1
, b
n
= 0
1 −k
n−1
, b
n
= 1
Note that 1 − k
n−1
is the complement or negation of k
n−1
– if k
n−1
= 1 then 1 − k
n−1
= 0; if k
n−1
= 0 then
1 − k
n−1
= 1. Basically, for diﬀerential BPSK, a switch in the angle of the signal space vector from 0
o
to 180
o
or vice versa indicates a bit 1; while staying at the same angle indicates a bit 0.
Note that we have to just agree on the “zero” phase. Typically k
0
= 0.
Example: Diﬀerential encoding
Let b = [1, 0, 1, 0, 1, 1, 1, 0, 0]. Assume b
0
= 0. What symbols k = [k
0
, . . . , k
8
]
T
will be sent? Solution:
k = [1, 1, 0, 0, 1, 0, 1, 1, 1]
T
These values of k
n
correspond to a symbol stream with phases:
∠a
k
= [π, π, 0, 0, π, 0, π, π, π]
T
29.1.2 DPSK Receiver
Now, at the receiver, we ﬁnd b
n
by comparing the phase of x
n
to the phase of x
n−1
. What our receiver does, is
to measure the statistic
ℜ¦x
n
x
∗
n−1
¦ = [x
n
[[x
n−1
[ cos(∠x
n
−∠x
n−1
)
as the statistic – if it less than zero, decide b
n
= 1, and if it is greater than zero, decide b
n
= 0.
Example: Diﬀerential decoding
1. Assuming no phase shift in the above encoding example, show that the receiver will decode the original
bitstream with diﬀerential decoding. Solution: Assuming φ
i
0
= 0,
ˆ
b
n
= [1, 0, 1, 0, 1, 1, 1, 0, 0]
T
.
ECE 5520 Fall 2009 100
2. Now, assume that all bits are shifted π radians and we receive
∠x
′
= [0, 0, π, π, 0, π, 0, 0, 0].
What will be decoded at the receiver? Solution:
ˆ
b
n
= [0, 0, 1, 0, 1, 1, 1, 0, 0].
Rotating all symbols by π radians only causes one bit error.
29.1.3 Probability of Bit Error for DPSK
The probability of bit error in DPSK is slightly worse than that for BPSK:
P [error] =
1
2
exp
_
−
c
b
N
0
_
For a constant probability of error, DPSK requires about 1 dB more
E
b
N
0
than BPSK, which has probability
of bit error Q
__
2E
b
N
0
_
. Both are plotted in Figure 50.
0 2 4 6 8 10 12 14
10
−6
10
−4
10
−2
10
0
E
b
/N
0
, dB
P
r
o
b
a
b
i
l
i
t
y
o
f
B
i
t
E
r
r
o
r
BPSK
DBPSK
Figure 50: Comparison of probability of bit error for BPSK and Diﬀerential BPSK.
29.2 Points for Comparison
• Linear ampliﬁers (transmitter complexity)
• Receiver complexity
• Fidelity (P [error]) vs.
E
b
N
0
• Bandwidth eﬃciency
ECE 5520 Fall 2009 101
η α = 0 α = 0.5 α = 1
BPSK 1.0 0.67 0.5
QPSK 2.0 1.33 1.5
16QAM 4.0 2.67 2.0
64QAM 6.0 4.0 3.0
Table 2: Bandwidth eﬃciency of PSK and QAM modulation methods using raised cosine ﬁltering as a function
of α.
29.3 Bandwidth Eﬃciency
We’ve talked about measuring data rate in bits per second. We’ve also talked about Hertz, i.e., the quantity of
spectrum our signal will use. Typically, we can scale a system, to increase the bit rate by decreasing the symbol
period, and correspondingly increase the bandwidth. This relationship is typically linearly proportional.
Def ’n: Bandwidth eﬃciency
The bandwidth eﬃciency, typically denoted η, is the ratio of bits per second to bandwidth:
η = R
b
/B
T
Bandwidth eﬃciency depends on the deﬁnition of “bandwidth”. Since it is usually used for comparative pur
poses, we just make sure we use the same deﬁnition of bandwidth throughout a comparison.
The key ﬁgure of merit: bits per second / Hertz, i.e., bps/Hz.
29.3.1 PSK, PAM and QAM
In these three modulation methods, the bandwidth is largely determined by the pulse shape. For root raised
cosine ﬁltering, the nullnull bandwidth is 1 +α times the bandwidth of the case when we use pure sinc pulses.
The transmission bandwidth (for a bandpass signal) is
B
T
=
1 +α
T
s
Since T
s
is seconds per symbol, we divide by log
2
M bits per symbol to get T
b
= T
s
/ log
2
M seconds per bit, or
B
T
=
(1 +α)R
b
log
2
M
where R
b
= 1/T
b
is the bit rate.
Bandwidth eﬃciency is then
η = R
b
/B
T
=
log
2
M
1 +α
See Table 2 for some numerical examples.
29.3.2 FSK
We’ve said that the bandwidth of FSK is,
B
T
= (M −1)∆f + 2B
ECE 5520 Fall 2009 102
where B is the onesided bandwidth of the digital baseband signal. For the nulltonull bandwidth of raised
cosine pulse shaping, 2B = (1 +α)/T
s
. So,
B
T
= (M −1)∆f + (1 +α)/T
s
=
R
b
log
2
M
¦(M −1)∆fT
s
+ (1 +α)¦
since R
b
= 1/T
s
for
η = R
b
/B
T
=
log
2
M
(M −1)∆fT
s
+ (1 +α)
If ∆f = 1/T
s
(required for noncoherent reception),
η =
log
2
M
M +α
29.4 Bandwidth Eﬃciency vs.
E
b
N
0
For each modulation format, we have quantities of interest:
• Bandwidth eﬃciency, and
• Energy per bit (
E
b
N
0
) requirements to achieve a given probability of error.
Example: Bandwidth eﬃciency vs.
E
b
N
0
for M = 8 PSK
What is the required
E
b
N
0
for 8PSK to achieve a probability of bit error of 10
−6
? What is the bandwidth
eﬃciency of 8PSK when using 50% excess bandwidth?
Solution: Given in Rice Figure 6.3.5 to be about 14 dB, and 2.
We can plot these (required
E
b
N
0
, bandwidth eﬃciency) pairs. See Rice Figure 6.3.6.
29.5 Fidelity (P [error]) vs.
E
b
N
0
Main modulations which we have evaluated probability of error vs.
E
b
N
0
:
1. Mary PAM, including Binary PAM or BPSK, OOK, DPSK.
2. Mary PSK, including QPSK.
3. Square QAM
4. Nonsquare QAM constellations
5. FSK, Mary FSK
In this part of the lecture we will break up into groups and derive: (1) the probability of error and (2)
probability of symbol error formulas for these types of modulations.
See also Rice pages 325331.
ECE 5520 Fall 2009 103
Name P [symbol error] P [error]
BPSK = Q
__
2E
b
N
0
_
same
OOK = Q
__
E
b
N
0
_
same
DPSK =
1
2
exp
_
−
E
b
N
0
_
same
MPAM =
2(M−1)
M
Q
_
_
6 log
2
M
M
2
−1
E
b
N
0
_
≈
1
log
2
M
P [symbol error]
QPSK = Q
__
2E
b
N
0
_
MPSK ≤ 2Q
__
2(log
2
M) sin
2
(π/M)
E
b
N
0
_
≈
1
log
2
M
P [symbol error]
Square MQAM ≈
4
log
2
M
(
√
M−1)
√
M
Q
_
_
3 log
2
M
M−1
E
b
N
0
_
2noncoFSK =
1
2
exp
_
−
E
b
2N
0
_
same
MnoncoFSK =
M−1
n=1
(
M−1
n
)
(−1)
n+1
n+1
exp
_
−
nlog
2
M
n+1
E
b
N
0
_
=
M/2
M−1
P [symbol error]
2coFSK = Q
__
E
b
N
0
_
same
McoFSK ≤ (M −1)Q
__
log
2
M
E
b
N
0
_
=
M/2
M−1
P [symbol error]
Table 3: Summary of probability of bit and symbol error formulas for several modulations.
29.6 Transmitter Complexity
What makes a transmitter more complicated?
• Need for a linear power ampliﬁer
• Higher clock speed
• Higher transmit powers
• Directional antennas
29.6.1 Linear / Nonlinear Ampliﬁers
An ampliﬁer uses DC power to take an input signal and increase its amplitude at the output. If P
DC
is the
input power (e.g., from battery) to the ampliﬁer, and P
out
is the output signal power, the power eﬃciency is
rated as η
P
= P
out
/P
DC
.
Ampliﬁers are separated into ‘classes’ which describe their conﬁguration (circuits), and as a result of their
conﬁguration, their linearity and power eﬃciency.
• Class A: linear ampliﬁers with maximum power eﬃciency of 50%. Output signal is a scaled up version of
the input signal. Power is dissipated at all times.
• Class B: linear ampliﬁers turn on for half of a cycle (conduction angle of 180
o
) with maximum power
eﬃciency of 78.5%. Two in parallel are used to amplify a sinusoidal signal, one for the positive part and
one for the negative part.
• Class AB: Like class B, but each ampliﬁer stays on slightly longer to reduce the “dead zone” at zero.
That is, the conduction angle is slightly higher than 180
o
.
ECE 5520 Fall 2009 104
• Class C: A class C ampliﬁer is closer to a switch than an ampliﬁer. This generates high distortion, but
then is bandpass ﬁltered or tuned to the center frequency to force out spurious frequencies. Class C
nonlinear ampliﬁers have power eﬃciencies around 90%. Can only amplify signals with a nearly constant
envelope.
In order to double the power eﬃciency, batterypowered transmitters are often willing to use Class C
ampliﬁers. They can do this if their output signal has constant envelope. This means that, if you look at
the constellation diagram, you’ll see a circle. The signal must never go through the origin (envelope of zero) or
even near the origin.
29.7 Oﬀset QPSK
Qchannel
Ichannel
OffsetQPSK:
Qchannel
Ichannel
QPSKwithoutoffset:
(b) (a)
Figure 51: Constellation Diagram for (a) QPSK and (b) OQPSK.
For QPSK, we wrote the modulated signal as
s(t) =
√
2p(t) cos(ω
0
t +∠a(t))
where ∠a(t) is the angle of the symbol chosen to be sent at time t. It is in a discrete set,
∠a(t) ∈ ¦
π
4
,
3π
4
,
5π
4
,
7π
4
¦
and p(t) is the pulse shape. We could have also written s(t) as
s(t) =
√
2p(t) [a
0
(t) cos(ω
0
t) +a
1
(t) sin(ω
0
t)]
The problem is that when the phase changes 180 degrees, the signal s(t) will go through zero, which precludes
use of a class C ampliﬁer. See Figure 52(b) and Figure 51 (a) to see this graphically.
For oﬀset QPSK (OQPSK), we delay the quadrature T
s
/2 with respect to the inphase. Rewriting s(t)
s(t) =
√
2p(t)a
0
(t) cos(ω
0
t) +
√
2p(t −T
s
/2)a
1
(t −T
s
/2) sin(ω
0
(t −T
s
/2))
At the receiver, we just need to delay the sampling on the quadrature half of a sample period with respect to
the inphase signal. The new transmitted signal takes the same bandwidth and average power, and has the same
E
b
/N
0
vs. probability of bit error performance. However, the envelope [s(t)[ is largely constant. See Figure 52
for a comparison of QPSK and OQPSK.
ECE 5520 Fall 2009 105
(a)
0 1 2 3 4
−0.1
0
0.1
Time t
R
e
a
l
0 1 2 3 4
−0.1
0
0.1
Time t
I
m
a
g
(b)
−0.2 −0.1 0 0.1 0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
I
m
a
g
Real
(c)
0 1 2 3 4
−0.1
0
0.1
Time t
R
e
a
l
0 1 2 3 4
−0.1
0
0.1
Time t
I
m
a
g
(d)
−0.2 −0.1 0 0.1 0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
I
m
a
g
Real
Figure 52: Matlab simulation of (ab) QPSK and (cd) OQPSK, showing the (d) largely constant envelope of
OQPSK, compared to (b) that for QPSK.
29.8 Receiver Complexity
What makes a receiver more complicated?
• Synchronization (carrier, phase, timing)
• Multiple parallel receiver chains
Lecture 19
Today: (1) Link Budgets and System Design
30 Link Budgets and System Design
As a digital communication system designer, your mission (if you choose to accept it) is to achieve:
1. High data rate
2. High ﬁdelity (low bit error rate)
3. Low transmit power
ECE 5520 Fall 2009 106
4. Low bandwidth
5. Low transmitter/receiver complexity
But this is truly a mission impossible, because you can’t have everything at the same time. So the system design
depends on what the desired system really needs and what are the acceptable tradeoﬀs. Typically some subset
of requirements are given; for example, given the bandwidth limits, the received signal power and noise power,
and bit error rate limit, what data rate can be achieved? Using which modulation?
To answer this question, it is just a matter of setting some system variables (at the given limits) and
determining what that means about the values of other system variables. See Figure 53. For example, if I was
given the bandwidth limits and C/N
0
ratio, I’d be able to determine the probability of error for several diﬀerent
modulation types.
In this lecture, we discuss this procedure, and deﬁne each of the variables we see in Figure 53, and to what
they are related. This lecture is about system design, and it cuts across ECE classes; in particular circuits,
radio propagation, antennas, optics, and the material from this class. You are expected to apply what you have
learned in other classes (or learn the functional relationships that we describe here).
Figure 53: Relationships between important system design variables (rectangular boxes). Functional relation
ships between variables are given as circles – if the relationship is a single operator (e.g., division), the operator
is drawn in, otherwise it is left as an unnamed function f(). For example, C/N
0
is shown to have a divide by
relationship with C and N
0
. The eﬀect of the choice of modulation impacts several functional relationships,
e.g., the relationship between probability of bit error and
E
b
N
0
, which is drawn as a dotted line.
30.1 Link Budgets Given C/N
0
The received power is denoted C, it has units of Watts. What is C/N
0
? It is received power divided by noise
energy. It is an odd quantity, but it summarizes what we need to know about the signal and the noise for the
purposes of system design.
• We often describe both the received power at a receiver P
R
, but in the Rice book it is typically denoted
C.
ECE 5520 Fall 2009 107
• We know the probability of bit error is typically written as a function of
E
b
N
0
. The noise energy is N
0
. The
bit energy is c
b
. We can write c
b
= CT
b
, since energy is power times time. To separate the eﬀect of T
b
,
we often denote:
c
b
N
0
=
C
N
0
T
b
=
C/N
0
R
b
where R
b
= 1/T
b
is the bit rate. In other words, C/N
0
=
E
b
N
0
R
b
What are the units of C/N
0
? Answer:
Hz, 1/s.
• Note that people often report C/N
0
in dB Hz, which is
10 log
10
C
N
0
• Be careful of Bytes per sec vs bits per sec. Commonly, CS people use Bps (kBps or MBps) when describing
data rate. For example, if it takes 5 seconds to transfer a 1MB ﬁle, then software often reports that the
data rate is 1/5 = 0.2 MBps or 200 kBps. But the bit rate is 8/5 Mbps or 1.6 10
6
bps.
Given C/N
0
, we can now relate bit error rate, modulation, bit rate, and bandwidth.
Note: We typically use Q() and Q
−1
() to relate BER and
E
b
N
0
in each direction. While you have Matlab, this
is easy to calculate. If you can program it into your calculator, great. Otherwise, it’s really not a big deal to
pull it oﬀ of a chart or table. For your convenience, the following tables/plots of Q
−1
(x) will appear on Exam
2. I am not picky about getting lots of correct decimal places.
TABLE OF THE Q
−1
() FUNCTION:
Q
−1
_
1 10
−6
_
= 4.7534 Q
−1
_
1 10
−4
_
= 3.719 Q
−1
_
1 10
−2
_
= 2.3263
Q
−1
_
1.5 10
−6
_
= 4.6708 Q
−1
_
1.5 10
−4
_
= 3.6153 Q
−1
_
1.5 10
−2
_
= 2.1701
Q
−1
_
2 10
−6
_
= 4.6114 Q
−1
_
2 10
−4
_
= 3.5401 Q
−1
_
2 10
−2
_
= 2.0537
Q
−1
_
3 10
−6
_
= 4.5264 Q
−1
_
3 10
−4
_
= 3.4316 Q
−1
_
3 10
−2
_
= 1.8808
Q
−1
_
4 10
−6
_
= 4.4652 Q
−1
_
4 10
−4
_
= 3.3528 Q
−1
_
4 10
−2
_
= 1.7507
Q
−1
_
5 10
−6
_
= 4.4172 Q
−1
_
5 10
−4
_
= 3.2905 Q
−1
_
5 10
−2
_
= 1.6449
Q
−1
_
6 10
−6
_
= 4.3776 Q
−1
_
6 10
−4
_
= 3.2389 Q
−1
_
6 10
−2
_
= 1.5548
Q
−1
_
7 10
−6
_
= 4.3439 Q
−1
_
7 10
−4
_
= 3.1947 Q
−1
_
7 10
−2
_
= 1.4758
Q
−1
_
8 10
−6
_
= 4.3145 Q
−1
_
8 10
−4
_
= 3.1559 Q
−1
_
8 10
−2
_
= 1.4051
Q
−1
_
9 10
−6
_
= 4.2884 Q
−1
_
9 10
−4
_
= 3.1214 Q
−1
_
9 10
−2
_
= 1.3408
Q
−1
_
1 10
−5
_
= 4.2649 Q
−1
_
1 10
−3
_
= 3.0902 Q
−1
_
1 10
−1
_
= 1.2816
Q
−1
_
1.5 10
−5
_
= 4.1735 Q
−1
_
1.5 10
−3
_
= 2.9677 Q
−1
_
1.5 10
−1
_
= 1.0364
Q
−1
_
2 10
−5
_
= 4.1075 Q
−1
_
2 10
−3
_
= 2.8782 Q
−1
_
2 10
−1
_
= 0.84162
Q
−1
_
3 10
−5
_
= 4.0128 Q
−1
_
3 10
−3
_
= 2.7478 Q
−1
_
3 10
−1
_
= 0.5244
Q
−1
_
4 10
−5
_
= 3.9444 Q
−1
_
4 10
−3
_
= 2.6521 Q
−1
_
4 10
−1
_
= 0.25335
Q
−1
_
5 10
−5
_
= 3.8906 Q
−1
_
5 10
−3
_
= 2.5758 Q
−1
_
5 10
−1
_
= 0
Q
−1
_
6 10
−5
_
= 3.8461 Q
−1
_
6 10
−3
_
= 2.5121
Q
−1
_
7 10
−5
_
= 3.8082 Q
−1
_
7 10
−3
_
= 2.4573
Q
−1
_
8 10
−5
_
= 3.775 Q
−1
_
8 10
−3
_
= 2.4089
Q
−1
_
9 10
−5
_
= 3.7455 Q
−1
_
9 10
−3
_
= 2.3656
ECE 5520 Fall 2009 108
10
−7
10
−6
10
−5
10
−4
10
−3
10
−2
10
−1
10
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Value, x
I
n
v
e
r
s
e
Q
f
u
n
c
t
i
o
n
,
Q
−
1
(
x
)
30.2 Power and Energy Limited Channels
Assume the C/N
0
, the maximum bandwidth, and the maximum BER are all given. Sometimes power is the
limiting factor in determining the maximum achievable bit rate. Such links (or channels) are called power limited
channels. Sometimes bandwidth is the limiting factor in determining the maximum achievable bit rate. In this
case, the link (or channel) is called a bandwidth limited channel. You just need to try to solve the problem and
see which one limits your system.
Here is a stepbystep version of what you might need do in this case:
Method A: Start with powerlimited assumption:
1. Use the probability of error constraint to determine the
E
b
N
0
constraint, given the appropriate probability
of error formula for the modulation.
2. Given the C/N
0
constraint and the
E
b
N
0
constraint, ﬁnd the maximum bit rate. Note that R
b
= 1/T
b
=
C/N
0
E
b
N
0
,
but be sure to express both in linear units.
3. Given a maximum bit rate, calculate the maximum symbol rate R
s
=
R
b
log
2
M
and then compute the required
bandwidth using the appropriate bandwidth formula.
4. Compare the bandwidth at maximum R
s
to the bandwidth constraint: If BW at R
s
is too high, then
the system is bandwidth limited; reduce your bit rate to conform to the BW constraint. Otherwise, your
system is power limited, and your R
b
is achievable.
Method B: Start with a bandwidthlimited assumption:
ECE 5520 Fall 2009 109
1. Use the bandwidth constraint and the appropriate bandwidth formula to ﬁnd the maximum symbol rate
R
s
and then the maximum bit rate R
b
.
2. Find the
E
b
N
0
at the given bit rate by computing
E
b
N
0
=
C/N
0
R
b
. (Again, make sure that everything is in linear
units.)
3. Find the probability of error at that
E
b
N
0
, using the appropriate probability of error formula.
4. If the computed P [error] is greater than the BER constraint, then your system is power limited. Use the
previous method to ﬁnd the maximum bit rate. Otherwise, your system is bandwidthlimited, and you
have found the correct maximum bit rate.
Example: Rice 6.33
Consider a bandpass communications link with a bandwidth of 1.5 MHz and with an available C/N
0
= 82
dB Hz. The maximum bit error rate is 10
−6
.
1. If the modulation is 16PSK using the SRRC pulse shape with α = 0.5, what is the maximum achievable
bit rate on the link? Is this a power limited or bandwidth limited channel?
2. If the modulation is square 16QAM using the SRRC pulse shape with α = 0.5, what is the maximum
achievable bit rate on this link? Is this a power limited or bandwidth limited channel?
Solution:
1. Try Method A. For M = 16 PSK, we can ﬁnd
E
b
N
0
for the maximum BER:
10
−6
= P [error] =
2
log
2
M
Q
_
_
2(log
2
M) sin
2
(π/M)
c
b
N
0
_
10
−6
=
2
4
Q
_
_
2(4) sin
2
(π/16)
c
b
N
0
_
c
b
N
0
=
1
8 sin
2
(π/16)
_
Q
−1
_
2 10
−6
_¸
2
c
b
N
0
= 69.84 (52)
Converting C/N
0
to linear, C/N
0
= 10
82/10
= 1.585 10
8
. Solving for R
b
,
R
b
=
C/N
0
E
b
N
0
=
1.585 10
8
69.84
= 2.2710
6
= 2.27 Mbits/s
and thus R
s
= R
b
/ log
2
M = 2.2710
6
/4 = 5.6710
5
Msymbols/s. The required bandwidth for this
system is
B
T
=
(1 +α)R
b
log
2
M
= 1.5(2.2710
6
)/4 = 851 kHz
This is clearly lower than the maximum bandwidth of 1.5 MHz. So, the system is power limited, and can
operate with bit rate 2.27 Mbits/s. (If B
T
had come out > 1.5 MHz, we would have needed to reduce R
b
to meet the bandwidth limit.)
ECE 5520 Fall 2009 110
2. Try Method A. For M = 16 (square) QAM, we can ﬁnd
E
b
N
0
for the maximum BER:
10
−6
= P [error] =
4
log
2
M
(
√
M −1)
√
M
Q
_
_
3 log
2
M
M −1
c
b
N
0
_
10
−6
=
4
4
(4 −1)
4
Q
_
_
¸
3(4)
15
c
b
N
0
_
_
c
b
N
0
=
15
12
_
Q
−1
_
(4/3) 10
−6
_¸
2
c
b
N
0
= 27.55 (53)
Solving for R
b
,
R
b
=
C/N
0
E
b
N
0
=
1.585 10
8
27.55
= 5.7510
6
= 5.75 Mbits/s
The required bandwidth for this bit rate is:
B
T
=
(1 +α)R
b
log
2
M
= 1.5(5.7510
6
)/4 = 2.16 MHz
This is greater than the maximum bandwidth of 1.5 MHz, so we must reduce the bit rate to
R
b
=
B
T
log
2
M
1 +α
= 1.5 MHz
4
1.5
= 4 MHz
In summary, we have a bandwidthlimited system with a bit rate of 4 MHz.
30.3 Computing Received Power
We need to design our system for the power which we will receive. How can we calculate how much power will
be received? We can do this for both wireless and wired channels. Wireless propagation models are required;
this section of the class provides two examples, but there are others.
30.3.1 Free Space
‘Free space’ is the idealization in which nothing exists except for the transmitter and receiver, and can really only
be used for deep space communications. But, this formula serves as a starting point for other radio propagation
formulas. In free space, the received power is calculated from the Friis formula,
C = P
R
=
P
T
G
T
G
R
λ
2
(4πR)
2
where
• G
T
and G
R
are the antenna gains at the transmitter and receiver, respectively.
• λ is the wavelength at signal frequency. For narrowband signals, the wavelength is nearly constant across
the bandwidth, so we just use the center frequency f
c
. Note that λ = c/f
c
where c = 3 10
8
meters per
second is the speed of light.
ECE 5520 Fall 2009 111
• P
T
is the transmit power.
Here, everything is in linear terms. Typically people use decibels to express these numbers, and we will write
[P
R
]
dBm
or [G
T
]
dB
to denote that they are given by:
[P
R
]
dBm
= 10 log
10
P
R
[G
T
]
dB
= 10 log
10
G
T
(54)
In lecture 2, we showed that the Friis formula, given in dB, is
[C]
dBm
= [G
T
]
dB
+ [G
R
]
dB
+ [P
T
]
dBm
+ 20 log
10
λ
4π
−20 log
10
R
This says the received power, C, is linearly proportional to the log of the distance R. The Rice book writes the
Friis formula as
[C]
dBm
= [G
T
]
dB
+ [G
R
]
dB
+ [P
T
]
dBm
+ [L
p
]
dB
where
[L
p
]
dB
= +20 log
10
λ
4π
−20 log
10
R
where [L
p
]
dB
is called the “channel loss”. (This name that Rice chose is unfortunate because if it was positive,
it would be a “gain”, but it is typically very negative. My opinion is, it should be called “channel gain” and
have a negative gain value.)
There are also typically other losses in a transmitter / receiver system; losses in a cable, other imperfections,
etc. Rice lumps these in as the term L and writes:
[C]
dBm
= [G
T
]
dB
+ [G
R
]
dB
+ [P
T
]
dBm
+ [L
p
]
dB
+ [L]
dB
30.3.2 Nonfreespace Channels
We don’t study radio propagation path loss formulas in this course. But, a very short summary is that radio
propagation on Earth is diﬀerent than the Friis formula suggests. Lots of other formulas exist that approximate
the received power as a function of distance, antenna heights, type of environment, etc.
For example, whenever path loss is linear with the log of distance,
[L
p
]
dB
= L
0
−10nlog
10
R.
for some constants n and L
0
. Eﬀectively, because of shadowing caused by buildings, trees, etc., the average loss
may increase more quickly than 1/R
2
, instead, it may be more like 1/R
n
.
30.3.3 Wired Channels
Typically wired channels are lossy as well, but the loss is modeled as linear in the length of the cable. For
example,
[C]
dBm
= [P
T
]
dBm
−R[L
1m
]
dB
where P
T
is the transmit power and R is the cable length in meters, and L
1m
is the loss per meter.
ECE 5520 Fall 2009 112
30.4 Computing Noise Energy
The noise energy N
0
can be calculated as:
N
0
= kT
eq
where k = 1.38 10
−23
J/K is Boltzmann’s constant and T
eq
is called the equivalent noise temperature in
Kelvin. This is a topic covered in another course, and so you are not responsible for that material here. But
in short, the equivalent temperature is a function of the receiver design. T
eq
is always going to be higher than
the temperature of your receiver. Basically, all receiver circuits add noise to the received signal. With proper
design, T
eq
can be kept low.
30.5 Examples
Example: Rice 6.36
Consider a “pointtopoint” microwave link. (Such links were the key component in the telephone company’s
long distance network before ﬁber optic cables were installed.) Both antenna gains are 20 dB and the transmit
antenna power is 10 W. The modulation is 51.84 Mbits/sec 256 square QAM with a carrier frequency of 4 GHz.
Atmospheric losses are 2 dB and other incidental losses are 2 dB. A pigeon in the lineofsight path causes an
additional 2 dB loss. The receiver has an equivalent noise temperature of 400 K and an im plementation loss
of 1 dB. How far away can the two towers be if the bit error rate is not to exceed 10
−8
? Include the pigeon.
Neal’s hint: Use the dB version of Friis formula and subtract these mentioned dB losses: atmospheric losses,
incidental losses, implementation loss, and the pigeon.
Solution: Starting with the modulation, M = 256 square QAM (log
2
M = 8,
√
M = 16), to achieve
P [error] = 10
−8
,
10
−8
=
4
log
2
m
(
√
M −1)
√
M
Q
_
_
3 log
2
M
M −1
c
b
N
0
_
10
−8
=
4
8
15
16
Q
_
_
¸
3(8)
255
c
b
N
0
_
_
10
−8
=
15
32
Q
_
_
24
255
c
b
N
0
_
.
c
b
N
0
=
255
24
_
Q
−1
_
32
15
10
−8
__
2
= 319.0
The noise power N
0
= kT
eq
= 1.3810
−23
(J/K)400(K) = 5.5210
−21
J. So c
b
= 319.05.5210
−21
= 1.76
10
−18
J. Since c
b
= C/R
b
and the bit rate R
b
= 51.84 10
6
bits/sec, C = (51.84 10
6
)J(1.76 10
−18
)1/sec =
9.13 10
−11
W, or 100.4 dBW.
Switching to ﬁnding an expression for C, the wavelength is λ = 3 10
8
m/s/4 10
9
1/s = 0.075m, so:
[C]
dBW
= [G
T
]
dB
+ [G
R
]
dB
+ [P
T
]
dBW
+ 20 log
10
λ
4π
−20 log
10
R −2dB −2dB −2dB −1dB
= 20dB + 20dB + 10dBW + 20 log
10
0.075m
4π
−20 log
10
R −7dB
= −1.48dBW−20 log
10
R (55)
ECE 5520 Fall 2009 113
Plugging in [C]
dBm
= −100.4 dBW = −1.48dBW− 20 log
10
R and solving for R, we ﬁnd R = 88.3 km. Thus
microwave towers should be placed at most 88.3 km (about 55 miles) apart.
Lecture 20
Today: (1) Timing Synchronization Intro, (2) Interpolation Filters
31 Timing Synchronization
At the receiver, a transmitted signal arrives with an unknown delay τ. The received complex baseband signal
(for QAM, PAM, or QPSK) can be written (assuming carrier synchronization) as
r(t) =
k
m
a
m
(k)φ
m
(t −kT
s
−τ) (56)
where a
k
(m) is the mth transmitted symbol.
As a start, let’s compare receiver implementations for a (mostly) continuoustime and a (more) discretetime
receiver. Figure 54 has a timing synchronization loop which controls the sampler (the ADC).
Figure 54: Block diagram for a continuoustime receiver, including analog timing synchronization (from Rice
book, Figure 8.3.1).
The input signal is downconverted and then run through matched ﬁlters, which correlate the signal with
φ
n
(t −t
k
) for each n, and for some delay t
k
. For the correlation with n,
x
n
(k) = ¸r(t), φ
n
(t))
x
n
(k) =
k
m
a
m
(k)¸φ
n
(t −t
k
), φ
n
(t −kT
s
−τ)) (57)
Note that if t
k
= kT
s
+τ, then the correlation ¸φ
n
(t −t
k
), φ
n
(t −kT
s
−τ)) is highest and closest to 1. This
t
k
is the correct timing delay at each correlator for the kth symbol. But, these are generally unknown to the
receiver until timing synchronization is complete.
Figure 55 shows a receiver with an ADC immediately after the downconverter. Here, note the ADC has
nothing controlling it. Instead, after the matched ﬁlter, an interpolator corrects the sampling time problems
using discretetime processing. This interpolation is the subject of this section.
ECE 5520 Fall 2009 114
ctstime
bandpass
signal
x t ( )
2cos(2 + ) p f f t
c
2sin(2 + ) p f f t
c
PLL
p/2
ADC
Matched
Filter
Digital
Interpolator
Timing
Synch
Control
O
p
t
i
m
a
l
D
e
t
e
c
t
o
r
b
m
ADC
Matched
Filter
Digital
Interpolator
r nT ( )
sa
r nT ( )
sa
r kT ( )
sy
r kT ( )
sy
T
i
m
i
n
g
S
y
n
c
h
r
o
n
i
z
a
t
i
o
n
Figure 55: Block diagram for a digital receiver for QAM/PSK, including discretetime timing synchronization.
The industry trend is more and more towards digital implementations. A ‘software radio’ follows the idea
that as much of the radio is done in digital, after the signal has been sampled. The idea is to “bring the ADC
to the antenna” for purposes of reconﬁgurability, and reducing part counts and costs.
Another implementation is like Figure 55 but instead of the interpolator, the timing synch control block is
fed back to the ADC. But again, this requires a DAC and feedback to the analog part of the receiver, which is
not preferred. Also, because of the processing delay, this digital and analog feedback loop can be problematic.
First, we’ll talk about interpolation, and then, we’ll consider the control loop.
32 Interpolation
In this class, we will emphasize digital timing synchronization using an interpolation ﬁlter. For example, consider
Figure 56. In this ﬁgure, a BPSK receiver samples the matched ﬁlter output at a rate of twice per symbol,
unsynchronized with the symbol clock, resulting in samples r(nT).
4 6 8 10 12 14 16
−1
−0.5
0
0.5
1
Time t
V
a
l
u
e
Figure 56: Samples of the matched ﬁlter output (BPSK, RRC with α = 1) taken at twice the correct symbol
rate (vertical lines), but with a timing error. If downsampling (by 2) results in the symbol samples r
k
given by
red squares, then sampling sometimes reduces the magnitude of the desired signal.
Some example sampling clocks, compared to the actual symbol clock, are shown in Figure 57. These are
shown in degrees of severity of correction for the receiver. When we say ‘synchronized in rate’, we mean within
an integer multiple, since the sampling clock must operate at (at least) double the symbol rate.
ECE 5520 Fall 2009 115
Figure 57: (1) Sampling clock and (24) possible actual symbol clocks. Symbol clock may be (2) synchronized
in rate and phase, (3) synchronized in rate but not in phase, (4) synchronized neither in rate nor phase, with
sample clock.
In general, our receivers always deal with type (4) sampling clock error as drawn in Figure 57. That is, the
sampling clock has neither the same exact rate nor the same phase as the actual symbol clock.
Def ’n: Incommensurate
Two clocks with rates T and T
s
are incommensurate if the ratio T/T
s
is irrational. In contrast, two clocks are
commensurate if the ratio T/T
s
can be written as n/m where n, m are integers.
For example, T/T
s
= 1/2, the two clocks are commensurate and we sample exactly twice per symbol period.
As another example, if T/T
s
= 2/5, we sample exactly 2.5 times per symbol, and every 5 samples the delay
until the next correct symbol sample will repeat. Since clocks are generally incommensurate, we cannot count
on them ever repeating.
The situation shown in Figure 56 is case (3), where T/T
s
= 1/2 (the clocks are commensurate), but the
sampling clock does not have the correct phase (τ is not equal to an integer multiple of T).
32.1 Sampling Time Notation
In general, for both cases (3) and (4) in Figure 57, the correct sampling times should be kT
s
+τ, but no samples
were taken at those instants. Instead, kT
s
+ τ is always µ(k)T after the most recent sampling instant, where
µ(k) is called the fractional interval. We can write that
kT
s
+τ = [m(k) +µ(k)]T (58)
where m(k) is an integer, the highest integer such that nT < kT
s
+τ, and 0 ≤ µ(k) < 1. In other words,
m(k) =
_
kT
s
+τ
T
_
where ⌊x⌋ is the greatest integer less than function (the Matlab floor function). This means that µ(k) is given
by
µ(k) =
kT
s
+τ
T
−m(k)
Example: Calculation Example
Let T
s
/T = 3.1 and τ/T = 1.8. Calculate (m(k), µ(k)) for k = 1, 2, 3.
ECE 5520 Fall 2009 116
Solution:
m(1) = ⌊3.1 + 1.8⌋ = 4; µ(1) = 0.9
m(2) = ⌊2(3.1) + 1.8⌋ = 8; µ(1) = 0
m(3) = ⌊3(3.1) + 1.8⌋ = 11; µ(1) = 0.1
Thus your interpolation will be done: in between samples 4 and 5; at sample 8; and in between samples 11 and
12.
32.2 Seeing Interpolation as Filtering
Consider the output of the matched ﬁlter, r(t) as given in (57). The analog output of the matched ﬁlter could
be represented as a function of its samples r(nT),
r(t) =
n
r(nT)h
I
(t −nT) (59)
where
h
I
(t) =
sin(πt/T)
πt/T
.
Why is this so? What are the conditions necessary for this representation to be accurate?
If we wanted the signal at the correct sampling times, we could have it – we just need to calculate r(t) at
another set of times (not nT).
Call the correct symbol sampling times as kT
s
+τ for integer k, where T
s
is the actual symbol period used
by the transmitter. Plugging these times in for t in (59), we have that
r(kT
s
+τ) =
n
r(nT)h
I
(kT
s
+τ −nT)
Now, using the (m(k), µ(k)) notation, since kT
s
+τ = [m(k) +µ(k)]T,
r([m(k) +µ(k)]T) =
n
r(nT)h
I
([m(k) −n +µ(k)]T).
Reindexing with i = m(k) −n,
r([m(k) +µ(k)]T) =
i
r([m(k) −i]T)h
I
([i +µ(k)]T). (60)
This is a ﬁlter on samples of r(), where the ﬁlter coeﬃcients are dependent on µ(k).
Note: Good illustrations are given in M. Rice, Figure 8.4.12, and Figure 8.4.13.
32.3 Approximate Interpolation Filters
Clearly, (60) is a ﬁlter. The desired sample at [m(k) +µ(k)]T is calculated by adding the weighted contribution
from the signal at each sampling time. The problem is that in general, this requires an inﬁnite sum over i from
−∞ to ∞, because the sinc function has inﬁnite support.
Instead, we use polynomial approximations for h
I
(t):
• The easiest one we’re all familiar with is linear interpolation (a ﬁrstorder polynomial), in which we draw
a straight line between the two nearest sampled values to approximate the values of the continuous signal
between the samples. This isn’t so great of an approximation.
ECE 5520 Fall 2009 117
• A secondorder polynomial (i.e.a parabola) is actually a very good approximation. Given three points,
one can determine a parabola that ﬁts those three points exactly.
• However, the three point ﬁt does not result in a linearphase ﬁlter. (To see this, note in the time domain
that two samples are on one side of the interpolation point, and one on the other. This is temporal
asymmetry.) Instead, we can use four points to ﬁt a secondorder polynomial, and get a linearphase
ﬁlter.
• Finally, we could use a cubic interpolation ﬁlter. Four points determine a 3rd order polynomial, and result
in a linear ﬁlter.
To see results for diﬀerent order polynomial ﬁlters, see M. Rice Figure 8.24.
32.4 Implementations
Note: These polynomial ﬁlters are called Farrow ﬁlters and are named after Cecil W. Farrow, of AT&T Bell
Labs, who has the US patent (1989) for the “Continuously variable digital delay circuit”. These Farrow ﬁlters
started to be used in the 90’s and are now very common due to the dominance of digital processing in receivers.
From (60), we can see that the ﬁlter coeﬃcients are a function of µ(k), the fractional interval. Thus we
could rewrite (60) as
r([m(k) +µ(k)]T) =
i
r([m(k) −i]T)h
I
(i; µ(k)). (61)
That is, the ﬁlter is h
I
(i) but its values are a function of µ(k). The ﬁlter coeﬃcients are a polynomial function
of µ(k), that is, they are a weighted sum of µ(k)
0
, µ(k)
1
, µ(k)
2
, . . . , µ(k)
p
for a pth order polynomial ﬁlter.
Example: First order polynomial interpolation
For example, consider the linear interpolator.
r([m(k) +µ(k)]T) =
0
i=−1
r([m(k) −i]T)h
I
(i; µ(k))
What are the ﬁlter elements h
I
for a linear interpolation ﬁlter?
Solution:
r([m(k) +µ(k)]T) = µ(k)r([m(k) + 1]T) + [1 −µ(k)]r(m(k)T)
Here we have used h
I
(−1; µ(k)) = µ(k) and h
I
(0; µ(k)) = 1 −µ(k).
Essentially, given µ(k), we form a weighted average of the two nearest samples. As µ(k) → 1, we should
take the r([m(k) + 1]T) sample exactly. As µ(k) → 0, we should take the r(m(k)T) sample exactly.
32.4.1 Higher order polynomial interpolation ﬁlters
In general,
h
I
(i; µ(k)) =
p
l=0
b
l
(i)µ(k)
l
A full table of b
l
(i) is given in Table 8.1 of the M. Rice handout.
Note that the i indices seem backwards.
ECE 5520 Fall 2009 118
For the 2nd order Farrow ﬁlter, there is an extra degree of freedom – you can select parameter α to be in
the range 0 < α < 1. It has been shown by simulation that α = 0.43 is best, but people tend to use α = 0.5
because it is only slightly worse, and division by two is extremely easy in digital ﬁlters.
Example: 2nd order Farrow ﬁlter
What is the Farrow ﬁlter for α = 0.5 which interpolates exactly halfway between sample points?
Solution: From the problem statement, µ = 0.5. Since µ
2
= 0.25, µ = 0.5, µ
0
= 1, we can calculate that
h
I
(−2; 0.5) = αµ
2
−αµ = 0.125 −0.25 = −0.125
h
I
(−1; 0.5) = −αµ
2
+ (1 +α)µ = −0.125 + 0.75 = 0.625
h
I
(0; 0.5) = −αµ
2
+ (α −1)µ + 1 = −0.125 −0.25 + 1 = 0.625
h
I
(1; 0.5) = αµ
2
−αµ = 0.125 −0.25 = −0.125
(62)
Does this make sense? Do the weights add up to 1? Does it make sense to subtract a fraction of the two
more distant samples?
Example: Matlab implementation of Farrow Filter
My implementation is called ece5520 lec20.m and is posted on WebCT. Be careful, as my implementation
uses a loop, rather than vector processing.
Lecture 21
Today: (1) Timing Synchronization (2) PLLs
33 Final Project Overview
s n [ ]
MF N/2
h n p nT [ ]= ( )
cos( ) W
0
n 2
MF N/2
h n p nT [ ]= ( )
sin( ) W
0
n 2
Timing
Error
Detector
Interp.
Filter
Loop
Filter
VCC
v n ( )
m
e n ( )
x
I
( ) n
x
Q
( ) n
symbol
strobe
Interp.
Filter
r
I
( ) k
r
Q
( ) k
Preamble
Detector
Symbol
Decision
DATA
Bits
All
Bits
Figure 58: Flow chart of ﬁnal project; timing synchronization for QPSK.
ECE 5520 Fall 2009 119
For your ﬁnal project, you will implement the above receiver. It adds these blocks to your QPSK implemen
tation:
1. Interpolation Filter
2. Timing Error Detector (TED)
3. Loop Filter
4. Voltage Controlled Clock (VCC)
5. Preamble Detector
These notes address the motivation and overall design of each block.
Compared to last year’s class, you have the advantage of having access to the the Rice book, Section 8.4.4,
which contains much of the code for a BPSK implementation. When you put these together and implement
them for QPSK, you will need to understand how and why they work, in order to ﬁx any bugs.
33.1 Review of Interpolation Filters
Timing synchronization is necessary to know when to sample the matched ﬁlter output. We want to sample at
times [n+µ]T
s
, where n is the integer part and µ is the fractional oﬀset. Often, we leave out the T
s
and simply
talk about the index n or fractional delay µ.
Implementations may be continuous time, discrete time, or a mix. We focus on the discrete time solutions.
• Problem: After the matched ﬁlter, the samples may be at incorrect times, and in modern discretetime
implementations, there may be no analog feedback to the ADC.
• Solution: From samples taken at or above the Nyquist rate, you can interpolate between samples to ﬁnd
the desired sample.
However this solution leads to new problems:
• Problem: True interpolation requires signiﬁcant calculation – the sinc ﬁlter has inﬁnite impulse response.
• Solution: Approximate the sinc function with a 2nd or 3rd order polynomial interpolation ﬁlter, it works
nearly as well.
• Problem: How do you know when the correct symbol sampling time should be?
• Solution: Use a timing locked loop, analogous to a phased locked loop.
33.2 Timing Error Detection
We consider timing error detection blocks that operate after the matched ﬁlter and interpolator in the receiver,
which has output denoted x
I
(n). The job of the timing error detector is to produce a voltage error signal e(n)
which is proportional to the correct fractional oﬀset between the current sampling oﬀset (ˆ µ) and the correct
sampling oﬀset.
There are several possible timing error detection (TED) methods, as related in Rice 8.4.1. This is a good
source for detail on many possible methods. We will talk through two common discretetime implementations:
1. Earlylate timing error detector (ELTED)
ECE 5520 Fall 2009 120
2. Zerocrossing timing error detector (ZCTED)
We’ll use a continuoustime realization of a BPSK received matched ﬁlter output, x(t), to illustrate the
operation of timing error detectors. You can imagine that the interpolator output x
I
(n) is a sampled version of
x(t), hopefully with some samples taken at exactly the correct symbol sampling times.
Figure 59(a) shows an example eye diagram of the signal x(t) and Figure 59(b) shows its derivative ˙ x(t). In
both, time 0 corresponds to a correct symbol sampling time.
(a)
−1.5 −1 −0.5 0 0.5 1 1.5
−1
−0.5
0
0.5
1
Time from Symbol Sample Time
x
(
t
)
V
a
l
u
e
(b)
−1.5 −1 −0.5 0 0.5 1 1.5
−5
0
5
x 10
−4
Time from Symbol Sample Time
S
l
o
p
e
o
f
x
(
t
)
Figure 59: A sample eyediagram of a RRCshaped BPSK received signal (post matched ﬁlter), x(t). Here (a)
shows the signal x(t) and (b) shows its derivative ˙ x(t).
33.3 Earlylate timing error detector (ELTED)
The earlylate TED is a discretetime implementation of the continuoustime “earlylate gate” timing error
detector. In general, the error e(n) is determined by the slope and the sign of the sample x(n), at time n. In
particular, consider for BPSK the value of
˙ x(t)sgn ¦x(t)¦
Since the derivative ˙ x(t) is close to zero at the correct sampling time, and increases away from the correct
sampling time, we can use it to indicate how far from the sampling time we might be.
Figure 59 shows that the derivative is a good indication of timing error when the bit is changing from 1,
to 1, to 1. When the bit is constant (e.g., 1, to 1, to 1) the slope is close to zero. When the bit sequence is
ECE 5520 Fall 2009 121
from 1, to 1, to 1, the slope will be somewhat negative even when the sample is taken at the right time, and
when the bit is changing from 1, to 1, to 1, the slope will be somewhat positive even when the sample is taken
at the right time. These are imperfections in the ELTED which (hopefully) average out over many bits. In
particular, we can send alternating bits during the synchronization period in order to help the ELTED more
quickly converge.
The signum function sgn ¦x(t)¦ just corrects for the sign:
• A positive slope would mean we’re behind for a +1 symbol, but would mean that we’re ahead for a 1
symbol.
• In the opposite manner, a negative slope would mean we’re ahead for a +1 symbol, but would mean that
we’re behind for a 1 symbol.
For a discrete time system, you get only an approximation of the slope unless the sampling rate SPS (or
N, as it is called in the Rice book) is high. We’d estimate ˙ x(t) from the samples out of the interpolator and see
that
e(n −1) = ¦x
I
(n) −x
I
(n −2)¦ sgn ¦x
I
(n −1)¦
Here, we don’t divide by the time duration 2T
s
because it is just a scale factor, and we just need something
proportional to the timing error. This factor contributes to the gain K
p
of this part in the loop. This timing
error detector has a theoretical gain K
p
which shown in Figure 8.4.7 of the Rice book.
33.4 Zerocrossing timing error detector (ZCTED)
The zerocrossing timing error detector (ZCTED) is also described in detail in Section 8.4.1 of the Rice book.
In particular see Figure 8.4.8 of the Rice book. This error detector assumes that the sampling rate is SPS = 2.
It is described here for BPSK systems.
The error is,
e(k) = x((k −1/2)T
s
+ ˆ τ)[ˆ a(k −1) − ˆ a(k)] (63)
where the ˆ a(k) term is the estimate of the kth symbol (either +1 or −1),
ˆ a(k −1) = sgn ¦x((k −1)T
s
+ ˆ τ)¦ , (64)
ˆ a(k) = sgn ¦x(kT
s
+ ˆ τ)¦ . (65)
The error detector is nonzero at symbol sampling times. That is, if the symbol strobe is not activated at sample
n, then the error e(n) = 0.
In terms of n, because every second sample is a symbol sampling time, (63) can be rewritten as
e(n) = x
I
(n −1)[sgn¦x
I
(n −2)¦ −sgn¦x
I
(n)¦] (66)
This operation is drawn in Figure 60.
Basically, if the sign changes between n −2 and n, that indicates a symbol transition. If there was a symbol
transmission, the intermediate sample n−1 should be approximately zero, if it was taken at the correct sampling
time.
The theoretical gain in the ZCTED is twice that of the ELTED. The factor of two comes from the diﬀerence
of signs in (66), which will be 2 when the sign changes. In the project, you will need to look up K
p
in Figure
8.17 of Rice, which is a function of the excess bandwidth of the RRC ﬁlter used, and then multiply it by 2 to
ﬁnd the gain K
p
of your ZCTED.
ECE 5520 Fall 2009 122
x n
I
( )
TimingErrorDetector
z
1
z
1
x n
I
( 1) x n
I
( 2)
symbol
strobe
sgn sgn
e n ( )
Figure 60: Block diagram of the zerocrossing timing error detector (ZCTED), where sgn indicates the signum
function, 1 when positive and 1 when negative. The input strobe signal activates the sampler when it is the
correct symbol sampling time. When this happens, and the symbol has switched, the output is 2x
I
(n − 1),
which would be approximately zero if the sample clock is synchronized.
33.4.1 QPSK Timing Error Detection
When using QPSK, both the inphase and quadrature signals are used to calculate the error term. The error is
simply the sum of the two errors calculated for each signal. See (8.100) in the Rice book for the exact expression.
33.5 Voltage Controlled Clock (VCC)
We’ll use a decrementing VCC in the project. This is shown in Figure 61. (The choice of incrementing or
decrementing is arbitrary.) An example trace of the numerically controlled oscillator (NCO) which is the heart
of the VCC is shown in Figure 62.
VoltageControlledClock(VCC)
v n ( )
1/2
W n ( )
z
1
modulo1
NCO n ( 1)
underflow?
Strobehigh:change to
= ( 1)/ ( )
m
m NCO n W n
symbol
strobe
symbol
strobe
m
Figure 61: Block diagram of decrementing VCC with control voltage v(n).
NCO n ( 2)
...
NCO n ( )
NCO n ( 1)
m
W n ( )
Figure 62: Numericallycontrolled oscillator signal NCO(n).
ECE 5520 Fall 2009 123
The NCO starts at 1. At each sample n, the NCO decrements by 1/SPS + v(n). Once per symbol, (on
average) the NCO voltage drops below zero. When this happens, the strobe is set to high, meaning that a
sample must be taken.
When the symbol strobe is activated, the fractional interval is also recalculated. It is,
µ(n) =
NCO(n −1)
W(n)
(67)
When the strobe is not activated, µ(n) is kept the same, i.e., µ(n) = µ(n − 1). You will prove that (67) is the
correct form for µ(n) in the HW 9 assignment.
33.6 Phase Locked Loops
Carrier synchronization is done using a phaselocked loop (PLL). This section is included for reference. For
those of you familiar with PLLs, the operation of the timing synchronization loop may be best understood by
analogy to the continuous time carrier synchronization loop.
Phase
Detector
Loop
Filter
VCO
A f t t cos(2 + ( )) p q
c g( ( ) ) q t q( ) t
v( ) t
cos(2 + ( )) p q f t t
c
q ò ( )= ( ) t k v x dx
0
¥
t
Figure 63: General block diagram of a phaselocked loop for carrier synchronization [Rice].
Consider a generic PLL block diagram in Figure 63. The incoming signal is a carrier wave,
Acos(2πf
c
t +θ(t))
where θ(t) is the phase oﬀset. It is a function of time, t, because it may not be constant. For example, it may
include a frequency oﬀset, and then θ(t) = ∆ft +α. In this case, the phase oﬀset is a ramp.
The job of the the PLL is to estimate φ(t). We call this estimate
ˆ
θ(t). The error in the phase estimate is
θ
e
(t) = θ(t) −
ˆ
θ(t)
If the PLL is perfect, θ
e
(t) = 0.
33.6.1 Phase Detector
If the estimate of the phase is not perfect, the phase detector produces a voltage to correct the phase estimate.
The function g(θ
e
) may look something like the one shown in Figure 64.
Initially, let’s ignore the loop ﬁlter.
1. If the phase estimate is too low, that is, behind the carrier, then g(θ
e
) > 0 and the VCO increases the
phase of the VCO output.
2. If the phase estimate is too high, that is, ahead of the carrier, then g(θ
e
) < 0 and the VCO decreases the
phase of the VCO output.
3. If the phase estimate is exactly correct, then g(θ
e
) = 0 and VCO output phase remains constant.
This is the eﬀect of the ‘S’ curve function g(θ
e
) in Figure 64.
ECE 5520 Fall 2009 124
g( ) q
e
q
e
Figure 64: Typical phase detector inputoutput relationship. This curve is called an ‘S’ Curve. because the
shape of the curve looks like the letter ‘S’ [Rice].
33.6.2 Loop Filter
The loop ﬁlter is just a lowpass ﬁlter to reduce the noise. Note that in addition to the Acos(2πf
c
t + θ(t))
signal term we also have channel noise coming into the system. We represent the loop ﬁlter in the Laplace or
frequency domain as F(s) or F(f), respectively, for continuous time. We will be interested in discrete time
later, so we could also write the Ztransform response as F(z).
33.6.3 VCO
A voltagecontrolled oscillator (VCO) simply integrates the input and uses that integral as the phase of a cosine
wave. The input is essentially the gas pedal for a race car driving around a track  increase the gas (voltage),
and the engine (VCO) will increase the frequency of rotation around the track.
Note that if you just raise the voltage for a short period and then reduce it back (a constant plus rect
function input), you change the phase of the output, but leave the frequency the same.
33.6.4 Analysis
To analyze the loop in Figure 63, we end up making simpliﬁcations. In particular, we model the ‘S’ curve
(Figure 64) as a line. This linear approximation is generally ﬁne when the phase error is reasonably small, i.e.,
θ
e
= 0. If we do this, we can redraw Figure 63 as it is drawn in Figure 65.
F s ( )
Q( ) s
V s ( )
Q( ) s
k
0
s
k
p
phase
equivalent
perspective
Figure 65: Phaseequivalent and linear model for the continuoustime PLL, in the Laplace domain [Rice].
In the left half of Figure 65, we write just Θ(s) and
ˆ
Θ(s) even though the physical process, as we know, is
the cosine of 2πf
c
plus that phase. This phaseequivalent model of the PLL allows us to do analysis.
ECE 5520 Fall 2009 125
We can write that
Θ
e
(s) = Θ(s) −
ˆ
Θ(s)
= Θ(s) −Θ
e
(s)k
p
F(s)
k
0
s
To get the transfer function of the ﬁlter (when we consider the phase estimate to be the output), then some
manipulation shows you that
H
a
(s) =
ˆ
Θ(s)
Θ(s)
=
k
p
F(s)
k
0
s
1 +k
p
F(s)
k
0
s
=
k
0
k
p
F(s)
s +k
0
k
p
F(s)
You can use H
a
(s) to design the loop ﬁlter F(s) and gains k
p
and k
0
to achieve the desired PLL goals. Here
those goals include:
1. Desired bandwidth.
2. Desired response to particular phase error models.
The latter deserves more explanation. For example, phase error models might be: step input; or ramp input.
Designing a ﬁlter to respond to these, but not to AWGN noise (as much as possible), is a challenge. The Rice
book recommends a 2nd order proportionalplusintegrator loop ﬁlter,
F(s) = k
1
+
k
2
s
This ﬁlter has a proportional part (k
1
) which is just proportional to the input, which responds to the input
during phase oﬀset. The ﬁlter also has an integrator part (k
2
/s) which integrates the phase input, and in case
of a frequency oﬀset, will ramp up the phase at the proper rate.
Note: A accelerating frequency change wouldn’t be handled properly by the given ﬁlter. For example, if the
temperature of the transmitter or receiver varied quickly, the frequency of the crystal will also change quickly.
However, this is not typically fast enough to be a problem in most radios.
In this case there are four parameters to set, k
0
, k
1
, k
2
, and k
p
. The Rice Section C.2 details how to set these
parameters to achieve a desired bandwidth and a desired damping factor (to avoid ringing, but to converge
quickly). You will, in Homework 9, design the particular loop ﬁlter to use in your ﬁnal project.
33.6.5 DiscreteTime Implementations
You can alter F(s) to be a discrete time ﬁlter using Tustin’s approximation,
1
s
→
T
2
1 +z
−1
1 −z
−1
Because it is only a secondorder ﬁlter, its implementation is quite eﬃcient. See Figure C.2.1 (p 733, Rice book)
for diagrams of this discretetime PLL. Equations C.56 and C.57 (p 736) show how to calculate the constants
K
1
and K
2
, given the desired natural frequency θ
n
, the normalized bandwidth B
n
T, damping coeﬃcient ζ, and
other gain constants K
0
and K
p
.
ECE 5520 Fall 2009 126
Note that the bandwidth B
n
T is the most critical factor. It is typically much less than one, (e.g., 0.005).
This means that the loop ﬁlter eﬀectively averages over many samples when setting the input to the VCO.
Lecture 22
Today: (1) Exam 2 Review
• Exam 2 is Tue Apr 14.
• I will be out of town MonWed. The exam will be proctored. Please come to me with questions today or
tomorrow.
• Oﬃce hours: Today: 23:30. Friday 1112, 34.
34 Exam 2 Topics
Where to study:
• Lectures covered: 8 (detection theory), 9, 1119. Lectures 20,21 are not covered. No material from 17
will be directly tested, but of course you need to know some things from the start of the semester to be
able to perform well on the second part of this course.
• Homeworks covered: 48. Please review these homeworks and know the correct solutions.
• Rice book: Chapters 5 and 6.
What topics:
1. Detection theory
2. Signal space
3. Digital modulation methods:
• Binary, bipolar vs. unipolar
• Mary: PAM, QAM, PSK, FSK
4. Intersymbol interference, Nyquist ﬁltering, SRRC pulse shaping
5. Diﬀerential encoding and decoding for BPSK
6. Energy detection for FSK
7. Gray encoding
8. Probability of Error vs.
E
b
N
0
.
• Standard modulation method. Both know the formula and know how it can be derived.
• A nonstandard modulation method, given signal space diagram.
• Exact expression, union bound, nearestneighbor approximation.
ECE 5520 Fall 2009 127
9. Bandwidth assuming SRRC pulse shaping for PAM/QAM/PSK, FSK, concept of frequency multiplexing
10. Comparison of digital modulation methods:
• Need for linear ampliﬁers (transmitter complexity)
• Receiver complexity
• Bandwidth eﬃciency
Types of questions (format):
1. Short answer, with a 12 sentence limit. There may be multiple correct answers, but there may be many
wrong answers (or right answer / wrong reason combinations).
2. True / false or multiple choice.
3. Work out solution. Example: What is the probability of bit error vs.
E
b
N
0
for this signal space diagram?
4. System design: Here are some engineering requirements for this communication system. From this list of
possible modulation, which one would you recommend? What bandwidth and bit rate would it achieve?
You will be provided the table of the Qinv function, and the plot of Qinv. You can use two sides of an 8.5 x 11
sheet of paper.
Lecture 23
Today: (1) Source Coding
• HW 9 is due today at 5pm. OH today 23pm, Monday 45pm.
• Final homework: HW 10, which will be due April 28 at 5pm.
35 Source Coding
This topic is a semesterlong course at the graduate level in itself; but the basic ideas can be presented pretty
quickly.
One good reference:
• C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, pp.
379423 and 623656, July and October, 1948. http://www.cs.belllabs.com/cm/ms/what/shannonday/
paper.html
We have done a lot of counting of bits as our primary measure of communication systems. Our information
source is measured in bits, or in bits per second. Modulation schemes’ bandwidth eﬃciency is measured in bits
per Hertz, and energy eﬃciency is energy per bit over noise PSD. Everything is measured in bits!
But how do we measure the bits of a source (e.g., audio, video, email, ...)? Information can often be
represented in many diﬀerent ways. Images and sound can be encoded in diﬀerent ways. Text ﬁles can be
presented in diﬀerent ways.
Here are two misconceptions:
ECE 5520 Fall 2009 128
1. Using the ﬁle size tells you how much information is contained within the ﬁle.
2. Take the log
2
of the number of diﬀerent messages you could send.
For example, consider a digital black & white image (not grayscale, but truly black or white).
1. You could store it as a list of pixels. Each pixel has two possibilities (possible messages), thus we could
encode it in log
2
2 or one bit per pixel.
2. You could simply send the coordinates of the pixels of one of the colors (e.g.all black pixels).
How many bits would be used in these two representations? What would make you decide which one is more
eﬃcient?
You can see that equivalent representations can use diﬀerent number of bits. This is the idea behind source
compression. For example, .zip or .tar ﬁles represent the exact same information that was contained in the
original ﬁles, but with fewer bits.
What if we had a ﬁxed number of bits to send any image, and we used the sparse B&W image coding scheme
(2.) above? Sometimes, the number of bits in the compressed image would exceed what we had allocated. This
would introduce errors into the image.
Two types of compression algorithms:
• Lossless: e.g., Zip or compress.
• Lossy: e.g., JPEG, MP3
Note: Both “zip” and the unix “compress” commands use the LempelZiv algorithm for source compression.
So what is the intrinsic measure of bits of text, an image, audio, or video?
35.1 Entropy
Entropy is a measure of the randomness of a random variable. Randomness and information, in nontechnical
language, are just two perspectives on the same thing:
• If you are told the value of a r.v. that doesn’t vary that much, that telling conveys very little information
to you.
• If you are told the value of a very “random” r.v., that telling conveys quite a bit of information to you.
Our technical deﬁnition of entropy of a random variable is as follows.
Def ’n: Entropy
Let X be a discrete random variable with pmf p
X
(x
i
) = P [X = x]. Here, there is a ﬁnite or countably inﬁnite
set S
X
, and x ∈ S
X
. We will shorten the notation by using p
i
as follows:
p
i
= p
X
(x
i
) = P [X = x
i
]
where ¦x
1
, x
2
, . . .¦ is an ordering of the possible values in S
X
. Then the entropy of X, in units of bits, is deﬁned
as,
H [X] = −
i
p
i
log
2
p
i
(68)
Notes:
ECE 5520 Fall 2009 129
• H [X] is an operator on a random variable, not a function of a random variable. It returns a (deterministic)
number, not another random variable. This it is like E[X], another operator on a random variable.
• Entropy of a discrete random variable X is calculated using the probability values of the pmf of X, p
i
.
Nothing else is needed.
• The sum will be from i = 1 . . . N when [S
X
[ = N < ∞.
• Use that 0 log 0 = 0. This is true in the limit of xlog x as x → 0
+
.
• All “log” functions are logbase2 in information theory unless otherwise noted. Keep this in mind when
reading a book on information theory. The “reason” the units are bits is because of the base2 of the
log. Actually, when theorists use log
e
or the natural log, they express information in “nats”, short for
“natural” digits.
Example: Binary r.v.
A binary (Bernoulli) r.v. has pmf,
p
X
(x) =
_
_
_
s, x = 1
1 −s, x = 0
0, o.w.
What is the entropy H [X] as a function of s?
Solution: Entropy is given by (68) and is:
H[X] = −s log
2
s −(1 −s) log
2
(1 −s)
The solution is plotted in Figure 66.
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
P[X=1]
E
n
t
r
o
p
y
H
[
X
]
Figure 66: Entropy of a binary r.v.
Example: Nonuniform source with ﬁve messages
Some signals are more often close to zero (e.g.audio). Model the r.v. X to have pmf
p
X
(x) =
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
1/16, x = 2
1/4, x = 1
1/2, x = 0
1/8, x = −1
1/16, x = −2
0, o.w.
ECE 5520 Fall 2009 130
What is its entropy H [X]?
Solution:
H [X] =
1
2
log
2
2 +
1
4
log
2
4 +
1
8
log
2
8 + 2
1
16
log
2
16
=
15
8
bits (69)
Other questions:
1. Do you need to know what the symbol set S
X
is?
2. Would multiplying X by 2 change its entropy?
3. Would an arbitrary onetoone function change the entropy of X?
35.2 Joint Entropy
Def ’n: Joint Entropy
The joint entropy of two random variables X
1
, X
2
with event sets S
X
1
and S
X
2
is deﬁned as
H[X
1
, X
2
] = −
x
1
∈S
X
1
x
2
∈S
X
2
p
X
1
,X
2
(x
1
, x
2
) log
2
p
X
1
,X
2
(x
1
, x
2
) (70)
For N joint random variables, X
1
, . . . , X
N
, entropy is
H[X
1
, . . . , X
N
] = −
x
1
∈S
X
1
x
N
∈S
X
N
p
X
1
,...,X
N
(x
1
, . . . , x
N
) log
2
p
X
1
,...,X
N
(x
1
, . . . , x
N
)
What is the entropy for N i.i.d. random variables? You can show that
H[X
1
, . . . , X
N
] = −N
x
1
∈S
X
1
p
X
1
(x
1
) log
2
p
X
1
(x
1
) = NH(X
1
)
The entropy of N i.i.d. random variables has N times the entropy of any one of them. In addition, the entropy
of any N independent (but possibly with diﬀerent distributions) r.v.s is just the sum of the entropy of each
individual r.v.
When r.v.s are not independent, the joint entropy of N r.v.s is less than N times the entropy of one of them.
Intuitively, if you know some of them, because of the dependence or correlation, the rest that you don’t know
become less informative. For example, the B&W image, since pixels are correlated in space, the joint r.v. of
several neighboring pixels will have less entropy than the sum of the individual pixel entropies.
35.3 Conditional Entropy
How much additional entropy is in the joint random variables X
1
, X
2
compared just to one of them? This is
often an important question because it answers the question, “How much additional information do I get from
both, compared to just one of them?”. We call this diﬀerence the conditional entropy, H[X
2
[X
1
]:
H[X
2
[X
1
] = H[X
2
, X
1
] −H[X
1
]. (71)
ECE 5520 Fall 2009 131
What is an equation for H[X
2
[X
1
] as a function of the joint probabilities p
X
1
,X
2
(x
1
, x
2
) and the conditional
probabilities p
X
2
X
1
(x
2
[x
1
).
Solution: Plugging in (68) for H[X
2
, X
1
] and H[X
1
],
H[X
2
[X
1
] = −
x
1
∈S
X
1
x
2
∈S
X
2
p
X
1
,X
2
(x
1
, x
2
) log
2
p
X
1
,X
2
(x
1
, x
2
) +
x
1
∈S
X
1
p
X
1
(x
1
) log
2
p
X
1
(x
1
)
= −
x
1
∈S
X
1
x
2
∈S
X
2
p
X
1
,X
2
(x
1
, x
2
) log
2
p
X
1
,X
2
(x
1
, x
2
) +
x
1
∈S
X
1
_
_
x
2
∈S
X
2
p
X
1
,X
2
(x
1
, x
2
)
_
_
log
2
p
X
1
(x
1
)
= −
x
1
∈S
X
1
x
2
∈S
X
2
p
X
1
,X
2
(x
1
, x
2
) (log
2
p
X
1
,X
2
(x
1
, x
2
) −log
2
p
X
1
(x
1
))
= −
x
1
∈S
X
1
x
2
∈S
X
2
p
X
1
,X
2
(x
1
, x
2
) log
2
p
X
1
,X
2
(x
1
, x
2
)
p
X
1
(x
1
)
= −
x
1
∈S
X
1
x
2
∈S
X
2
p
X
1
,X
2
(x
1
, x
2
) log
2
p
X
2
X
1
(x
2
[x
1
) (72)
Note the asymmetry – there is the joint probability multiplied by the log of the conditional probability. This
is not like either the joint or the marginal entropy.
We could also have multivariate conditional entropy,
H[X
N
[X
N−1
, . . . , X
1
] = −
x
N−1
∈S
X
N−1
x
1
∈S
X
1
p
X
1
,...,X
N
(x
1
, x
N
) log
2
p
X
N
X
N−1
,...,X
1
(x
N
[x
N−1
, . . . , x
1
)
which is the additional entropy (or information) contained in the Nth random variable, given the values of the
N −1 previous random variables.
35.4 Entropy Rate
Typically, we’re interested in discretetime random processes, in which we have random variables X
1
, X
2
, . . ..
Since there are inﬁnitely many of them, the joint entropy of all of them may go to inﬁnity as N → ∞. For this
case, we are more interested in the rate. How many additional bits, in the limit, are needed for the average r.v.
as N → ∞?
Def ’n: Entropy Rate
The entropy rate of a stationary discretetime random process, in units of bits per random variable (a.k.a.
source output), is deﬁned as
H = lim
N→∞
H[X
N
[X
N−1
, . . . , X
1
].
It can be shown that entropy rate can equivalently be written as
H = lim
N→∞
1
N
H[X
1
, X
2
, . . . , X
N
].
Example: Entropy of English text
Let X
i
be the ith letter or space in a common English sentence. What is the sample space S
X
i
? Is X
i
uniform
on that space?
ECE 5520 Fall 2009 132
What is H[X
i
]? Solution: I had Matlab read in the text of Shakespeare’s Romeo and Juliet. See Figure
67(a). For this pmf, I calculated an entropy of H = 4.1199. The Proakis & Salehi book mentions that this value
for general English text is about 4.3.
What is H[X
i
, X
i+1
]? Solution: Again, using Matlab on Shakespeare’s Romeo and Juliet, I calculated the
entropy of the joint pmf of each twoletter combination. This gives me the twodimensional pmf shown in Figure
??(b). I calculate an entropy of 7.46, which is 2 3.73. For the threeletter combinations, the joint entropy was
10.04 = 3 3.35. For fourletter combinations, the joint entropy was 11.98 = 4 2.99.
You can see that the average entropy in bits per letter is decreasing quickly.
(a)
sp f k p u z
0
0.05
0.1
0.15
0.2
Space or Letter
P
o
c
c
u
r
r
e
n
c
e
(b) Second Space or Letter
F
i
r
s
t
S
p
a
c
e
o
r
L
e
t
t
e
r
sp f k p u z
sp
f
k
p
u
z
0.005
0.01
0.015
0.02
0.025
0.03
Figure 67: PMF of (a) single letters and (b) twoletter combinations (including spaces) in Shakespeare’s Romeo
and Juliet.
What is the entropy rate, H? Solution: For N = 10, we have H = 1.3 bits/letter [taken from Proakis &
Salehi Section 6.2].
35.5 Source Coding Theorem
The key connection between this mathematical deﬁnition of entropy and the bit rate that we’ve been talking
about all semester is given by the source coding theorem. It is one of the two fundamental theorems of
information theory, and was introduced by Claude Shannon in 1948.
Theorem: A source with entropy rate H can be encoded with arbitrarily small error probability, at any rate
R (bits / source output) as long as R > H. Conversely, if R < H, the error probability will be bounded away
from zero, independent of the complexity of the encoder and the decoder employed.
Proof: Proof: Using typical sequences. See Shannon’s original 1948 paper.
Notes:
• Here, an ‘error’ occurs when your compressed version of the data is not exactly the same as the original.
Example: B&W images.
• R is our 1/T
b
.
• Theorem fact: Information measure (entropy) gives us a minimum bit rate.
• What is the minimum possible rate to encode English text (if you remove all punctuation)?
• The theorem does not tell us how to do it – just that it can be done.
ECE 5520 Fall 2009 133
• The theorem does not tell us how well it can be done if N is not inﬁnite. That is, for a ﬁnite source, the
rate may need to be higher.
Lecture 24
Today: (1) Channel Capacity
36 Review
Last time, we deﬁned entropy,
H[X] = −
i
p
i
log
2
p
i
and entropy rate,
H = lim
N→∞
1
N
H[X
1
, X
2
, . . . , X
N
].
We showed that entropy can be used to quantify information. Given our information source X or ¦X
i
¦, the
value of H[X] or H gives us a measure of how many bits we actually need to use to encode, without loss, the
source data.
The major result was the Shannon’s source coding theorem, which says that a source with entropy rate H
can be encoded with arbitrarily small error probability, at any rate R (bits / source output) as long as R > H.
Any lower rate than H would guarantee loss of information.
37 Channel Coding
Now, we turn to the noisy channel. This discussion of entropy allows us to consider the maximum data rate
which can be carried without error on a bandlimited channel, which is aﬀected by additive White Gaussian
noise (AWGN).
37.1 R. V. L. Hartley
Ralph V. L. Hartley (born Nov. 30, 1888) received the A.B. degree from the University of Utah in 1909. He
worked as a researcher for the Western Electric Company, involved in radio telephony. Here he developed the
“Hartley oscillator”. Afterwards, at Bell Laboratories, he developed relationships useful for determining the
capacity of bandlimited communication channels. In July 1928, he published in the Bell System Technical
Journal a paper on “Transmission of Information”.
Hartley was particularly inﬂuenced by Nyquist’s result. When transmitting a sequence of pulses, each of
duration T
sy
, Nyquist determined that the pulse rate was limited to two times the available channel bandwidth
B,
1
T
sy
≤ 2B.
In Hartley’s 1928 paper, he considered digital transmission in pulseamplitude modulated systems. The
pulse rate was limited to 2B, as described by Nyquist. But, depending on how pulse amplitudes were chosen,
each pulse could represent more or less information.
In particular, Hartley assumed that the maximum amplitude available to the transmitter was A. Then,
Hartley made the assumption that the communication system could discern between pulse amplitudes, if they
ECE 5520 Fall 2009 134
were at separated by at least a voltage spacing of A
δ
. Given that a PAM system operates from 0 to A in
increments of A
δ
, the number of diﬀerent pulse amplitudes (symbols) is
M = 1 +
A
A
δ
Note that early receivers were modiﬁed AM envelope detectors, and did not deal well with negative amplitudes.
Next, Hartley used the ‘bit’ measure to quantify the data which could be encoded using M amplitude levels,
log
2
M = log
2
_
1 +
A
A
δ
_
Finally, Hartley quantiﬁed the data rate using Nyquist’s relationship to determine the maximum rate C, in
bits per second, possible from the digital communication system,
C = 2Blog
2
_
1 +
A
A
δ
_
(Note that the Hartley transform came later in his career in 1942).
37.2 C. E. Shannon
What was left unanswered by Hartley’s capacity formula was the relationship between noise and the minimum
amplitude separation between symbols. Engineers would have to be conservative when setting A
δ
to ensure a
low probability of error.
Furthermore, the capacity formula was for a particular type of PAM system, and did not say anything
fundamental about the relationship between capacity and bandwidth for arbitrary modulation.
37.2.1 Noisy Channel
Shannon did take into account an AWGN channel, and used statistics to develop a universal bound for capacity,
regardless of modulation type. In this AWGN channel, the ith symbol sample at the receiver (after the matched
ﬁlter, assuming perfect syncronization) is y
i
,
y
i
= x
i
+z
i
where X is the transmitted signal and Z is the noise in the channel. The noise term z
i
is assumed to be i.i.d.
Gaussian with variance P
N
.
37.2.2 Introduction of Latency
Shannon’s key insight was to exchange latency (time delay) for reduced probability of error. In fact, his capacity
bound considers simultaneously demodulating sequences of received symbols, y = [y
1
, . . . , y
n
], of length n. All
n symbols are received before making a decision. This late decision will decide all values of y = [x
1
, . . . , x
n
]
simultaneously. Further, Shannon’s proof considers the limiting case as n → ∞.
This asymptotic limit as n → ∞ allows for a proof using the statistical convergence of a sequence of random
variables. In particular, we need a law called the law of large numbers (ECE 6962 topic). This law says that
the following event,
1
n
n
i=1
(y
i
−x
i
)
2
≤ P
N
happens with probability one, as n → ∞. In other words, as n → ∞, the measured value y will be located
within an ndimensional sphere (hypersphere) of radius
√
nP
N
with center x.
ECE 5520 Fall 2009 135
37.2.3 Introduction of Power Limitation
Shannon also formulated the problem as a powerlimited case, in which the average power in the desired signal
x
i
was limited to P. That is,
1
n
n
i=1
x
2
i
≤ P
This combination of signal power limitation and noise power results in the fact that,
1
n
n
i=1
y
2
i
≤
1
n
n
i=1
x
2
i
+
1
n
n
i=1
(y
i
−x
i
)
2
≤ P +P
N
As a result
y
2
≤ n(P +P
N
)
This result says that the vector y, with probability one as n → ∞, is contained within a hypersphere of radius
_
n(P +P
N
) centered at the origin.
37.3 Combining Two Results
The two results, together, show how we many diﬀerent symbols we could have uniquely distinguished, within
a period of n sample times. Hartley asked how many symbol amplitudes could be ﬁt into [0, A] such that they
are all separated by A
δ
. Shannon’s formulation asks us how many multidimensional amplitudes x
i
can be ﬁt
into a hypersphere of radius
_
n(P +P
N
) centered at the origin, such that hyperspheres of radius
√
nP
N
do
not overlap. This is shown in P&S in Figure 9.9 (pg. 584).
This number M is the number of diﬀerent messages that could have been sent in n pulses. The result of
this geometrical problem is that
M =
_
1 +
P
P
N
_
(
n/2) (73)
37.3.1 Returning to Hartley
Adjusting Hartley’s formula, if we could send M messages now in n pulses (rather than 1 pulse) we would adjust
capacity to be:
C =
2B
n
log
2
M
Using the M from (76) above,
C =
2B
n
n
2
log
2
_
1 +
P
P
N
_
= Blog
2
_
1 +
P
P
N
_
37.3.2 Final Results
Finally, we can replace the noise variance P
N
with the relationship between noise twosided PSD N
0
/2, by
P
N
= N
0
B
ECE 5520 Fall 2009 136
So ﬁnally we have the ShannonHartley Theorem,
C = Blog
2
_
1 +
P
N
0
B
_
(74)
Typically, people also use W = B so you’ll see also
C = W log
2
_
1 +
P
N
0
W
_
(75)
This result says that a communication system can operate at bit rate C (in a bandlimited channel
with width W given power limit P and noise value N
0
), with arbitrarily low probability of error.
Shannon also proved that any system which operates at a bit rate higher than the capacity C will certainly
incur a positive bit error rate. Any practical communication system must operate at R
b
< C, where R
b
is the
operating bit rate.
Note that the ratio
P
N
0
W
is the signal power divided by the noise power, or signal to noise ratio (SNR). Thus
the capacity bound is also written C = W log
2
(1 +SNR).
37.4 Eﬃciency Bound
Recall that c
b
= PT
b
. That is, the energy per bit is the power multiplied by the bit duration. Thus from (74),
C = W log
2
_
1 +
c
b
/T
b
N
0
W
_
or since R
b
= 1/T
b
,
C = W log
2
_
1 +
R
b
W
c
b
N
0
_
Here, C is just a capacity limit. We know that our bit rate R
b
≤ C, so
R
b
W
≤ log
2
_
1 +
R
b
W
c
b
N
0
_
Deﬁning η =
R
b
W
,
η ≤ log
2
_
1 +η
c
b
N
0
_
This expression can’t analytically be solved for η. However, you can look at it as a bound on the bandwidth
eﬃciency as a function of the
E
b
N
0
ratio. This relationship is shown in Figure 70.
Lecture 25
Today: (1) Channel Capacity
• Lecture Tue April 28, is canceled. Instead, please the Robert Graves lecture 3:05 PM in Room 105 WEB,
“Digital Terrestrial Television Broadcasting”.
• The ﬁnal project is due Tue, May 5, 5pm. No late projects are accepted, because of the late due date.
If you think you might be late, set your deadline to May 4.
• Everyone gets a 100% on the “Discussion Item” which I never implemented.
ECE 5520 Fall 2009 137
(a)
0 5 10 15 20 25 30
0
2
4
6
8
10
12
14
E
b
/N
0
ratio (dB)
B
i
t
s
/
s
e
c
o
n
d
p
e
r
H
e
r
t
z
(b)
0 5 10 15 20 25 30
10
−1
10
0
10
1
E
b
/N
0
ratio (dB)
B
i
t
s
/
s
e
c
o
n
d
p
e
r
H
e
r
t
z
Figure 68: From the ShannonHartley theorem, bound on bandwidth eﬃciency, η, on a (a) linear plot and (b)
log plot.
ECE 5520 Fall 2009 138
38 Review
Last time, we deﬁned entropy,
H[X] = −
i
p
i
log
2
p
i
and entropy rate,
H = lim
N→∞
1
N
H[X
1
, X
2
, . . . , X
N
].
We showed that entropy can be used to quantify information. Given our information source X or ¦X
i
¦, the
value of H[X] or H gives us a measure of how many bits we actually need to use to encode, without loss, the
source data.
The major result was the Shannon’s source coding theorem, which says that a source with entropy rate H
can be encoded with arbitrarily small error probability, at any rate R (bits / source output) as long as R > H.
Any lower rate than H would guarantee loss of information.
39 Channel Coding
Now, we turn to the noisy channel. This discussion of entropy allows us to consider the maximum data rate
which can be carried without error on a bandlimited channel, which is aﬀected by additive White Gaussian
noise (AWGN).
39.1 R. V. L. Hartley
Ralph V. L. Hartley (born Nov. 30, 1888) received the A.B. degree from the University of Utah in 1909. He
worked as a researcher for the Western Electric Company, involved in radio telephony. Afterwards, at Bell Labo
ratories, he developed relationships useful for determining the capacity of bandlimited communication channels.
In July 1928, he published in the Bell System Technical Journal a paper on “Transmission of Information”.
Hartley was particularly inﬂuenced by Nyquist’s sampling theorem. When transmitting a sequence of pulses,
each of duration T
s
, Nyquist determined that the pulse rate was limited to two times the available channel
bandwidth B,
1
T
s
≤ 2B.
In Hartley’s 1928 paper, he considered digital transmission in pulseamplitude modulated systems. The
pulse rate was limited to 2B, as described by Nyquist. But, depending on how pulse amplitudes were chosen,
each pulse could represent more or less information.
In particular, Hartley assumed that the maximum amplitude available to the transmitter was A. Then,
Hartley made the assumption that the communication system could discern between pulse amplitudes, if they
were at separated by at least a voltage spacing of A
δ
. Given that a PAM system operates from 0 to A in
increments of A
δ
, the number of diﬀerent pulse amplitudes (symbols) is
M = 1 +
A
A
δ
Note that early receivers were modiﬁed AM envelope detectors, and did not deal well with negative amplitudes.
Next, Hartley used the ‘bit’ measure to quantify the data which could be encoded using M amplitude levels,
log
2
M = log
2
_
1 +
A
A
δ
_
ECE 5520 Fall 2009 139
Finally, Hartley quantiﬁed the data rate using Nyquist’s relationship to determine the maximum rate C, in
bits per second, possible from the digital communication system,
C = 2Blog
2
_
1 +
A
A
δ
_
39.2 C. E. Shannon
What was left unanswered by Hartley’s capacity formula was the relationship between noise and the minimum
amplitude separation between symbols. Engineers would have to be conservative when setting A
δ
to ensure a
low probability of error.
Furthermore, the capacity formula was for a particular type of PAM system, and did not say anything
fundamental about the relationship between capacity and bandwidth for arbitrary modulation.
39.2.1 Noisy Channel
Shannon did take into account an AWGN channel, and used statistics to develop a universal bound for capacity,
regardless of modulation type. In this AWGN channel, the ith symbol sample at the receiver (after the matched
ﬁlter, assuming perfect synchronization) is y
i
,
y
i
= x
i
+z
i
where X is the transmitted signal and Z is the noise in the channel. The noise term z
i
is assumed to be
i.i.d. Gaussian with variance E
N
= N
0
/2.
39.2.2 Introduction of Latency
Shannon’s key insight was to exchange latency (time delay) for reduced probability of error. In fact, his capacity
bound considers ndimensional signaling. So the received vector is y = [y
1
, . . . , y
n
], of length n. These might be
truly an ndimensional signal (e.g., FSK), or they might use multiple symbols over time (recall that symbols at
diﬀerent multiples of T
s
are orthogonal). In either case, Shannon uses all n dimensions in the constellation – the
detector must use all n samples of y to make a decision. In the multiple symbols over time, this late decision
will decide all values of x = [x
1
, . . . , x
n
] simultaneously. Further, Shannon’s proof considers the limiting case as
n → ∞.
This asymptotic limit as n → ∞ allows for a proof using the statistical convergence of a sequence of random
variables. In particular, we need a law called the law of large numbers. This law says that the following event,
1
n
n
i=1
(y
i
−x
i
)
2
≤ E
N
happens with probability one, as n → ∞. In other words, as n → ∞, the measured value y will be located
within an ndimensional sphere (hypersphere) of radius
√
nE
N
with center x.
39.2.3 Introduction of Power Limitation
Shannon also formulated the problem as a energylimited case, in which the maximum symbol energy in the
desired signal x
i
was limited to E. That is,
1
n
n
i=1
x
2
i
≤ E
ECE 5520 Fall 2009 140
This combination of signal energy limitation and noise energy results in the fact that,
1
n
n
i=1
y
2
i
≤
1
n
n
i=1
x
2
i
+
1
n
n
i=1
(y
i
−x
i
)
2
≤ E +E
N
As a result
y
2
≤ n(E +E
N
)
This result says that the vector y, with probability one as n → ∞, is contained within a hypersphere of radius
_
n(E +E
N
) centered at the origin.
39.3 Combining Two Results
The two results, together, show how we many diﬀerent symbols we could have uniquely distinguished, within
a period of n sample times. Hartley asked how many symbol amplitudes could be ﬁt into [0, A] such that they
are all separated by A
δ
. Shannon’s formulation asks us how many multidimensional amplitudes x
i
can be ﬁt
into a hypersphere of radius
_
n(E +E
N
) centered at the origin, such that hyperspheres of radius
√
nE
N
do
not overlap. This is shown in Figure 69.
n
E
E
(
+
)
N
radius
nE
N
Figure 69: Shannon’s capacity formulation simpliﬁes to the geometrical question of: how many hyperspheres of
a smaller radius
√
nE
N
ﬁt into a hypersphere of radius
_
n(E +E
N
)?
This number M is the number of diﬀerent messages that could have been sent in n pulses. The result of
this geometrical problem is that
M =
_
1 +
E
E
N
_
n/2
(76)
39.3.1 Returning to Hartley
Adjusting Hartley’s formula, if we could send M messages now in n pulses (rather than 1 pulse) we would adjust
capacity to be:
C =
2B
n
log
2
M
Using the M from (76) above,
C =
2B
n
n
2
log
2
_
1 +
E
E
N
_
= Blog
2
_
1 +
E
E
N
_
ECE 5520 Fall 2009 141
39.3.2 Final Results
Since energy is power multiplied by time, E = PT
s
=
P
2B
where P is the maximum signal power and B is the
bandwidth, and E
N
= N
0
/2, we have the ShannonHartley Theorem,
C = Blog
2
_
1 +
P
N
0
B
_
. (77)
This result says that a communication system can operate at bit rate C (in a bandlimited channel
with width B given power limit E and noise value N
0
), with arbitrarily low probability of error.
Shannon also proved that any system which operates at a bit rate higher than the capacity C will certainly
incur a positive bit error rate. Any practical communication system must operate at R
b
< C, where R
b
is the
operating bit rate.
Note that the ratio
E
N
0
B
is the signal power divided by the noise power, or signal to noise ratio (SNR). Thus
the capacity bound is also written C = Blog
2
(1 +SNR).
39.4 Eﬃciency Bound
Another way to write the maximum signal power P is to multiply it by the bit period and use it as the maximum
energy per bit, i.e., c
b
= PT
b
. That is, the energy per bit is the maximum power multiplied by the bit duration.
Thus from (77),
C = Blog
2
_
1 +
c
b
/T
b
N
0
B
_
or since R
b
= 1/T
b
,
C = Blog
2
_
1 +
R
b
B
c
b
N
0
_
Here, C is just a capacity limit. Be know that our bit rate R
b
≤ C, so
R
b
B
≤ log
2
_
1 +
R
b
B
c
b
N
0
_
Deﬁning η =
R
b
B
(the spectral eﬃciency),
η ≤ log
2
_
1 +η
c
b
N
0
_
This expression can’t analytically be solved for η. However, you can look at it as a bound on the bandwidth
eﬃciency as a function of the
E
b
N
0
ratio. This relationship is shown in Figure 70. Figure 71 is the plot on a logy
axis with some of the modulation types discussed this semester.
ECE 5520 Fall 2009 142
0 5 10 15 20 25 30
0
2
4
6
8
10
12
14
E
b
/N
0
ratio (dB)
B
i
t
s
/
s
e
c
o
n
d
p
e
r
H
e
r
t
z
Figure 70: From the ShannonHartley theorem, bound on bandwidth eﬃciency, η.
−10 0 10 20 30
10
−2
10
−1
10
0
10
1
E
b
/N
0
(dB)
B
a
n
d
w
i
d
t
h
E
f
f
i
c
i
e
n
c
y
(
b
p
s
/
H
z
)
4−FSK
16−FSK
64−FSK
256−FSK
1024−FSK
4−QAM
16−QAM
64−QAM
256−QAM
1024−QAM
8−PSK
16−PSK
32−PSK
64−PSK
Figure 71: From the ShannonHartley theorem bound with achieved bandwidth eﬃciencies of MQAM, MPSK,
and MFSK.
ECE 5520 Fall 2009
2
Contents
1 Class Organization 2 Introduction 2.1 ”Executive Summary” . . . . . . . . . . . 2.2 Why not Analog? . . . . . . . . . . . . . . 2.3 Networking Stack . . . . . . . . . . . . . . 2.4 Channels and Media . . . . . . . . . . . . 2.5 Encoding / Decoding Block Diagram . . . 2.6 Channels . . . . . . . . . . . . . . . . . . 2.7 Topic: Random Processes . . . . . . . . . 2.8 Topic: Frequency Domain Representations 2.9 Topic: Orthogonality and Signal spaces . 2.10 Related classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 8 8 9 9 10 10 11 12 12 12 13
3 Power and Energy 13 3.1 DiscreteTime Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 Decibel Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4 TimeDomain Concept Review 15 4.1 Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Impulse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 Bandwidth 5.1 Continuoustime Frequency Transforms 5.1.1 Fourier Transform Properties . . 5.2 Linear Time Invariant (LTI) Filters . . . 5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 17 19 20 20
6 Bandpass Signals 21 6.1 Upconversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6.2 Downconversion of Bandpass Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 7 Sampling 23 7.1 Aliasing Due To Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 7.2 Connection to DTFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 7.3 Bandpass sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 8 Orthogonality 8.1 Inner Product of Vectors . . 8.2 Inner Product of Functions 8.3 Deﬁnitions . . . . . . . . . . 8.4 Orthogonal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 28 29 30 31
32 9 Orthonormal Signal Representations 9.1 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 9.2 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 9.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
ECE 5520 Fall 2009
3 37 38 38 39 40 42 42 43 43 44 45 45
10 MultiVariate Distributions 10.1 Random Vectors . . . . . . . . . . . . . . . . . 10.2 Conditional Distributions . . . . . . . . . . . . 10.3 Simulation of Digital Communication Systems . 10.4 Mixed Discrete and Continuous Joint Variables 10.5 Expectation . . . . . . . . . . . . . . . . . . . . 10.6 Gaussian Random Variables . . . . . . . . . . . 10.6.1 Complementary CDF . . . . . . . . . . 10.6.2 Error Function . . . . . . . . . . . . . . 10.7 Examples . . . . . . . . . . . . . . . . . . . . . 10.8 Gaussian Random Vectors . . . . . . . . . . . . 10.8.1 Envelope . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
11 Random Processes 46 11.1 Autocorrelation and Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 11.1.1 White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 12 Correlation and MatchedFilter Receivers 12.1 Correlation Receiver . . . . . . . . . . . . . 12.2 Matched Filter Receiver . . . . . . . . . . . 12.3 Amplitude . . . . . . . . . . . . . . . . . . . 12.4 Review . . . . . . . . . . . . . . . . . . . . . 12.5 Correlation Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 48 49 50 50 51
13 Optimal Detection 51 13.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 13.2 Bayesian Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 14 Binary Detection 14.1 Decision Region . . . . . . . . . . . . . . . . . 14.2 Formula for Probability of Error . . . . . . . 14.3 Selecting R0 to Minimize Probability of Error 14.4 LogLikelihood Ratio . . . . . . . . . . . . . . 14.5 Case of a0 = 0, a1 = 1 in Gaussian noise . . . 14.6 General Case for Arbitrary Signals . . . . . . 14.7 Equiprobable Special Case . . . . . . . . . . 14.8 Examples . . . . . . . . . . . . . . . . . . . . 14.9 Review of Binary Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 53 53 53 55 55 56 56 56 57
15 Pulse Amplitude Modulation (PAM) 58 15.1 Baseband Signal Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 15.2 Average Bit Energy in M ary PAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 16 Topics for Exam 1 60
. . . . . . . . . . . . . . . . 24. . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Random Processes. . . . . . . . . 18 Probability of Error in Binary PAM 18. . . . . .3 Signal Constellations . . . . . . . . . . . . . 69 21. . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Exact Error Analysis . . . . . . . . . 25. . . . . . . . . .2 Bit Errors and Gray Encoding . . . . . . . . . . . . . . . . . . . . . . . . Noise PSD . . . . . . . . . . . . . . . . . . 24. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25. . . . .2 Constellation . . .7 Systems which use QAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PSD . . 25. . . of Error .5 Correlation / Matched Filter Receivers . . . . . . . . . . .3 Binary PAM Error Probabilities . . . . . . . . . . . . . . . . . . . . . . . .1 Bit Error Probabilities . . . . . . . . . . . . . . . . . . . . 72 22. . . . . . . . . .2 SquareRoot Raised Cosine Filtering . . 20. 17. . . . . . . . . . . . . . . . . . . . . . . . .1 Symbol Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17. . . . . . . . . . . . . . . . . . . 21 Intersymbol Interference 69 21. . . . . . . . . . 25. .2 Transmitter and Receiver Filters .3 Orthogonality and Signal Space . . . .4 Probability of Error in QPSK . . . 24. . . . . . . Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 BER Function of Distance. .1. . . . . . . . . . . . . . . . . . . . . . . . .1 Showing Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . .6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2 Options for Probability of Error Expressions . . . .6 PhaseShift Keying .1 General Formula for Union Boundbased . . 72 23 M ary Detection Theory in N dimensional signal space 24 Quadrature Amplitude Modulation (QAM) 24. . . . . . . . . . . . . . . . . . . 24. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Average Energy in MQAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Detection with Multiple Symbols 20 M ary PAM Probability of Error 20. . . 25. . . 17. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Application of Union Bound . . . . . . . .ECE 5520 Fall 2009 4 61 61 61 61 62 62 63 63 63 64 66 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Angle and Magnitude Representation . . . .1 Symbol Error Rate and Average Bit Energy 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Raised Cosine Filtering . . . . . . . . . . . . . . . . . . 18. . . . . . . . . . . . . . . . .1 Signal Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 74 76 77 78 79 79 79 79 80 80 81 81 81 82 82 84 25 QAM Probability of Error 25. . . . . . . . . . . . . . 18. . . . 67 67 67 68 68 17 Additional Problems 17. 24. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24. . . . . . . . . . . . . . . . . . . . . . . . . 25.1 Multipath Radio Channel . . . . . . . . . . . . . . . . . . . 70 22 Nyquist Filtering 70 22. . . . . . . . . . . . .2 Sampling and Aliasing . . . . .1 Spectrum of Communication Signals . . . . . . . . . . . .1 Overview of Future Discussions on QAM . . . . . . .5 Union Bound . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30. . . . . . . . . . . . .3 Reception of FSK . . . . . .9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Nonfreespace Channels . E 29. . . . 30. . . . . . . . . . . . . .2 Summary . . . . . . . . 0 29. .2 Transmission of FSK . . . . . . . . . . . . . . . . . . . . . . . . . . 85 26. 29. Part 2 . . 27. . . . . . 29. . . . . . . . . . . . . . . . . . . . . . . . 88 88 89 90 90 90 91 93 93 94 94 95 95 28 Frequency Multiplexing 95 28. . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . . 27. . . . .1 Diﬀerential Encoding for BPSK . . . . . . . . . . . . . . . . .1 NearestNeighbor Approximate Probability of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . .2 Points for Comparison . . . . . .2 FSK . . . . . . . 30. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29. . . . . . . .5 Noncoherent Reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Computing Received Power . . . . . . . . . . . . . . . . . . . . . . . . . Nb .10Bandwidth of FSK . . . . . . . . . . . . . . 27. . . . . . . . . . . . . 27. . . . . PAM and QAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Frequency Selective Fading . . . . . . . . . .2 Summary and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30. . . . . . . . . . . . 29. 97 29 Comparison of Modulation Methods 29. . . 30 Link Budgets and System Design 30. . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 DPSK Transmitter . . . . . . . . . . .3. . . . .6. . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Power and Energy Limited Channels 30.6 Transmitter Complexity . . .6 Receiver Block Diagrams . . . . . . . . . . . .3. . . . . . . . . . .1 PSK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . .7 Probability of Error for Coherent Binary FSK .7 Oﬀset QPSK . . 27. . .2 Beneﬁts of Frequency Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 Probability of Error for Noncoherent Binary FSK 27. . . . .9. . . . . . . . . . . . . . . . . . . . . . 0 E 29. . . . . . . . . . . . . . . . . . . . . . . . . . 29. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Free Space . . . . . . . . . .9 FSK Error Probabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29. . . . .3 Probability of Bit Error for DPSK 29.1. .2 DPSK Receiver . . . . . . . 86 27 Frequency Shift Keying 27. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27. . . .4 Bandwidth Eﬃciency vs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 98 99 99 100 100 101 101 101 102 102 103 103 104 105 105 106 108 110 110 111 111 112 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30. . . . . . . . . .ECE 5520 Fall 2009 5 26 QAM Probability of Error 85 26. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Coherent Reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Orthogonal Frequencies . . . . . . . . . 27. . . . . . . . . . . . . . . . . . . . 29. . .4 Computing Noise Energy . . . 96 28. . . . . . . . . . . . . . . . . . . . . . . 95 28. . . 27. 27.1 Link Budgets Given C/N0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Wired Channels . . . . . . . . . . . .3 Bandwidth Eﬃciency . . . . . . . . .1 M ary NonCoherent FSK . . . .8 Receiver Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29. . . . . .5 Fidelity (P [error]) vs. . . . . . . . . . . . . . . . . . . . .3 OFDM as an extension of FSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27. . . . . .1 Linear / Nonlinear Ampliﬁers . . .
. . . 33. . . 33 Final Project Overview 33. . . . . . . . . . . . . . . 124 . . . . . . . . . . . . . . . . . . .2 Timing Error Detection .3 Conditional Entropy . . . . . . . . . . . . . . . . . . . 35. . . . . . .1 R. . . . . . . .5 Voltage Controlled Clock (VCC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 . . . . . . . . . . . . . . . . . . . . . . . 123 . . . . . . . 37. . . . . . . . . . . V. . . . . . . . . . 134 . . . 124 . . . . . . . . . . . . . . . . . . . . . . . 37. . . . . . . . . 134 . . . . . . . . 115 .5 DiscreteTime Implementations .1 Returning to Hartley . . . . . . . . . . 128 . . . . . . . . . . . . . . . . . . . 119 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37. . . . . . . . .3 Combining Two Results . . 117 118 . . . . . . . . . . . . 35. . . . . . 37. Hartley . . . . . . . . . . . . . . . . . . . . . . . . . 113 114 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Review of Interpolation Filters . . . . . . 124 . . .1 QPSK Timing Error Detection . . . . . . . . . . . . . . 130 . . . . . . . 32. . . . . . . . . . . . . . . . 34 Exam 2 Topics 35 Source Coding 35. . . . . . . . . . . . . . . . . . . 130 . . . . . . . . . . . . . . . . . . 133 . . . . . . . . . . . . .2. . . . .4 Eﬃciency Bound . . . .4 Entropy Rate . . . . . . . . . . . 37. . . . . . . . . . . . . . . . .6. . . . . . . . . . . . . . . . . . .3 Approximate Interpolation Filters . . . . . . . . . . .6. .1 Phase Detector . . . . . . . . . . . . .1 Higher order polynomial interpolation ﬁlters . .1 Sampling Time Notation . . . . . . . . . . . . . . . . . . . . . . . . . 136 138 . . . . . . . . 38 Review . . . . . . . . . . . . . . . . . . . . . . . . .5 Examples . . . . . . . . 135 . . . . . . . .4. . . . . 125 126 127 .2 Seeing Interpolation as Filtering . . . . . . . . . . . . . . . . . . . . . . . . 131 . . . . . . . . 32. . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . .4 Zerocrossing timing error detector (ZCTED) 33. . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 . . . . . . . . . . . . . . . . .3 Introduction of Power Limitation 37.4 Implementations . 33. . . 33. . . . . . . . . . . . . . . . . . . . . . . . . . . .6. .2 Final Results . . 135 . . . . . . . 33. . 116 . . . . . . . . . . . . . . . 112 31 Timing Synchronization 32 Interpolation 32. . . . . . . . . .3.6. .3 VCO . . . . . . . . . . . . . . Shannon . . . . . . . . . . . . . 134 . . . . . . . . . . . . . . . . . . . . . . . . 119 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37. . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . 37. . . . . . . . . . . . 122 . . . . . . . 33. . . . . . . . . . . . . . . . . . . . . . . . . . . 33. 117 . . . . . . . . . . . . . . . . . . . . 135 . . . . . . . . . . . .2. . . . . . .2 C. .4 Analysis . . . . . . . . . . . . . . . . . . . . . . .2 Loop Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Source Coding Theorem 36 Review 37 Channel Coding 37. . . . . . . . . . . . . . . . 33. . . . . . . . . . .3 Earlylate timing error detector (ELTED) . .6 Phase Locked Loops . . . . . . . . . . . . . . . . . 33. . . . . . . . . . . . . . . 33. . . . . . . . . . . . . . . 35. . . . . .2 Joint Entropy . . . . . 32. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Noisy Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E.6. . . . . 116 . . 135 . 120 . . . . . . . . . . . . . . . . . .ECE 5520 Fall 2009 6 30. 32. 121 . . . .2 Introduction of Latency . . .4. . . . . . . . . . . . . . 33. 35.1 Entropy . . . . . 132 133 133 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . L. . . . . . . . . . . . . . . . . . .
. . . . . . . . 39. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Eﬃciency Bound . . V. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Introduction of Power Limitation 39. . . . Shannon . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . .2. . . . . . . . . . . L. . . . . 39. . .3 Combining Two Results .1 R. . . . . . . . .1 Returning to Hartley . . . . . . . . . . . . 39. .1 Noisy Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39. . . . . . . . . . . . . . . . . . . . .2 C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hartley . . 39. . . . . . . . . .2. . . . . . . . . . . . . .2 Introduction of Latency . . . . 39. . . . . . . . . . . . . . . . .2. . .2 Final Results . . . . . .ECE 5520 Fall 2009 7 138 138 139 139 139 139 140 140 141 141 39 Channel Coding 39. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39. . . . . . . . . . . . . . . . . E. . .
In this section we talk about what we’ll cover in this class. But graduates today will almost always encounter / be developing solely digital communication systems. Salehi. discretevalued information across a physical channel. The keys points stuﬀed into that one sentence are: 1. Taking notes is important: I ﬁnd most learning requires some writing on your part. these are optional. Information sources might include audio. video.. Finally. continuoustime functions. 2. realvalued. Please take your own notes.1 ”Executive Summary” Here is the one sentence version: We will study how to eﬃciently encode digital data on a noisy. Lecture Notes I type my lecture notes. Prentice Hall. I ﬁnd it to be very wellwritten. on the channel (medium).ECE 5520 Fall 2009 8 Lecture 1 Today: (1) Syllabus (2) Intro to Digital Communications 1 Class Organization Textbook Few textbooks cover solely digital communications (without analog) in an introductory communications course. So half of most textbooks are useless.G. I’m constantly updating my notes. even up to the lecture time. If I didn’t. there are few options in this area. These waveforms are from a discrete set of possible waveforms. images) and even 1D or 2D. the past text was the ‘standard’ text in the area for an undergraduate course. discretevalued). I will provide additional readings solely to provide another presentation style or ﬁt another learning style. Proakis and M. not just watching. However. For example. What set of waveforms should we use? Why? . Digital information on an analog medium: We can send waveforms. Communication Systems Engineering. J. they may already be digital (discretetime. and I try to tailor my approach during the lecture. so that decoding the data (i. Unless speciﬁed.. 2001. text.e. 2nd edition. 2. They might be continuoustime (analog) signals (audio. My written notes do not and cannot reﬂect everything said during lecture: I answer questions and understand your perspective better after hearing your questions. you could just watch a recording! 2 Introduction A digital communication system conveys discretetime. eﬃcient. i. These can be available to you at lecture. what we won’t cover. Proakis & Salehi. Or. and more importantly. and/or after lecture online. And. and that the other half is sparse and needs supplemental material. and does it in depth. bandwidthlimited analog medium. reception) at a receiver is simple. and highﬁdelity. Our object is to convey the signals or data to another place (or time) with as faithful representation as possible. Students didn’t like that I had so many supplemental readings.e. you must accept these conditions: 1. This year’s text covers primarily digital communications. or data. I have taught ECE 5520 previously and have my lecture notes from past years.
none of the original waveforms will match exactly. 2. however. which you would study in a CS computer networking class. send them through a medium. Eﬃciency. iPhone. Analog systems still exist and will continue to exist. wireless car key). we study how to take ones and zeros from a transmitter. development of new systems will almost certainly be of digital communication systems. this course would study both analog and digital communication systems. and then (hopefully) correctly identify the same ones and zeros at the receiver. the opposite of error rate). Why? • Fidelity • Energy: transmit power.e. WiFi. To manage complexity. That is. We typically start with an application. In the recent past. Decoding the data: When receiving a signal (a function) in noise. The 7layer OSI stack. and device power consumption • Bandwidth eﬃciency: due to coding gains • Moore’s Law is decreasing device costs for digital hardware • Increasing need for digital information • More powerful information security 2.2 Why not Analog? The previous text used for this course. What makes a receiver diﬃcult to realize? What choices of waveforms make a receiver simpler to implement? What techniques are used in a receiver to compensate? 4. and ﬁdelity? We all want high ﬁdelity. has an extensive analysis and study of analog communication systems.. Bluetooth. we study digital communications from bits to bits. we (engineers) don’t try to build a system to do everything all at once. You can look at this like an impedance matching problem from circuits. How do you make a decision about which waveform was sent? 3. to have the source impedance match the destination impedance. wireless keyboard. this means that we want our waveform choices to match the channel and receiver to maximize the eﬃciency of the communication system. bandwidth. and we build a layered network to handle the application. There’s a lot more than this to the digital communication systems which you use on a daily basis (e.g. In digital comm. Bandwidth. such as radio and television broadcasting (Chapter 3). for power eﬃciency. What is the tradeoﬀ between energy.. You want. is as follows: • Application • Presentation (*) • Session (*) • Transport • Network . and low energy consumption and bandwidth usage (the costs of our communication system).ECE 5520 Fall 2009 9 2. by Proakis & Salehi. and Fidelity: Fidelity is the correctness of the received data (i.3 Networking Stack In this course.
or lossless). mmwave bands.) ECE 5520 is part of the bottom layer. What are some compression methods that you’re familiar with? We present an introduction to source encoding at the end of this course. waveguides. wire pairs. Here are some media: • EM Spectra: (anything above 0 Hz) Radio.5 Encoding / Decoding Block Diagram Source Encoder Channel Encoder Modulator Upconversion Information Source Other Components: Digital to Analog Converter Analog to Digital Converter Synchronization Channel Information Output Source Decoder Channel Decoder Demodulator Downconversion Figure 1: Block diagram of a singleuser digital communication system. and quantization of continuousvalued signals.. and (bottom) receiver. optical ﬁber. the physical layer has much more detail. • Source Encoding: Finding a compact digital representation for the data source. Includes sampling of continuoustime signals. the physical layer. 2. coaxial cable.ECE 5520 Fall 2009 • Link Layer • Physical (PHY) Layer 10 (Note that there is also a 5layer model in which * layers are considered as part of the application layer. including (top) transmitter. but we largely can’t change the properties of the medium (although there are exceptions). It is primarily divided into: • Multiple Access Control (MAC) • Encoding • Channel / Medium We can control the MAC and the encoding chosen for a digital communication.4 Channels and Media We can chose from a few media. light • Acoustic: ultrasound • Transmission lines. Microwave. . Also includes compression of those sources (lossy. . It may be continuous or packetized. In fact. Notes: • Information source comes from higher networking layers.. • Disk (data storage applications) 2. (middle) channel.
• Channels: See above for examples. or timeinvariant. movement of cars. He showed that (given enough time) we can consider them separately without additional loss. The ﬁlter may be constant. we lump into the ‘interference’ category. reduces complexity to the designer. we will focus primarily on the AWGN channel. Noise Transmitted Signal LTI Filter h(t) Received Signal Figure 2: Linear ﬁlter and additive noise channel model. or linear ﬁltering channel. due to multiple timedelayed reﬂections. thus it is referred to as additive white Gaussian noise (AWGN). if the medium. A channel decoder. people. because of the redundancy. These other transmitters whose signals we cannot completely cancel. 2. Typical models are additive noise. in real wireless channels. Narrowband wireless channels experience fading that varies quickly as a function of frequency. and white. the channel may change very quickly over time. • Modulation refers to the digitaltoanalog conversion which produces a continuoustime signal that can be sent on the physical channel. Why do we do both source encoding (which compresses the signal as much as possible) and also channel encoding (which adds redundancy to the signal)? Because of Shannon’s sourcechannel coding separation theorem. and how they are addressed. All of these can be modeled as linear ﬁlters. Even for stationary TX and RX. or nonwhite noise spectral density. Modulation and demodulation will be the main focus of this course. can correct some bit errors. the TX and RX do not move or change. like layering. trees. We will not study channel encoding. In this course. etc. but we will mention what variations exist for particular channels. in the environment may change the channel slowly over time. Well modeled as Gaussian.6 Channels A channel can typically be modeled as a linear ﬁlter with the addition of noise. And separation. like impedance matching ensured optimal power transfer. For example. The linear ﬁltering of the channel result from the physics and EM of the medium. However. Interference from other transmitted signals. These may result in nonGaussian noise distribution. 2. Wideband wireless channels display multipath.proper matching of a modulation to a channel allows optimal information transfer. diﬀractions. but predominantly: 1. . but it is a topic in the (ECE 6520) Coding Theory.ECE 5520 Fall 2009 11 • Channel encoding refers to redundancy added to the signal such that any bit errors can be corrected. The noise comes from a variety of sources. attenuation in telephone wires varies by frequency. Thermal background noise: Due to the physics of living above 0 Kelvin. and scattering of the signal oﬀ of the objects in the environment. It is analogous to impedance matching . for mobile radio.
The concept of sharing a channel is called multiple access (MA). Signals in diﬀerent frequency bands are orthogonal. Noise and attenuation of the channel will cause bit errors to be made by the demodulator and even the channel decoder. .8 Topic: Frequency Domain Representations To ﬁt as many signals as possible onto a channel. for (a) M = 8 and (b) M = 16. We want to build the best receiver possible despite the impediments. TCPIP) can determine that an error occurred and then rerequest the data..9 Topic: Orthogonality and Signal spaces To show that signals sharing the same channel don’t interfere with each other. We can also employ multiple orthogonal signals in a single transmitter and receiver. and learn an algorithm to take an arbitrary set of signals and output a set of orthogonal signals with which to represent them. M=8 M=16 Figure 3: Example signal space diagram for M ary Phase Shift Keying. 2. transmitted signal is used to show that it meets the spectrum allocation limits of the FCC. This means. There is a tradeoﬀ between frequency requirements and time requirements. attenuation. Each point is a vector which can be used to send a 3 or 4 bit sequence. as the example in Figure 36. which will be a major part of this course. The Fourier transform of our modulated. we often split the signals by frequency. we need to show that they are orthogonal. in order to provide multiple independent means (dimensions) on which to modulate information. 2. Optimal receiver design is something that we study using probability theory. or a higher layer networking protocol (eg. this is controlled by the FCC (in the US) and called spectrum allocation. We’ll use signal spaces to show graphically the results. We will study orthogonal signals. This may be tolerated.7 Topic: Random Processes Random things in a communication system: • Noise in the channel • Signal (bits) • Channel ﬁltering. and timing oﬀsets These random signals often pass through LTI ﬁlters. that a receiver can uniquely separate them.ECE 5520 Fall 2009 12 2. For the wireless channel. and are sampled. in short. phase. and fading • Device frequency. Separating signals by frequency band is called frequencydivision multiple access (FDMA). We have to tolerate errors.
2. A signal x(t) has energy deﬁned as E= ∞ −∞ x(t)2 dt For some signals. (ECE 3500) Signals and Systems. Also. Prerequisites: (ECE 5510) Random Processes. E will be inﬁnite because the signal is nonzero for an inﬁnite duration of time (it is always on). recall that our standard in signals and systems is deﬁne our signals.ECE 5520 Fall 2009 13 2. Use the units: energy is measured in Joules (J). . (ECE 6540): Estimation Theory Lecture 2 Today: (1) Power. (ECE 5720) Analog IC Design 6. (CS 5480) Computer Networks 7. Signal Processing: (ECE 5530): Digital Signal Processing 3. so x(t)2 is the power dissipated at time t (since power is equal to the voltage squared divided by the resistance). Devices and Circuits: (ECE 3700) Fundamentals of Digital System Design. Energy. Advanced Classes: (ECE 6590) Software Radio. When we want to know the power of a signal we assume it is being dissipated in a 1 Ohm resistor. Today’s lecture provides some tools to deal with power and energy. Networking: (ECE 5780) Embedded System Design. (ECE 6520) Information Theory and Coding. as voltage signals (V). Electromagnetics: EM Waves. such as x(t). power is measured in Watts (W) which is the same as Joules/second (J/sec). (ECE 53205321) Microwave Engineering. Fourier Transform Two of the biggest limitations in communications systems are (1) energy / power . 3 Power and Energy Recall that energy is power times time. Breadth: (ECE 5325) Wireless Communications 5. (ECE 5411) Fiberoptic Systems 4. and (2) bandwidth. dB (2) Timedomain concepts (3) Bandwidth.10 Related classes 1. These signals we call power signals and we compute their power as P = lim 1 T →∞ 2T T −T x(t)2 dt The signal with ﬁnite energy is called an energy signal. and starts the review of tools to analyze frequency content and bandwidth. (ECE 5324) Antenna Theory.
we refer to discrete samples of the sampled signal x as x(n). In either case. • Only use 20 as the multiplier if you are converting from voltage to power.they can apply to other unitless quantities as well. so we’ll follow the Rice text notation. for example. Just convert any dB number into a sum of multiples of 10. m). taking the log10 of a voltage and expecting the result to be a dB power value. In particular. [L(f )]dB = 10 log 10 H(f )2 (3) Note: Why is the capital B used? Either the lowercase ‘b’ in the SI system is reserved for bits. Matlab uses parentheses also. But. you can quickly convert losses or gains between linear and dB units without a calculator. i. Our standard is to consider power gains and losses.ECE 5520 Fall 2009 14 3. we use the unit decibel 10 log 10 (·) which is then abbreviated as dB in the SI system. so when the ‘bel’ was ﬁrst proposed as log10 (·). whenever you see a function of t (or perhaps τ ) it is a continuoustime function. not voltage gains and losses. l. such as a gain (loss) L(f ) through a ﬁlter H(f ).2 Decibel Notation We often use a decibel (dB) scale for power.1 DiscreteTime Signals In this book. Essentially. So if we say. it is a discretetime function. whenever you see a function of n (or k. energy and power are deﬁned as: E = ∞ n=−∞ x(n)2 N n=−N (1) x(n)2 P 1 = lim N →∞ 2N + 1 (2) 3.e. 20 in the dB formula. the channel has a loss of 20 dB. You may be more familiar with the x[n] notation.. • 10 dB: This means the number is ten times in linear terms. and 1. I’m sorry this is not more obvious in the notation. this refers to a loss in power. And maybe this one: • 1 dB: This means the number is a little over 25% more (multiply by 5/4) in linear terms. or it referred to a name ‘Bell’ so it was capitalized. it was capitalized. Remember these two dB numbers: • 3 dB: This means the number is double in linear terms. the output of the channel has 100 times less power than the input to the channel. For discretetime signals. then [P ]dBW = 10 log 10 Plin Decibels are more general . Note that (3) could also be written as: [L(f )]dB = 20 log10 H(f ) Be careful with your use of 10 vs. Example: Convert dB to linear values: (4) . If Plin is the power in Watts. With these three numbers. 3.
These last two are what we will need in Section 6. and Gt. G G λ2 4 4.lin t.lin and Gt. 30 dBW 2. Py.lin is the transmit power (W) and Pr.4. where d is the cable length (typically in units of km). as deﬁned in Rice Ch.lin is the received power (W). This is the Friis free space path loss formula.ECE 5520 Fall 2009 15 1. 20 dB 4. The smallest such constant T0 > 0 is the period. 33 dBm 3. 1. If a signal is not periodic it is aperiodic.lin (4πd)2 are the linear gains in the antennas. Po.lin is the gain in any connectors. and Lcable. The main idea is that we have a limited amount of power which will be available at the receiver.lin . when we discuss link budgets. Periodic signals have Fourier series representations. 40 mW Example: Convert power relationships to dB: Convert the expression to one which involves only dB terms.lin t. 4 dB Example: Convert linear values to dB: 1. Def ’n: Periodic (discretetime) A DT signal x(n) is periodic if x(n) = x(n + N0 ) for some integer N0 = 0. Pt.lin 2. where Po. where λ is the wavelength (m). 3.lin L−d cable.lin = Pt. The smallest positive integer N0 is the period.lin .lin is a loss in a 1 km cable.lin is the received power in a ﬁberoptic link. d is the path length (m). 0. . 2. Pr.lin = 100Px. for all integers n.2 W 2.1 TimeDomain Concept Review Periodicity Def ’n: Periodic (continuoustime) A signal x(t) is periodic if x(t) = x(t + T0 ) for some constant T0 = 0 for all t ∈ R.lin = Gconnector. Gconnector.
ECE 5520 Fall 2009
16
4.2
Impulse Functions
Def ’n: Impulse Function The (Dirac) impulse function δ(t) is the function which makes
∞ −∞
x(t)δ(t)dt = x(0)
(5)
true for any function x(t) which is continuous at t = 0. We are deﬁning a function by its most important property, the ‘sifting property’. Is there another deﬁnition which is more familiar? Solution: δ(t) = lim
T →0
1/T , −T ≤ t ≤ T 0, o.w.
You can visualize δ(t) here as an inﬁnitely high, inﬁnitesimally wide pulse at the origin, with area one. This is why it ‘pulls out’ the value of x(t) in the integral in (5). Other properties of the impulse function: • Time scaling, • Symmetry, • Sifting at arbitrary time t0 , The continuoustime unit step function is u(t) = 1, t ≥ 0 0, o.w.
Example: Sifting Property ∞ What is −∞ sin(πt) δ(1 − t)dt? πt
The discretetime impulse function (also called the Kronecker delta or δK ) is deﬁned as: δ(n) = 1, n = 0 0, o.w.
(There is no need to get complicated with the math; this is well deﬁned.) Also, u(n) = 1, n ≥ 0 0, o.w.
5
Bandwidth
Bandwidth is another critical resource for a digital communications system; we have various deﬁnitions to quantify it. In short, it isn’t easy to describe a signal in the frequency domain with a single number. And, in the end, a system will be designed to meet a spectral mask required by the FCC or system standard.
ECE 5520 Fall 2009 Periodicity Time ContinuousTime Periodic Aperiodic Laplace Transform x(t) ↔ X(s) Fourier Transform x(t) ↔ X(jω) ∞ X(jω) = t=−∞ x(t)e−jωt dt ∞ 1 x(t) = 2π ω=−∞ X(jω)ejωt dω zTransform x(n) ↔ X(z) Discrete Time Fourier Transform (DTFT) x(n) ↔ X(ejΩ ) −jΩn X(ejΩ ) = ∞ n=−∞ x(n)e π 1 x(n) = 2π n=−π X(ejΩ )dΩ
17
Fourier Series x(t) ↔ ak DiscreteTime Discrete Fourier Transform (DFT) x(n) ↔ X[k] 2π N −1 X[k] = n=0 x(n)e−j N kn 2π N −1 1 x(n) = N k=0 X[k]ej N nk
Table 1: Frequency Transforms Intuitively, bandwidth is the maximum extent of our signal’s frequency domain characterization, call it X(f ). A baseband signal absolute bandwidth is often deﬁned as the W such that X(f ) = 0 for all f except for the range −W ≤ f ≤ W . Other deﬁnitions for bandwidth are • 3dB bandwidth: B3dB is the value of f such that X(f )2 = X(0)2 /2. • 90% bandwidth: B90% is the value which captures 90% of the energy in the signal:
B90% −B90% ∞ =−∞
X(f )2 df = 0.90
Xd(f )2 df
As a motivating example, I mention the squareroot raised cosine (SRRC) pulse, which has the following desirable Fourier transform: √ 0 ≤ f  ≤ 1−α Ts , 2Ts πTs Ts 1−α 1−α (6) HRRC (f ) = 1 + cos α f  − 2Ts , 2Ts ≤ f  ≤ 1+α 2 2Ts 0, o.w.
where α is a parameter called the “rolloﬀ factor”. We can actually analyze this using the properties of the Fourier transform and many of the standard transforms you’ll ﬁnd in a Fourier transform table. The SRRC and other pulse shapes are discussed in Appendix A, and we will go into more detail later on. The purpose so far is to motivate practicing up on frequency transforms.
5.1
Continuoustime Frequency Transforms
Notes about continuoustime frequency transforms: 1. You are probably most familiar with the Laplace Transform. To convert it to the Fourier tranform, we replace s with jω, where ω is the radial frequency, with units radians per second (rad/s).
ECE 5520 Fall 2009
18
2. You may prefer the radial frequency representation, but also feel free to use the rotational frequency f (which has units of cycles per sec, or Hz. Frequency in Hz is more standard for communications; you should use it for intuition. In this case, just substitute ω = 2πf . You could write X(j2πf ) as the notation for this, but typically you’ll see it abbreviated as X(f ). Note that the deﬁnition of the Fourier tranform 1 in the f domain loses the 2π in the inverse Fourier transform deﬁnition.s X(j2πf ) = x(t) =
∞ t=−∞ ∞ f =−∞
x(t)e−j2πf t dt X(j2πf )ej2πf t df (7)
3. The Fourier series is limited to purely periodic signals. Both Laplace and Fourier transforms are not limited to periodic signals. 4. Note that ejα = cos(α) + j sin(α). See Table 2.4.4 in the Rice book. Example: Square Wave Given a rectangular pulse x(t) = rect(t/Ts ), x(t) = 1, −Ts /2 < t ≤ Ts /2 0, o.w.
What is the Fourier transform X(f )? Calculate both from the deﬁnition and from a table. Solution: Method 1: From the deﬁnition:
Ts /2
X(jω) =
t=−Ts /2
e−jωt dt
Ts /2 t=−Ts /2
=
1 e−jωTs /2 − ejωTs /2 = −jω sin(ωTs /2) sin(ωTs /2) = 2 = Ts ω ωTs /2
1 This uses the fact that −2j e−jα − ejα = sin(α). While it is sometimes convenient to replace sin(πx)/(πx) with sincx, it is confusing because sinc(x) is sometimes deﬁned as sin(πx)/(πx) and sometimes deﬁned as (sin x)/x. No standard deﬁnition for ‘sinc’ exists! Rather than make a mistake because of this, the Rice book always writes out the expression fully. I will try to follow suit. Method 2: From the tables and properties:
1 −jωt e −jω
x(t) = g(2t) where g(t) =
1, t < Ts 0, o.w.
(8)
1 From the table, G(jω) = 2Ts sin(ωTs ) . From the properties, X(jω) = 2 G j ω . So ωTs 2
X(jω) = Ts
sin(ωTs /2) ωTs /2
Time shift property: F {x(t − t0 )} = e−jωt0 X(jω) . Ts Ts sinc(π f Ts) Ts/2 0 (a) −4/Ts −3/Ts −2/Ts −1/Ts 0 f 1/Ts 2/Ts 3/Ts 4/Ts 0 −10 (X(f)/T ) 20 log s −20 10 −30 −40 (b) −50 −4/Ts −3/Ts −2/Ts −1/Ts 0 f 1/Ts 2/Ts 3/Ts 4/Ts Figure 4: (a) Fourier transform X(j2πf ) of rect pulse with period Ts . Assume that F {x(t)} = X(jω).1. Duality property: x(jω) = F {X(−t)} x(−jω) = F {X(t)} (Confusing. It says is that you can go backwards on the Fourier transform.4.ECE 5520 Fall 2009 19 See Figure 4(a).3 in the Rice book. Question: What if Y (jω) was a rect function? What would the inverse Fourier transform y(t) be? 5. 20 log10 (X(j2πf )/Ts ). just remember to ﬂip the result around the origin. Important properties of the Fourier transform: 1. and (b) Power vs.1 Fourier Transform Properties frequency See Table 2.) 2.
x(t) = 2 ∂w(t) ∂t 4. Modulation property: 1 1 F {x(t)cos(ω0 t)} = X(ω − ω0 ) + X(ω + ω0 ) 2 2 6. Scaling property: for any real a = 0. Parceval’s theorem: The energy calculated in the frequency domain is equal to the energy calculated in the time domain.3 Examples Example: Applying FT Properties If w(t) has the Fourier transform W (jω) = ﬁnd X(jω) for the following waveforms: 1.ECE 5520 Fall 2009 20 3. Y (jω) = X(jω) · H(jω) 5. Convolution property: if. x(t) = w(2t + 2) 2. check your answer by doing both. F {x(t) ⋆ y(t)} = X(jω) · Y (jω) 5. the output signal is y(t) = h(t) ⋆ x(t) Using the above convolution property. jω 1 + jω Lecture 3 Today: (1) Bandpass Signals (2) Sampling Solution: From the Examples at the end of Lecture 2: .2 Linear Time Invariant (LTI) Filters If a (deterministic) signal x(t) is input to a LTI ﬁlter with impulse response h(t). x(t) = w(1 − t) Solution: To be worked out in class. ∞ ∞ ∞ 1 X(jω)2 dω X(f )2 df = x(t)2 dt = 2π ω=−∞ f =−∞ t=−∞ So do whichever one is easiest! Or. additionally y(t) has Fourier transform X(jω). F {x(at)} = ω 1 X j a a 4. 5. x(t) = e−jt w(t − 1) 3.
. jω 1 + jω 6 Bandpass Signals Def ’n: Bandpass Signal A bandpass signal x(t) has Fourier transform X(jω) = 0 for all ω such that ω ± ωc  > W/2. where Y (jω) = ejω W (jω) = ejω −jω So X(jω) = e−jω 1−jω . and X(jω) = ejω R(jω). B X(f) B fc fc Figure 5: Bandpass signal with center frequency fc . X(jω) = 2jω 1+jω = − 1+jω . as shown in Fig. Alternatively. 6. where y(t) = w(1 + t). where B/2 < fc . there will be some ‘sidelobes’ out of the main band.ECE 5520 Fall 2009 21 1. 1 + jω 2 jω/2 So X(jω) = 1 ejω 1+jω/2 . Let x(t) = e−jt z(t) where z(t) = w(t − 1). 6. Then X(jω) = Y (−jω). So j(ω+1) X(jω) = e−j(ω−1) W (j(ω − 1)) = e−j(ω+1) 1+j(ω+1) . Equivalently. Let x(t) = y(−t). Realistically. where W/2 < ωc . and X(jω) = Z(j(ω − 1)). let x(t) = r(t + 1) where r(t) = w(2t).1 Upconversion We can take a baseband signal xC (t) and ‘upconvert’ it to be a bandpass ﬁlter by multiplying with a cosine at the desired center frequency. Then Z(jω) = ej2ω 1 jω . jω 2ω 3. X(jω) will not be exactly zero outside of the bandwidth. 2 4. Then 2 1 R(jω) = W (jω/2). X(jω) = 1 ejω 1+jω/2 . a bandpass signal x(t) has Fourier transform X(f ) = 0 for all f such that f ± fc  > B/2. Then Z(ω) = e−jω W (jω). 2 2. Let x(t) = z(2t) where z(t) = w(t + 2). and X(jω) = Z(jω/2). 2 jω/2 Again.
−1/2 < t ≤ 1/2 rect(t) = 0. This is shown in Fig. 2. This time. Note that I use “rect” as follows: 1. 6.2 Downconversion of Bandpass Signals How will we get back the desired baseband signal xC (t) at the receiver? 1. 7. Input to a lowpass ﬁlter (LPF) with cutoﬀ frequency ωc . we will look at: x(t) = rect(t/Ts ) cos(ωc t) where ωc is the center frequency in radians/sec (fc = ωc /(2π) is the center frequency in Hz). X(f) fc fc Figure 7: Plot of X(j2πf ) for modulated square wave example.w. What is X(jω)? Solution: X(jω) = F {rect(t/Ts )} ⋆ F {cos(ωc t)} sin(ωTs /2) 1 Ts ⋆ π [δ(ω − ωc ) + δ(ω + ωc )] X(jω) = 2π ωTs /2 Ts sin[(ω − ωc )Ts /2] Ts sin[(ω + ωc )Ts /2] X(jω) = + 2 (ω − ωc )Ts /2 2 (ω + ωc )Ts /2 The plot of X(j2πf ) is shown in Fig.ECE 5520 Fall 2009 22 xC(t) cos(2pfct) Figure 6: Modulator block diagram. . Multiply (again) by the carrier 2 cos(ωc t). o. Example: Modulated Square Wave We dealt last time with the square wave x(t) = rect(t/Ts ). 8.
e. Assume the general modulated signal was. • Given {xn }∞ n=−∞ . 7 Sampling A common statement of the Nyquist sampling theorem is that a signal can be sampled at twice its bandwidth.) Let xc (t) be a baseband. where T ≥ 2B to yield the ∞ sequence {xc (nT )}n=−∞ .ECE 5520 Fall 2009 23 x(t) 2cos(2pfct) LPF Figure 8: Demodulator block diagram. Xc (jω) = 0 for all ω ≥ 2πB. how would you ﬁnd the maximum of x(t)? • This is only precise when X(jω) = 0 for all ω ≥ 2πB. continuous signal with bandwidth B (in 1 Hz). (Does the LPF need to be ideal?) X2 (jω) = XC (jω) So x2 (t) = xC (t). Let xc (t) be sampled at multiples of T . i. . 2πB(t − nT ) (9) Proof: Not covered. Notes: • This is an interpolation procedure. x(t) = xC (t) cos(ωc t) 1 1 X(jω) = XC (j(ω − ωc )) + XC (j(ω + ωc )) 2 2 What happens after the multiplication with the carrier in the receiver? Solution: X1 (jω) = X1 (jω) = 1 X(jω) ⋆ 2π [δ(ω − ωc ) + δ(ω + ωc )] 2π 1 1 XC (j(ω − 2ωc )) + XC (jω) + 2 2 1 1 XC (jω) + XC (j(ω + 2ωc )) 2 2 There are components at 2ωc and −2ωc which will be canceled by the LPF. But the theorem really has to do with signal reconstruction from a sampled signal. Then xc (t) = 2BT ∞ n=−∞ xc (nT ) sin(2πB(t − nT )) . Theorem: (Nyquist Sampling Theorem..
The Fourier transform of the sampled signal is many copies of X(jω) strung at integer multiples of 2π/T . compared to Figure 10(a)? Example: Square vs. Example: Sinusoid sampled above and below Nyquist rate Consider two sinusoidal signals sampled at 1/T = 25 Hz: x1 (nT ) = sin(2π5nT ) x2 (nT ) = sin(2π20nT ) What are the two frequencies of the sinusoids. 9. and what is the Nyquist rate? Which of them does the Nyquist theorem apply to? Draw the spectrums of the continuous signals x1 (t) and x2 (t). and indicate what the spectrum is of the sampled signals. x1 (t) = rect(t/Ts ).12. this is a convolution: Xsa (jω) = X(jω) ⋆ = = 2π T ∞ n=−∞ ∞ n=−∞ δ(t − nT ) x(nT )δ(t − nT ) 2π T ∞ n=−∞ δ ω− 2πn T 2πn T for all ω (10) X j ω− 1 X(jω) T for ω < 2πB This is shown graphically in the Rice book in Figure 2. Also consider a parabola pulse (this doesn’t . as shown in Fig.1 Aliasing Due To Sampling ∞ n=−∞ Essentially. What observations would you make about Figure 10(b). Figure 9: The eﬀect of sampling on the frequency spectrum in terms of frequency f in Hz. round pulse shape Consider the square pulse considered before. sampling is the multiplication of a impulse train (at period T with the desired signal x(t): xsa (t) = x(t) xsa (t) = What is the Fourier transform of xsa (t)? Solution: In the frequency domain. Figure 10 shows what happens when the Nyquist theorem is applied to the each signal (whether or not it is valid).ECE 5520 Fall 2009 24 7.
ECE 5520 Fall 2009
25
1.5 1 0.5
Sampled Interp.
x(t)
0 −0.5 −1 −1 −0.5
(a)
1.5 1 0.5
0 Time
0.5
1
Sampled Interp.
x(t)
0 −0.5 −1 −1 −0.5
(b)
0 Time
0.5
1
Figure 10: Sampled (a) x1 (nT ) and (b) x2 (nT ) are interpolated (—) using the Nyquist interpolation formula. really exist in the wild – I’m making it up for an example.) x2 (t) = 1− 0,
2t Ts 2 1 , − 2Ts ≤ t ≤ 1 2Ts
o.w.
What happens to x1 (t) and x2 (t) when they are sampled at rate T ? In the Matlab code EgNyquistInterpolation.m we set Ts = 1/5 and T = 1/25. Then, the sampled pulses are interpolated using (9). Even though we’ve sampled at a pretty high rate, the reconstructed signals will not be perfect representations of the original, in particular for x1 (t). See Figure 11.
7.2
Connection to DTFT
Recall (10). We calculated the Fourier transform of the product by doing the convolution in the frequency domain. Instead, we can calculate it directly from the formula.
ECE 5520 Fall 2009
26
1 0.8
Sampled Interp.
x(t)
0.6 0.4 0.2 0 −0.2 −1 −0.5 0 Time 0.5 1
(a)
1 0.8
Sampled Interp.
x(t)
0.6 0.4 0.2 0 −0.2 −1 −0.5 0 Time 0.5 1
(b)
Figure 11: Sampled (a) x1 (nT ) and (b) x2 (nT ) are interpolated (—) using the Nyquist interpolation formula. Solution: Xsa (jω) = = =
∞ ∞
t=−∞ n=−∞ ∞ n=−∞ ∞ n=−∞
x(nT )δ(t − nT )e−jωt
∞ t=−∞
x(nT )
δ(t − nT )e−jωt
(11)
x(nT )e−jωnT
The discretetime Fourier transform of xsa (t), I denote DTFT {xsa (t)}, and it is: DTFT {xsa (t)} = Xd (ejΩ ) =
∞ n=−∞
x(nT )e−jΩn
Essentially, the diﬀerence between the DTFT and the Fourier transform of the sampled signal is the relationship Ω = ωT = 2πf T
ECE 5520 Fall 2009
27
But this deﬁnes the relationship only between the Fourier transform of the sampled signal. How can we Ω relate this to the FT of the continuoustime signal? A: Using (10). We have that Xd (ejω ) = Xsa 2πT . Then plugging into (10), Solution: Xd (ejΩ ) = Xsa j = = 2π T 2π T
∞ n=−∞ ∞ n=−∞
Ω T X j X j Ω 2πn − T T Ω − 2πn T (12)
If x(t) is suﬃciently bandlimited, Xd (ejΩ ) ∝ X(jΩ/T ) in the interval −π < Ω < π. This relationship holds for the Fourier transform of the original, continuoustime signal between −π < Ω < π if and only if the original signal x(t) is suﬃciently bandlimited so that the Nyquist sampling theorem applies. Notes: • Most things in the DTFT table in Table 2.8 (page 68) are analogous to continuous functions which are not bandlimited. So  don’t expect the last form to apply to any continuous function. • The DTFT is periodic in Ω with period 2π. • Don’t be confused: The DTFT is a continuous function of Ω.
Lecture 4 Today: (1) Example: Bandpass Sampling (2) Orthogonality / Signal Spaces I
7.3
Bandpass sampling
Bandpass sampling is the use of sampling to perform downconversion via frequency aliasing. We assume that the signal is truly a bandpass signal (has no DC component). Then, we sample with rate 1/T and we rely on aliasing to produce an image the spectrum at a relatively low frequency (less than onehalf the sampling rate). Example: Bandpass Sampling This is Example 2.6.2 (pp. 6065) in the Rice book. In short, we have a bandpass signal in the frequency band, 450 Hz to 460 Hz. The center frequency is 455 Hz. We want to sample the signal. We have two choices: 1. Sample at more than twice the maximum frequency. 2. Sample at more than twice the bandwidth. The ﬁrst gives a sampling frequency of more than 920 Hz. The latter gives a sampling frequency of more than 20 Hz. There is clearly a beneﬁt to the second part. However, traditionally, we would need to downconvert the signal to baseband in order to sample it at the lower rate. (See Lecture 3 notes).
To build a better framework for it. we talked about how digital communications is largely about using (and choosing some combination of) a discrete set of waveforms to convey information on the (continuoustime. xn yn xT y = i xi y i Note the T transpose ‘inbetween’ an inner product. x . orthogonality of waveforms means that some combination of the waveforms can be uniquely separated at the receiver. Finally. x1 y1 x2 y2 x= . 8 Orthogonality From the ﬁrst lecture. 1000 samples/sec 455 500 = 455 500 π. which you’re familiar with from vectors (as the dot product). so the center frequency of the sampled signal is: Ωc = 2π × 455 140 mod 2π = 13π 2 mod 2π = π 2 10 The value π/2 radians/sample is equivalent to 35 cycles/second. This choice of waveforms is partially determined by bandwidth. The signal is very sparse . .ECE 5520 Fall 2009 28 What is the sampled spectrum. This is the critical concept needed to understand digital receivers.1 Inner Product of Vectors Then we can take their inner product as: We often use an idea called inner product. The inner product is also denoted x. 140 samples/sec? Solution: 1. which we have covered. Sample at 140 samples/sec: Copies of the entire frequency spectrum will appear in the frequency domain at multiples of 140 Hz or 2π140 radians/sec. . y . Sample at 1000 samples/sec: Ωc = 2π × 10 Hz 2π = π/50 radians/sample. . and what is the bandwidth of the sampled signal in radians/sample if the sampling rate is: 1. we’ll start with something you’re probably more familiar with: vector multiplication. At the receiver. 1000 samples/sec? 2. the idea is to determine which waveforms were sent and in what quantities. the norm of a vector is the square root of the inner product of a vector with itself. . It is also determined by orthogonality. The discretetime frequency is modulo 2π. realvalued) channel. 8. Loosely. If we have two vectors x and y. The bandwidth is 2π 140 = π/7 radians/sample of sampled spectrum. y= .it occupies only 2. x = x.
0 < t ≤ 1 0. These norms are: x(t).e. they’re at crossdirections. they are pointing in somewhat opposite directions. i. y(t) As a result. sin(2πt). A norm is always indicated with the Example: Example Inner Products Let x = [1. y ? • x ? • z ? Recall the inner product between two vectors can also be written as xT y = x y cos θ where θ is the angle between vectors x and y.. It also is applied to the ‘space’ of squareintegrable real functions.ECE 5520 Fall 2009 · 29 symbols outside of the vector. those functions x(t) for which ∞ −∞ ∞ −∞ x2 (t)dt < ∞. . 0]T .2 Inner Product of Functions The inner product is not limited to vectors. 1. when negative. o. What are: • x. y ? • z. y = [0. when zero. 1]T . 0 < t ≤ 1 0. y(t) x(t). x(t) x(t) = = ∞ −∞ ∞ −∞ = = ∞ −∞ ∞ −∞ x(t)y(t)dt x(t)y ∗ (t)dt Real functions Complex functions x2 (t)dt x(t)2 dt Real functions Complex functions Example: Sine and Cosine Let x(t) = y(t) = cos(2πt). 0. The ‘space’ of squareintegrable complex functions.. the two vectors are pointing in a similar direction. i. In words. o. 1. 8.w.e. the inner product is a measure of ‘samedirectionness’: when positive.w. z = [1. those functions x(t) for which x(t)2 dt < ∞ also have an inner product. 1]T .
ECE 5520 Fall 2009
30
What are x(t) and y(t) ? What is x(t), y(t) ?
1 Solution: Using cos2 x = 2 (1 + cos 2x), 1
x(t)
=
0
cos2 (2πt)dt =
1 2
1+
1 sin(2πt) dt 2π
1
=
0
1/2
The solution for y(t) also turns out to be x(t), y(t) =
1/2. Using sin 2x = 2 cos x sin x, the inner product is,
1
cos(2πt) sin(2πt)dt
0 1
=
0
1 sin(4πt)dt 2
1
=
−1 cos(4πt) 8π
=
0
−1 (1 − 1) = 0 8π
8.3
Deﬁnitions
Def ’n: Orthogonal Two signals x(t) and y(t) are orthogonal if x(t), y(t) = 0 Def ’n: Orthonormal Two signals x(t) and y(t) are orthonormal if they are orthogonal and they both have norm 1, i.e., x(t) = 1 y(t) = 1
(a) Are x(t) and y(t) from the sine/cosine example above orthogonal? (b) Are they orthonormal? √ Solution: (a) Yes. (b) No. They could be orthonormal if they’d been scaled by 2.
x1(t) A
x2(t) A
A
A
Figure 12: Two WalshHadamard functions. Example: WalshHadamard 2 Functions Let 0 < t ≤ 0.5 A, −A, 0.5 < t ≤ 1 x1 (t) = 0, o.w. A, 0 < t ≤ 1 0, o.w.
x2 (t) =
ECE 5520 Fall 2009
31
These are shown in Fig. 12. (a) Are x1 (t) and x2 (t) orthogonal? (b) Are they orthonormal? Solution: (a) Yes. (b) Only if A = 1.
8.4
Orthogonal Sets
Def ’n: Orthogonal Set M signals x1 (t), . . . , xM (t) are mutually orthogonal, or form an orthogonal set, if xi (t), xj (t) = 0 for all i = j. Def ’n: Orthonormal Set M mutually orthogonal signals x1 (t), . . . , xM (t) are form an orthonormal set, if xi = 1 for all i.
(a)
(b)
Figure 13: WalshHadamard 2length and 4length functions in image form. Each function is a row of the image, where crimson red is 1 and black is 1.
Figure 14: WalshHadamard 64length functions in image form. Do the 64 WH functions form an orthonormal set? Why or why not? Solution: Yes. Every pair i, j with i = j agrees half of the time, and disagrees half of the time. So the function is halftime 1 and halftime 1 so they cancel and have inner product of zero. The norm of the function in row i is 1 since the amplitude is always +1 or −1. Example: CDMA and WalshHadamard This example is just for your information (not for any test). The set of 64length WalshHadamard sequences
ECE 5520 Fall 2009
32
is used in CDMA (IS95) in two ways: 1. Reverse Link: Modulation: For each group of 6 bits to be send, the modulator chooses one of the 64 possible 64length functions. 2. Forward Link: Channelization or Multiple Access: The base station assigns each user a single WH function (channel). It uses 64 functions (channels). The data on each channel is multiplied by a Walsh function so that it is orthogonal to the data received by all other users.
Lecture 5 Today: (1) Orthogonal Signal Representations
9
Orthonormal Signal Representations
Last time we deﬁned the inner product for vectors and for functions. We talked about orthogonality and orthonormality of functions and sets of functions. Now, we consider how to take a set of arbitrary signals and represent them as vectors in an orthonormal basis. What are some common orthonormal signal representations? • Nyquist sampling: sinc functions centered at sampling times. • Fourier series: complex sinusoids at diﬀerent frequencies • Sine and cosine at the same frequency • Wavelets And, we will come up with others. Each one has a limitation – only a certain set of functions can be exactly represented in a particular signal representation. Essentially, we must limit the set of possible functions to a set. That is, some subset of all possible functions. We refer to the set of arbitrary signals as: S = {s0 (t), s1 (t), . . . , sM −1 (t)} For example, a transmitter may be allowed to send any one of these M signals in order to convey information to a receiver. We’re going to introduce another set called an orthonormal basis: B = {φ0 (t), φ1 (t), . . . , φN −1 (t)} Each function in B is called a basis function. You can think of B as an unambiguous and useful language to represent the signals from our set S. Or, in analogy to color printers, a color printer can produce any of millions of possible colors (signal set), only using black, red, blue, and green (basis set) (or black, cyan, magenta, yellow).
for which si (t) ∈ Span{B} for every i = 0. The signal set might be a set of signals we can use to transmit particular bits (or sets of bits). si (t).ECE 5520 Fall 2009 33 9. An orthonormal basis for an arbitrary signal set tells us: • how to build the receiver.. you can use. • how to represent the signals in ‘signalspace’.1 Orthonormal Bases Def ’n: Span The span of the set B is the set of all functions which are linear combinations of the functions in B. si (t) = N −1 k=0 ai.2 Synthesis Consider one of the signals in our signal set. . φk (t) The projection of one signal i onto basis function k is deﬁned as ai. Notes: N −1 • The orthonormal basis is not unique.k φk (t) = si (t). . Given that it is in the span of our basis B. Then the signal is equal to the sum of its projections onto the basis functions. Why is this? . M − 1. and a larger set would not be a basis by deﬁnition. 9. . it can be represented as a linear combination of the basis functions.k φk (t) The particular constants ai.. φk (t) φk (t). (As the WalshHadamard functions are used in IS95 reverse link). If you can ﬁnd one basis. i=0 N −1 B = {φi (t)}i=0 .k are calculated using the inner product: ai. {−φk (t)}k=0 as another orthonormal basis. the orthogonal basis is an orthogonal set of the smallest size N . Def ’n: Orthonormal Basis The orthonormal basis is an orthogonal basis in which each basis function has norm 1.k = si (t). The value of N is called the dimension of the signal set..aN−1 ∈R Def ’n: Orthogonal Basis For any arbitrary set of signals S = {si (t)}M −1 . for example.. The span is referred to as Span{B} and is Span{B} = N −1 k=0 ak φk (t) a0 . . • All orthogonal bases will have size N : no smaller basis can include all signals si (t) in its span. and • how to quickly analyze the error rate of the scheme.
s0 . and s2 = [1. Example: Positionshifted pulses Plot the signal space diagram for the signals. we know there are some constants {ai.k φk (t). φj (t) = N −1 k=0 ai.ECE 5520 Fall 2009 N Solution: Since si (t) ∈ Span{B}. s1 .k Proof? . .1 . . . φj (t) = ai. which shows a block diagram of how a transmitter would synthesize one of the M signals to send. (Generally. this space that we’re plotting in is called signal space. 0]T .k }k −1 such that 34 si (t) = Taking the inner product of both sides with φj (t). 1]T . speciﬁcally.k φk (t). based on an input bitstream. si (t). φj (t) si (t). now we can now represent a signal by a vector.N −1 ]T This and the basis functions completely represent each signal. s1 = [0. Energy: • Energy can be calculated in signal space as Energy{si (t)} = −∞ φ1 (t) = u(t − 1) − u(t − 2) ∞s2 (t)dt = i N −1 k=0 a2 i. φ0 (t) = u(t) − u(t − 1) What are the signal space vectors. log2 M bits of information. si = [ai. ai. Plotting {si }i in an N dimensional grid is termed a constellation diagram. ai.5 in Rice (page 254). and s2 ? Solution: They are s0 = [1. They are plotted in the signal space diagram in Figure 15. . 1]T . φj (t) si (t). See Figure 5.0 . By choosing one of the M signals. we choose M to be a power of 2 for simplicity).k φk (t) N −1 k=0 N −1 k=0 ai. φj (t) = ai.j So. Generally. We can also synthesize any of our M signals in the signal set by adding the proper linear combination of the N bases. we convey information. s0 (t) = u(t) − u(t − 1) s1 (t) = u(t − 1) − u(t − 2) s2 (t) = u(t) − u(t − 2) given the orthonormal basis.
Example: Amplitudeshifted signals Now consider s0 (t) = 1.j = for i. . s2 = [−0. di.25.5]. This is the task of analysis. s1 = [0.5[u(t) − u(t − 1)] φ1 (t) = u(t) − u(t − 1) and the orthonormal basis. respectively.5[u(t) − u(t − 1)] N −1 k=0 (ai. 0. j in {0. It turns out that an orthonormal bases makes our analysis very straightforward and elegant. What are the signal space vectors for the signals {si (t)}? What are their energies? Solution: s0 = [1. . See Figure 16.25. Energies are just the squared magnitude of the vector: 2.5[u(t) − u(t − 1)] s3 (t) = −1. 1 0 1 f1 Figure 16: Signal space diagram for amplitudeshifted signals example.25.5[u(t) − u(t − 1)] s2 (t) = −0.k − aj. We won’t receive exactly what we sent .5]. the energy and distance between points will not change. • Although diﬀerent ON bases can be used. .5].5]. M − 1}.3 Analysis At a receiver. .25. and 2. s3 = [−1. • We will ﬁnd out later that it is the distances between the points in signal space which determine the bit error rate performance of the receiver. 9. . our job will be to analyze the received signal (a function) and to decide which of the M possible signals was sent.k )2 s1 (t) = 0. 0.there will be additional functions added to the signal function we sent.ECE 5520 Fall 2009 35 f2 1 f1 1 Figure 15: Signal space diagram for positionshifted signals example.
What is r(t)? ˆ Solution: 0≤t<1 1. Lecture 6 Today: (1) Multivariate Distributions . xN −1 ]T ..e. .w. x1 . ˆ x = [x0 . and then ﬁnd the minimum with respect to each xk . φk (t) . . xk = r(t). so r(t) would not be in Span{B} either. What is the best approximation to r(t) in the signal space? Speciﬁcally.e. i. . .. . 0. it can be represented as a vector in signal space. But w(t) might not (probably not) be in the span of our basis B. i. then we would receive r(t) = sm (t) + w(t) where the w(t) is the sum of all of the additive signals that we did not intend to receive. 1 ≤ t < 2 o. we might say that if we send signal m. . N . for k = 0. and the synthesis equation is r (t) = ˆ N −1 k=0 xk φk (t) If you plug in the above expression for r (t) into (13). you’d ˆ see that the minimum error is at ∞ r(t)φk (t)dt xk = −∞ that is. But. Let s0 (t) = φ0 (t) and s1 (t) = φ1 (t). what is r (t) ∈ Span{B} such that the energy of the ˆ diﬀerence between r(t) and r(t) is minimized. . sm (t) from our signal set. ˆ argmin r (t)∈Span{B} ˆ ∞ −∞ r ˆ(t) − r(t)2 dt (13) Solution: Since r(t) ∈ Span{B}. . r (t) = ˆ −1/2. Example: Analysis using a WalshHadamard 2 Basis See Figure 17.ECE 5520 Fall 2009 • Thermal noise • Interference from other users • Selfinterference 36 We won’t get into these problems today.
X2 (x1 .X2 (x1 .X2 )∈B PX1 . • Joint pmf: PX1 . x2 ) = ∂2 ∂x1 ∂x2 FX1 .X2 (x1 .−∞ FX1 . x2 ) = 1.x2 →.9 should be: x FX (x) = P [X ≤ x] = and Eqn 4. x2 )dx1 dx2 (x1 . x2 ) fX1 .15 should be: fX (t)dt −∞ FX (x) = P [X ≤ x] To ﬁnd the probability of an event. The CDF is nonnegative and nondecreasing. and are nonnegative. Note: Two errors in Rice book dealing with the deﬁnition of the CDF. x2 ) The pdf and pmf integrate / sum to one. with limxi →. x2 ) = 0 and limx1 .1/2 r(t) 2 1 2 2 2 1 2 Figure 17: Signal and basis functions for Analysis example. For example.X2 (x1 .X2 (x1 . Eqn 4. • Joint pdf: fX1 . x2 ) = P [{X1 = x1 } ∩ {X2 = x2 }] It is the probability that both events happen simultaneously.1/2 r(t) 2 1 2 2 f2(t) 1/2 1 . 10 MultiVariate Distributions For two random variables X1 and X2 .+∞ FX1 . for event B ∈ S.X2 (x1 . x2 ) = P [{X1 ≤ x1 } ∩ {X2 ≤ x2 }] It is the probability that both events happen simultaneously.ECE 5520 Fall 2009 37 f1(t) 1/2 1 . • Discrete case: P [B] = • Continuous Case: P [B] = (X1 .X2 (x1 .X2 (x1 .x2 )∈B . • Joint CDF: FX1 . you integrate.
. xn ) = P [X1 ≤ x1 .. 3. X2 . Xn .Xn (x1 . .w.X2 (x1 . .V. xn ) = P [X1 = x1 .. . is fX1 X2 (x1 x2 ) = fX1 . . . . . x2 )dx1 Two random variables X1 and X2 are independent iﬀ for all x1 and x2 . . x2 ) = PX1 (x1 )PX2 (x2 ) • fX1 . is PX1 X2 (x1 x2 ) = PX1 . x2 )/fX2 (x2 ) PX1 . X2 ) ∈ B o. x2 ) = fX2 (x2 )fX1 (x1 ) 10. .. . x2 ) fX1 . Xn = xn ]. • PX1 ..x2 ) . .V. .. . .V. .s: 1. The CDF of R.s X1 and X2 . where PX2 (x2 ) > 0.X2 (x1 . • Discrete case.X2 B (x1 . . xn ) = ∂n ∂x1 ···∂xn FX (x). . .V. . X = [X1 .. x2 )/PX2 (x2 ) • Continuous Case: The conditional pdf of X1 given X2 = x2 .X2 (x1 .Xn (x1 . The pdf of a continuous R. X2 ) ∈ B o.Xn (x1 . .. x2 ) = • Continuous Case: fX1 . P [B] 0.. X is FX (x) = FX1 .X2 (x1 . .) is a list of multiple random variables X1 .x2 ) .1 Random Vectors Def ’n: Random Vector A random vector (R.w.. fX1 . where fX2 (x2 ) > 0. . The conditional pmf of X1 given X2 = x2 .V. Xn ≤ xn ]. . X2 . .v. X is fX (x) = fX1 .X2 B (x1 . the joint probability conditioned on event B is • Discrete case: PX1 . Xn ]T Here are the Models of R. . . X is PX (x) = PX1 . x2 ) = Given r.2 Conditional Distributions Given event B ∈ S which has P [B] > 0. (X1 .. 2. . .X2 (x1 . 0.. .X2 (x1 .X2 (x1 . The pmf of a discrete R.X2 (x1 . 10.ECE 5520 Fall 2009 38 The marginal distributions are: • Marginal pmf: PX2 (x2 ) = • Marginal pdf: fX2 (x2 ) = x1 ∈SX1 x1 ∈SX1 PX1 . P [B] (X1 .
Let S = N Ei .d. and assume that {Ei } are independent and identically distributed i=1 (i. Each bit can either be demodulated without error.v.3 Simulation of Digital Communication Systems A simulation of a digital communication system is often used to estimate a bit error rate.).i.X2 (x1 . x2 ) = fX2 X1 (x2 x1 )fX1 (x1 ) or fX1 X2 (x1 x2 ) = fX2 X1 (x2 x1 )fX1 (x1 ) fX2 (x2 ) 39 10. say N bits through a model of the communication system.v. What type of random variable is S? 2. Thus the simulation of one bit is a Bernoulli trial. What is the mean and variance of S? Solution: Ei is called a Bernoulli r. This trial Ei is in error (E1 = 1) with probability pe (the true bit error rate) and correct (Ei = 0) with probability 1 − pe . and count the number of bits that are in error. or with error. What is the pmf of S? 3. What type of random variable is Ei ? Simulations run many bits.ECE 5520 Fall 2009 Def ’n: Bayes’ Law Bayes’ Law is a reformulation of he deﬁnition of the marginal pdf. with pmf PS (s) = N s p (1 − pe )N −s s e The mean of S is the the same as the mean of the sum of {Ei }. 1. N N ES [S] = E{Ei } N Ei = i=1 i=1 EEi [Ei ] = i=1 [(1 − p) · 0 + p · 1] = N p We can ﬁnd the variance of S the same way: N N VarS [S] = Var{Ei } N Ei = i=1 i=1 VarEi [Ei ] = i=1 N (1 − p) · (0 − p)2 + p · (1 − p)2 (1 − p)p2 + (1 − p)(p − p2 ) = i=1 = N p(1 − p) . It is written either as: fX1 .. and S is called a Binomial r.
For instance. with pmf PT1 (t) = (1 − pe )t−1 pe The mean of T1 is E [T1 ] = 1 pe Note the variance of T1 is Var [T1 ] = (1 − pe )/p2 .w. Workaround: We will sometimes use a pdf for a discrete 0.g. So.5 would be an integral that would return 0. For example. X1 may take a discrete set of values. For example. Example: Digital communication system in noise Let X1 is the transmitted signal voltage. But the noise N may be continuous.6. We’ll often have X1 discrete and X2 continuous. Any probability integral will return the proper value.5}.v. and nonnegative values for all x1 . e. As long as σ 2 > 0.5 < x1 < 0.6δ(x1 ) + 0.4. . our estimate of pe will have relatively high variance.5. Let T1 be the time (number of bits) up to and including the ﬁrst error. if we run a simulation and get zero bit errors. the probability that −0.5. 1. although it was in the Yates & Goodman textbook. What is the pmf of T1 ? 3. In a digital system.. we won’t have a very good idea of the bit error rate. PX1 (x1 ) = 0. which is modeled as X2 = aX1 + N where N is a continuousvalued random variable representing the additive noise of the channel. e. then we would write a pdf with the probabilities as amplitudes of a dirac delta function centered at the value that has that probability.5.6.4 Mixed Discrete and Continuous Joint Variables This was not covered in ECE 5510.ECE 5520 Fall 2009 40 We may also be interested knowing how many bits to run in order to get an estimate of the bit error rate. and a is a constant which represents the attenuation of the channel. What type of random variable is T1 ? 2. {1. and X2 is the received signal voltage. −1. −0. 0. 0. so the standard deviation for very low pe is almost the same e as the expected value. What is the mean of T1 ? Solution: T1 is a Geometric r.g. then X2 will be continuousvalued.. 10.4δ(x1 − 1) This pdf has integral 1. Gaussian with mean µ and variance σ 2 . random variable. fX1 (x1 ) = 0. even if we run an experiment until the ﬁrst bit error. if x1 = 0 x1 = 1 o.
to get fX2 (x2 ) = 0. Use whichever seems more convenient for you.d. .ECE 5520 Fall 2009 41 Example: Joint distribution of X1 .5δ(x1 ) + 0.X2 (x1 . What is the joint p. and fN (n) = √ 1 2πσ 2 e−n 2 /(2σ 2 ) where σ 2 is some known variance. 0. of (X1 . and the r.X2 (x1 .5fN (x2 ) + 0. Writing fX1 (x1 ) = 0. x2 ) = fX1 . So break this into two cases: fX2 X1 (x2 0)fX1 (0). x2 ) = down the pdf of X2 .w.w. It is necessary to use Bayes’ Law. X2 ) ? Since X1 and X2 are NOT independent. fX1 . X2 ) ? Solution: 1.d. 1.X2 (x1 . X1 is independent of N with 0.5.5fN (x2 ). fX1 . x1 = 1 0. the pdf of the sum is the convolution of the pdfs of the inputs. What is the pdf of X2 ? 2. x2 ) = fX2 X1 (x2 x1 )fX1 (x1 ) Given a value of X1 (either 0 or 1) we can write fX1 .5fN (x2 − 1). of (X1 .X2 (x1 . x1 = 0 fX2 X1 (x2 1)fX1 (1).5.5fN (x2 − 1)δ(x1 − 1) These last two lines are completely equivalent.5fN (x2 )δ(x1 ) + 0. x1 = 0 0. o. x1 = 1 .5δ(x1 − 1) we convolve this with fN (n) above. What is the joint p.5fN (x2 − 1) 1 2 2 2 2 √ e−x2 /(2σ ) + e−(x2 −1) /(2σ ) fX2 (x2 ) = 2 2 2πσ 2. o. x2 ) = 0. o.v.s are added.v. 0. When two independent r. See Figure 18. x1 = 1 0.w.f. x1 = 0 PX1 (x1 ) = 0. we cannot simply multiply the marginal pdfs together. X2 Consider the channel model X2 = X1 + N .f.
X2 ) = X1 X2 .6 Gaussian Random Variables 2 For a single Gaussian r.ECE 5520 Fall 2009 42 4 2 0 −1 0 0 X 1 1 −2 2 −4 X 2 Figure 18: Joint pdf of X1 and X2 . Discrete: E [g(X1 . X2 )] = Typical functions g(X1 . also called the ‘correlation’ of X1 and X2 : g(X1 . X2 ) = X1 or g(X1 .X2 (x1 . fX (x) = 1 2 2πσX e− (x−µX )2 2σ 2 Consider Y to be Gaussian with mean 0 and variance 1. X with mean µX and variance σX .v. Continuous: E [g(X1 . X2 )fX1 . • Covariance of X1 and X2 : g(X1 . we can write its CDF as FX (x) = P [X ≤ x] = Φ x − µX σX . X2 )PX1 . 1. x2 ) 2.X2 (x1 . X2 ) = X2 will result in the means µX1 and µX2 . 2 2 Often denoted σX1 and σX2 . Then. which has nonzero mean and nonunitvariance. the input and output (respectively) of the example additive noise communication system. X2 ) of random variables X1 and X2 is given by. we have the pdf. when σ = 1. X2 ) = (X2 − µX2 )2 . X2 ) = (X1 − µX1 )2 or g(X1 . X2 ) = (X1 − µX1 )(X2 − µX2 ).5 Expectation Def ’n: Expected Value (Joint) The expected value of a function g(X1 . So. 10. the CDF of Y is denoted as FY (y) = P [Y ≤ y] = Φ(y). 10. x2 ) g(X1 . for X. • Expected value of the product of X1 and X2 . X2 )] = X1 ∈SX1 X1 ∈SX1 X2 ∈SX2 X2 ∈SX2 g(X1 . X2 ) are: • Mean of X1 or X2 : g(X1 . • Variance (or 2nd central moment) of X1 or X2 : g(X1 .
in some texts.6. 1 − Φ(x). X with variance σX .6. it is given its own name. √ 1 1 Φ( 2x) = erf(x) + 2 2 Finally let y = √ 2x. X exceeds some value x is one minus the CDF. Q(x) = P [X > x] = 1 − Φ (x) What is Q(x) in integral form? Q(x) = x 2 For an Gaussian r. 10. zero mean Gaussian random variable.v.2 Error Function x − µX σX =1−Φ x − µX σX In math. there is a function called erf(x) erf(x) 2 √ π x 0 e−t dt 2 Example: Relationship between Q(·) and erf(·) What is the functional relationship between Q(·) and erf(·)? √ √ Solution: Substituting t = u/ 2 (and thus dt = du/ 2). Q(x). that is. and in Matlab. we can write Φ(·) in terms of the erf(·) function. so that 1 Φ(y) = erf 2 y √ 2 + 1 2 0 √ 2x .ECE 5520 Fall 2009 You can prove this by showing that the event X ≤ x is the same as the event x − µX X − µX ≤ σX σX 43 Since the lefthand side is a unitvariance. ∞ 1 2 √ e−w /2 dw 2π P [X > x] = Q 10. zero mean Gaussian r. This is so common in digital communications.1 Complementary CDF The probability that a unitvariance. erf(x) 2 √ 2π = 2 √ 2x e−u 2 /2 du 1 2 √ e−u /2 du 2π 0 √ 1 = 2 Φ( 2x) − 2 Equivalently. Instead. we can write the probability of this event using the unitvariance. the Q(x) function is not used.v. zero mean Gaussian CDF.
what is the probability that the receiver decides that a ‘0’ was sent? 2.v. then decide that the transmitter sent a ‘1’. the probability that the receiver decides ‘1’ is the probability that X2 > 1/3. Given that X1 = 0. 1. P [errorX1 = 0] = P [X2 > 1/3] = P X2 1/4 > 1/3 1/4 = Q ((1/3)/(1/2)) = Q(2/3) . since X2 = X1 + N .1/2 . • If X2 > 1/3. P [errorX1 = 1] = P [X2 ≤ 1/3] = P 1/3 − 1 X2 − 1 ≤ 1/4 1/4 = 1 − Q(−4/3) = 1 − Q ((−2/3)/(1/2)) 2. Given that X1 = 1. we have a model system in which the receiver sees X2 = X1 + N . Here.* erf(y. then decide that the transmitter sent a ‘0’. it is clear that X2 is also a Gaussian r. X1 ∈ {0. with mean 1 and variance 1/4. what is the probability that the receiver decides that a ‘1’ was sent? Solution: 1. 1} with equal probabilities and N is independent of X1 and zeromean Gaussian with variance 1/4. 10.7 Examples Example: Probability of Error in Binary Example As in the previous example. Q(y) = 1 − Φ(y) = 1 1 − erf 2 2 y √ 2 (14) You should go to Matlab and create a function Q(y) which implements: function rval = Q(y) rval = 1/2 . Then the probability that the receiver decides ‘0’ is the probability that X2 ≤ 1/3. The receiver decides as follows: • If X2 ≤ 1/3. Given that X1 = 0. Given that X1 = 1./sqrt(2)).ECE 5520 Fall 2009 44 Or in terms of Q(·).
the pdf would be the product of the individual pdfs (as with any random vector) and in this case the pdf would be: fX (x) = 1 (2π)n i 2 σi n exp − i=1 (xi − µXi )2 2 2σXi Section 4. r≥0 o.1 Envelope The envelope of a 2D random vector is its distance from the origin (0.d. X2 ). the pdf of R is fR (r) = r σ2 r exp − 2σ2 . if the vector is (X1 .3 spends some time with 2D Gaussian random vectors. An nlength R. and I0 (·) is the zeroth order modiﬁed Bessel function of the ﬁrst kind. r ≥ 0 2 0. 10.ECE 5520 Fall 2009 45 10.k.8 Gaussian Random Vectors Def ’n: Multivariate Gaussian R.w. respectively. and identical variances σ 2 . Any linear combination of jointly Gaussian random variables is another jointly Gaussian random variable. fX (x) = 1 (2π)n det(CX ) 1 −1 exp − (x − µX )T CX (x − µX ) 2 −1 where det() is the determinant of the covariance matrix.V. 0). Both pdfs are sometimes needed to derive probability of error formulas. 2 1 Note that both the Rayleigh and Rician pdfs can be derived from the Gaussian distribution and (15) using the transformation of random variables methods studied in ECE 5510. Gaussian with mean 0 and variance σ 2 . X is multivariate Gaussian with mean µX . Lecture 7 Today: (1) Matlab Simulation (2) Random Processes (3) Correlation and Matched Filter Receivers . o. If instead X1 and X2 are independent Gaussian with means µ1 and µ2 . Speciﬁcally. the envelope R is R= 2 2 X1 + X2 (15) If both X1 and X2 are i. For example. If the elements of X were independent random variables.a. the envelope R would now have a Rice (a. then Y is also a Gaussian random vector with mean AµX and covariance matrix ACX AT .8.i.w. . fR (r) = r σ2 +s exp − r 2σ2 2 2 I0 rs σ2 0. where s2 = µ2 + µ2 . which is the dimension with which we spend most of our time in this class. and covariance matrix CX if it has the pdf.V. Rician) distribution. and CX is the inverse of the covariance matrix. if we have a matrix A and we let a new random vector Y = AX. This is the Rayleigh distribution.
multiple receivers would record diﬀerent noisy signals even of the same transmission. The power of a signal is given by RX (0). s)] Note the mean is taken over all possible realizations s. µX = µX (t) = E [X(t)] is independent of t. τ ) depends only on the time diﬀerence τ and not on t. We then denote the autocorrelation function as RX (k). We can estimate the autocorrelation function of an ergodic. we omit the “s” when writing the name of the random process or random sequence. 2. you don’t have anything to average to get the mean function µX (t). For an ergodic random process.ECE 5520 Fall 2009 46 11 Random Processes A random process X(t. s) is µX (t) = E [X(t. Typically. A random process is widesense stationary (WSS) if 1. RX (t. but we now must deﬁne “ergodic”. µX = µX (n) = E [X(n)] is independent of n. s) is a function of time t and outcome (realization) s. If you record one signal over all time t. For example. We then denote the autocorrelation function as RX (τ ). RX (n. widesense stationary random process from one realization of a powertype signal x(t). Def ’n: Autocorrelation Function The autocorrelation function of a random process X(t) is RX (t. τ ) = E [X(t)X(t − τ )] The autocorrelation of a random sequence X(n) is RX (n. Consider what happens when a pollster takes either a timeaverage or a ensemble average: . Outcome s lies in sample space S. RX (τ ) = lim 1 T →∞ T T /2 t=−T /2 x(t)x(t − τ )dt This is not a critical topic for this class. 11. Outcome or realization is included because there could be ways to record X even at the same time. An example that helps demonstrate the diﬀerence between time averages and ensemble averages is the nonergodic process of opinion polling. k) depends only on k and not on n.1 Autocorrelation and Power Def ’n: Mean Function The mean function of the random process X(t. k) = E [X(n)X(n − k)] Def ’n: Widesense stationary (WSS) A random process is widesense stationary (WSS) if 1. A random sequence X(n. 2. its time averages are equivalent to its ensemble averages. and abbreviate it to X(t) or X(n). respectively. s) is a sequence of random variables indexed by time index n and realization s ∈ S.
m = n. 1015 Hz)..5. Let X(t) be a WSS random process with autocorrelation function RX (τ ) = E [X(t)X(t − τ )] = σ 2 δ(τ ) • What is the PSD of the sequence? • What is the power of the signal? Realistically. The ensemble average is taken by dividing the total number of “Yes”s by N . whether or not they will vote for person X. • Does this mean it is independent of every other sample? • What is the PSD of the sequence? This sequence is commonly called ‘white noise’ because it has equal parts of every frequency (analogy to light). whether or not she will vote for person X. the power spectral density goes to zero.e. 12 Correlation and MatchedFilter Receivers We’ve been talking about how our digital transmitter can transmit at each time one of a set of signals {si (t)}M . For continuous time signals. because as frequency goes very high (e. • Ensemble Average: The pollster asks N diﬀerent people.g. log2 M bits of information.. but probably not. on the same day. that is.1 White Noise Let X(n) be a WSS random sequence with autocorrelation function RX (k) = E [X(n)X(n − k)] = σ 2 δ(k) This says that each element of the sequence X(n) has zero covariance with every other sample of the sequence. thermal noise is not constant in frequency. At the receiver. we receive signal si (t) scaled and corrupted by noise: r(t) = bsi (t) + n(t) How do we decide which signal i was transmitted? .1. i. This makes the process nonergodic. every day for N days. white noise is a commonly used approximation for thermal noise. Will they come up with the same answer? Perhaps. SX (f ) = F {RX (τ )} SX (ejΩ ) = DTFT {RX (k)} 11. The Rice book (4.ECE 5520 Fall 2009 47 • Time Average: The pollster asks the same person.2) has a good analysis of the physics of thermal noise. The time average is taken by dividing the total number of “Yes”s by N . Power Spectral Density We have seen that for a WSS random process X(t) (and for a random sequence X(n) we compute the power spectral density as. i=1 This transmission conveys to us which of M messages we want to send. it is uncorrelated with X(m).
N . si (t) = r(t)si (t)dt But we want to build a receiver that does the minimum amount of calculation. . . Noisefree channel Since r(t) = bsi (t) + n(t). let b = 1 but consider n(t) to be a white Gaussian random process with zero mean and PSD SN (f ) = N0 /2. φk (t) = for k = 1 . . . . the inner product is a correlation: r(t). Correlation with the basis functions gives k=1 xk = r(t). φk = ∞ n(t)φk (t)dt. . x2 .k + nk First. {φk (t)}N .6 in Rice (page 226). In other terms.k + n(t)φk (t)dt = ai.1 Correlation Receiver ∞ −∞ What we’ve been talking about it the inner product. As notation. nk is zero mean: E [nk ] = ∞ −∞ E [n(t)] φk (t)dt = 0 . shown in Figure 5. x = [x1 . If si (t) are nonorthogonal. Deﬁne nk = n(t).ECE 5520 Fall 2009 48 12. xN ]T Now that we’ve done these N correlations (inner products). correlating with the basis functions. we can compute the estimate of the received signal as K−1 k=0 ∞ −∞ r(t)φk (t)dt r(t) = ˆ xk φk (t) This is the ‘correlation receiver’. −∞ What is x in terms of ai and the noise {nk }? What are the mean and covariance of {nk }? Solution: xk = = ∞ −∞ ∞ −∞ r(t)φk (t)dt [si (t) + n(t)]φk (t)dt ∞ −∞ = ai. then we can reduce the amount of correlation (and thus multiplication and addition) done in the receiver by instead. if n(t) = 0 and b = 1 then x = ai where ai is the signal space vector for si (t). Noisy Channel Now.1.
by calculating the autocorrelation Rn (m. N 1 2π(N0 /2) e − (xk −ai. o. Then since nk is fXk (xk ) = And. (Why?) What is the pdf of xk ? What is the pdf of x? Solution: Gaussian. . and then rewrite the integral as T xk = t=0 r(t)φk (t)dt T This can be written as xk = t=0 r(t)hk (T − t)dt . 2 Is {nk } WSS? Is it a Gaussian random sequence? Since the noise components are independent.i. Rn (m.d. since {Xk } are independent.w.ECE 5520 Fall 2009 49 Next we can show that n1 .k )2 2(N0 /2) fx (x) = k=1 N fXk (xk ) (x −a ) 1 − k i.k ) 2(N0 /2) 12. then xk are also independent.k e 2(N0 /2) 2π(N0 /2) 2 = k=1 = An example is in ece5510 lec07.m. xk are independent because xk = ai. . .m = 0.k + nk and ai.2 Matched Filter Receiver xk = ∞ t=−∞ Above we said that r(t)φk (t)dt But there really are ﬁnite limits – let’s say that the signal has a duration T . k).k is a deterministic constant. . 1 − e N/2 [2π(N0 /2)] PN 2 k=1 (xk −ai. m=k δk. nN are i. k) = E [nk nm ] = = = = ∞ t=−∞ ∞ t=−∞ ∞ τ =−∞ ∞ τ =−∞ E [n(t)n(τ )] φk (t)φm (τ )dτ dt N0 δ(t − τ )φk (t)φm (τ )dτ dt 2 N0 ∞ φk (t)φm (t)dτ dt 2 t=−∞ N0 N0 2 .
. • These are just two diﬀerent physical implementations. gain control.) This is the output of a convolution. Lecture 8 Today: (1) Optimal Binary Detection 12. At the receiver. and then will assume the noise signal is multiplied accordingly. . 12. The output for other times will be diﬀerent in the correlation and matched ﬁlter. textbooks will assume that the automatic gain control (AGC) works properly.m.4.3 Amplitude Before we said b = 1. correlation or matched ﬁlter reception. we assume we receive signal si (t) corrupted by noise: r(t) = si (t) + n(t) How do we decide which signal i ∈ {0. that is. • It may be easier to ‘see’ why the correlation receiver works. This eﬀectively multiplies the noise. . r’(t)e j2pfct r’(t) Downconverter AGC r(t) Correlation or Matched Filter x Optimal Detector b Figure 19: A block diagram of the receiver blocks which discussed in lecture 7 & 8. as shown in Figure 19. which is posted on WebCT. correlation and matched filter rx. A real channel attenuates! The job of our receiver will be to amplify the signal by approximately 1/b. Try out the Matlab code. M − 1} was transmitted? We split the task into downconversion. and detection.ECE 5520 Fall 2009 50 where hk (t) = φk (T − t). This transmission i=0 conveys to us which of M messages we want to send. .1. taken at time T . have a physical ﬁlter with the impulse response φk (T − t) and thus it is easy to do a matched ﬁlter implementation. (Plug in (T − t) in this formula and the T s cancel out and only positive t is left. for example. Notes: • The xk can be seen as the output of a ‘matched’ ﬁlter at time T . xk = r(t) ⋆ hk (t)t=T Or equivalently xk = r(t) ⋆ φk (T − t)t=T This is Rice Section 5. We might.4 Review At a digital transmitter. • This works at time T . In general. we can transmit at each time one of a set of signals {si (t)}M −1 . log2 M bits of information. .
. We simulated this in the Matlab code correlation and matched filter rx. . pick the i with ai closest to x. and the matched ﬁlter receiver. 13 Optimal Detection We receive r(t) and use the matched ﬁlter or correlation receiver to calculate x. nK−1 ]T Hopefully. • If symbols aren’t equally probable. . . the pdf of x is multivariate Gaussian with each component xk independent. 1 0 X 1 f1 Figure 20: Signal space diagram of a PAM system with an X to mark the receiver output x. Both result in the same output x.k + nk for k = 0 .m.1 Overview We’ll show two things: • If each symbol i is equally probable. We denote vectors: x = [x0 . 13. Gaussian. φk (t) = ai. then we’ll need to shift the decision boundaries.d.i. . Now. n1 . as shown in Figure 20. what rules should we have to decide on i? This is the topic of detection.5 Correlation Receiver Our signal set can be represented by the orthonormal basis functions.m.k ) 2(N0 /2) (16) We showed that there are two ways to implement this: the correlation receiver. x1 . how exactly do we decide which si (t) was sent? Consider this in signal space. then the decision will be. {φk (t)}K−1 .K−1 ]T n = [n0 . K − 1. In an example Matlab simulation. The joint pdf of x is K−1 fX (x) = k=0 1 − e fXk (xk ) = N/2 [2π(N0 /2)] PN 2 k=1 (xk −ai. . In general. xK−1 ]T ai = [ai. . . . .ECE 5520 Fall 2009 51 12. . ai. Correlation with the basis k=0 functions gives xk = r(t). . Given the signal space diagram of the transmitted signals and the received signal space vector x. .1 . ece5520 lec07. Optimal detection uses rules designed to minimize the probability that a symbol error will occur.k is a deterministic constant • The {nk } are i. ai. x and ai should be close since i was actually sent. . . 1]T and receiving r(t) (and thus x) in noise.k + nk • ai. we simulated sending ai = [1. because: • xk = ai.0 .
P [error] = P [errorH0 ] P [H0 ] + P [errorH1 ] P [H1 ] (17) X = a0 + N X = a1 + N r(t) = s0 (t) + n(t) r(t) = s1 (t) + n(t) . At the start of every detection problem. we’ll have only scalars: x. and only one basis function φ0 (t). H0 : H1 : Equivalently. the events H0 ∪ H1 ∪ · · · ∪ HM −1 = S. we list the events that could have occurred. and write: H0 : H1 : HM −1 : ··· r(t) = s0 (t) + n(t) r(t) = s1 (t) + n(t) r(t) = sM −1 (t) + n(t) ··· This must be a complete listing of events.ECE 5520 Fall 2009 52 Detection theory is a major learning objective of this course. 13. H0 : H1 : We use the law of total probability to say that P [error] = P [error ∩ H0 ] + P [error ∩ H1 ] Where the cap means ‘and’. detection is applicable in a wide variety of problems. a0 and a1 . 14 Binary Detection Let’s just say for now that there are only two signals s0 (t) and s1 (t). it is very applicable in digital receivers.2 Bayesian Detection When we say ‘optimal detection’ in the Bayesian detection framework. Then. The event listing is. where the ∪ means union. The probability of error is denoted P [error] By error.e.. We follow all detection and statistics textbooks and label these classes Hi . and S is the complete event space. • Medical applications: Does an image show a tumor? Does a blood sample indicate a disease? Does a breathing sound indicate an obstructed airway? • Communications applications: Receiver design. When we talk about them as random variables. instead of vectors x. • Radar applications: Obstruction detection. motion detection. for example. we mean that we want the smallest probability of error. i. Then using Bayes’ Law. signal detection. We need to decide from X whether s0 (t) or s1 (t) was sent. and n. a0 and a1 . the symbols that could have been sent. That is. So even though it is somewhat theoretical. we’ll substitute N for n and X for x. we mean that a diﬀerent signal was detected than the one that was sent. Further. and n.
we just say what region of x. Finally. 14. so • There is no overlap: R0 ∩ R1 = ∅. since the two are complementary sets. The objective is to minimize the probability of error. So the question is.3 Selecting R0 to Minimize Probability of Error We can see what the integrand looks like. the diﬀerence between the joint densities. . how do you pick R0 to minimize (19)? 14. We can’t be indecisive. but the only thing we can change is the region R0 . Over a diﬀerent set R1 of values. Figure ?? shows the joint densities (the conditional pdfs multiplied by the bit probabilities P [H0 ] and P [H1 ]. Which x’s should we include in the region? Should we include x which has a positive value of the integrand? Or should we include the parts of x which have a negative value of the integrand? Solution: Select R0 to be all x such that the integrand is negative.1 Decision Region We’re making a decision based only on X.2 Formula for Probability of Error So the probability of error is P [error] = P [X ∈ R1 H0 ] P [H0 ] + P [X ∈ R0 H1 ] P [H1 ] (18) The probability that X is in R1 is one minus the probability that it is in R0 . Over some set R0 of values of X. then this is the region in which X is more probable given H0 than given H1 . Figure 21 shows the conditional probability density functions. P [error] = P [H0 ] − + x∈R0 x∈R0 fXH0 (xH0 )P [H0 ] dx fXH1 (xH1 )P [H1 ] dx (19) fXH1 (xH1 )P [H1 ] − fXH0 (xH0 )P [H0 ] dx = P [H0 ] + x∈R0 We’ve got a lot of things in the expression in (19). P [error] = (1 − P [X ∈ R0 H0 ])P [H0 ] + P [X ∈ R0 H1 ] P [H1 ] P [error] = P [H0 ] − P [X ∈ R0 H0 ] P [H0 ] + P [X ∈ R0 H1 ] P [H1 ] Now note that probabilities that X ∈ R0 are integrals over the event (region) R0 . we’ll decide that H0 happened (s0 (t) was sent). • There are no values of x disregarded: R0 ∪ R1 = S. Everything else is determined by the time we get to this point. we’ll decide H1 occurred (that s1 (t) was sent). We can pick R0 however we want .ECE 5520 Fall 2009 53 14. and the integral in (19) will integrate over it. Figure ?? shows the full integrand of (19). Then R0 is the area in which fXH0 (xH0 )P [H0 ] > fXH1 (xH1 )P [H1 ] If P [H0 ] = P [H1 ].
d.s.f.8 f XH XH 1 2 Probability Density 0.4 0.6 0.4 −0.f.s (likelihood functions) fXH0 (xH0 ) and fXH1 (xH1 ).2 0 −3 −2 −1 (a) 0 r 1 2 3 (b) 0. fXH1 (xH1 )P [H1 ] − fXH0 (xH0 )P [H0 ].ECE 5520 Fall 2009 54 1 f 0. which is the integrand in (19). .2 X∩H −f 2 X ∩H 1 Integrand 0 −0.4 f 0. (b) joint p.2 −0.d. and (c) diﬀerence between the joint p.6 −3 −2 −1 (c) 0 r 1 2 3 Figure 21: The (a) conditional p.d.s fXH0 (xH0 )P [H0 ] and fXH1 (xH1 )P [H1 ].f.
ECE 5520 Fall 2009 55 Rearranging the terms.. Both sides are positive. Equation (20) is a very general result. n ∼ N (0. then we’ll decide H0 . 14. a1 = 1 in Gaussian noise 2 In this example. fXH1 (xH1 ) P [H0 ] < fXH0 (xH0 ) P [H1 ] (20) The left hand side is called the likelihood ratio. In addition. The loglikelihood ratio? 3.4 LogLikelihood Ratio For the Gaussian distribution.5 Case of a0 = 0. we’ll decide H1 . i.e. Now. the math gets much easier if we take the log of both sides. Whenever x indicates that the likelihood ratio is less than the threshold. i. Why can we do this? Solution: 1.e.. The log of a Gaussian pdf? 2. The log() function is strictly increasing. The decision regions for x? Solution: What is the log of a Gaussian pdf? log fXH0 (xH0 ) = log 1 2 2πσN e 2 − x2 2σ N (21) 1 x2 2 = − log(2πσN ) − 2 2 2σN The log fXH1 (xH1 ) term will be the same but with (x − 1)2 instead of x2 . The right hand side is a threshold. σN ). Continuing with the loglikelihood ratio. 2. that s0 (t) was sent. the loglikelihood ratio is fXH1 (xH1 ) P [H0 ] < log log fXH0 (xH0 ) P [H1 ] 14. log fXH1 (xH1 ) − log fXH0 (xH0 ) < log [H0 ] [H1 ] 2 2 (x − 1) [H0 ] x 2 − 2 [H1 ] 2σN 2σN P [H0 ] 2 x2 − (x − 1)2 < 2σN log P [H1 ] P [H0 ] 2 2x − 1 < 2σN log P [H1 ] P [H0 ] 1 2 + σN log x < 2 P [H1 ] P P P < log P . Otherwise. that s1 (t) was sent. What is: 1. applicable no matter what conditional distributions x has. assume for a minute that a0 = 0 and a1 = 1.
decide that s0 (t) was sent. we had arbitrary values for them (the signal space representations of s0 (t) and s1 (t)). For simplicity. we’d still have r H1 > < H0 γ. we could have derived the result in the last section the same way. 2 Example: Let a0 = −1. we also write x H1 > < H0 γ where γ= 1 P [H0 ] 2 + σN log 2 P [H1 ] (22) 14. σN = 0. instead of a0 = 0 and a1 = 1. 14. So then a0 + a1 2 The decision above says that if x is closer to a0 . What is the decision threshold for x? . γ= 2 σN P [H0 ] a0 + a1 + log 2 a1 − a0 P [H1 ] (23) 14.8 Examples Example: When H1 becomes less likely. This receiver is also called a maximum likelihood detector.ECE 5520 Fall 2009 56 In the end result.7 Equiprobable Special Case P [H1 ] P [H0 ] H1 > < H0 If symbols are equally likely. decide H0 . a1 = 1. which direction will the optimal threshold move.6 General Case for Arbitrary Signals If.4. there is a simple test for x .1. because we only decide which likelihood function is higher (neither is scaled by the prior probabilities P [H0 ] or P [H1 ]. And if x is closer to a1 . The boundary is exactly halfway in between the two signal space vectors. decide that s1 (t) was sent. P [H1 ] = P [H0 ]. As long as a0 < a1 . but now. and P [H0 ] = 0.if it is below the decision threshold. we use the following notation: x H1 > < H0 1 P [H0 ] 2 + σN log 2 P [H1 ] This completely describes the detector receiver.6. P [H1 ] = 0. towards a0 or towards a1 ? Solution: Towards a1 . Rather than writing both inequalities each time. P [H0 ] 1 2 x > + σN log 2 P [H1 ] decide H1 . If it is above the decision threshold. then x = 1 and the logarithm of the fraction is zero.
given that xH1 is Gaussian with mean a1 and variance σN of error? Solution: P [x > γH0 ] = Q γ − a0 σN γ − a1 a1 − γ P [x < γH1 ] = 1 − Q =Q σN σN γ − a0 P [error] = P [H0 ] Q + P [H1 ] Q σN a1 − γ σN 14.1 log = 0.0203 2 0. Example: In this example. receiver? 2 Given a0 . a1 . given that rH0 is Gaussian with mean a0 and variance σN ? What is the second 2 ? What is then the overall probability probability. you should be able to calculate the optimal decision threshold γ. both in the equiprobable symbol case.9 Review of Binary Detection We did three things to prove some things about the optimal detector: • We wrote the formula for the probability of error.6 0.4 Example: Can the decision threshold be higher than both a0 and a1 in this binary. • We found the decision regions which minimized the probability of error. and P [H0 ]. Lecture 9 Today: (1) Finish Lecture 8 (2) Intro to Mary PAM .5 ≈ 0. onedimensional signalling. σN .ECE 5520 Fall 2009 57 Solution: From (23). given all of the above constants and the optimal threshold γ. and in the general case. P [H1 ]. • We showed the formula for that threshold. • We used the log operator to show that for the Gaussian error case the decision regions are separated by a single threshold. Starting from P [error] = P [x ∈ R1 H0 ] P [H0 ] + P [x ∈ R0 H1 ] P [H1 ] we can use the decision regions in (22) to write P [error] = P [x > γH0 ] P [H0 ] + P [x < γH1 ] P [H1 ] 2 What is the ﬁrst probability.05 log 1. calculate the probability of error from (18). γ =0+ 0.
. That makes our symbol rate as 1/Ts symbols per second. o.a. by putting these sample exams up. • The book that year used ϕi (t) instead of φi (t) as the notation for a basis function. 0. The resulting signal we’ll call s(t) (without any subscript) and is s(t) = n a(n)p(t − nTs ) where a(n) ∈ {a0 . “symbol”) conveys k = log2 M bits of information. .1 Baseband Signal Examples When you compose a signal of a number of symbols. you know the actual exam won’t contain the same exact problems. on the RHS of the given expressions should be “t”s. it is on the probability topic. There is no guarantee this exam will be the same level of diﬃculty of either past exam. M where ai is the amplitude of waveform i. both for simplicity. . 1 2 2 (3t − 1). for i = 0. t.k. Instead of just sending one symbol.ECE 5520 Fall 2009 58 Sample exams (20078) and solutions posted on WebCT.0 . it is modulated with a carrier cos(2πfc t). There is a typo: fN (n) should be fN (t). . we are sending one symbol every Ts units of time. si (t) = ai p(t). −1 ≤ t ≤ 1 and x1 (t) = 0. That is. Each sent waveform (a. x(t) = ℜ s(t)ej2πfc t . aM −1 } is the nth symbol transmitted. and Ts is the symbol period (not the sample period!). .w. Of course. Notes on 2008 sample exam: • Problem 1(b) was not material covered this semester. Note: Keep in mind that when s(t) is to be transmitted on a bandpass channel. • Problem 6 typo: x1 (t) = Notes on 2007 sample exam: • Problem 2 should have said “with period Tsa ” rather than “at rate Tsa ”. Note we’re calling it p(t) instead of φ0 (t). • Problem 6 is on the topic of detection theory. and ai instead of ai. . and αi instead of si as the signal space vector for signal si (t). 15. you’ll just add each timedelayed signal to the transmitted waveform. the “x”s o. since there’s only one basis function. 15 Pulse Amplitude Modulation (PAM) Def ’n: Pulse Amplitude Modulation (PAM) M ary Pulse Amplitude Modulation is the use of M scaled versions of a single basis function p(t) = φ0 (t) as a signal set.w. . • Problem 3 is not on the topic of detection theory. . −1 ≤ t ≤ 1 .
It shows a signal set using square pulses. Example: 4ary PAM A 4ary PAM signal set using amplitudes a0 = −3A.2. 0 ≤ t < Ts φ0 (t) = p(t) = 0. a3 = 3A is shown in Figure 23. −(M − 3)A. respectively. a2 = A.5 −1 0 0.5 Value 0 −0. . +(M − 3)A. we want to calculate the average bit energy Eb .3 and 5.2.4 Time t 0.8 1 Figure 22: Example signal for binary bipolar PAM example for A = √ Ts . 1 0.ECE 5520 Fall 2009 59 Rice Figures 5. √ 1/ Ts . . A. Es = E = E ∞ −∞ 2 ai a2 p2 (t)dt = i ∞ −∞ E a2 φ2 (t)dt i 0 (24) . Example: Binary Unipolar PAM Unipolar binary PAM uses the amplitudes a0 = 0. a1 = A. a1 = A.6 show continuoustime and discretetime realizations. It shows a signal set using square pulses. 0 ≤ t < Ts p(t) = 0. . 15.2 Average Bit Energy in Mary PAM Assuming that each symbol si (t) is equally likely to be sent.2 0.6 0. which is 1/ log 2 M times the symbol energy Es . −A. of PAM transmitters and receivers. If the yaxis in Figure 22 was scaled such that the minimum was 0 instead of −1. √ 1/ Ts . . Example: Binary Bipolar PAM Figure 22 shows a 4ary PAM signal set using amplitudes a0 = −A.2.w. . This is also called OnOﬀ Keying (OOK). Typical M ary PAM (for all even M ) uses the following amplitudes: −(M − 1)A. The average energy per symbol has decreased. . o. a1 = −A. o. .1 shows the signal space diagram of Mary PAM for diﬀerent values of M . +(M − 1)A Rice Figure 5. it would represent unipolar PAM. .w.
properties (12) 2. as described above to be the typical Mary PAM case. .2 0. aM −1 } = −(M − 1)A. Bandpass signals 3. . Signal space representation 7. .8 1 Figure 23: Example signal for 4ary PAM example. 3 and Eb = 1 (M 2 − 1) 2 A log2 M 3 Lecture 10 Today: Review for Exam 1 16 Topics for Exam 1 1. (M − 1)A. . .6 0. −(M − 3)A. Then E a2 i = = A2 (1 − M )2 + (3 − M )2 + · · · + (M − 1)2 M A2 M M m=1 (2m − 1 − M )2 = A2 M (M 2 − 1) M 3 (M 2 − 1) = A2 3 So Es = (M 2 −1) 2 A . Let {a0 . Orthogonality / Orthonormality 5. .ECE 5520 Fall 2009 60 3 2 1 Value 0 −1 −2 −3 0 0. . . Sampling and aliasing 4. Fourier transform of signals. Correlation / matched ﬁlter receivers . Correlation and Covariance 6.4 Time t 0.
in signal space using the orthonormal set found in part (b). 2.8. its samples Xn are.W.99.9: An otherwise ideal mixer (multiplier) generates an internal DC oﬀset which sums with the baseband signal prior to multiplication with the cos(2πfc t) carrier. Haykin & Moyer Problem 2. o.25. ask me. 4).52 cos 0. Xn = 2.4 and Table 2.3 Orthogonality and Signal Space 1. Assume that x(t) is a bandlimited signal with X(f ) 1 T = 2W . Problem 249). 2. How does this aﬀect the spectrum of the output signal? 2. 17 17. Read over the lecture notes 17. If anything is confusing. why not? 17. Ts 2 17. You will be provided with Table 2. Consider the pulse shape x(t) = (a) Draw a plot of the pulse shape x(t). (b) Find the Fourier transform of x(t). (c) Express the waveform w(t) = 1. 3.w. − Ts < t < 2 o. 0.4. 0 ≤ t ≤ 4 0. Couch 2007. A signal x(t) of ﬁnite energy is applied to a square law device whose output y(t) is deﬁned by y(t) = x2 (t).w.2 Sampling and Aliasing = 0 for f  > W . (b) Find the scale factors needed to scale the functions to make them into an orthonormal set over the interval (−4.5×11 sheet of paper for notes.ECE 5520 Fall 2009 61 Make sure you know how to do each HW problem in HWs 13. 1. .100 What was the original continuoustime signal x(t)? Or. (from L. 4. 4). since they will reappear. Rice 2. 1.4. if x(t) cannot be determined. πt Ts .1 Additional Problems Spectrum of Communication Signals 1. The spectrum of x(t) is bandlimited to −W ≤ ω ≤ W . Rice 2. 2. Three functions are shown in Figure P249. (a) Show that these functions are orthogonal over the interval (−4. You may have one side of an 8. When x(t) is sampled at a rate n = ±1 n=0 o. It will be good to know the ‘tricks’ of how each problem is done. Bateman Problem 1.w. Prove that the spectrum of y(t) is bandlimited to −2W ≤ ω ≤ 2W .39.
H0 : H1 : r = a0 + n0 r = a1 + n1 where α0 = −1 and α1 = 1. design an RC lowpass ﬁlter that will attenuate this signal by 20 dB at 15 kHz.16. Let a binary 1D communication system be described by two possible signals.w. s1 (t) = s2 (t) = cos 0.26 17.w.e. SX (f ) = 1.ECE 5520 Fall 2009 62 2. .. 5.3. −T ≤ t ≤ 2 o. PSD 1. and noise with diﬀerent variance depending on which signal is sent. 5.23. a signal x(t) = si (t) + n(t) is received. 5. −T ≤ t ≤ 2 o. . Its transfer function is + R C + X(t)  Y(t)  Figure 24: An RC lowpass ﬁlter. i.5 Correlation / Matched Filter Receivers 1.1 on Page 219. 5. Rice 5. πt T T 2 T 2 − cos 0. (a) Is s1 (t) an energytype or powertype signal? (b) What is the energy or power (depending on answer to (a)) in s1 (t)? (c) Describe in words and a block diagram the operation of an correlation receiver.13.1. ﬁnd the value of RC to satisfy the design speciﬁcations. (From Couch 2007 Problem 280) An RC lowpass ﬁlter is the circuit shown in Figure 24. A binary communication system uses the following equallylikely signals. and n1 is zeromean Gaussian with variance 0.4 Random Processes.e.14. and n0 is zeromean Gaussian with variance 0.. 3.5. H(f ) = Y (f ) 1 = X(f ) 1 + j2πRCf Given that the PSD of the input signal is ﬂat and constant at 1. Rice Example 5.1. 5. πt T .2. i. That is. 5. (a) What is the conditional pdf of r given H0 ? (b) What is the conditional pdf of r given H1 ? 17. At the receiver. 2.
Proakis & Salehi 7.2 BER Function of Distance. We had 2 also called the variance of noise component nk as σN . Proakis & Salehi 7. k=1 (ai.1. T n1 = 0 T s1 (t)n(t)dt s2 (t)n(t)dt 0 n2 = Prove that E [n1 n2 ] = 0.ECE 5520 Fall 2009 63 2. In the Gaussian noise case (with equal variance of noise in H0 and H1 ). and with P [H0 ] = P [H1 ].2 of Rice. So: 2 σN = N0 . and P [error] = P [x > γH0 ] P [H0 ] + P [x < γH1 ] P [H1 ] . Lecture 11 Today: (1) Mary PAM Intro from Lecture 9 notes (2) Probability of Error in PAM 18 Probability of Error in Binary PAM This is in Section 6. T ). Prove that when a sinc pulse gT (t) = sin(πt/T )/(πt/T ) is passed through its matched ﬁlter.j = ai − aj = For the 1D. a distance between vectors. Noise PSD We have consistently used a1 > a0 . 3.k ) 2 d1. we came up with threshold γ = (a0 + a1 )/2. M 1/2 di. white noise process is correlated with s1 (t) and 2 (t) to yield.0 = a1 − a0  (26) 18.16. two signal case. A sample function n(t) of a zeromean.0 . Suppose that two signal waveforms s1 (t) and s2 (t) are orthogonal over the interval (0.8. 2 (25) 18. when talking about signal space diagrams. there is only d1. the output is the same sinc pulse.1 Signal Distance We mentioned.k − aj. How are N0 /2 and σN related? Recall from lecture 7 that the variance of nk came out to be N0 /2.
when A0 = 0 and A1 = A. the following formula works: a1 − a0  P [error] = Q 2σN Then. This expression. 2 2 4A 2A = Q P [error] = Q 2N0 N0 For unipolar signaling. Thus ‘bit’ and ‘symbol’ are interchangeable. P [x > γH0 ] = Q γ − a0 σN γ − a1 P [x < γH1 ] = 1 − Q σN =Q a1 − γ σN (27) Thus P [x > γH0 ] = Q P [x < γH1 ] = Q So both are equal.1 Q 2N0 18. This will be important as we talk about binary PAM. we have P [error] = Q d0. using (25) and (26). 2 d0.1 2N0 (30) Is one that we will see over and over again in this class.3 Binary PAM Error Probabilities For binary PAM (M = 2) it takes one symbol to encode one bit. when A0 = −A and A1 = A. For bipolar signaling. so if we had also calculated for the case a1 < a0 .ECE 5520 Fall 2009 64 Which simpliﬁed using the Q function. and again with P [H0 ] = P [H1 ] = 1/2. Now we can use (30) to compute the probability of error. we’d have seen that in general. A2 2N0 P [error] = Q .1 2 N0 /2 = Q d2 0. P [error] = Q (a1 − a0 )/2 σN (29) (a1 − a0 )/2 σN (a1 − a0 )/2 σN (28) We’d assumed a1 > a0 . our signals s0 (t) = a0 p(t) and s1 (t) = a1 p(t) are just s0 = a0 and s1 = a1 . In signal space.
For the unipolar signalling. A2 m m equals A2 with probability 1/2 and zero with probability 1/2. 10 0 Theoretical Probability of Error 10 10 10 10 10 10 −1 Unipolar Bipolar −2 −3 −4 −5 −6 0 2 4 6 8 10 E / N ratio. A2 = A2 always. Since half of the bits are exactly zero. For bipolar signaling. dB b 0 12 14 Figure 25: Probability of Error in binary PAM signalling. we rewrite the probability of error expressions in terms of Eb . P [error] = Q Discussion 2Eb 2N0 =Q Eb N0 These probabilities of error are shown in Figure 25. 2 Now. Using (24). • Even for the same bit energy Eb . • What is the diﬀerence in Eb /N0 for a constant probability of error? • What is the diﬀerence in terms of probability of error for a given Eb /N0 ? Lecture 12 Today: (0) Exam 1 Return. 2 4A 2Eb =Q P [error] = Q 2N0 N0 For unipolar signaling. what is the average energy of bipolar and unipolar signaling? Solution: For the bipolar signalling. Thus Eb = Es = 1 A2 . so Eb = Es = A2 . (1) MultiHypothesis Detection Theory. (2) Probability of Error in Mary PAM . they take no signal energy. the bit error rate of bipolar PAM beats unipolar PAM.ECE 5520 Fall 2009 65 However. this doesn’t take into consideration that the unipolar signaling method uses only half of the energy to transmit.
that is. H0 : H1 : HM −1 : ··· fXH0 (xH0 )P [H0 ] fXH1 (xH1 )P [H1 ] fXHM −1 (xHM −1 )P [HM −1 ] ··· ··· r(t) = s0 (t) + n(t) r(t) = s1 (t) + n(t) r(t) = sM (t) + n(t) ··· We will ﬁnd which of these joint probabilities is highest. a1 g12 a2 g23 a3 g34 a4 r Figure 26: Signal space representation and decision thresholds for 4ary PAM. So. it is very rare in higher M communication systems. it is better to maximize the log of the likelihood rather than the likelihood directly. H0 : H1 : HM −1 : which have joint probabilities. The decision between two neighboring signal vectors ai and ai+1 will be r Hi+1 > < Hi γi. See Figure 26. ﬁnd the ai which is closest to x. this is a (squared) distance between x and ai . the decision is.v.) If P [H0 ] = · · · = P [HM −1 ] then we only need to ﬁnd the i that makes the likelihood fXHi (xHi ) maximum. .ECE 5520 Fall 2009 66 19 Detection with Multiple Symbols When we only had s0 (t) and s1 (t). For this class. we saw that our decision was based on which of the following was higher: fXH0 (xH0 )P [H0 ] and fXH1 (xH1 )P [H1 ] For M ary signals. (While equiprobable signals is sometimes not the case for M = 2 binary detection.i+1 = ai + ai+1 2 As an example: 4ary PAM. so (x − ai )2 1 2 log fXHi (xHi ) = − log(2πσN ) − 2 2 2σN This is maximized when (x − ai )2 is minimized.s with equal variances σN . we’ll only consider the case of equally probable signals. Symbol Decision = arg max fXHi (xHi ) i 2 For Gaussian (conditional) r. Essentially. we’ll see that we need to consider the highest of all M possible events Hm . maximum likelihood detection.
. P (symbol errorHi ) = 2Q For the symbols i on the ‘sides’. dB b 0 12 14 Figure 27: Probability of Symbol Error in M ary PAM. Also the noise σN = N0 /2. P (symbol errorHi ) = Q So overall. Here. Assuming neighboring symbols ai are spaced by 2A. log2 M 3 How does this relate to the average bit energy Eb ? From last lecture. P (symbol error) = 20. as discussed above. 10 0 which means that 3 log2 M Eb M2 − 1 6 log2 M Eb M 2 − 1 N0 (31) 2(M − 1) Q M Theoretical Probability of Error 10 10 10 10 10 10 −1 −2 −3 −4 −5 M=2 M=4 M=8 M=16 −6 0 2 4 6 8 10 E / N ratio.ECE 5520 Fall 2009 67 20 20. the decision threshold is always A away from the symbol values.1 2(M − 1) Q M A N0 /2 A N0 /2 A N0 /2 Symbol Error Rate and Average Bit Energy (M 2 −1) 2 1 A . each ai = Ai . Eb = A= So P (symbol error) = Equation (31) is plotted in Figure 27. For the symbols i in the ‘middle’.1.1 Mary PAM Probability of Error Symbol Error The probability that we don’t get the symbol correct is the probability that x does not fall within the range 2 between the thresholds in which it belongs.
in almost all cases. as in Option 2 in Figure 28. one will be assigned binary 0 and the other binary 1..ECE 5520 Fall 2009 68 20. Solution: Here is one solution. As of now . . we have to carefully assign bit codes to symbols 1 . 20. When you make one symbol error (decide H0 or H1 in error) then it will cause one bit error.if you have to estimate a bit error rate for M ary PAM . M . we need to study further the probability that x will jump more than one decision region. “110” “111” “101” “100” “000” “001” “011” “010” f1 1 0 2 1 3 3 2 Figure 29: Gray encoding for 8ary PAM. the errors will tend to be just one bit ﬂipped. P (error) ≈ 1 P (symbol error) log2 M (32) • Maybe more. . It will not shift a signal uniformly across symbols. At least at high Eb /N0 . one will lead to a higher bit error rate than the other. Is one is better or worse? Option 1: “00” “11” “01” “10” Option 2: “00” “01” “11” “10” 1 0 1 Figure 28: Two options for assigning bits to symbols.g. .2.2 Bit Errors and Gray Encoding For binary PAM. The key is to recall the model for noise. For M ary PAM. We will show that this approximation is very good.use (32). Example: Bit coding of M = 4 symbols While the two options shown in Figure 28 both assign 2bits to each symbol in unique ways. Then. Example: Bit coding of M = 8 symbols Assign three bits to each symbol such that any two nearest neighbors are diﬀerent in only one bit (Gray encoding). We will also show examples in multidimensional signalling (e. more than multiple bits ﬂipped. Instead. Thus Gray encoding will only change one bit across boundaries. It will tend to leave the received signal x close to the original signal ai . The neighbors of ai will be more likely than distant signal space vectors. there are only two symbols. If Gray encoding is used. bits and symbols are not synonymous. up to log2 M in the worst case.1 Bit Error Probabilities How many bit errors are caused by a symbol error in M ary PAM? • One. QAM) when this is not a good approximation.
. The measured radio channel is a ﬁlter. then r(t) = s(t) ⋆ h(τ. The amplitudes of the impulses tend to decay over time. Consider the spectrum of the ideal 1D PAM system with a square pulse. Now. Wired channels have (nonideal) bandwidth – the attenuation of a cable or twisted pair is a function of frequency. consider the eﬀect of ﬁltering in the path between transmitter and receiver. If we try (1) above and use FDMA (frequency division multiple access) then the interference is outofband interference. and (2) occupies too much time. If s(t) is transmitted. Transmitter: Baseband ﬁlters come from the limited bandwidth. 2. Consider the timedomain of the signal which is close to a rect function in the frequency domain. each reﬂection. or scattering adds an additional impulse to (33) with a particular time delay (depending on the extra path length compared to the lineofsight path) and complex amplitude (depending on the losses and phase shifts experienced along the path). t) = l=1 αl ejφl δ(τ − τl ). noise power entering receiver. and τl is its time delay. 1. 3. RF ﬁlters are used to limit the spectral output of the signal (to meet outofband interference requirements). Essentially. In reality. Note that the ‘matched ﬁlter’ receiver is also a ﬁlter! 21. If we try (2) above and put symbols right next to each other in time. but multipath with delays of hundreds of nanoseconds often exist in indoor links and delays of several microseconds often exist in outdoor links. We don’t want either: (1) occupies too much spectrum. the radio channel is modeled as L h(τ. our own symbols experience interference called intersymbol interference. call its impulse response h(τ. (33) where αl and φl are the amplitude and phase of the lth multipath component. Waveguides have bandwidth limits (and modes). diﬀraction. 2. Typically. (2) NDim Detection Theory 21 Intersymbol Interference 1. Filters will be seen in 1. t) will be received. Receiver: Bandpass ﬁlters are used to reduce interference. t). speed of digital circuitry.1 Multipath Radio Channel (Background knowledge).ECE 5520 Fall 2009 69 Lecture 13 Today: (1) Pulse Shaping / ISI. we want to compromise between (1) and (2) and experience only a small amount of both. 2. Channel: The wideband radio channel is a sum of timedelayed impulses.
2 0.2 0.1 0.2 Transmitter and Receiver Filters Example: Bipolar PAM Signal Input to a Filter How does this eﬀect the correlation receiver and detector? Symbols sent over time are no longer orthogonal – one symbol ‘leaks’ into the next (or beyond). Example: 24002480 MHz Measured Radio Channel The ‘Delay Proﬁle’ is shown in Figure 30 for an example radio channel.2µs is 1/ 5 MHz. Key insight: .05 0 0 100 (a) (b) 200 Delay.1 0.25 0. 0.15 0.05 0 0 Amplitude Delay Profile 100 200 Delay. ns 300 400 Figure 30: Multiple delay proﬁles measured on two example links: (a) 13 to 43. Actually what we observe PDP(t) = r(t) ⋆ x(t) = (x(t) ⋆ h(t)) ⋆ x(t) PDP(t) = r(t) ⋆ x(t) = RX (t) ⋆ h(t) PDP(f ) = H(f )SX (f ) (34) Note that 200 ns is 0. each symbol would fully bleed into the next symbol.15 0.25 Amplitude Delay Profile 0. At this symbol rate. and (b) 14 to 43. How about for a 5 kHz symbol rate? Why or why not? Other solutions: • OFDM (Orthogonal Frequency Division Multiplexing) • Direct Sequence Spread Spectrum / CDMA • Adaptive Equalization 21. 22 Nyquist Filtering • We don’t need the leakage to be zero at all time. ns 300 400 0.ECE 5520 Fall 2009 70 The channel in (33) has inﬁnite bandwidth (why?) so isn’t really what we’d see if we looked just at a narrow band of the channel.
. symmetric about f = 1/(2Ts ).. X(f ) could be: 1 • X(f ) = rect(f Ts ).. X(f ) = 0 for all f  > Ts ).ECE 5520 Fall 2009 • The value out of the matched ﬁlter (correlator) is taken only at times nTs . The square root raised cosine. making the sum of ‘leakage’ or ISI from one symbol be exactly 0 at all other sampling times nTs . 1 If X(f ) only bleeds into one neighboring channel (that is. 71 Nyquist ﬁltering has the objective of. at the matched ﬁlter output.w.. n = 0 0. Thus X(f ) is allowed to bleed into other frequency bands – but the neighboring frequencyshifted copy of X(f ) must be lower s. of Rice book. we denote the diﬀerence between the ideal rect function and our X(f ) as ∆(f ). are orthogonal to each other.e.t.. is the condition for orthogonality of two waveforms. ∆(f ) = X(f ) − rect(f Ts ) then we can rewrite the Nyquist Filtering condition as. Appendix A. Basically. we need p(t) such that {φn (t) = p(t − nTs )}n=. forms an orthonormal basis.−1. The Nyquist ﬁltering theorem provides an easy means to come up with others. o. • It may bleed into the next ‘channel’ but the sum of ··· + X f − must be constant across all f . that there is an inner product of zero.0. Essentially. that when shifted in time by Ts . See Figure 31. Where have you seen this condition before? This condition. for integer n.. ∆ 1 −f 2Ts =∆ 1 +f 2Ts 1 Ts + X(f ) + X f + 1 Ts + ··· for −1/Ts ≤ f < 1/Ts . See Figure 31 . i. X f+ m Ts = Ts Proof: on page 833. What we need are pulse shapes (waveforms). the sum is constant. exactly constant within − 2Ts < f < 1 2Ts band and zero outside. Theorem: Nyquist Filtering Proof: A necessary and suﬃcient condition for x(t) to satisfy x(nTs ) = is that its Fourier transform X(f ) satisfy ∞ m=−∞ 1. In more mathematical language.1.. shown in Figure 32 accomplishes this. But lots of other functions also accomplish this.
Drawing a horizontal line for the frequency f axis.5/Ts . the middle is 0. f  − 1−α 2Ts . and later even higher dimensional signal vectors. and the vertical axis it X(f ). In other words.” Activity: Doityourself Nyquist ﬁlter. 23 Mary Detection Theory in N dimensional signal space We are going to start to talk about QAM.1 Raised Cosine Filtering HRC (f ) = Ts . one at the transmitter and one at the receiver. ≤ f  1−α 2Ts ≤ 1+α 2Ts (35) But we usually need do design a system with two identical ﬁlters. Take a sheet of paper and fold it in half.w.w. we need H(f )2 to meet the Nyquist ﬁltering condition. Leave a tiny bit so that it is not completely disconnected into two halves. Setup: • Transmit: one of M possible symbols. 22. Cut a function in the thickest side (the edge that you just folded). Unfold. . We have developed M ary detection theory for 1D signal vectors. produce a zeroISI ﬁlter. o. 2Ts Ts (36) HRRC (f ) = 1 + cos πTs f  − 1−α . . and now we will extend that to N dimensional signal vectors. and then in half the other direction.5 × 1/Ts . 0 ≤ f  ≤ 1−α 2Ts o.2 SquareRoot Raised Cosine Filtering 0. . 1−α ≤ f  ≤ 1+α 2 α 2Ts 2Ts 2Ts 0. . a modulation with two dimensional signal vectors. Ts 2 1 + cos πTs α 22. Bateman presents this whole condition as “A Nyquist channel response is characterized by the transfer function having a transition band that is symmetrical about a frequency equal to 0. aM −1 .ECE 5520 Fall 2009 72 X(f) + X(f+1/T) X(f) X(f+1/T) f 1 1 2T T Figure 31: The Nyquist ﬁltering criterion – 1/Ts frequencyshifted copies of X(f ) must add up to a constant. a0 . . √ 0 ≤ f  ≤ 1−α Ts . which in series.
fXHi (xHi ) = which can also be written as fXHi (xHi ) = 1 2 (2πσN )N/2 exp − exp − j (xj 2 2σN − ai. • Receive: the symbol plus noise: H0 : HM −1 : ··· X = a0 + n X = aM −1 + n ··· • Assume: n is multivariate Gaussian. each component ni is independent with zero mean and variance σN = N0 /2. are orthogonal to each other. the optimal decision turns out to be given by the maximum likelihood receiver.1 0. delayed in time by nTs for any integer n.05 −6 −5 −4 −3 −2 −1 i 0 1 2 Time.25 Basis Function φ (t) 0.05 0 −0.2 0.15 0. • Assume: Symbols are equally likely. ˆ = argmax log fXH (xHi ) i (37) i i Here.ECE 5520 Fall 2009 73 0. • Question: What are the optimal decision regions? When symbols are equally likely.j )2 2 1 2 (2πσN )N/2 x − ai 2 2σN x − ai 2 2σN So when we want to solve (37) we can simplify quickly to: ˆ = argmax log i i 1 2 (2πσN )N/2 2 2 − ˆ = argmax − x − ai i 2 2σN i ˆ = argmin x − ai 2 i i ˆ = argmin x − ai i i (38) . t/Tsy 3 4 5 6 7 Figure 32: Two SRRC Pulses.
Previously. These are: √ 2p(t) cos(ω0 t) φ0 (t) = √ φ1 (t) = 2p(t) sin(ω0 t) where p(t) is a pulse shape (like the ones we’ve looked at previously) with support on T1 ≤ t ≤ T2 . This deﬁnition of the orthonormal basis speciﬁcally considers a pulse shape at a frequency ω0 . We include it here because . Lecture 14 Today: (1) QAM & PSK. To ﬁnd this line: 1. p(t) is only nonzero within that window. Pairwise Comparisons When is x closer to ai than to aj for some other signal space point j? Solution: The two decision regions are separated by a straight line (Note: replace “line” with plane in 3D. Draw the perpendicular bisector of the line segment through that point.) Example: Optimal Decision Regions See Figure 33. (2) QAM Probability of Error 24 Quadrature Amplitude Modulation (QAM) Quadrature Amplitude Modulation (QAM) is a twodimensional signalling method which uses the inphase and quadrature (cosine and sine waves.ECE 5520 Fall 2009 74 Again: Just ﬁnd the ai in the signal space diagram which is closest to x. and reorganize to ﬁnd a linear equation in terms of x. Draw a line segment connecting ai and aj . Decision Regions Each pairwise comparison results in a linear division of space. The combined decision region of Ri is the space which is the intersection of all pairwise decision regions. This is left as an exercise. Draw a point in the middle of that line segment. cancel. or subspace in N D). we’ve separated frequency upconversion and downconversion from pulse shaping. 2. (All conditions must be satisﬁed. respectively) as the two dimensions. That is. 3. Why? Solution: Try to ﬁnd the locus of point x which satisfy x − ai 2 = x − aj 2 You can do this by using the inner product to represent the magnitude squared operator: (x − ai )T (x − ai ) = (x − aj )T (x − aj ) Then use FOIL (multiply out). Thus QAM uses two basis functions.
. Draw the optimal decision regions.ECE 5520 Fall 2009 75 (a) (b) (c) (d) Figure 33: Example signal space diagrams.
• p(t) is ‘low pass’. consider the constant p(t) = c for a constant c. φ1 (t) We can see that there are two cases: = c2 sin(ω0 (T2 + T1 )) sin(ω0 (T2 − T1 )) ω0 . let p(t) = c for a constant c . we can have two orthogonal basis functions. so φ0 (t).ECE 5520 Fall 2009 76 it is critical to see how. see how far we can get before plugging in for p(t): T2 φ0 (t). Are the two bases orthogonal? First.1 Showing Orthogonality Let’s show that the basis functions are orthogonal. φ1 (t) = c2 sin(ω0 (T2 + T1 )) sin(ω0 (T2 − T1 )) ω0 Solution: First. 24. (This is not intuitive!) We have two restrictions on p(t) that makes these two basis functions orthonormal: • p(t) is unitenergy. show that φ0 (t). that is. for T1 ≤ t ≤ T2 . with the same pulse shape p(t). for T1 ≤ t ≤ T2 . it has low frequency content compared to ω0 t. φ1 (t) = T1 √ √ 2p(t) cos(ω0 t) 2p(t) sin(ω0 t)dt T2 = 2 T1 T2 p2 (t) cos(ω0 t) sin(ω0 t)dt = T1 p2 (t) sin(2ω0 t)dt Next. We’ll need these cosine / sine function relationships: sin(2A) = 2 cos A sin A A+B cos A − cos B = −2 sin 2 sin A−B 2 As a ﬁrst example. φ1 (t) = c2 T2 sin(2ω0 t)dt T1 = c2 − = −c2 cos(2ω0 t) 2ω0 T2 T1 cos(2ω0 T2 ) − cos(2ω0 T1 ) 2ω0 cos(2ω0 T2 ) − cos(2ω0 T1 ) = −c2 2ω0 I want the answer in terms of T2 − T1 (for reasons explained below). φ0 (t).
. Finally. we’re going to end up with approximately zero. assume p2 (t) is nearly constant. T2 ].15 in the Rice book (page 298). 2. and so the correlation is exactly zero. Otherwise. That is. [T1 + (i − 1)π/ω0 . where each signal space vector ak is twodimensional: ak = [ak. Proof? Solution: How many cycles are there? Consider the period [T1 . Then. For engineering purposes. . φ1 (t) = p2 (T1 + Lπ/ω0 ) = p2 (T1 + Lπ/ω0 ) T2 sin(2ω0 t)dt T1 +Lπ/ω0 cos(2ω0 T2 ) − cos(2ω0 T1 + 2πL) 2ω0 sin(ω0 (T2 + T1 )) sin(ω0 (T2 − T1 )) = p2 (T1 + Lπ/ω0 ) ω0 Again. . How many times can it be divided by π/ω0 ? Let the integer number be (T2 − T1 )ω0 . aM −1 . ak. we can attempt the proof for the case of arbitrary pulse shape p(t). 24. when we divide by numbers on the order of 106 or 109 . you can see that the pulse p2 (t) is largely constant across each cycle. . . T2 ].1 ]T . we see a pulse modulated by a sine wave at frequency 2πω0 . The pulse duration T2 − T1 is an integer number of periods. when we integrate p2 (t) sin(2ω0 t) across one cycle. we use the ‘lowpass’ assumption that the maximum frequency content of p(t) is much lower than 2πω0 . This is wellillustrated in Figure 5. the right sin is zero. ω0 is a large number. [T1 + Lπ/ω0 . L= π Then each cycle is in the period. Within each of these cycles. . because it is a sinusoid. In this case. . That is. φ1 (t) ≤ ω0 ω0 Typically. frequencies 2πω0 could be in the MHz or GHz ranges. for i = 1. The only remainder is the remainder (partial cycle) period. the numerator bounded above and below by +1 and 1. Zooming in on any few cycles. . Certainly.ECE 5520 Fall 2009 77 1. this part of the integral is zero here. In this case. the inner product is bounded on either side: − p2 (T1 + Lπ/ω0 ) p2 (T1 + Lπ/ω0 ) ≤ φ0 (t). is nearly zero. T1 + iπ/ω0 ]. This assumption allows us to assume that p(t) is nearly constant over the course of one cycle of the carrier sinusoid. φ1 (t) ω0 ω0 which for very large ω0 . L. − c2 c2 ≤ φ0 (t). In this ﬁgure. For example. we’re going to get a very small inner product. φ0 (t) and φ1 (t) are orthogonal.2 Constellation With these two basis functions. in the same way that the earlier integral was zero. ω0 (T2 − T1 ) = πk for some integer k. M ary QAM is deﬁned as an arbitrary signal set a0 . Thus. Thus φ0 (t).0 .
. . • Figure 5. but you should be able to read other books and know what they’re talking about. .1 e−jω0 t s(t) = √ = 2p(t)R e−jω0 t (ak.0 2p(t) cos(ω0 t) + ak. These constellations use M = 2a for some even integer a.0 φ0 (t) + ak. These are either rectangular grids. . • See Figure 5.1 2p(t) sin(ω0 t) √ 2p(t) [ak. we can see that the signal space representation ak is given by ak = [ak. Then.1 φ1 (t) √ √ = ak.ECE 5520 Fall 2009 78 The signal corresponding to symbol k in (M ary) QAM is thus s(t) = ak.0 e−jω0 t + jak.3 Signal Constellations M=64 QAM Figure 34: Square signal constellation for 64QAM. M − 1 24. If you do the following operation you can recover the real signal s(t) as √ s(t) = 2R e−jω0 t sSB (t) You will not be responsible for Complex Baseband notation. .0 + jak. One such diagram for M = 64 square QAM is also given here in Figure 34. and arrange the points in a grid.1 ]T for k = 0.1 sin(ω0 t)] = Note that we could also write the signal s(t) as √ 2p(t)R ak. ak.18 shows examples of constellations which use M = 2a for some odd integer a.1 ) (39) In many textbooks.0 + jak. you will see them write a QAM signal in shorthand as sCB (t) = p(t)(ak. and arrange the points in a grid.17 for examples of square QAM.1 ) This is called ‘Complex Baseband’. or use squares with the corners cut out.0 .0 cos(ω0 t) + ak.
Thus BPSK is the same as bipolar (2ary) PAM.768 QAM).e. BPSK For example. it can send up to 215 QAM (32. 24. Note how QPSK is the same as M = 4 square QAM.0 .. . wikipedia.e.1 ak. 2Eb . Note that the rotation of the signal space diagram doesn’t matter. and 11 GHz. • Dialup modems: use a M = 16 or M = 8 QAM constellation.1 ak. P [error] = Q N0 QPSK M = 4 PSK is also called quadrature phase shift keying (QPSK). so both ‘versions’ are identical in concept (although would be a slightly diﬀerent implementation). i.0 k. and the Rice book: • Digital Microwave Relay. for equally probable symbols.DMT uses multicarrier (up to 256 carriers) methods (OFDM). G.5 Average Energy in MQAM Recall that the average energy is calculated as: Es = Eb = 1 M M −1 i=0 ai 2 M −1 i=0 1 M log2 M ai 2 where Es is the average energy per symbol and Eb is the average energy per bit.4 Angle and Magnitude Representation a2 + a2 k.ECE 5520 Fall 2009 79 24. .7 Systems which use QAM See [Couch 2007]. a0  = a1  = · · · = aM −1  The points ai for i = 0. We’ll work in class some examples of ﬁnding Eb in diﬀerent constellation diagrams. Some examples are shown in Figure 35.3kHz) carrier. . M − 1 are uniformly spaced on the unit circle. 24. i. What is the probability of error in BPSK? The same as in bipolar PAM. You can plot ak in signal space and see that it has a magnitude (distance from the origin) of ak  = and angle of ∠ak = tan−1 In the continuous time signal s(t) this is s(t) = √ 2p(t)ak  cos(ω0 t + ∠ak ) 24. each with up to 28 = 256 QAM.Lite uses up to 128 carriers. G. • DSL. and on each narrowband (4. . 6 GHz..6 PhaseShift Keying Some implementations of QAM limit the constellation to include only signal space vectors with equal magnitude. . binary phaseshift keying (BPSK) is the case of M = 2. various manufacturerspeciﬁc protocols. and is shown in Figure 35(a).
Some constellations are more diﬃcult (energyconsuming) to amplify at the transmitter. • Digital Video Broadcast (DVB): APSK used in ETSI standard. uses up to 64 QAM. you should realize that there are two main considerations when a constellation is designed for a particular M : 1.11g: Adaptive modulation method.11a and 802. 2. The highest magnitude points (furthest from the origin) strongly (negatively) impact average bit energy. • 802. . with 64 QAM or 256 QAM. • Cable modems. Downstream: QPSK or 16 QAM. Upstream: 6 MHz bandwidth channel. There are also some practical considerations that we will discuss 1.ECE 5520 Fall 2009 80 QPSK QPSK (a) M=8 M=16 (b) Figure 35: Signal constellations for (a) M = 4 PSK and (b) M = 8 and M = 16 PSK.1 QAM Probability of Error Overview of Future Discussions on QAM Before we get into details about the probabilities of error. Some constellations have more complicated decision regions which require more computation to implement in a receiver. 2. 25 25. Points with equal distances separating them and their neighboring points tend to reduce probability of error.
It is not an approximation.2 Options for Probability of Error Expressions In order of preference: 1. Consider the QPSK constellation diagram. 2. consider M ary PSK. now consider the decision region for the ﬁrst bit. The decisions are decoupled – x2 has no impact on the decision about bit one. It can be used for “worst case” analysis which is often very useful for engineering design of systems. of the received signal vector x. π minus the area in the sector within ± M of the correct angle φi . This is a way to get a solution that is analytically easier to handle.. 3. The decision is made using only one dimension.4 Probability of Error in QPSK In QPSK. there is an exact expression for P [error] in an AWGN environment. In a few cases. we must ﬁnd calculate the probability of symbol error as 1 M=8 M=16 Figure 36: Signal space diagram for M ary PSK for M = 8 and M = 16. and x1 has no impact on the decision . They require an integration of a N D Gaussian pdf across an area. For example. Also. You have already calculated the decision regions for each symbol. speciﬁcally x1 . but more like a section probability which is the volume under some part of the N dimensional Gaussian pdf. the probability of error is analytically tractable. Nearest Neighbor Approximation.. E Typically these approximations are good at high Nb . 25. We needed a Q function to get a tail probability for a 1D Gaussian pdf. Now we need not just a tail probability. 0 25.ECE 5520 Fall 2009 81 25. the second bit decision is made using only x2 . and we don’t generally have any exact expression to use to express the result in general. This is because our decisions regions are more complex than one threshold test. the noise contribution to each element is independent. Similarly. P (symbol error) = 1 − r∈Ri 1 − e 2πσ 2 r−α i 2 2σ 2 (40) This integral is a double integral. Essentially.3 Exact Error Analysis Exact probability of error formulas in N dimensional modulations can be very diﬃcult to ﬁnd. This is a provable upper bound on the probability of error. This is. Exact formula. Union bound. when Gray encoding is used.
and it is very useful across communications. As we will show later. 25. the bandwidth of QPSK is identical to that of BPSK. This can be useful for quick initial study of a modulation type. when some other symbol not equal to i was actually sent. as shown in Figure 37(a). the bit error rate does not increase. know this one. Example: QPSK First. Since we know the bit error probability for each bit decision (it is the same as bipolar PAM) we can see that the bit error probability is also P [error] = Q 2Eb N0 (41) This is an extraordinary result – the bit rate will double in QPSK.6 Application of Union Bound In this class. let’s calculate the union bound on the probability of error. 25.ECE 5520 Fall 2009 82 on bit two. n P [symbol error] ≤ P [Ei ] i=1 (44) The union bound gives a conservative estimate. that for two events E and F that P [E ∪ F ] = P [E] + P [F ] − P [E ∩ F ] You can prove this from the three axioms of probability. we have that P [E ∪ F ] ≤ P [E] + P [F ] . The event Ei may represent the event that we decide Hi . We deﬁne the two error events as: . Furthermore. If you know one inequality. It is useful when the overlaps Ei ∩ Ej are small but diﬃcult to calculate. using the above formula. Assume s1 (t) is sent. En we have that n n (42) P i=1 Ei ≤ P [Ei ] i=1 (43) This is called the union bound. let’s study the union bound for QPSK. . but in theory. and the ﬁrst axiom of probability. our events are typically decision events. from (42) it is straightforward to show that for any list of sets E1 . . We know that (from previous analysis): 2Eb P [error] = Q N0 So the probability of symbol error is one minus the probability that there were no error in either bit. E2 . .5 Union Bound From 5510 or an equivalent class (or a Venn diagram) you may recall the probability formula. P [symbol error] = 1 − 1−Q 2Eb N0 2 (45) In contrast. In this case. (This holds for any events E and F !) Then.
Is this equation exact? 2. the event that r falls on the wrong side of the decision boundary with a0 . compared to the exact expression. Either way produces an upper bound. because we are only looking for an upper bound any way.). 3Es /2]T and a2 = −a0 = [ Es /2. P [symbol errorH1 ] = P [E2 ∪ E0 ] But don’t be confused. Is E2 ∪ E0 ∪ E3 = E2 ∪ E0 ? Why or why not? Because of the answer to the question (2. both produce nearly identical numerical results. Then we write P [symbol errorH1 ] = P [E2 ∪ E0 ∪ E3 ] Questions: 1. The Rice book will not tell you to do this! His strategy is to ignore redundancies. the event that r falls on the wrong side of the decision boundary with a2 . and in fact. it is perfectly acceptable (and in fact a better bound) to remove redundancies and then apply the bound. However. The amplitudes of the top and bottom symbols are 3 times the . we use the union bound.ECE 5520 Fall 2009 83 a1 a1 E2 a2 (a) a3 E4 a0 (b) a3 √ Es . 0]T . P [symbol errorH1 ] ≤ P [E2 ] + P [E0 ] These two probabilities are just the probability of error for a binary modulation. Then. so P [symbol errorH1 ] ≤ 2Q 2Eb N0 What is missing / What is the diﬀerence in this expression compared to (45)? See Figure 38 to see the union bound probability of error plot. In (b) a1 = −a3 = E2 a2 E4 a0 Figure 37: Union bound examples. and both are identical. (a) is QPSK with symbols of equal energy [0. • E0 . Only at very low Eb /N0 is there any noticeable diﬀerence! Example: 4QAM with two amplitude levels √ This is shown (poorly) in Figure 37(b). • E2 .
) I am calling this ”2amplitude 4QAM” (I made it up). dB 5 6 Figure 38: For QPSK. This is a general formula and is not necessarily the best upper bound.6. Thus the formula for the union bound is the same. the probability is the same as above.1 General Formula for Union Boundbased Probability of Error This is given in Rice 6. (They are positioned to keep the distance between points in the signal space equal to 2Es /N0 . 25.5) = Es Eav = Es 4 Thus there is no advantage to the twoamplitude QAM modulation in terms of average energy. the union bound on probability of symbol error is P [symbol errorH2 ] ≤ 3·2+2·2 Q 4 2Eb N0 How about average energy? For QPSK. amplitude of the symbols on the right and left. This means the actual P [symbol error] will be at most this value. these two distances between symbol a1 and a2 or a0 are the same: 2Es /N0 . given H2 ? Solution: Now. given H1 ? Solution: Given symbol 1. What is the union bound on the probability of symbol error. the union bound.ECE 5520 Fall 2009 84 Probability of Symbol Error 10 −1 10 −2 QPSK Exact QPSK Union Bound 0 1 2 3 4 Eb/N0 Ratio. It is probably less than this value. overall. For the twoamplitude 4QAM modulation. Deﬁning E2 and E0 as above.76: P [symbol error] ≤ 1 M M −1 M −1 m=0 n=0 n=m P [decide Hn Hm ] Note the ≤ sign. Thus Eav = Es .5) + 2(1. it is 2Eb N0 P [symbol errorH2 ] ≤ 3Q So. What is the union bound on the probability of symbol error. the symbol energies are all equal. 2(0. the exact probability of symbol error expression vs. What do we mean? If I draw any .
. Section 2.n dm. Nmin d2 min Q P [symbol error] ≈ M 2N0 Nmin d2 Es min P [symbol error] ≈ Q (47) M 2Es N0 where dmin = min dm. But I could draw lots of functions that are upper bounds. M − 1. . some higher than others. then. .ECE 5520 Fall 2009 85 function that is always above the actual P [symbol error] formula. . In particular. A little extra distance means a much lower value of the Qfunction. These neighbors are the ones that are necessary to include in the union bound. If we use a constant A when describing the signal space vectors (as we usually do). I have drawn an upper bound. The Rice book uses Eb where I use Eb to denote average bit energy. Note that the Rice book uses Eavg where I use Es to denote average symbol energy. Proof of (46)? Lecture 14 Today: (1) QAM Probability of Error. and Nmin is the number of pairs of symbols which are separated by distance dmin . and do not need to be included. and n = 0. M − 1.n .1) with the smallest dm.n Es = Q (46) P [decide Hn Hm ] = Q 2N0 2Es N0 where dm. (2) Union Bound and Approximations Lecture 15 Today: (1) Mary PSK and QAM Analysis 26 26. So approximately. Given this. decide Hn Hm may be redundant.n is the Euclidean distance between the symbols m and n in the signal space diagram.n will be proportional to A2 . P [decide Hn Hm ]. the factors A will cancel out of the expression. m=n m = 0. This is because higher dm. the probability of error is often well approximated by the terms in the Union Bound (Lecture 14. . the pairwise probability of error. which in turn means a lower value of the Qfunction. .n .6. since Es will be proportional to 2 A2 and dm.n means a higher argument in the Qfunction. .1 QAM Probability of Error NearestNeighbor Approximate Probability of Error As it turns out. is given by: 2 2 dm. for some of the error events. . We will talk about the concept of “neighboring” symbols to symbol i. . I prefer Es to make clear we are talking about Joules per symbol.
). Calculate the union bound on probability of symbol error using the formula derived in the previous lecture. Convert distances to be in terms of Eb /N0 . Calculate the average energy per bit. • The nearestneighbor approximation. Eb . i)!). • The Union bound.n 2N0 7. P [symbol error] ≤ 1 M M −1 M −1 m=0 n=0 n=m P [decide Hn Hm ] P [decide Hn Hm ] = Q d2 m. that is. It is not necessary to include redundant boundary information. and Nmin is the number of ordered pairs (i. Convert probability of symbol error into probability of bit error if Gray encoding can be assumed. 4. 5.ECE 5520 Fall 2009 86 26. Calculate the average energy per symbol. j) and (j.j = dmin (don’t forget to count both (i. 6. You should calculate: • An exact expression if one should happen to be available. Calculate pairwise distances between pairs of symbols which contribute to the decision boundaries. These steps include: 1. if they are not already. Draw decision boundaries. 2. 8. using the answer to step (1. P [error] ≈ 1 P [symbol error] log2 M Example: Additional QAM / PSK Constellations Solve for the probability of symbol error in the signal space diagrams in Figure 39.2 Summary and Examples We’ve been formulating in class a pattern or stepbystep process for ﬁnding the probability of bit or symbol error in arbitrary Mary PSK and QAM modulations. . as a function of any amplitude parameters (typically we’ve been using A). Approximate using the nearest neighbor approximation if needed: Nmin d2 min Q P [symbol error] ≈ M 2N0 where dmin is the shortest pairwise distance in the constellation diagram. Es . using Eb = Es / log2 M . j) which are that distance apart. an error region which is completely contained in a larger error region. di. 3.
.ECE 5520 Fall 2009 87 Figure 39: Signal space diagram for some example (made up) constellation diagrams.
.ECE 5520 Fall 2009 88 on. this requires that ∆f Ts = n/2. Lecture 16 Today: (1) QAM Examples (Lecture 15) . φm (t) = ∞ t=−∞ Ts (2/Ts ) cos(2πfc t + 2πk∆f t) cos(2πfc t + 2πk∆f t) cos(2π(k − m)∆f t)dt + cos(4πfc t + 2π(k + m)∆f t)dt t=0 Ts t=0 = 1/Ts t=0 Ts 1/Ts = 1/Ts = sin(2π(k − m)∆f t) 2π(k − m)∆f sin(2π(k − m)∆f Ts ) 2π(k − m)∆f Ts Yes.w. in frequency shift keying.) For general k = m.e. f1 . could be a rectangular pulse: √ 1/ Ts . but we don’t want to waste spectrum. • Does this function have unit energy? • Are these functions orthogonal? Solution: φk (t). and thus √ (48) φk (t) = 2p(t) cos(2πfc t + 2πk∆f t) where p(t) is a pulse shape. . . o. (They are also approximately orthogonal if ∆f is realy big. ∆f = n fsy 1 =n 2Ts 2 . i. they are orthogonal if 2π(k − m)∆f Ts is a multiple of π. symbols are selected to be sinusoids with frequency selected among a set of M diﬀerent frequencies {f0 . The plots are in terms of amplitude A. but all probability of error expressions should be written in terms of Eb /N0 . .1 Orthogonal Frequencies We will want our cosine wave at these M diﬀerent frequencies to be orthonormal. (2) FSK 27 Frequency Shift Keying x(t) = cos(2πf (t)t). probability of symbol error. fM −1 }. 0 ≤ t ≤ Ts p(t) = 0. which for example. 27. ﬁrst. and then also probability of bit error. Consider fk = fc + k∆f . Frequency modulation in general changes the center frequency over time: Now.
. For M = 2 and M = 3 these vectors are plotted in Figure 40. . Signal f1(t) Signal f2(t) Switch FSK Out Figure 41: Block diagram of a binary FSK transmitter. What n? We will see that we either use an n of 1 or 2. . 0] a1 = [0. . FSK can be seen as a switch between M diﬀerent carrier signals. bandwidth will be lower when we send less steep transitions. . no they’re not. Def ’n: Voltage Controlled Oscillator (VCO) A sinusoidal generator with frequency that linearly proportional to an input voltage. . Note that we don’t need to sent square wave input into the VCO. Signal space vectors ai are given by a0 = [A. as shown in Figure 41. In fact. SRRC pulses.) 1 Thus we need to plug into (48) for ∆f = n 2Ts for some integer n in order to have an orthonormal basis. 0. as seen in Figure 42. .ECE 5520 Fall 2009 89 for integer n. But usually it is generated from a single VCO. . A. . .2 Transmission of FSK At the transmitter. aM −1 = [0. 27. A] √ What is the energy per symbol? Show that this means that A = Es . 0. M=2 FSK M=3 FSK Figure 40: Signal space diagram for M = 2 and M = 3 FSK modulation. (Otherwise. . . for example. . . 0] . .
If we only correlate it with one (the sine wave. Since we will not know the phase of the received signal.ECE 5520 Fall 2009 90 Def ’n: Continuous Phase Frequency Shift Keying FSK with no phase discontinuity between symbols. and the phase makes the signal the other (a cosine wave) we would get a inner product of zero! The solution is that we need to correlate the received signal with both a sine and a cosine wave at the frequency fk . There are a variety of ﬂavors of CPFSK.4 Coherent Reception FSK reception can be done via a correlation receiver. . though – we have a fundamental problem. we don’t know whether or not the energy at frequency fk correlates highly with the cosine wave or with the sine wave. In other words. to aid in demodulation (at the expense of the additional energy). ˆ Each phase θk is estimated to be θk by a separate phaselocked loop (PLL). so we’d need to know and be synchronized to M diﬀerent phases θi . the carrier cos(2πfk t + θk ) is sent with the transmitted signal. the phase of the output signal φk (t) does not change instantaneously at symbol boundaries iTs for integer i. 1/M of the time (assuming equallyprobable symbols). just as we’ve seen for previous modulation types. Here. for example). VCO Frequency Signal f(t) FSK Out Figure 42: Block diagram of a binary FSK transmitter. there are two orthogonal functions. coherent detection becomes diﬃcult. This is more diﬃcult than it sounds. 27. These M PLLs must operate even though they can only synchronize when their symbol is sent. This will give us two inner products. cos(2πfc t + 2π(M − 1)∆f t + θM −1 ) 27. for every frequency.3 Reception of FSK FSK reception is either phase coherent or phase noncoherent. the sign or phase of the sinusoid is not very important – only one symbol exists in each dimension. . Also. which are beyond the scope of this course. This is the only case where I’ve heard of using coherent FSK reception. In one ﬂavor of CPFSK (called Sunde’s FSK). having M PLLs is a drawback. k . we just measure the energy in each frequency. As we know. respectively.5 Noncoherent Reception Notice that in Figure 40. one for each symbol frequency: cos(2πfc t + θ0 ) cos(2πfc t + 2π∆f t + θ1 ) . there are M possible carrier frequencies. and thus φ(t− + iTs ) = φ(t+ + iTs ) where t− and t+ are the limiting times just to the left and to the right of 0. In noncoherent reception. cosine and sine (see QAM and PSK). As M gets high. lets call them xI using the capital I to denote inphase and k xQ with Q denoting quadrature. 27.
. . Q Efk = (xI )2 + (xk )2 k is calculated for each frequency fk .6 Receiver Block Diagrams The block diagram of the the noncoherent FSK receiver is shown in Figures 45 and 46 (copied from the Proakis & Salehi book). but the optimal detector (Maximum likelihood detector) is then to decide upon the frequency with the maximum energy Efk . that is. Compare this to the coherent FSK receiver in Figure 44. xQ ]T .ECE 5520 Fall 2009 sin(2pfit) 91 xQ i length: (xI )2 + (xQ)2 i i xI i cos(2pfit) Figure 43: The energy in a noncoherent FSK receiver at one frequency fk is calculated by ﬁnding its correlation with the cosine wave (xI ) and sine wave (xQ ) at the frequency of interest. 1. . . Is this a threshold test? We would need new analysis of the FSK noncoherent detector to ﬁnd its analytical probability of error. 27. k = 0. k k The energy at frequency fk . M −1. . fk . You could prove this. Figure 44: (Proakis & Salehi) Phasecoherent demodulation of M ary FSK signals. and calculating the squared length k k of the vector [xI .
ECE 5520 Fall 2009 92 Figure 45: (Proakis & Salehi) Demodulation of M ary FSK signals for noncoherent detection. Figure 46: (Proakis & Salehi) Demodulation and squarelaw detection of binary FSK signals. .
given the noncoherent receiver and rmc and rms . 2 Question: What will rm equal when the noise is very small? As it turns out. The ‘envelope’ is a term used for the square root of the energy. to d0. 1. Hi )fθm Hi (φHi )dφ (49) Note that frθm. 2. 27.8 Probability of Error for Noncoherent Binary FSK The energy detector shown in Figure 46 uses the energy in each frequency and selects the frequency with maximum energy.H0 (rφ.ECE 5520 Fall 2009 93 27. binary FSK is about 1. What do they do to prove this in Proakis & Salehi? They prove it for binary noncoherent FSK. What is the detection threshold line separating the two decision regions? 2. φHi )dφ frθm. which is assumed to be uniform between 0 and 2π. 2π frHi (rHi ) = = 0 2π 0 fr. sM .1 P [error]2−ary = Q 2N0 √ 2/2 compared to bipolar PAM. . . It takes quite a bit of 5510 to do this proof.1 = Eb N0 √ 2Eb . H0 ) is a 2D Gaussian random vector with i. and independent of the noise. so rm is termed the envelope.Hi (rφ. Note that this depends on θm .5 dB more energy per bit).d. the envelope rm is an optimum (suﬃcient) statistic to use to decide between s1 .5 dB better than OOK (requires 1. Now. let’s look at coherent detection of binary FSK. We had that d2 0.5 dB less energy per bit).θmHi (r. They formulate the prior probabilities frHi (rHi ). but 1.5 dB worse than bipolar PAM (requires 1. Deﬁne the received vector r as a 4 length vector of the correlation of r(t) with the sin and cos at each frequency f1 . . components.i. 1. What is the distance between points in the Binary FSK signal space? What is the probability of error for coherent binary FSK? It is the same as bipolar PAM. the spacing between symbols has reduced by a factor of So P [error]2−Co−F SK = Q For the same probability of bit error. f2 .7 Probability of Error for Coherent Binary FSK First. but the symbols are spaced diﬀerently (more closely) as a function of Eb . 2 This energy is denoted rm in Figure 46 for frequency m and is 2 2 2 rm = rmc + rms This energy measure is a statistic which measures how much energy was in the signal at frequency fm .
as well. Where the joint probability fr∩H0 (r ∩ H0 ) is greater than frH1 (rH1 ).ECE 5520 Fall 2009 3. it decides H1 . which is posted on WebCT. Lecture 17 Today: (1) FSK Error Probabilities (2) OFDM 27.6. Solution: 1 − 1 Eb P [symbol error] = P [bit error] = e 2 N0 2 M/2 P [symbol error] M −1 . They instead are either Ricean (for the transmitted frequency) or Rayleigh distributed (for the “other” M − 1 frequencies). and is presented since it does not appear in the Rice book. 5. 94 4. An exact expression for the probability of error can be derived.1 FSK Error Probabilities.9. The expression for probability of error in binary noncoherent FSK is given by. The decisions in this last step. after manipulation of the pdfs. Example: Probability of Error for Noncoherent M = 2 case Use the above expressions to ﬁnd the P [symbol error] and P [error] for binary noncoherent FSK. are shown to reduce to this decision (given that P [H0 ] = P [H1 ]): 2 2 r1c + r1s H0 > < H1 2 2 r2c + r2s The “envelope detector” can equally well be called the “energy detector”.6. the receiver decides H0 . They formulate the joint probabilities fr∩H0 (r ∩ H0 ) and fr∩H1 (r ∩ H1 ).9 27. and you should make note of them. Proof Summary: Our noncoherent receiver ﬁnds the energy in each frequency.9 shows that P [symbol error] = and P [error]M −nc−F SK = M −1 n=1 (−1)n+1 n E M −1 1 − log2 M n+1 Nb 0 e n+1 n See Figure 47. These energy values no longer have a Gaussian distribution (due to the squaring of the amplitudes in the energy calculation). The above proof is simply FYI. the derivation in the Proakis & Salehi book. Part 2 M ary NonCoherent FSK For M ary noncoherent FSK. The proof is in Proakis & Salehi. and it often is. section 7. page 430. Otherwise.9. Section 7. You will use them to be able to design communication systems that use FSK. The probability that the correct frequency is selected is the probability that the Ricean random variable is larger than all of the other random variables measured at the other frequencies. Eb 1 (50) P [error]2−N C−F SK = exp − 2 2N0 The expressions for probability of error in binary FSK (both coherent and noncoherent) are important.
B = (1 + α)/Ts . you now have BT /K bandwidth on each subchannel.10 Bandwidth of FSK Carson’s rule is used to calculate the bandwidth of FM signals. BT = (M − 1)∆f + 2B where B is the onesided bandwidth of the digital baseband signal. longer by a factor of K so that the bandwidth of that subchannel has narrower bandwidth. The energy would be divided by K in each subchannel. 27. Note for square wave pulses. 28 Frequency Multiplexing Frequency multiplexing is the division of the total bandwidth (call it BT ) into many smaller frequency bands.it is related to the ShannonHartley 0 theorem on the limit of reliable communication. In HW 7. Furthermore. For the nulltonull bandwidth of raisedcosine pulse shaping. Speciﬁcally.6dB. we might send a PSK or PAM or QAM signal on that band. As M → ∞. so you don’t lose anything by this division. frequency dependent gains are also experienced in DSL. each subchannel has a symbol period of Ts K. consider Figure 48. where H(f ) is an . Note that in the multiplexed system. because phone lines were not designed for higher frequency signals.2 Summary The key insight is that probability of error decreases for increasing M . 27. the probability of error E goes to zero. divide your BT bandwidth into K subchannels. which shows an example frequency selective fading pattern.ECE 5520 Fall 2009 −1 95 10 10 −2 Probability of Error 10 −3 10 −4 10 −5 10 −6 M=2 M=4 M=8 M=16 M=32 2 4 6 8 Eb/N0 Ratio. and sending a lower bit rate on each one. For each band.1 Frequency Selective Fading Frequency selectivity is primarily a problem in wireless channels. For M ary FSK. 28. Our choice is arbitrary. 10 log 10 H(f )2 . your transmission energy is the same. you show that the total bit rate is the same in the multiplexed system. it tells us that the approximate bandwidth is. For example. Remember this for later . for the case when Nb > −1. dB 10 12 0 Figure 47: Probability of bit error for noncoherent reception of Mary FSK. B = 1/Ts . But.9. so the sum of the energy is constant.
• Frequency multiplexing methods (including OFDM). • Directsequence spread spectrum (DSSS)..ECE 5520 Fall 2009 BT/K 96 Severely faded channel Figure 48: An example frequency selective fading pattern. the channel stays constant over time. The H(f ) acts as a ﬁlter. The whole bandwidth is divided into K subchannels.2 Beneﬁts of Frequency Multiplexing Now. Wideband: The bandwidth is used for one channel. Some channels experience gain. and experiences a BER of 0.e. bit errors are not an ‘all or nothing’ game. or is in a severe fade. then the loss 10 log10 H(f ′ )2 will not change throughout the communication. (The f is relative. a subchannel either experiences a high SNR and makes no errors.5 (the . For example. take the actual frequency f = f + fc for some center frequency fc . which introduces ISI. example wireless channel. each of rate R/K. there are K parallel bitstreams. while most experience little attenuation. We’re going to mention frequency multiplexing. ˜ i. where R is the total bit rate. This H(f ) might be experienced in an average outdoor channel. Problems: 1. 2. As a ﬁrst order approximation. or • Frequencyhopping spread spectrum (FHSS). The power is E increased by a fade margin which makes the Nb high enough for low BER demodulation except in the 0 most extreme fading situations. and each subchannel’s channel ﬁlter is mostly constant. When designing for frequency selective fading. designers may use a wideband modulation method which is robust to frequency selective fading.) • You can see that some channels experience severe attenuation. If you put down the transmitter and receiver and have them operate at a particular f ′ . • For stationary transmitter and receiver and stationary f . Narrowband: Part of the bandwidth is used for one narrow channel. 28. If the channel loss is too great. In frequency multiplexing. at a lower bit rate. than the error rate will be too high. has a very low SNR.
I (t) = φ0. Does this look like an inverse discrete fourier transform? If yes. 0 ≤ t ≤ Ts o.I (t) + jak. 2/Ts sin(2πfc t).I (t) = φ1.5 probability of error. 0 ≤ t ≤ Ts 0. than you can see why it is possible to use an IFFT and FFT to generate the complex baseband signal. o. o. The . FFT implementation: There is a particular implementation of the transmitter and receiver that use FFT/IFFT operations. These are all orthogonal functions! We can transmit much more information than possible in M ary FSK.w. 2/Ts sin(2πfc t + 2π∆f t). which has probability β that it will have a 0. 1 where ∆f = 2Ts . Similarly. 0. .w. 0 ≤ t ≤ Ts o. 2/Ts cos(2πfc t). In FSK.I (t) + jak.Q (t) sin(2πfk t)] The complex baseband signal of the sum of all K signals might then be represented as K xl (t) = xl (t) = 2/Ts R k=1 K (ak.Q (t) = 0.ECE 5520 Fall 2009 97 worst bit error rate!).w. 1.Q (t).Q (t))ej2πk∆f t Ak (t)ej2πk∆f t k=1 2/Ts R (51) where Ak (t) = ak. we use a single basis function at each of diﬀerent frequencies.3 OFDM as an extension of FSK This is section 5. . o. φM −1. a wideband system with very high ISI might be completely unable to demodulate the received signal. If β is the probability that a subchannel experiences a severe fade. 0 ≤ t ≤ Ts 0. OFDM is the combination: φ0. 2/Ts cos(2πfc t + 2π∆f t). 28. Frequency multiplexing is typically combined with channel coding designed to correct a small percentage of bit errors. the overall probability of error will be 0. Compare this to a single narrowband channel.w.Q (t) = φ1. In QAM.w. 0 ≤ t ≤ Ts o. 0. 0 ≤ t ≤ Ts 0.Q (t) = .5β.I (t) cos(2πfk t) + ak. (Note we have 2M basis functions here!) The signal on subchannel k might be represented as: xk (t) = 2/Ts [ak. This avoids having K independent transmitter chains and receiver chains.5 in the Rice book. 2/Ts sin(2πfc t + 2π(M − 1)∆f t). we use two basis functions at the same frequency.w. 2/Ts cos(2πfc t + 2π(M − 1)∆f t).I (t) = φM −1.
1 Diﬀerential Encoding for BPSK Def ’n: Coherent Reception The reception of a signal when its carrier phase is explicitly determined and used for demodulation. so eﬀectively 48 subcarriers are used for data. the signal is like Kary FSK.11a is 250k/sec. .ECE 5520 Fall 2009 98 FFT implementation (and the speed and ease of implementation of the FFT in hardware) is why OFDM is popular.11a IEEE 802. Example: 802. An example state space diagram for K = 3 and PAM on each channel is shown in Figure 49. Thus the bit rate is 250 × 103 OFDM symbols subcarriers coded bits Mbits 48 4 = 48 sec OFDM symbol subcarrier sec Lecture 18 Today: Comparison of Modulation Methods 29 Comparison of Modulation Methods This section ﬁrst presents some miscellaneous information which is important to real digital communications systems but doesn’t ﬁt nicely into other lectures.11a uses OFDM with 52 subcarriers. But. The symbol rate in 802. One example is to use 16 square QAM on each subcarrier (which is 4 bits per symbol per subcarrier). Four of the subcarriers are reserved for pilot tones. 29. Each data subcarrier can be modulated in diﬀerent ways. 3 subchannels of 4ary PAM Figure 49: Signal space diagram for K = 3 subchannel OFDM with 4PAM on each channel. rather than transmitting on one of the K carriers at a given time (like FSK) we transmit information in parallel on all K channels simultaneously. Since the K carriers are orthogonal. OFDM.
a switch in the angle of the signal space vector from 0o to 180o or vice versa indicates a bit 1.1. and if it is greater than zero. Basically. 0. will always need some kind of phase synchronization in BPSK. Example: Diﬀerential encoding Let b = [1. 1. . and send a1 if bn = 1. b . if kn−1 = 0 then 1 − kn−1 = 1. we ﬁnd bn by comparing the phase of xn to the phase of xn−1 . 1. bn = 0 1 − kn−1 . we’ve said. Typically k0 = 0. . 1. How do we decide which phase to send? Prior to this. show that the receiver will decode the original bitstream with diﬀerential decoding. π. 0. 1. 1. 1. π. The sequence bn is a sequence of 0’s and 1’s. 1. 1. Assume b0 = 0. kn = kn−1 . . π. 0]T . 1. 0. Typically. k8 ]T will be sent? Solution: k = [1. Example: Diﬀerential decoding 1. consider the bit sequence {bn }. 0]. decide bn = 1. ˆn = [1. 1. Note that we have to just agree on the “zero” phase. decide bn = 0. For noncoherent reception of PSK. bn = 1 Note that 1 − kn−1 is the complement or negation of kn−1 – if kn−1 = 1 then 1 − kn−1 = 0.ECE 5520 Fall 2009 99 For coherent reception of PSK. this means transmitting a training sequence. while staying at the same angle indicates a bit 0. send a0 if bn = 0. we use diﬀerential encoding (at the transmitter) and decoding (at the receiver). 0. π. 1]T These values of kn correspond to a symbol stream with phases: ∠ak = [π.1 DPSK Transmitter Now. 0. 0. . 1. 29. What our receiver does. Solution: Assuming φi0 = 0. What symbols k = [k0 . 0. 0. 0. Instead of setting k for ak only as a function of bn .2 DPSK Receiver Now. in diﬀerential encoding. 0. 0. 0. we also include kn−1 . for diﬀerential BPSK. where bn is the nth bit that we want to send. 1. Now. Assuming no phase shift in the above encoding example. π]T 29. at the receiver. is to measure the statistic ∗ ℜ{xn xn−1 } = xn xn−1  cos(∠xn − ∠xn−1 ) as the statistic – if it less than zero.1.
0. 10 0 Probability of Bit Error 10 −2 10 −4 10 −6 BPSK DBPSK 2 4 6 8 Eb/N0.ECE 5520 Fall 2009 100 2.3 Probability of Bit Error for DPSK The probability of bit error in DPSK is slightly worse than that for BPSK: P [error] = Eb 1 exp − 2 N0 Eb N0 For a constant probability of error. • Bandwidth eﬃciency Eb N0 . assume that all bits are shifted π radians and we receive ∠x′ = [0.1. b Rotating all symbols by π radians only causes one bit error. 0. 29.2 Points for Comparison • Linear ampliﬁers (transmitter complexity) • Receiver complexity • Fidelity (P [error]) vs. 0. 0. 1. Both are plotted in Figure 50. What will be decoded at the receiver? Solution: ˆn = [0. π. 1. π. 0. 29. 0. which has probability . 0. 1. Now. dB 10 12 14 0 Figure 50: Comparison of probability of bit error for BPSK and Diﬀerential BPSK. 1. 0]. π. DPSK requires about 1 dB more of bit error Q 2Eb N0 than BPSK. 0].
67 1.ECE 5520 Fall 2009 η BPSK QPSK 16QAM 64QAM α=0 1. 29. 29. we just make sure we use the same deﬁnition of bandwidth throughout a comparison.67 4. The transmission bandwidth (for a bandpass signal) is BT = 1+α Ts Since Ts is seconds per symbol. is the ratio of bits per second to bandwidth: η = Rb /BT Bandwidth eﬃciency depends on the deﬁnition of “bandwidth”.3 Bandwidth Eﬃciency We’ve talked about measuring data rate in bits per second. This relationship is typically linearly proportional.2 FSK (1 + α)Rb log2 M log2 M 1+α We’ve said that the bandwidth of FSK is. typically denoted η. to increase the bit rate by decreasing the symbol period. the nullnull bandwidth is 1 + α times the bandwidth of the case when we use pure sinc pulses. the bandwidth is largely determined by the pulse shape.0 α=1 0.3. Since it is usually used for comparative purposes.5 1. BT = (M − 1)∆f + 2B ..33 2.0 4. PAM and QAM In these three modulation methods.5 0.1 PSK. we can scale a system. i.3. i.e. bps/Hz. We’ve also talked about Hertz.0 2. we divide by log2 M bits per symbol to get Tb = Ts / log2 M seconds per bit. the quantity of spectrum our signal will use.0 α = 0. Def ’n: Bandwidth eﬃciency The bandwidth eﬃciency.0 101 Table 2: Bandwidth eﬃciency of PSK and QAM modulation methods using raised cosine ﬁltering as a function of α. and correspondingly increase the bandwidth. For root raised cosine ﬁltering. The key ﬁgure of merit: bits per second / Hertz..0 6. or BT = where Rb = 1/Tb is the bit rate.5 2. Bandwidth eﬃciency is then η = Rb /BT = See Table 2 for some numerical examples.0 3. Typically. 29.e.
See also Rice pages 325331. we have quantities of interest: • Bandwidth eﬃciency. FSK. and E • Energy per bit ( Nb ) requirements to achieve a given probability of error. 2. 1. E We can plot these (required Nb . including QPSK. So. 2B = (1 + α)/Ts . . 0 E Example: Bandwidth eﬃciency vs.ECE 5520 Fall 2009 102 where B is the onesided bandwidth of the digital baseband signal. Nb for M = 8 PSK 0 E What is the required Nb for 8PSK to achieve a probability of bit error of 10−6 ? What is the bandwidth 0 eﬃciency of 8PSK when using 50% excess bandwidth? Solution: Given in Rice Figure 6.4 Bandwidth Eﬃciency vs. Mary PSK.3.3. Eb N0 Eb N0 : Main modulations which we have evaluated probability of error vs. Nonsquare QAM constellations 5.6. Eb N0 For each modulation format.5 to be about 14 dB. Mary PAM.5 Fidelity (P [error]) vs. log2 M (M − 1)∆f Ts + (1 + α) log2 M M +α η= 29. Mary FSK In this part of the lecture we will break up into groups and derive: (1) the probability of error and (2) probability of symbol error formulas for these types of modulations. Square QAM 4. For the nulltonull bandwidth of raisedcosine pulse shaping. and 2. OOK. bandwidth eﬃciency) pairs. BT = (M − 1)∆f + (1 + α)/Ts = since Rb = 1/Ts for η = Rb /BT = Rb {(M − 1)∆f Ts + (1 + α)} log2 M If ∆f = 1/Ts (required for noncoherent reception). 3. See Rice Figure 6. 0 29. including Binary PAM or BPSK. DPSK.
1 Linear / Nonlinear Ampliﬁers An ampliﬁer uses DC power to take an input signal and increase its amplitude at the output. • Class A: linear ampliﬁers with maximum power eﬃciency of 50%.ECE 5520 Fall 2009 Name BPSK OOK DPSK MPAM QPSK MPSK Square MQAM 2noncoFSK MnoncoFSK 2coFSK McoFSK = Eb 1 2 exp − 2N0 M −1 M −1 (−1)n+1 n=1 ( n ) n+1 exp 103 P [symbol error] =Q =Q = = 1 2 2Eb N0 Eb N0 E − Nb 0 6 log2 M Eb M 2 −1 N0 P [error] same same same ≈ ≈ 1 log2 M P exp 2(M −1) Q M [symbol error] 2Eb N0 =Q ≤ 2Q E 2(log2 M ) sin2 (π/M ) Nb 0 ≈ = E − n log2 M Nb n+1 0 1 log2 M P [symbol error] √ 3 log2 M Eb ( M −1) 4 √ Q log2 M M −1 N0 M same = M/2 M −1 P M/2 M −1 P [symbol error] same =Q ≤ (M − 1)Q Eb N0 E log2 M Nb 0 = [symbol error] Table 3: Summary of probability of bit and symbol error formulas for several modulations. but each ampliﬁer stays on slightly longer to reduce the “dead zone” at zero. from battery) to the ampliﬁer. If PDC is the input power (e. one for the positive part and one for the negative part. • Class AB: Like class B. Output signal is a scaled up version of the input signal. Two in parallel are used to amplify a sinusoidal signal. . their linearity and power eﬃciency.6. and Pout is the output signal power. the power eﬃciency is rated as ηP = Pout /PDC . • Class B: linear ampliﬁers turn on for half of a cycle (conduction angle of 180o ) with maximum power eﬃciency of 78. 29.6 Transmitter Complexity What makes a transmitter more complicated? • Need for a linear power ampliﬁer • Higher clock speed • Higher transmit powers • Directional antennas 29. Ampliﬁers are separated into ‘classes’ which describe their conﬁguration (circuits). That is. the conduction angle is slightly higher than 180o .g.. and as a result of their conﬁguration. Power is dissipated at all times.5%.
See Figure 52(b) and Figure 51 (a) to see this graphically. } 4 4 4 4 and p(t) is the pulse shape. This means that. The new transmitted signal takes the same bandwidth and average power. In order to double the power eﬃciency.7 Oﬀset QPSK QPSK without offset: Qchannel Offset QPSK: Qchannel Ichannel Ichannel (a) (b) Figure 51: Constellation Diagram for (a) QPSK and (b) OQPSK. We could have also written s(t) as √ s(t) = 2p(t) [a0 (t) cos(ω0 t) + a1 (t) sin(ω0 t)] The problem is that when the phase changes 180 degrees. batterypowered transmitters are often willing to use Class C ampliﬁers. Can only amplify signals with a nearly constant envelope. you’ll see a circle. which precludes use of a class C ampliﬁer. For QPSK. 29. the envelope s(t) is largely constant. the signal s(t) will go through zero. However. Class C nonlinear ampliﬁers have power eﬃciencies around 90%. we wrote the modulated signal as √ s(t) = 2p(t) cos(ω0 t + ∠a(t)) where ∠a(t) is the angle of the symbol chosen to be sent at time t. This generates high distortion. we just need to delay the sampling on the quadrature half of a sample period with respect to the inphase signal. . probability of bit error performance. but then is bandpass ﬁltered or tuned to the center frequency to force out spurious frequencies. and has the same Eb /N0 vs. The signal must never go through the origin (envelope of zero) or even near the origin. See Figure 52 for a comparison of QPSK and OQPSK.ECE 5520 Fall 2009 104 • Class C : A class C ampliﬁer is closer to a switch than an ampliﬁer. They can do this if their output signal has constant envelope. . It is in a discrete set. if you look at the constellation diagram. π 3π 5π 7π ∠a(t) ∈ { . . Rewriting s(t) √ √ s(t) = 2p(t)a0 (t) cos(ω0 t) + 2p(t − Ts /2)a1 (t − Ts /2) sin(ω0 (t − Ts /2)) At the receiver. For oﬀset QPSK (OQPSK). we delay the quadrature Ts /2 with respect to the inphase.
15 −0. timing) • Multiple parallel receiver chains Lecture 19 Today: (1) Link Budgets and System Design 30 Link Budgets and System Design As a digital communication system designer.2 Real 0 −0. phase.1 0 1 2 Time t 3 4 Imag 0.1 0 1 2 Time t 3 4 Imag 0. showing the (d) largely constant envelope of OQPSK. your mission (if you choose to accept it) is to achieve: 1. 29.ECE 5520 Fall 2009 0.1 0.2 0. Low transmit power .1 105 0.05 −0. High ﬁdelity (low bit error rate) 3.1 (a) 0.8 Receiver Complexity What makes a receiver more complicated? • Synchronization (carrier.1 Imag 0 −0.05 0 −0.1 (c) 0 1 2 Time t 3 4 (d) −0. High data rate 2. compared to (b) that for QPSK.1 Real 0 −0.15 0.05 −0.1 0.2 Figure 52: Matlab simulation of (ab) QPSK and (cd) OQPSK.1 −0.2 0.2 0.1 0 Real 0.1 0 Real 0.1 Imag 0 −0.1 0.15 −0.15 0.1 0.1 0 1 2 Time t 3 4 (b) −0.2 −0.05 0 −0.
The eﬀect of the choice of modulation impacts several functional relationships. Functional relationships between variables are given as circles – if the relationship is a single operator (e. we discuss this procedure. and bit error rate limit. and the material from this class. if I was given the bandwidth limits and C/N0 ratio. For example. It is an odd quantity. optics.ECE 5520 Fall 2009 106 4. . 0 30. the relationship between probability of bit error and Nb . given the bandwidth limits. but it summarizes what we need to know about the signal and the noise for the purposes of system design. division). otherwise it is left as an unnamed function f (). for example. it has units of Watts. What is C/N0 ? It is received power divided by noise energy. antennas. what data rate can be achieved? Using which modulation? To answer this question. This lecture is about system design. • We often describe both the received power at a receiver PR .g. which is drawn as a dotted line. in particular circuits.1 Link Budgets Given C/N0 The received power is denoted C. I’d be able to determine the probability of error for several diﬀerent modulation types. Figure 53: Relationships between important system design variables (rectangular boxes). and it cuts across ECE classes. and deﬁne each of the variables we see in Figure 53. For example. See Figure 53. E e. the received signal power and noise power. it is just a matter of setting some system variables (at the given limits) and determining what that means about the values of other system variables.g. You are expected to apply what you have learned in other classes (or learn the functional relationships that we describe here). radio propagation. Low bandwidth 5. In this lecture. but in the Rice book it is typically denoted C. because you can’t have everything at the same time. and to what they are related. So the system design depends on what the desired system really needs and what are the acceptable tradeoﬀs. Low transmitter/receiver complexity But this is truly a mission impossible. the operator is drawn in. C/N0 is shown to have a divide by relationship with C and N0 . Typically some subset of requirements are given...
1214 Q−1 1 × 10−3 = 3.3776 7 × 10−6 = 4.3528 Q−1 5 × 10−4 = 3. great.4758 Q−1 8 × 10−2 = 1. Otherwise. E Note: We typically use Q (·) and Q−1 (·) to relate BER and Nb in each direction.0364 Q−1 2 × 10−1 = 0.6708 2 × 10−6 = 4. For your convenience.1947 Q−1 8 × 10−4 = 3.5 × 10−6 = 4.5758 Q−1 6 × 10−3 = 2.7478 Q−1 4 × 10−3 = 2.2905 Q−1 6 × 10−4 = 3.5264 4 × 10−6 = 4.6 × 106 bps.6449 Q−1 6 × 10−2 = 1.1559 Q−1 9 × 10−4 = 3. we can now relate bit error rate.84162 Q−1 3 × 10−1 = 0. • Note that people often report C/N0 in dB Hz.3439 8 × 10−6 = 4.4573 Q−1 8 × 10−3 = 2. then software often reports that the data rate is 1/5 = 0.1735 2 × 10−5 = 4. we often denote: C C/N0 Eb = Tb = N0 N0 Rb where Rb = 1/Tb is the bit rate.6153 −1 Q 2 × 10−4 = 3. this 0 is easy to calculate.3408 Q−1 1 × 10−1 = 1. The noise energy is N0 .25335 Q−1 5 × 10−1 = 0 .4089 Q−1 9 × 10−3 = 2. While you have Matlab.ECE 5520 Fall 2009 107 E • We know the probability of bit error is typically written as a function of Nb .0537 Q−1 3 × 10−2 = 1. and bandwidth.8082 8 × 10−5 = 3.0902 Q−1 1. I am not picky about getting lots of correct decimal places.5121 Q−1 7 × 10−3 = 2.8906 6 × 10−5 = 3. CS people use Bps (kBps or MBps) when describing data rate.5548 Q−1 7 × 10−2 = 1.8808 Q−1 4 × 10−2 = 1.3145 9 × 10−6 = 4.5244 Q−1 4 × 10−1 = 0. which is 10 log10 C N0 Eb N 0 Rb What are the units of C/N0 ? Answer: • Be careful of Bytes per sec vs bits per sec.2 MBps or 200 kBps.4172 6 × 10−6 = 4. We can write Eb = CTb .775 9 × 10−5 = 3.5 × 10−5 = 4. it’s really not a big deal to pull it oﬀ of a chart or table.7534 1.5 × 10−4 = 3. since energy is power times time. modulation.3656 −1 FUNCTION: Q−1 1 × 10−2 = 2.2884 1 × 10−5 = 4.9444 5 × 10−5 = 3. The 0 bit energy is Eb . Commonly. 1/s. if it takes 5 seconds to transfer a 1MB ﬁle.4652 5 × 10−6 = 4. C/N0 = Hz. In other words.5401 Q−1 3 × 10−4 = 3.7507 Q−1 5 × 10−2 = 1.2816 Q−1 1.6521 Q−1 5 × 10−3 = 2.7455 10−6 TABLE OF THE Q−1 (·) Q 1 × 10−4 = 3.4316 Q−1 4 × 10−4 = 3.5 × 10−1 = 1. To separate the eﬀect of Tb . But the bit rate is 8/5 Mbps or 1.4051 Q−1 9 × 10−2 = 1.2389 Q−1 7 × 10−4 = 3. For example.3263 Q−1 1.9677 Q−1 2 × 10−3 = 2.1701 Q−1 2 × 10−2 = 2. bit rate.0128 4 × 10−5 = 3. Given C/N0 .8782 Q−1 3 × 10−3 = 2.719 −1 Q 1. the following tables/plots of Q−1 (x) will appear on Exam 2.8461 7 × 10−5 = 3. Q Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 Q−1 −1 1× = 4. If you can program it into your calculator.2649 1.1075 3 × 10−5 = 4.5 × 10−2 = 2.5 × 10−3 = 2.6114 3 × 10−6 = 4.
reduce your bit rate to conform to the BW constraint. your system is power limited.5 0 −7 10 10 −6 10 −5 10 −4 10 Value.5 2 1.5 3 2.5 1 0.5 Inverse Q function. ﬁnd the maximum bit rate. You just need to try to solve the problem and see which one limits your system. In this case. Given a maximum bit rate. Such links (or channels) are called power limited channels. Otherwise. x −3 10 −2 10 −1 10 0 30. Here is a stepbystep version of what you might need do in this case: Method A: Start with powerlimited assumption: 1.ECE 5520 Fall 2009 108 5 4. Use the probability of error constraint to determine the of error formula for the modulation. Q−1(x) 4 3. then the system is bandwidth limited. but be sure to express both in linear units. 3. Note that Rb = 1/Tb = . Sometimes bandwidth is the limiting factor in determining the maximum achievable bit rate. the maximum bandwidth. 2. calculate the maximum symbol rate Rs = bandwidth using the appropriate bandwidth formula. Rb log2 M and then compute the required 4. Compare the bandwidth at maximum Rs to the bandwidth constraint: If BW at Rs is too high. given the appropriate probability C/N0 Eb N0 constraint. the link (or channel) is called a bandwidth limited channel. Given the C/N0 constraint and the Eb N0 Eb N0 constraint.2 Power and Energy Limited Channels Assume the C/N0 . and your Rb is achievable. Sometimes power is the limiting factor in determining the maximum achievable bit rate. Method B: Start with a bandwidthlimited assumption: . and the maximum BER are all given.
Eb N0 = C/N0 Rb . 4. If the modulation is square 16QAM using the SRRC pulse shape with α = 0. make sure that everything is in linear 3. For M = 16 PSK.5 MHz and with an available C/N0 = 82 dB Hz.27×106 = 2.27×106 /4 = 5. Rb = C/N0 Eb N0 = 1. Use the bandwidth constraint and the appropriate bandwidth formula to ﬁnd the maximum symbol rate Rs and then the maximum bit rate Rb . your system is bandwidthlimited. Find the units. what is the maximum achievable bit rate on this link? Is this a power limited or bandwidth limited channel? Solution: 1.27×106 )/4 = 851 kHz log2 M This is clearly lower than the maximum bandwidth of 1.5.84 and thus Rs = Rb / log2 M = 2.5 MHz. If the modulation is 16PSK using the SRRC pulse shape with α = 0.33 Consider a bandpass communications link with a bandwidth of 1. The maximum bit error rate is 10−6 . Find the probability of error at that using the appropriate probability of error formula.5.ECE 5520 Fall 2009 109 1. (Again. The required bandwidth for this system is (1 + α)Rb BT = = 1. and can operate with bit rate 2.67×105 Msymbols/s. what is the maximum achievable bit rate on the link? Is this a power limited or bandwidth limited channel? 2. If the computed P [error] is greater than the BER constraint. and you have found the correct maximum bit rate. So.84 (52) Converting C/N0 to linear. (If BT had come out > 1. Example: Rice 6. Use the previous method to ﬁnd the maximum bit rate.27 Mbits/s 69.585 × 108 . the system is power limited. C/N0 = 1082/10 = 1. 1. we can ﬁnd 10−6 = P [error] = 10−6 = Eb N0 Eb N0 = 2 Q 4 2 Eb N0 for the maximum BER: 2(log2 M ) sin2 (π/M ) Eb N0 2 2 Q log2 M Eb N0 2(4) sin2 (π/16) 1 Q−1 2 × 10−6 8 sin (π/16) = 69. Otherwise. we would have needed to reduce Rb to meet the bandwidth limit. Try Method A.5 MHz.) . 2.5(2.) Eb N0 at the given bit rate by computing Eb N0 .585 × 108 = 2. Solving for Rb . then your system is power limited.27 Mbits/s.
55 (53) Rb = C/N0 Eb N0 = 1.5 MHz.75×106 )/4 = 2. For M = 16 (square) QAM. For narrowband signals. the wavelength is nearly constant across the bandwidth. but there are others. • λ is the wavelength at signal frequency.3 Computing Received Power We need to design our system for the power which we will receive. C = PR = where • GT and GR are the antenna gains at the transmitter and receiver. we have a bandwidthlimited system with a bit rate of 4 MHz.3. we can ﬁnd 10 −6 10−6 Eb N0 Eb N0 Solving for Rb . Note that λ = c/fc where c = 3 × 108 meters per second is the speed of light. 30.16 MHz log2 M This is greater than the maximum bandwidth of 1. respectively.ECE 5520 Fall 2009 Eb N0 110 for the maximum BER: 3 log2 M Eb M − 1 N0 2.5(5.1 Free Space ‘Free space’ is the idealization in which nothing exists except for the transmitter and receiver. and can really only be used for deep space communications. so we must reduce the bit rate to Rb = BT log2 M 4 = 1. so we just use the center frequency fc . PT GT GR λ2 (4πR)2 . this formula serves as a starting point for other radio propagation formulas. Wireless propagation models are required. the received power is calculated from the Friis formula. = √ 4 ( M − 1) √ = P [error] = Q log2 M M 4 (4 − 1) 3(4) Eb Q = 4 4 15 N0 15 −1 Q (4/3) × 10−6 12 2 = 27. But. Try Method A.75 Mbits/s 27.55 The required bandwidth for this bit rate is: BT = (1 + α)Rb = 1.5 MHz = 4 MHz 1+α 1. 30. In free space. How can we calculate how much power will be received? We can do this for both wireless and wired channels.75×106 = 5.5 In summary.585 × 108 = 5. this section of the class provides two examples.
is linearly proportional to the log of the distance R. we showed that the Friis formula. antenna heights.3. etc. C. losses in a cable. My opinion is. a very short summary is that radio propagation on Earth is diﬀerent than the Friis formula suggests. [Lp ]dB = L0 − 10n log10 R. For example. given in dB. it would be a “gain”. whenever path loss is linear with the log of distance. and we will write [PR ]dBm or [GT ]dB to denote that they are given by: [PR ]dBm = 10 log 10 PR [GT ]dB = 10 log 10 GT (54) In lecture 2. is λ [C]dBm = [GT ]dB + [GR ]dB + [PT ]dBm + 20 log 10 − 20 log 10 R 4π This says the received power. type of environment. because of shadowing caused by buildings. 111 Here. Typically people use decibels to express these numbers. for some constants n and L0 . it should be called “channel gain” and have a negative gain value. the average loss may increase more quickly than 1/R2 . Eﬀectively. [C]dBm = [PT ]dBm − R [L1m ]dB where PT is the transmit power and R is the cable length in meters. it may be more like 1/Rn .2 Nonfreespace Channels We don’t study radio propagation path loss formulas in this course. everything is in linear terms. trees. other imperfections. Rice lumps these in as the term L and writes: [Lp ]dB = +20 log10 [C]dBm = [GT ]dB + [GR ]dB + [PT ]dBm + [Lp ]dB + [L]dB 30.ECE 5520 Fall 2009 • PT is the transmit power. The Rice book writes the Friis formula as [C]dBm = [GT ]dB + [GR ]dB + [PT ]dBm + [Lp ]dB where λ − 20 log 10 R 4π where [Lp ]dB is called the “channel loss”. but the loss is modeled as linear in the length of the cable. etc..3 Wired Channels Typically wired channels are lossy as well. 30. For example. etc. But. Lots of other formulas exist that approximate the received power as a function of distance. and L1m is the loss per meter. but it is typically very negative.3. (This name that Rice chose is unfortunate because if it was positive. instead.) There are also typically other losses in a transmitter / receiver system. .
all receiver circuits add noise to the received signal.76 × 10−18 )1/sec = 9.38 × 10−23 J/K is Boltzmann’s constant and Teq is called the equivalent noise temperature in Kelvin.76 × 10−18 J. so: λ [C]dBW = [GT ]dB + [GR ]dB + [PT ]dBW + 20 log10 − 20 log10 R − 2dB − 2dB − 2dB − 1dB 4π 0. With proper design.plementation loss of 1 dB. 10 −8 = 10−8 = 10−8 = Eb N0 = √ 4 ( M − 1) √ Q log2 m M 4 15 3(8) Eb Q 8 16 255 N0 15 Q 32 24 Eb . This is a topic covered in another course.48dBW − 20 log 10 R (55) . to achieve P [error] = 10−8 . or 100.5 Examples Example: Rice 6.84 × 106 bits/sec. Basically. implementation loss. A pigeon in the lineofsight path causes an additional 2 dB loss. Since Eb = C/Rb and the bit rate Rb = 51.0 The noise power N0 = kTeq = 1. Switching to ﬁnding an expression for C.36 Consider a “pointtopoint” microwave link. How far away can the two towers be if the bit error rate is not to exceed 10−8 ? Include the pigeon. So Eb = 319.075m − 20 log10 R − 7dB = 20dB + 20dB + 10dBW + 20 log10 4π = −1. 255 N0 2 3 log2 M Eb M − 1 N0 255 Q−1 24 32 −8 10 15 = 319.075m. M = 256 square QAM (log2 M = 8. Teq can be kept low. (Such links were the key component in the telephone company’s long distance network before ﬁber optic cables were installed. C = (51. Atmospheric losses are 2 dB and other incidental losses are 2 dB. Teq is always going to be higher than the temperature of your receiver.13 × 10−11 W.84 × 106 )J(1. the wavelength is λ = 3 × 108 m/s/4 × 109 1/s = 0. The modulation is 51.52 × 10−21 J.ECE 5520 Fall 2009 112 30. The receiver has an equivalent noise temperature of 400 K and an im. √ Solution: Starting with the modulation.38 × 10−23 (J/K)400(K) = 5. incidental losses.4 dBW. and so you are not responsible for that material here. M = 16).84 Mbits/sec 256 square QAM with a carrier frequency of 4 GHz. 30. Neal’s hint: Use the dB version of Friis formula and subtract these mentioned dB losses: atmospheric losses.52 × 10−21 = 1. But in short.) Both antenna gains are 20 dB and the transmit antenna power is 10 W.4 Computing Noise Energy N0 = kTeq The noise energy N0 can be calculated as: where k = 1.0 × 5. the equivalent temperature is a function of the receiver design. and the pigeon.
xn (k) = xn (k) = k m r(t). For the correlation with n.3 km. This interpolation is the subject of this section. these are generally unknown to the receiver until timing synchronization is complete.3. Here. Figure 54 has a timing synchronization loop which controls the sampler (the ADC).ECE 5520 Fall 2009 113 Plugging in [C]dBm = −100. Figure 8. . or QPSK) can be written (assuming carrier synchronization) as r(t) = k m am (k)φm (t − kTs − τ ) (56) where ak (m) is the mth transmitted symbol. This tk is the correct timing delay at each correlator for the kth symbol. let’s compare receiver implementations for a (mostly) continuoustime and a (more) discretetime receiver.48dBW − 20 log 10 R and solving for R. which correlate the signal with φn (t − tk ) for each n.3 km (about 55 miles) apart. Figure 54: Block diagram for a continuoustime receiver. Figure 55 shows a receiver with an ADC immediately after the downconverter. a transmitted signal arrives with an unknown delay τ . we ﬁnd R = 88. Lecture 20 Today: (1) Timing Synchronization Intro.1). including analog timing synchronization (from Rice book. φn (t − kTs − τ ) (57) Note that if tk = kTs + τ . The received complex baseband signal (for QAM. after the matched ﬁlter. Instead. note the ADC has nothing controlling it. (2) Interpolation Filters 31 Timing Synchronization At the receiver. an interpolator corrects the sampling time problems using discretetime processing. φn (t) am (k) φn (t − tk ). The input signal is downconverted and then run through matched ﬁlters. PAM. then the correlation φn (t − tk ).4 dBW = −1. Thus microwave towers should be placed at most 88. and for some delay tk . But. φn (t − kTs − τ ) is highest and closest to 1. As a start.
this digital and analog feedback loop can be problematic. because of the processing delay. and reducing part counts and costs. the timing synch control block is fed back to the ADC. A ‘software radio’ follows the idea that as much of the radio is done in digital. resulting in samples r(nT). When we say ‘synchronized in rate’. then sampling sometimes reduces the magnitude of the desired signal. First. For example. we will emphasize digital timing synchronization using an interpolation ﬁlter. Some example sampling clocks. consider Figure 56. The idea is to “bring the ADC to the antenna” for purposes of reconﬁgurability.5 −1 4 6 8 10 Time t 12 14 Figure 56: Samples of the matched ﬁlter output (BPSK. But again. unsynchronized with the symbol clock. we’ll talk about interpolation.ECE 5520 Fall 2009 r(nTsa) ADC ctstime bandpass signal Matched Filter Digital Interpolator Timing Synchronization 114 r(kTsy) x(t) p/2 PLL Timing Synch Control Optimal Detector 16 2cos(2pfct+f) bm 2sin(2pfct+f) ADC Matched Filter Digital Interpolator r(nTsa) r(kTsy) Figure 55: Block diagram for a digital receiver for QAM/PSK. which is not preferred. after the signal has been sampled. compared to the actual symbol clock. this requires a DAC and feedback to the analog part of the receiver. since the sampling clock must operate at (at least) double the symbol rate. but with a timing error. Another implementation is like Figure 55 but instead of the interpolator. and then. a BPSK receiver samples the matched ﬁlter output at a rate of twice per symbol. including discretetime timing synchronization. .5 Value 0 −0. If downsampling (by 2) results in the symbol samples rk given by red squares. we’ll consider the control loop. are shown in Figure 57. In this ﬁgure. 32 Interpolation In this class. The industry trend is more and more towards digital implementations. RRC with α = 1) taken at twice the correct symbol rate (vertical lines). 1 0. Also. we mean within an integer multiple. These are shown in degrees of severity of correction for the receiver.
That is. the two clocks are commensurate and we sample exactly twice per symbol period. Instead. As another example. (3) synchronized in rate but not in phase. µ(k)) for k = 1. two clocks are commensurate if the ratio T/Ts can be written as n/m where n. Since clocks are generally incommensurate. and 0 ≤ µ(k) < 1. Def ’n: Incommensurate Two clocks with rates T and Ts are incommensurate if the ratio T/Ts is irrational. if T/Ts = 2/5. m are integers. for both cases (3) and (4) in Figure 57. we cannot count on them ever repeating. 32. In general. our receivers always deal with type (4) sampling clock error as drawn in Figure 57. the correct sampling times should be kTs + τ . The situation shown in Figure 56 is case (3). we sample exactly 2. kTs + τ is always µ(k)T after the most recent sampling instant. We can write that kTs + τ = [m(k) + µ(k)]T where m(k) is an integer. Calculate (m(k). where µ(k) is called the fractional interval. For example. and every 5 samples the delay until the next correct symbol sample will repeat. the highest integer such that nT < kTs + τ . In contrast. Symbol clock may be (2) synchronized in rate and phase. This means that µ(k) is given by kTs + τ µ(k) = − m(k) T Example: Calculation Example Let Ts /T = 3. where T /Ts = 1/2 (the clocks are commensurate). with sample clock. m(k) = kTs + τ T (58) where ⌊x⌋ is the greatest integer less than function (the Matlab floor function). (4) synchronized neither in rate nor phase. but no samples were taken at those instants. In other words.ECE 5520 Fall 2009 115 Figure 57: (1) Sampling clock and (24) possible actual symbol clocks.1 Sampling Time Notation In general. 3.8.1 and τ /T = 1. T/Ts = 1/2.5 times per symbol. the sampling clock has neither the same exact rate nor the same phase as the actual symbol clock. but the sampling clock does not have the correct phase (τ is not equal to an integer multiple of T ). 2. .
The analog output of the matched ﬁlter could be represented as a function of its samples r(nT).13. r([m(k) + µ(k)]T) = n r(nT)hI ([m(k) − n + µ(k)]T). Plugging these times in for t in (59).4. in which we draw a straight line between the two nearest sampled values to approximate the values of the continuous signal between the samples. 32.1 + 1. where Ts is the actual symbol period used by the transmitter. r([m(k) + µ(k)]T) = i r([m(k) − i]T)hI ([i + µ(k)]T). 32. Rice. Instead. This isn’t so great of an approximation. and Figure 8.ECE 5520 Fall 2009 116 Solution: m(1) = ⌊3. at sample 8. since kTs + τ = [m(k) + µ(k)]T. using the (m(k).8⌋ = 11. µ(1) = 0.3 Approximate Interpolation Filters Clearly. Figure 8. Thus your interpolation will be done: in between samples 4 and 5. r(t) = n r(nT)hI (t − nT) sin(πt/T) .12. πt/T (59) where hI (t) = Why is this so? What are the conditions necessary for this representation to be accurate? If we wanted the signal at the correct sampling times. .2 Seeing Interpolation as Filtering Consider the output of the matched ﬁlter. r(t) as given in (57).4. because the sinc function has inﬁnite support. we use polynomial approximations for hI (t): • The easiest one we’re all familiar with is linear interpolation (a ﬁrstorder polynomial). we could have it – we just need to calculate r(t) at another set of times (not nT). Reindexing with i = m(k) − n.1) + 1. this requires an inﬁnite sum over i from −∞ to ∞. Call the correct symbol sampling times as kTs + τ for integer k.1) + 1.8⌋ = 4. m(2) = ⌊2(3. µ(k)) notation. The desired sample at [m(k) + µ(k)]T is calculated by adding the weighted contribution from the signal at each sampling time. The problem is that in general. we have that r(kTs + τ ) = n r(nT)hI (kTs + τ − nT) Now. Note: Good illustrations are given in M. (60) is a ﬁlter. and in between samples 11 and 12. where the ﬁlter coeﬃcients are dependent on µ(k).9 µ(1) = 0 µ(1) = 0. (60) This is a ﬁlter on samples of r(·).1 m(3) = ⌊3(3.8⌋ = 8.
we can use four points to ﬁt a secondorder polynomial. Thus we could rewrite (60) as r([m(k) − i]T)hI (i. • However. (61) r([m(k) + µ(k)]T) = i That is. . consider the linear interpolator. the ﬁlter is hI (i) but its values are a function of µ(k). that is. Four points determine a 3rd order polynomial. These Farrow ﬁlters started to be used in the 90’s and are now very common due to the dominance of digital processing in receivers. From (60). and result in a linear ﬁlter. . 0 r([m(k) + µ(k)]T) = i=−1 r([m(k) − i]T)hI (i. µ(k)) = bl (i)µ(k)l l=0 A full table of bl (i) is given in Table 8. Essentially. To see results for diﬀerent order polynomial ﬁlters.1 Higher order polynomial interpolation ﬁlters p In general. who has the US patent (1989) for the “Continuously variable digital delay circuit”.ECE 5520 Fall 2009 117 • A secondorder polynomial (i. we should take the r(m(k)T) sample exactly.e. 32.4. . 32. This is temporal asymmetry. we should take the r([m(k) + 1]T) sample exactly. • Finally.4 Implementations Note: These polynomial ﬁlters are called Farrow ﬁlters and are named after Cecil W. Farrow. they are a weighted sum of µ(k)0 . µ(k)) = µ(k) and hI (0. given µ(k). (To see this. hI (i. Example: First order polynomial interpolation For example. µ(k)p for a pth order polynomial ﬁlter. µ(k)2 . we can see that the ﬁlter coeﬃcients are a function of µ(k).24. and get a linearphase ﬁlter. µ(k)). we could use a cubic interpolation ﬁlter. The ﬁlter coeﬃcients are a polynomial function of µ(k). µ(k)) What are the ﬁlter elements hI for a linear interpolation ﬁlter? Solution: r([m(k) + µ(k)]T) = µ(k)r([m(k) + 1]T) + [1 − µ(k)]r(m(k)T) Here we have used hI (−1. of AT&T Bell Labs.a parabola) is actually a very good approximation. note in the time domain that two samples are on one side of the interpolation point. . see M. and one on the other. Rice Figure 8. . the fractional interval. µ(k)1 . the three point ﬁt does not result in a linearphase ﬁlter. µ(k)) = 1 − µ(k). As µ(k) → 0. Note that the i indices seem backwards. one can determine a parabola that ﬁts those three points exactly. Rice handout.1 of the M. Given three points. As µ(k) → 1. we form a weighted average of the two nearest samples.) Instead.
625 (62) Does this make sense? Do the weights add up to 1? Does it make sense to subtract a fraction of the two more distant samples? Example: Matlab implementation of Farrow Filter My implementation is called ece5520 lec20. 0.m and is posted on WebCT. there is an extra degree of freedom – you can select parameter α to be in the range 0 < α < 1. Example: 2nd order Farrow ﬁlter What is the Farrow ﬁlter for α = 0. 0.5. we can calculate that hI (−2.5 because it is only slightly worse.75 = 0.5. Since µ2 = 0. It has been shown by simulation that α = 0. . µ = 0.625 hI (1.ECE 5520 Fall 2009 118 For the 2nd order Farrow ﬁlter. Lecture 21 Today: (1) Timing Synchronization (2) PLLs 33 Final Project Overview xI(n) MF N/2 Interp.5) = αµ2 − αµ = 0.5) = αµ2 − αµ = 0.25.25 + 1 = 0.125 − 0.125 − 0. µ = 0.5 which interpolates exactly halfway between sample points? Solution: From the problem statement. timing synchronization for QPSK. µ0 = 1. but people tend to use α = 0. 0. Be careful. rather than vector processing. as my implementation uses a loop. Filter MF h[n]=p(nT) 2 sin(W0n) N/2 rI(k) rQ(k) xQ(n) Symbol Decision All Bits Preamble Detector DATA Bits Figure 58: Flow chart of ﬁnal project. and division by two is extremely easy in digital ﬁlters.25 = −0.125 hI (−1.25 = −0.43 is best.125 − 0.125 + 0.5) = −αµ2 + (1 + α)µ = −0.5) = −αµ2 + (α − 1)µ + 1 = −0. 0.125 hI (0. Filter Timing e(n) Loop Error Filter Detector s [ n] h[n]=p(nT) 2 cos(W0n) v(n) symbol strobe VCC m Interp.
This is a good source for detail on many possible methods. The job of the timing error detector is to produce a voltage error signal e(n) which is proportional to the correct fractional oﬀset between the current sampling oﬀset (ˆ) and the correct µ sampling oﬀset. you can interpolate between samples to ﬁnd the desired sample. it works nearly as well. • Solution: From samples taken at or above the Nyquist rate. Often. or a mix. discrete time. Loop Filter 4. • Problem: After the matched ﬁlter. • Problem: How do you know when the correct symbol sampling time should be? • Solution: Use a timing locked loop.2 Timing Error Detection We consider timing error detection blocks that operate after the matched ﬁlter and interpolator in the receiver. 33.ECE 5520 Fall 2009 119 For your ﬁnal project.1. Interpolation Filter 2. there may be no analog feedback to the ADC. Voltage Controlled Clock (VCC) 5. analogous to a phased locked loop. and in modern discretetime implementations. There are several possible timing error detection (TED) methods. which contains much of the code for a BPSK implementation. However this solution leads to new problems: • Problem: True interpolation requires signiﬁcant calculation – the sinc ﬁlter has inﬁnite impulse response. you will implement the above receiver. in order to ﬁx any bugs. We will talk through two common discretetime implementations: 1. the samples may be at incorrect times. Compared to last year’s class. • Solution: Approximate the sinc function with a 2nd or 3rd order polynomial interpolation ﬁlter. Timing Error Detector (TED) 3. you have the advantage of having access to the the Rice book.1 Review of Interpolation Filters Timing synchronization is necessary to know when to sample the matched ﬁlter output. We want to sample at times [n + µ]Ts . When you put these together and implement them for QPSK.4. 33. as related in Rice 8. Preamble Detector These notes address the motivation and overall design of each block. Earlylate timing error detector (ELTED) . you will need to understand how and why they work. which has output denoted xI (n).4. Implementations may be continuous time. we leave out the Ts and simply talk about the index n or fractional delay µ.4. where n is the integer part and µ is the fractional oﬀset. Section 8. We focus on the discrete time solutions. It adds these blocks to your QPSK implementation: 1.
ECE 5520 Fall 2009 120 2. and increases away from the correct ˙ sampling time. to 1) the slope is close to zero. 1 0.5 Figure 59: A sample eyediagram of a RRCshaped BPSK received signal (post matched ﬁlter). When the bit sequence is The earlylate TED is a discretetime implementation of the continuoustime “earlylate gate” timing error detector.g. Figure 59(a) shows an example eye diagram of the signal x(t) and Figure 59(b) shows its derivative x(t).5 1 Time from Symbol Sample Time 1.5 1 Time from Symbol Sample Time 1. to 1. we can use it to indicate how far from the sampling time we might be.5 x(t) Value 0 −0. In particular. Zerocrossing timing error detector (ZCTED) We’ll use a continuoustime realization of a BPSK received matched ﬁlter output.5 Slope of x(t) 0 −5 −1. to 1. time 0 corresponds to a correct symbol sampling time.3 Earlylate timing error detector (ELTED) Since the derivative x(t) is close to zero at the correct sampling time. x(t). consider for BPSK the value of x(t)sgn {x(t)} ˙ . at time n. You can imagine that the interpolator output xI (n) is a sampled version of x(t).5 −1 (b) −0.5 −1 −4 (a) x 10 5 −0. Figure 59 shows that the derivative is a good indication of timing error when the bit is changing from 1. to illustrate the operation of timing error detectors.5 −1 −1. 1. x(t). hopefully with some samples taken at exactly the correct symbol sampling times. ˙ 33. When the bit is constant (e.. Here (a) shows the signal x(t) and (b) shows its derivative x(t).5 0 0. to 1. the error e(n) is determined by the slope and the sign of the sample x(n). In ˙ both.5 0 0. In general.
which will be 2 when the sign changes. we don’t divide by the time duration 2Ts because it is just a scale factor. For a discrete time system. This timing error detector has a theoretical gain Kp which shown in Figure 8. the intermediate sample n − 1 should be approximately zero.1 of the Rice book. a negative slope would mean we’re ahead for a +1 symbol. If there was a symbol transmission. and when the bit is changing from 1. e(k) = x((k − 1/2)Ts + τ )[ˆ(k − 1) − a(k)] ˆ a ˆ (63) where the a(k) term is the estimate of the kth symbol (either +1 or −1). the slope will be somewhat positive even when the sample is taken at the right time.4. The theoretical gain in the ZCTED is twice that of the ELTED. Basically. This error detector assumes that the sampling rate is SP S = 2. which is a function of the excess bandwidth of the RRC ﬁlter used. if the sign changes between n − 2 and n. In particular. ˆ ˆ a(k) = sgn {x(kTs + τ )} . because every second sample is a symbol sampling time. to 1. but would mean that we’re ahead for a 1 symbol. you get only an approximation of the slope unless the sampling rate SP S (or N . In the project. (63) can be rewritten as e(n) = xI (n − 1)[sgn{xI (n − 2)} − sgn{xI (n)}] (66) This operation is drawn in Figure 60.4. that indicates a symbol transition. It is described here for BPSK systems. 33. and then multiply it by 2 to ﬁnd the gain Kp of your ZCTED. ˆ ˆ (64) (65) The error detector is nonzero at symbol sampling times.4 Zerocrossing timing error detector (ZCTED) The zerocrossing timing error detector (ZCTED) is also described in detail in Section 8.7 of the Rice book. . The signum function sgn {x(t)} just corrects for the sign: • A positive slope would mean we’re behind for a +1 symbol. In terms of n. as it is called in the Rice book) is high. you will need to look up Kp in Figure 8. we can send alternating bits during the synchronization period in order to help the ELTED more quickly converge.4. In particular see Figure 8. the slope will be somewhat negative even when the sample is taken at the right time. • In the opposite manner.17 of Rice. but would mean that we’re behind for a 1 symbol. to 1. The error is. We’d estimate x(t) from the samples out of the interpolator and see ˙ that e(n − 1) = {xI (n) − xI (n − 2)} sgn {xI (n − 1)} Here. then the error e(n) = 0. These are imperfections in the ELTED which (hopefully) average out over many bits. if it was taken at the correct sampling time. The factor of two comes from the diﬀerence of signs in (66). to 1. to 1.ECE 5520 Fall 2009 121 from 1. ˆ a(k − 1) = sgn {x((k − 1)Ts + τ )} . and we just need something proportional to the timing error.8 of the Rice book. That is. if the symbol strobe is not activated at sample n. This factor contributes to the gain Kp of this part in the loop.
The input strobe signal activates the sampler when it is the correct symbol sampling time.1 QPSK Timing Error Detection When using QPSK. 33. NCO(n2) NCO(n) . which would be approximately zero if the sample clock is synchronized. Strobe high: change m to m=NCO(n1)/W(n) 1/2 v(n) W(n) symbol strobe m underflow? symbol strobe z 1 modulo1 NCO(n1) Voltage Controlled Clock (VCC) Figure 61: Block diagram of decrementing VCC with control voltage v(n). 33. where sgn indicates the signum function. This is shown in Figure 61. . and the symbol has switched. See (8. both the inphase and quadrature signals are used to calculate the error term. the output is 2xI (n − 1).100) in the Rice book for the exact expression. 1 when positive and 1 when negative... NCO(n1) m W(n) Figure 62: Numericallycontrolled oscillator signal N CO(n). The error is simply the sum of the two errors calculated for each signal.ECE 5520 Fall 2009 122 Timing Error Detector sgn sgn e(n) xI(n) z1 xI(n1) z1 xI(n2) symbol strobe Figure 60: Block diagram of the zerocrossing timing error detector (ZCTED). (The choice of incrementing or decrementing is arbitrary.4. When this happens.) An example trace of the numerically controlled oscillator (NCO) which is the heart of the VCC is shown in Figure 62.5 Voltage Controlled Clock (VCC) We’ll use a decrementing VCC in the project.
the NCO decrements by 1/SP S + v(n). A cos(2πfc t + θ(t)) where θ(t) is the phase oﬀset. The incoming signal is a carrier wave. It is. Consider a generic PLL block diagram in Figure 63. µ(n) = N CO(n − 1) W (n) (67) When the strobe is not activated. the phase oﬀset is a ramp. 33. You will prove that (67) is the correct form for µ(n) in the HW 9 assignment. the operation of the timing synchronization loop may be best understood by analogy to the continuous time carrier synchronization loop. Initially. At each sample n. because it may not be constant. Once per symbol. . the fractional interval is also recalculated. This is the eﬀect of the ‘S’ curve function g(θe ) in Figure 64.1 Phase Detector If the estimate of the phase is not perfect.6 Phase Locked Loops Carrier synchronization is done using a phaselocked loop (PLL). 2. behind the carrier. meaning that a sample must be taken. ahead of the carrier. it may include a frequency oﬀset. t. that is. i. 33. When the symbol strobe is activated. If the phase estimate is exactly correct. If the phase estimate is too high. When this happens. the phase detector produces a voltage to correct the phase estimate.ECE 5520 Fall 2009 123 The NCO starts at 1. This section is included for reference.e. then g(θe ) < 0 and the VCO decreases the phase of the VCO output. For example. ˆ The job of the the PLL is to estimate φ(t). If the phase estimate is too low. We call this estimate θ(t). 1. θe (t) = 0. The error in the phase estimate is ˆ θe (t) = θ(t) − θ(t) If the PLL is perfect. then g(θe ) = 0 and VCO output phase remains constant. let’s ignore the loop ﬁlter. The function g(θe ) may look something like the one shown in Figure 64. then g(θe ) > 0 and the VCO increases the phase of the VCO output. In this case. (on average) the NCO voltage drops below zero. 3. that is.. µ(n) = µ(n − 1). Acos(2pfct+q(t)) Phase Detector g(q(t)q(t)) Loop Filter v(t) cos(2pfct+q(t)) VCO t q(t)=k0òv(x)dx ¥ Figure 63: General block diagram of a phaselocked loop for carrier synchronization [Rice]. µ(n) is kept the same. For those of you familiar with PLLs. and then θ(t) = ∆f t + α. It is a function of time.6. the strobe is set to high.
3 VCO A voltagecontrolled oscillator (VCO) simply integrates the input and uses that integral as the phase of a cosine wave. because the shape of the curve looks like the letter ‘S’ [Rice]. . but leave the frequency the same. θe = 0.4 Analysis To analyze the loop in Figure 63.increase the gas (voltage). In particular. We will be interested in discrete time later. for continuous time. we end up making simpliﬁcations.6..2 Loop Filter The loop ﬁlter is just a lowpass ﬁlter to reduce the noise. 33. 33. The input is essentially the gas pedal for a race car driving around a track . This linear approximation is generally ﬁne when the phase error is reasonably small. Note that if you just raise the voltage for a short period and then reduce it back (a constant plus rect function input). and the engine (VCO) will increase the frequency of rotation around the track. If we do this. We represent the loop ﬁlter in the Laplace or frequency domain as F (s) or F (f ).ECE 5520 Fall 2009 124 g(qe) qe Figure 64: Typical phase detector inputoutput relationship. i. This curve is called an ‘S’ Curve. Note that in addition to the A cos(2πfc t + θ(t)) signal term we also have channel noise coming into the system. respectively. is the cosine of 2πfc plus that phase. so we could also write the Ztransform response as F (z).6. you change the phase of the output. phaseequivalent perspective Q(s) kp F(s) V (s ) Q(s) k0 s Figure 65: Phaseequivalent and linear model for the continuoustime PLL.e. in the Laplace domain [Rice]. This phaseequivalent model of the PLL allows us to do analysis. 33. we write just Θ(s) and Θ(s) even though the physical process. as we know.6. ˆ In the left half of Figure 65. we can redraw Figure 63 as it is drawn in Figure 65. we model the ‘S’ curve (Figure 64) as a line.
the normalized bandwidth Bn T . then some manipulation shows you that Ha (s) = = ˆ kp F (s) ks0 Θ(s) = Θ(s) 1 + kp F (s) ks0 k0 kp F (s) s + k0 kp F (s) You can use Ha (s) to design the loop ﬁlter F (s) and gains kp and k0 to achieve the desired PLL goals.57 (p 736) show how to calculate the constants K1 and K2 . which responds to the input during phase oﬀset. See Figure C. However. T 1 + z −1 1 → s 2 1 − z −1 Because it is only a secondorder ﬁlter. k1 . 2. design the particular loop ﬁlter to use in your ﬁnal project. and other gain constants K0 and Kp .ECE 5520 Fall 2009 125 We can write that ˆ Θe (s) = Θ(s) − Θ(s) = Θ(s) − Θe (s)kp F (s) k0 s To get the transfer function of the ﬁlter (when we consider the phase estimate to be the output).5 DiscreteTime Implementations You can alter F (s) to be a discrete time ﬁlter using Tustin’s approximation. Here those goals include: 1. in Homework 9. this is not typically fast enough to be a problem in most radios. and in case of a frequency oﬀset. is a challenge.2. In this case there are four parameters to set. Note: A accelerating frequency change wouldn’t be handled properly by the given ﬁlter. k2 . Equations C. the frequency of the crystal will also change quickly. . For example. Rice book) for diagrams of this discretetime PLL. F (s) = k1 + k2 s This ﬁlter has a proportional part (k1 ) which is just proportional to the input. The Rice book recommends a 2nd order proportionalplusintegrator loop ﬁlter. The Rice Section C. k0 .56 and C. The ﬁlter also has an integrator part (k2 /s) which integrates the phase input. Designing a ﬁlter to respond to these. or ramp input. phase error models might be: step input. will ramp up the phase at the proper rate. damping coeﬃcient ζ. For example. its implementation is quite eﬃcient. Desired bandwidth.1 (p 733. and kp . The latter deserves more explanation.2 details how to set these parameters to achieve a desired bandwidth and a desired damping factor (to avoid ringing. You will.6. given the desired natural frequency θn . but to converge quickly). 33. if the temperature of the transmitter or receiver varied quickly. Desired response to particular phase error models. but not to AWGN noise (as much as possible).
given signal space diagram.21 are not covered. Friday 1112. 1119. (e. FSK 4.005). union bound. Signal space 3. 9. Energy detection for FSK 7. This means that the loop ﬁlter eﬀectively averages over many samples when setting the input to the VCO. 34 Exam 2 Topics Where to study: • Lectures covered: 8 (detection theory). nearestneighbor approximation. • Exact expression.ECE 5520 Fall 2009 126 Note that the bandwidth Bn T is the most critical factor. bipolar vs. Intersymbol interference. 0. Digital modulation methods: • Binary. No material from 17 will be directly tested. Detection theory 2. Diﬀerential encoding and decoding for BPSK 6. • Oﬃce hours: Today: 23:30. Nyquist ﬁltering. SRRC pulse shaping 5. Probability of Error vs. Lectures 20. QAM. Lecture 22 Today: (1) Exam 2 Review • Exam 2 is Tue Apr 14.g. 34. • Homeworks covered: 48. Please come to me with questions today or tomorrow. unipolar • M ary: PAM. It is typically much less than one. The exam will be proctored. Gray encoding 8. Both know the formula and know how it can be derived. • Rice book: Chapters 5 and 6. . Eb N0 . PSK. • Standard modulation method.. What topics: 1. • A nonstandard modulation method. Please review these homeworks and know the correct solutions. but of course you need to know some things from the start of the semester to be able to perform well on the second part of this course. • I will be out of town MonWed.
. System design: Here are some engineering requirements for this communication system. Here are two misconceptions: . You can use two sides of an 8. http://www. True / false or multiple choice. with a 12 sentence limit.html We have done a lot of counting of bits as our primary measure of communication systems. Lecture 23 Today: (1) Source Coding • HW 9 is due today at 5pm. Example: What is the probability of bit error vs.” Bell System Technical Journal. “A mathematical theory of communication. and the plot of Qinv. Short answer. concept of frequency multiplexing 10. and energy eﬃciency is energy per bit over noise PSD. FSK. Text ﬁles can be presented in diﬀerent ways.)? Information can often be represented in many diﬀerent ways. From this list of possible modulation. audio.belllabs. vol. but the basic ideas can be presented pretty quickly. Images and sound can be encoded in diﬀerent ways.cs. Work out solution. Shannon. 1948. One good reference: • C. Bandwidth assuming SRRC pulse shaping for PAM/QAM/PSK. which one would you recommend? What bandwidth and bit rate would it achieve? You will be provided the table of the Qinv function. email.com/cm/ms/what/shannonday/ paper. pp. 379423 and 623656. 2. July and October.. 27.ECE 5520 Fall 2009 127 9. Modulation schemes’ bandwidth eﬃciency is measured in bits per Hertz. E. There may be multiple correct answers. 35 Source Coding This topic is a semesterlong course at the graduate level in itself.. Monday 45pm. 3. OH today 23pm. which will be due April 28 at 5pm. Eb N0 for this signal space diagram? 4. • Final homework: HW 10. . video. or in bits per second. Our information source is measured in bits. but there may be many wrong answers (or right answer / wrong reason combinations).g. Comparison of digital modulation methods: • Need for linear ampliﬁers (transmitter complexity) • Receiver complexity • Bandwidth eﬃciency Types of questions (format): 1. Everything is measured in bits! But how do we measure the bits of a source (e.5 x 11 sheet of paper.
) above? Sometimes. an image. For example. Randomness and information.v. in nontechnical language. Each pixel has two possibilities (possible messages). and x ∈ SX . Our technical deﬁnition of entropy of a random variable is as follows. or video? 35. Take the log2 of the number of diﬀerent messages you could send.g. audio. You could store it as a list of pixels.ECE 5520 Fall 2009 128 1. • If you are told the value of a very “random” r. the number of bits in the compressed image would exceed what we had allocated. How many bits would be used in these two representations? What would make you decide which one is more eﬃcient? You can see that equivalent representations can use diﬀerent number of bits. 2.v. MP3 Note: Both “zip” and the unix “compress” commands use the LempelZiv algorithm for source compression. Def ’n: Entropy Let X be a discrete random variable with pmf pX (xi ) = P [X = x]. that doesn’t vary that much. 1. Using the ﬁle size tells you how much information is contained within the ﬁle. Then the entropy of X. . • Lossy: e. .. there is a ﬁnite or countably inﬁnite set SX . This would introduce errors into the image. x2 . is deﬁned as. . that telling conveys quite a bit of information to you. but with fewer bits. thus we could encode it in log2 2 or one bit per pixel.tar ﬁles represent the exact same information that was contained in the original ﬁles. and we used the sparse B&W image coding scheme (2. This is the idea behind source compression.. in units of bits.zip or . Zip or compress. You could simply send the coordinates of the pixels of one of the colors (e. So what is the intrinsic measure of bits of text.} is an ordering of the possible values in SX . . For example. Two types of compression algorithms: • Lossless: e. Here. consider a digital black & white image (not grayscale. JPEG. We will shorten the notation by using pi as follows: pi = pX (xi ) = P [X = xi ] where {x1 .all black pixels).1 Entropy Entropy is a measure of the randomness of a random variable.g. H [X] = − pi log2 pi (68) i Notes: .g. What if we had a ﬁxed number of bits to send any image. are just two perspectives on the same thing: • If you are told the value of a r. 2. but truly black or white).. that telling conveys very little information to you.
audio).4 0. x=1 s.6 0.v.2 0 0 0. has pmf. Actually.8 1 Figure 66: Entropy of a binary r. • All “log” functions are logbase2 in information theory unless otherwise noted. Model the r.w. they express information in “nats”. Nothing else is needed. not another random variable. x = 2 1/4. • The sum will be from i = 1 .g. This is true in the limit of x log x as x → 0+ .v. Keep this in mind when reading a book on information theory. Example: Nonuniform source with ﬁve messages Some signals are more often close to zero (e. N when SX  = N < ∞.w. The “reason” the units are bits is because of the base2 of the log.ECE 5520 Fall 2009 129 • H [X] is an operator on a random variable. x = 1 1/2. pX (x) = 1 − s. x = −2 0. o. X to have pmf 1/16.v. . . This it is like E [X]. x = 0 0. . • Entropy of a discrete random variable X is calculated using the probability values of the pmf of X. another operator on a random variable. not a function of a random variable.v. short for “natural” digits. o. 1 0. Example: Binary r.2 0.8 Entropy H[X] 0.4 0. • Use that 0 log 0 = 0. A binary (Bernoulli) r. pi . It returns a (deterministic) number.6 P[X=1] 0. x = −1 1/16. x = 0 pX (x) = 1/8. when theorists use loge or the natural log. What is the entropy H [X] as a function of s? Solution: Entropy is given by (68) and is: H[X] = −s log2 s − (1 − s) log2 (1 − s) The solution is plotted in Figure 66.
the joint r. random variables? You can show that H[X1 .XN (x1 .d. . . ..s are not independent. Would multiplying X by 2 change its entropy? 3. When r. random variables has N times the entropy of any one of them.i. .d.XN (x1 . .. Do you need to know what the symbol set SX is? 2. X2 with event sets SX1 and SX2 is deﬁned as H[X1 . compared to just one of them?”.s is less than N times the entropy of one of them. . X2 ] = − pX1 . .. . .i. . entropy is H[X1 .2 Joint Entropy Def ’n: Joint Entropy The joint entropy of two random variables X1 .ECE 5520 Fall 2009 130 What is its entropy H [X]? Solution: H [X] = = Other questions: 1. the entropy of any N independent (but possibly with diﬀerent distributions) r.X2 (x1 . We call this diﬀerence the conditional entropy. .v. .. For example. x2 ) x1 ∈SX1 x2 ∈SX2 (70) For N joint random variables. since pixels are correlated in space. H[X2 X1 ]: H[X2 X1 ] = H[X2 . . Would an arbitrary onetoone function change the entropy of X? 1 1 1 1 log2 2 + log2 4 + log2 8 + 2 log2 16 2 4 8 16 15 bits 8 (69) 35.X2 (x1 . Intuitively. X2 compared just to one of them? This is often an important question because it answers the question. x2 ) log2 pX1 . X1 .v. In addition.3 Conditional Entropy How much additional entropy is in the joint random variables X1 ..v.s is just the sum of the entropy of each individual r. of several neighboring pixels will have less entropy than the sum of the individual pixel entropies. . ..v. because of the dependence or correlation. XN ] = −N pX1 (x1 ) log2 pX1 (x1 ) = N H(X1 ) x1 ∈SX1 The entropy of N i. if you know some of them.. 35. . the B&W image. XN . xN ) log2 pX1 . xN ) xN ∈SXN x1 ∈SX1 What is the entropy for N i. . the joint entropy of N r.. the rest that you don’t know become less informative. “How much additional information do I get from both. . X1 ] − H[X1 ].v. . XN ] = − ··· pX1 . (71) . .
as N → ∞? Def ’n: Entropy Rate The entropy rate of a stationary discretetime random process... H[X2 X1 ] = − = − = − = − = − pX1 . source output). x1 ) xN−1 ∈SXN−1 x1 ∈SX1 which is the additional entropy (or information) contained in the N th random variable. x2 ) + x1 ∈SX1 x2 ∈SX2 x1 ∈SX1 x2 ∈SX2 x1 ∈SX1 x2 ∈SX2 pX1 . .X2 (x1 . .X2 (x1 . 35. . Solution: Plugging in (68) for H[X2 . x2 ) log2 pX2 X1 (x2 x1 ) (72) Note the asymmetry – there is the joint probability multiplied by the log of the conditional probability. Since there are inﬁnitely many of them. x2 ) log2 pX1 .X2 (x1 . . x2 ) log2 pX1 . N →∞ N Example: Entropy of English text Let Xi be the ith letter or space in a common English sentence. XN ].X2 (x1 . X1 ]. X1 ] = − ··· pX1 . What is the sample space SXi ? Is Xi uniform on that space? .X2 (x1 . x2 ) pX1 (x1 ) pX1 . we are more interested in the rate. x2 ) − log2 pX1 (x1 )) pX1 .X1 (xN xN −1 .X2 (x1 . is deﬁned as H = lim H[XN XN −1 . xN ) log2 pXN XN−1 .ECE 5520 Fall 2009 131 What is an equation for H[X2 X1 ] as a function of the joint probabilities pX1 .. are needed for the average r. x2 ) + x1 ∈SX1 x2 ∈SX2 x1 ∈SX1 pX1 (x1 ) log2 pX1 (x1 ) pX1 .XN (x1 . . . .X2 (x1 .X2 (x1 . X2 . . given the values of the N − 1 previous random variables. . in the limit. x2 ) (log2 pX1 .X2 (x1 . . . x2 ) log2 pX1 (x1 ) x1 ∈SX1 x2 ∈SX2 x1 ∈SX1 x2 ∈SX2 pX1 . X2 .X2 (x1 . For this case. in units of bits per random variable (a. .. X1 ] and H[X1 ].. . H[XN XN −1 . . x2 ) and the conditional probabilities pX2 X1 (x2 x1 ). in which we have random variables X1 . . This is not like either the joint or the marginal entropy. . the joint entropy of all of them may go to inﬁnity as N → ∞..a.k.4 Entropy Rate Typically. we’re interested in discretetime random processes. . .. . How many additional bits. N →∞ It can be shown that entropy rate can equivalently be written as H = lim 1 H[X1 .. We could also have multivariate conditional entropy.. x2 ) log2 pX1 .v.X2 (x1 .
005 0. the joint entropy was 10. the joint entropy was 11. Conversely.15 u p k f Poccurrence 0. Notes: • Here. For the threeletter combinations. if R < H.015 0.3.1199.02 0. The Proakis & Salehi book mentions that this value for general English text is about 4. H? Solution: For N = 10. For fourletter combinations. Example: B&W images. which is 2 · 3.46. I calculated an entropy of H = 4. . and was introduced by Claude Shannon in 1948. • R is our 1/Tb . • What is the minimum possible rate to encode English text (if you remove all punctuation)? • The theorem does not tell us how to do it – just that it can be done. See Figure 67(a). an ‘error’ occurs when your compressed version of the data is not exactly the same as the original.01 0. See Shannon’s original 1948 paper.025 0. For this pmf.05 0 (a) sp f k p Space or Letter u z (b) sp sp f k p u Second Space or Letter z Figure 67: PMF of (a) single letters and (b) twoletter combinations (including spaces) in Shakespeare’s Romeo and Juliet.5 Source Coding Theorem The key connection between this mathematical deﬁnition of entropy and the bit rate that we’ve been talking about all semester is given by the source coding theorem. at any rate R (bits / source output) as long as R > H. independent of the complexity of the encoder and the decoder employed.2 z 0.ECE 5520 Fall 2009 132 What is H[Xi ]? Solution: I had Matlab read in the text of Shakespeare’s Romeo and Juliet.3 bits/letter [taken from Proakis & Salehi Section 6. using Matlab on Shakespeare’s Romeo and Juliet.1 0.03 First Space or Letter 0.04 = 3 · 3. we have H = 1. Xi+1 ]? Solution: Again. • Theorem fact: Information measure (entropy) gives us a minimum bit rate.98 = 4 · 2. Theorem: A source with entropy rate H can be encoded with arbitrarily small error probability. It is one of the two fundamental theorems of information theory. I calculate an entropy of 7. the error probability will be bounded away from zero.73. What is H[Xi . 35. What is the entropy rate.99. Proof: Proof: Using typical sequences. 0.35. This gives me the twodimensional pmf shown in Figure ??(b). I calculated the entropy of the joint pmf of each twoletter combination.2]. You can see that the average entropy in bits per letter is decreasing quickly.
1888) received the A. In July 1928. H = lim 37 Channel Coding Now. He worked as a researcher for the Western Electric Company. 1 ≤ 2B. 30. each of duration Tsy . Hartley (born Nov. for a ﬁnite source. The major result was the Shannon’s source coding theorem. Given our information source X or {Xi }. each pulse could represent more or less information. he developed relationships useful for determining the capacity of bandlimited communication channels.ECE 5520 Fall 2009 133 • The theorem does not tell us how well it can be done if N is not inﬁnite. Here he developed the “Hartley oscillator”. . X2 . the value of H[X] or H gives us a measure of how many bits we actually need to use to encode. . and entropy rate. degree from the University of Utah in 1909. XN ]. we deﬁned entropy. . V. involved in radio telephony. . which says that a source with entropy rate H can be encoded with arbitrarily small error probability.1 R. depending on how pulse amplitudes were chosen. Nyquist determined that the pulse rate was limited to two times the available channel bandwidth B. Then. Hartley made the assumption that the communication system could discern between pulse amplitudes.B. the rate may need to be higher. the source data. L. Afterwards. 1 H[X1 . without loss. N →∞ N We showed that entropy can be used to quantify information. as described by Nyquist. 37. The pulse rate was limited to 2B. This discussion of entropy allows us to consider the maximum data rate which can be carried without error on a bandlimited channel. L. at any rate R (bits / source output) as long as R > H. In particular. When transmitting a sequence of pulses. Hartley was particularly inﬂuenced by Nyquist’s result. Lecture 24 Today: (1) Channel Capacity 36 Review H[X] = − pi log2 pi i Last time. But. Any lower rate than H would guarantee loss of information. he considered digital transmission in pulseamplitude modulated systems. Tsy In Hartley’s 1928 paper. Hartley assumed that the maximum amplitude available to the transmitter was A. if they . Hartley Ralph V. which is aﬀected by additive White Gaussian noise (AWGN). he published in the Bell System Technical Journal a paper on “Transmission of Information”. we turn to the noisy channel. at Bell Laboratories. That is.
In other words. yn ]. his capacity bound considers simultaneously demodulating sequences of received symbols. and did not say anything fundamental about the relationship between capacity and bandwidth for arbitrary modulation.d. and used statistics to develop a universal bound for capacity. . Shannon’s proof considers the limiting case as n → ∞. Hartley quantiﬁed the data rate using Nyquist’s relationship to determine the maximum rate C. y = [y1 . n 1 (yi − xi )2 ≤ PN n i=1 happens with probability one.1 Noisy Channel Shannon did take into account an AWGN channel. . 37. . C = 2B log2 1 + A Aδ (Note that the Hartley transform came later in his career in 1942). . Gaussian with variance PN . assuming perfect syncronization) is yi . Furthermore. . E. Given that a PAM system operates from 0 to A in increments of Aδ . the number of diﬀerent pulse amplitudes (symbols) is M =1+ A Aδ Note that early receivers were modiﬁed AM envelope detectors. the measured value y will be located √ within an ndimensional sphere (hypersphere) of radius nPN with center x. 37. of length n. 37.2 C. This late decision will decide all values of y = [x1 . Engineers would have to be conservative when setting Aδ to ensure a low probability of error. xn ] simultaneously. This asymptotic limit as n → ∞ allows for a proof using the statistical convergence of a sequence of random variables. Hartley used the ‘bit’ measure to quantify the data which could be encoded using M amplitude levels. . we need a law called the law of large numbers (ECE 6962 topic). the ith symbol sample at the receiver (after the matched ﬁlter. yi = xi + zi where X is the transmitted signal and Z is the noise in the channel. The noise term zi is assumed to be i. and did not deal well with negative amplitudes. log2 M = log2 1 + A Aδ Finally. Next. .i. In this AWGN channel. in bits per second.2.ECE 5520 Fall 2009 134 were at separated by at least a voltage spacing of Aδ . In fact. the capacity formula was for a particular type of PAM system. regardless of modulation type. In particular. Shannon What was left unanswered by Hartley’s capacity formula was the relationship between noise and the minimum amplitude separation between symbols. possible from the digital communication system. . This law says that the following event.2 Introduction of Latency Shannon’s key insight was to exchange latency (time delay) for reduced probability of error. Further.2. as n → ∞. All n symbols are received before making a decision. . as n → ∞.
3. with probability one as n → ∞. 584). show how we many diﬀerent symbols we could have uniquely distinguished. together.9 (pg. This is shown in P&S in Figure 9.2 Final Results Finally. C = P 2B n log2 1 + n 2 PN P = B log2 1 + PN 37. 1 n As a result y 2 n i=1 2 yi ≤ 1 n n x2 + i i=1 1 n n i=1 (yi − xi )2 ≤ P + PN ≤ n(P + PN ) This result says that the vector y. That is. This number M is the number of diﬀerent messages that could have been sent in n pulses.3 Introduction of Power Limitation Shannon also formulated the problem as a powerlimited case. is contained within a hypersphere of radius n(P + PN ) centered at the origin. n 1 x2 ≤ P i n i=1 This combination of signal power limitation and noise power results in the fact that. within a period of n sample times. if we could send M messages now in n pulses (rather than 1 pulse) we would adjust capacity to be: 2B C= log2 M n Using the M from (76) above.3 Combining Two Results The two results. we can replace the noise variance PN with the relationship between noise twosided PSD N0 /2.3. A] such that they are all separated by Aδ . Shannon’s formulation asks us how many multidimensional amplitudes xi √ be ﬁt can into a hypersphere of radius n(P + PN ) centered at the origin. 37.1 Returning to Hartley Adjusting Hartley’s formula. such that hyperspheres of radius nPN do not overlap. The result of this geometrical problem is that P ( n/2) (73) M = 1+ PN 37.2. by PN = N0 B . Hartley asked how many symbol amplitudes could be ﬁt into [0.ECE 5520 Fall 2009 135 37. in which the average power in the desired signal xi was limited to P .
0 Lecture 25 Today: (1) Channel Capacity • Lecture Tue April 28. 5pm. you can look at it as a bound on the bandwidth E eﬃciency as a function of the Nb ratio. or signal to noise ratio (SNR). Any practical communication system must operate at Rb < C. is canceled. because of the late due date. No late projects are accepted. • Everyone gets a 100% on the “Discussion Item” which I never implemented. This relationship is shown in Figure 70.4 Eﬃciency Bound Recall that Eb = P Tb . C = W log2 1 + Eb /Tb N0 W Rb Eb W N0 Here. with arbitrarily low probability of error. Instead. “Digital Terrestrial Television Broadcasting”. Thus from (74). . so Rb Eb Rb ≤ log2 1 + W W N0 Deﬁning η = Rb W. That is. If you think you might be late. C = B log2 1 + Typically. 37. set your deadline to May 4. • The ﬁnal project is due Tue. May 5. Thus 0 the capacity bound is also written C = W log2 (1 + SN R). the energy per bit is the power multiplied by the bit duration. Shannon also proved that any system which operates at a bit rate higher than the capacity C will certainly incur a positive bit error rate. where Rb is the operating bit rate. please the Robert Graves lecture 3:05 PM in Room 105 WEB. C is just a capacity limit.ECE 5520 Fall 2009 136 So ﬁnally we have the ShannonHartley Theorem. We know that our bit rate Rb ≤ C. C = W log2 1 + or since Rb = 1/Tb . people also use W = B so you’ll see also C = W log2 1 + P N0 W (75) P N0 B (74) This result says that a communication system can operate at bit rate C (in a bandlimited channel with width W given power limit P and noise value N0 ). Note that the ratio NPW is the signal power divided by the noise power. However. η ≤ log2 1 + η Eb N0 This expression can’t analytically be solved for η.
bound on bandwidth eﬃciency. η.ECE 5520 Fall 2009 137 14 12 Bits/second per Hertz 10 8 6 4 2 0 0 5 10 15 20 E /N ratio (dB) b 0 25 30 (a) 10 1 Bits/second per Hertz 10 0 10 −1 0 5 (b) 10 15 20 E /N ratio (dB) b 0 25 30 Figure 68: From the ShannonHartley theorem. on a (a) linear plot and (b) log plot. .
the source data. L. The major result was the Shannon’s source coding theorem. L. . When transmitting a sequence of pulses. This discussion of entropy allows us to consider the maximum data rate which can be carried without error on a bandlimited channel. XN ]. H = lim N →∞ 1 H[X1 . . 30. 1888) received the A. X2 . he published in the Bell System Technical Journal a paper on “Transmission of Information”.B. Ts In Hartley’s 1928 paper. . V. as described by Nyquist. The pulse rate was limited to 2B. 39 Channel Coding Now. 39. degree from the University of Utah in 1909. and entropy rate. Hartley Ralph V. we turn to the noisy channel. Any lower rate than H would guarantee loss of information. Hartley assumed that the maximum amplitude available to the transmitter was A. and did not deal well with negative amplitudes.ECE 5520 Fall 2009 138 38 Review H[X] = − pi log2 pi i Last time. each pulse could represent more or less information. Given our information source X or {Xi }. Hartley was particularly inﬂuenced by Nyquist’s sampling theorem. In particular. which says that a source with entropy rate H can be encoded with arbitrarily small error probability. if they were at separated by at least a voltage spacing of Aδ . we deﬁned entropy. at Bell Laboratories. Hartley used the ‘bit’ measure to quantify the data which could be encoded using M amplitude levels. But. without loss. at any rate R (bits / source output) as long as R > H. Hartley made the assumption that the communication system could discern between pulse amplitudes. he developed relationships useful for determining the capacity of bandlimited communication channels. which is aﬀected by additive White Gaussian noise (AWGN). he considered digital transmission in pulseamplitude modulated systems. Afterwards. the number of diﬀerent pulse amplitudes (symbols) is M =1+ A Aδ Note that early receivers were modiﬁed AM envelope detectors. each of duration Ts . the value of H[X] or H gives us a measure of how many bits we actually need to use to encode. depending on how pulse amplitudes were chosen. 1 ≤ 2B.1 R. involved in radio telephony. In July 1928. Hartley (born Nov. . log2 M = log2 1 + A Aδ . Given that a PAM system operates from 0 to A in increments of Aδ . Nyquist determined that the pulse rate was limited to two times the available channel bandwidth B. He worked as a researcher for the Western Electric Company. N We showed that entropy can be used to quantify information. Next. Then.
Furthermore. . . In other words.2. yn ].2. In this AWGN channel. . This law says that the following event. this late decision will decide all values of x = [x1 . we need a law called the law of large numbers. his capacity bound considers ndimensional signaling. in which the maximum symbol energy in the desired signal xi was limited to E. the ith symbol sample at the receiver (after the matched ﬁlter. of length n. This asymptotic limit as n → ∞ allows for a proof using the statistical convergence of a sequence of random variables.2 Introduction of Latency Shannon’s key insight was to exchange latency (time delay) for reduced probability of error. n 1 2 xi ≤ E n i=1 . or they might use multiple symbols over time (recall that symbols at diﬀerent multiples of Ts are orthogonal). yi = xi + zi where X is the transmitted signal and Z is the noise in the channel.ECE 5520 Fall 2009 139 Finally. C = 2B log2 1 + A Aδ 39. in bits per second.2 C. . the capacity formula was for a particular type of PAM system. Hartley quantiﬁed the data rate using Nyquist’s relationship to determine the maximum rate C. the measured value y will be located √ within an ndimensional sphere (hypersphere) of radius nEN with center x. Shannon uses all n dimensions in the constellation – the detector must use all n samples of y to make a decision.. E. and used statistics to develop a universal bound for capacity. FSK). 39. In the multiple symbols over time.d. regardless of modulation type. . xn ] simultaneously. Shannon What was left unanswered by Hartley’s capacity formula was the relationship between noise and the minimum amplitude separation between symbols. as n → ∞. That is.2.i. In either case. In fact. Shannon’s proof considers the limiting case as n → ∞.1 Noisy Channel Shannon did take into account an AWGN channel.3 Introduction of Power Limitation Shannon also formulated the problem as a energylimited case. Engineers would have to be conservative when setting Aδ to ensure a low probability of error.g. Gaussian with variance EN = N0 /2. . possible from the digital communication system. So the received vector is y = [y1 . Further. 1 n n i=1 (yi − xi )2 ≤ EN happens with probability one. In particular. . . 39. assuming perfect synchronization) is yi . as n → ∞. These might be truly an ndimensional signal (e. The noise term zi is assumed to be i. and did not say anything fundamental about the relationship between capacity and bandwidth for arbitrary modulation. 39.
such that hyperspheres of radius nEN do not overlap. is contained within a hypersphere of radius n(E + EN ) centered at the origin. with probability one as n → ∞.3 Combining Two Results The two results. 1 n As a result y 2 n i=1 2 yi ≤ 1 n n x2 + i i=1 1 n n i=1 (yi − xi )2 ≤ E + EN This result says that the vector y. ≤ n(E + EN ) 39. Shannon’s formulation asks us how many multidimensional amplitudes xi √ be ﬁt can into a hypersphere of radius n(E + EN ) centered at the origin. if we could send M messages now in n pulses (rather than 1 pulse) we would adjust capacity to be: 2B C= log2 M n Using the M from (76) above. together. A] such that they are all separated by Aδ . This is shown in Figure 69.1 Returning to Hartley Adjusting Hartley’s formula. radius nEN ) EN E+ n( Figure 69: Shannon’s capacity formulation simpliﬁes to the geometrical question of: how many hyperspheres of √ a smaller radius nEN ﬁt into a hypersphere of radius n(E + EN )? This number M is the number of diﬀerent messages that could have been sent in n pulses. Hartley asked how many symbol amplitudes could be ﬁt into [0. show how we many diﬀerent symbols we could have uniquely distinguished. The result of this geometrical problem is that E n/2 (76) M = 1+ EN 39.3. within a period of n sample times.ECE 5520 Fall 2009 140 This combination of signal energy limitation and noise energy results in the fact that. C= 2B n E log2 1 + n 2 EN = B log2 1 + E EN .
3. Note that the ratio NEB is the signal power divided by the noise power. and EN = N0 /2. That is. . Be know that our bit rate Rb ≤ C. C = B log2 1 + P N0 B .ECE 5520 Fall 2009 141 39.. or signal to noise ratio (SNR). Figure 71 is the plot on a logy 0 axis with some of the modulation types discussed this semester. Any practical communication system must operate at Rb < C. Thus from (77). This relationship is shown in Figure 70. 39. Eb /Tb C = B log2 1 + N0 B or since Rb = 1/Tb .e. i. (77) This result says that a communication system can operate at bit rate C (in a bandlimited channel with width B given power limit E and noise value N0 ). Eb = P Tb . Thus 0 the capacity bound is also written C = B log2 (1 + SN R). E = P Ts = 2B where P is the maximum signal power and B is the bandwidth. η ≤ log2 1 + η Eb N0 This expression can’t analytically be solved for η. we have the ShannonHartley Theorem. with arbitrarily low probability of error. Shannon also proved that any system which operates at a bit rate higher than the capacity C will certainly incur a positive bit error rate. C = B log2 1 + Rb Eb B N0 Here. However. the energy per bit is the maximum power multiplied by the bit duration. where Rb is the operating bit rate.2 Final Results P Since energy is power multiplied by time. you can look at it as a bound on the bandwidth E eﬃciency as a function of the Nb ratio.4 Eﬃciency Bound Another way to write the maximum signal power P is to multiply it by the bit period and use it as the maximum energy per bit. so Rb Rb Eb ≤ log2 1 + B B N0 Deﬁning η = Rb B (the spectral eﬃciency). C is just a capacity limit.
η. and MFSK. . bound on bandwidth eﬃciency. 10 1 Bandwidth Efficiency (bps/Hz) 10 0 1024−QAM 256−QAM 64−QAM 64−PSK 16−QAM 32−PSK 16−PSK 8−PSK 4−QAM 4−FSK 16−FSK 64−FSK 10 −1 256−FSK 10 −10 −2 1024−FSK 0 10 Eb/N0 (dB) 20 30 Figure 71: From the ShannonHartley theorem bound with achieved bandwidth eﬃciencies of MQAM. MPSK.ECE 5520 Fall 2009 142 14 12 Bits/second per Hertz 10 8 6 4 2 0 0 5 10 15 20 Eb/N0 ratio (dB) 25 30 Figure 70: From the ShannonHartley theorem.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.