Professional Documents
Culture Documents
Lecture 4
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 2
Synchronous Digital Logic at RTL
At RTL a synchronous digital design consists of
combinational logic blocks and synchronously clocked
registers
16
x1[n]
combinational 18 combinational 20
block 1 a block 2 combinational
16 block 5
20 20
x2[n] g out
18 b
20
18 combinational
d
block 3
16
c
16 18
x3[n] f
combinational 18
x4[n] 16 block 4
18 e
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 3
Hard Real-Time Systems
Hard real-time system constraints the design to process the input
samples at the input sampling rate and produces output samples at
the output sampling rate
Examples:
Communication Systems
Multimedia Systems
Biomedical Systems
Discrete Time
Processing
A/D GPP/DSP/ASIC/FPGA
D/A
Analog Analog
Real time
environment
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 4
System for discrete 1-D Signal
For 1-D real-time signals only the amplitude varies in
time
Example: Voice, ECG, signals from temp and pressure
sensors
0.8
0.6
0.4
0.2
0.2
220 240 260 280 300 320 340
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 5
A Discrete 2-D Signal: An Image
70
60
50
40
30
20
10
10 20 30 40 50 60 70
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 6
A Discrete time Video
time
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 7
Digital Design of Real-time Systems
A digital communication receiver is a good example of real-time system
Digitizes an analog signal at IF
For high data rates, all these operations are performed in a digital
design
For digital design, a top level framework is important
70MHz
Digital Source
X A/D Down Demodulator FEC Decododer D/A
Converter
Filter
512MHz
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 8
Kahn Process Networks (KPN)
A system implementing a streaming application is best represented by
KPN
A KPN consists of autonomously running components
Takes inputs from FIFOs on input edges
x1[n] in 1
out
x2[n] in 2 y[n]
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 9
Example of a KPN
Four processors P1,P2,P3 and P4
A Processor fires if it has enough tokens/data at its input FIFOs
Example: P3 fires if there are input samples in FIFO4 and
FIFO3
P2
input
FIFO 1 FIFO 4
FIFO 2
FIFO 3 FIFO 5
P1 P3 P4
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 10
KPN for Modeling Streaming Applications
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 11
Code Example
N = 4*16; % processing block 2
K = 4*32; for i=1:N
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 12
Building Blocks in JPEG compression
FIFO 4 FIFO 5
Entropy
Quantization Sink
Coding
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 13
JPEG Code Matlab Code for Mapping as KPN
fun = @zigzag_runlength;
imageRGB = imread('peppers.png'); % default image
zigzagRLFIFO = blkproc(QuanFIFO, [BLK_SZ
figure, imshow(imageRGB); BLK_SZ], fun);
title('Original Color Image');
ImageFIFO = rgb2gray(RGB);
figure, imshow(ImageFIFO); % variable length coding using Huffman tables
title('Original Gray Scale Image'); JPEGoutFIFO = VLC_huffman(zigzagRLFIFO,
Huffman_dict);
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 14
JPEG KPN
DCT_ImageFIFO
ImageFIFO
DCT
block
Zig_zagRLFIFO
Quat_ImageFIFO
Zi-zag
& Run Length Quantize
Coding
JPEG_outFIFO
Variable 011010001011101...
Length
Coding
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 15
Example: MPEG Encoding
I P P P I P P P
(a)
Cr
Y
Cb
(b)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 16
Steps in Video Encoding
Video is coded as a mixture of
Intra-Frames (I-Frames)
The I-Frame and all other frames are also decoded for coding of P-Frame.
For P-Frame
frame
The motion vector is also coded and transmitted with the coded block
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 17
KPN Architecture for MPEG Encoder
Controller
Q-1
Quant inv
FIFO
DCT-1
mv search
mv mem
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 18
Limitations of KPN
Reads to follow FIFO operation
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 19
MPSoC
Multi-processor System on Chip (MPSoC) employs some from of
KPN
Local memory in each PE enables the design to counter KPN key
limitations
The PE (P) first copies the data from FIFO to local memory (M) and then
processes it and writes the results in the memory
The results are then copied in FIFO for passing to the next PE
TILE TILE Memory Processor
TILE
P M
P M M
P
MC MC MC
FIFO
CC CC CC
NOC
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 20
Case Study
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 21
GMSK mapping on KPN
Processing Elements (PEs) of AES, FEC, Framer, Modulator & mixer
are shown
FIFOs & PEs implements the transmitter as modified KPN
Each PE has its own memory to counter KPN limitations
8
Encryption &
AES mem Round Key
Local mem
Port A/B
circuit clk
serial clk
NRZ-IF
FEC-IF
1 aes_done
serial-data ASP AES FEC Encoding
Port A Port A
Write words to Port A
Port B
Port B
Port B
Port A
Port B
fec_done
8 8 8 8
Framer IF
Selection
GMSK
logic
jωоn
e Mx output bit clock
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 22
Sample by Sample Processing
The last PE performs sample by sample GMSK
modulation
Does not require any FIFO
x(t ) 2 Pc cos(2f ct s (t ))
t
s (t ) 2f d
a g ( kT )d k
k
cos([
cos( t)n])
I[n]
cos(.) x
[n]
Pulse
binary data jωоn
NRZ
M
Shaping with
Integrator cos(ω0n) e GMSK
Mapping Gaussian Signal
Filter
sin(.) x
Q[n]
sin([n])
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 23
Design Options
A PE in KPN can be implemented
Flexibility
using one of the design options: Programmable CPU
Programmable DSP
Application Specific
Application Specific Instruction Set Instruction Set
Processor (ASIP)
Processor (ASIP)
Application Specific Processor (ASP)
Application Specific
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 24
Graphical Representation of DSP Algorithms
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 25
Block Diagram
A block diagram consists of functional blocks connected
with directed edges
A connected edge represents flow of data from source
block to destination block
a
clk clk
ho h1 h2
x + x x x
V U
+ + y[n]
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 26
Signal Flow Graph
A signal flow graph (SFG) is a simplified version of a
block diagram.
The operation of multiplication with a constant and delays are
represented by edges,
The nodes represent addition, subtraction and input and output
(I/O) operations.
h0 h1 h2
x x x h0 h1 h2
y[n]
+ + y[n]
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 27
,
Dataflow Graph or Data Dependency Graph
A DSP algorithm is described by a directed graph
G V , E
Node V
represents a computational unit or, in a
hierarchical design, a sub-graph
e1 e2
A B C
e3
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 28
Scope of subclasses of graph representation
KPN is most generalized
For more specific representations use
Dynamic DFG (DDFG)
KPN
DDFG
CSDFG
SDFG
HSDF
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 29
Synchronous Dataflow Graph
The number of tokens consumed by a node on each of
its edges, and number of tokens produced on its
output edges, are known a priori
PS CD
S,TS D,TD
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 30
DFG and Topology Matrix
1 1 e2
e1 2 2
A,2 B,1 C,1
1 1
2 2
e3
(a) (b)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan
Example: IIR Filter as an SDFG
x[n] 1 y[n]
+
1 1
1 1
1
1
x
1 1
x
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 32
Consistent and Inconsistent SDFG
An SDFG is consistent if it is correctly constructed.
An SDFG is inconsistent
If some of its nodes starve for data at its input to fire
SDFG of Fig (b) is an example
1 3 3 1
1 1 1 1 1 1 1 1
A B C A B C
(a) (b)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 33
Balanced Firing Equations
For designing a scheduler that synchronously makes all the nodes to
fire, a repetition vector is computed
The repetition vector consists of f1 . . . fN values
The scheduler makes node i to fire fi times for synchronous operation
Following set of equations are solved
f s ps f D PD
f1 p1 f 2c2
f 2 p2 f 3c3
.
.
f N 1 p N 1 f N cN
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 34
HW Mapping and Scheduling
Self Timed
Minimum buffer size
pattern of firing
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan
Self-timed Firing
Example of self-timed firing where a node fires as soon as it has
enough tokens on its input
3
1 1 3
A,1 B,2 C,2
2 2
An SDFG
A B A,b B A,b B b C c
b A
B
A,B
Self firing schedule b B A,b
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 36
Example: CD quality sound to DAT conversion
1 1 2 3 4 7 5 7 4 1
CD A B C D DAT
40x4
To fire as soon as it gets enough samples at its input port
Buffers of sizes 1, 4, 10, 11 and 4
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 37
Real-time Processing
Dedicated Fully Parallel Implementation
clock frequency = sample frequency
Each operator in the DFG is mapped on related
hardware unit, e.g. addition, multiplication and delay
on adder, multiplier and register respectively
Most designs: time multiplexing
clock frequency >= 2 x sample frequency
clock frequency
sample frequency = available number of clock
cycles per sample
Pipeline / Parallel Designs
clock frequency < sample frequency
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 38
Dedicated Fully Parallel
Architecture
clock freq = sampling freq
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 39
One-to-one mapping
OP2
representation OP3
Step 2: One-to-one
mapping of mathematical OP2
OP3
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 40
Example: Dedicated Implementation
a[n]
DFG representation d[n]
+ d[n-1]
b[n] - x out[n]
c[n]
e[n]
One-to-one mapping
8 d[n] d[n-1]
a[n] 8
+/-
b[n]
8 8
c[n] 8 out[n]
+/-
X
algorithmic register
16
8 d[n] e[n] 8
a[n] 8
+/- d[n-1]
pipeline registers
b[n]
8 8
c[n] 8 out[n]
+/-
x
16
e[n] 8
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 41
HW Components for Fully Dedicated Architecture
Adder
Multiplier
Barrel Shifter
Register File
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 42
Embedded Functional Units in FPGAs
Xilinx Family
DSP48 in Virtex4
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 43
Contd…
BCOUT
18 PCOUT
18 48
38
18 X
48 CIN 48
A 36 A
18 18 38
18 x18 bit x
72
48 + P
B multiplier B
18 48
38 Y
SUBSTRACT
18
48
48
48 X
C
48 48
48 48
BCOUNT
18 PCOUT
18
38 48 48
X CIN 48
18 18 38
A
72
x 48 + P
38 Y
18 48
B
SUBSTRACT
48
48
X
18 48
48
BCIN PCIN
Wire shift Right by 17 Bits
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 44
Example: One to one mapping using embedded
multipliers and DSP48 blocks
wq[n]
x[n] y[n]
16 + 32
Q
16
x 32 + 32
w[n]
b0
+ 32 x 16
w[n-1]
16 x 32 +
a1 b1
32 x 16
w[n-2]
16 x 32
a2 b2
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 45
Implementation in RTL Verilog
module iir(xn, clk, rst, yn); // w[n] in Q2.30 format with one redundant sign bit
assign wfn = wn_1*a1+wn_2*a2;
// x[n] is in Q1.15 format
input signed [15:0] xn; /* through away redundant sign bit and keeping
input clk, rst; 16 MSB and adding x[n] to get w[n] in Q1.15 format */
endmodule
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 46
Synthesis on Spartan 3
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 47
Synthesis Results
Number of IOs: 50
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 48
Synthesis on Virtex-4 Device
Multiplication and addition operations in the DFG are
now mapped on DSP48 blocks
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 49
Design Options
In case multiple design options for HW blocks available
Choose fastest blocks for critical path
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan
Design Options
a1 x x a2 b1 x x b2 a1 M1 M1 a2 b1 M3 M3 b2
+ + Path 1 A1 Path 2
A2
c1 x
c1 M1
+
M1
Path 1 Path 2
y[n]
y[n]
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan
Homogeneous SDF (HSDF)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 52
SDFG to HSDF conversion (a) SDFG consisting of
three nodes A, B, and C (b) An equivalent HSDG
4 C
2 1 3
A B C
A B
C
A B
C
B
(a) (b)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 53
Computing two-dimensional DCT in a JPEG
algorithm presents a good example to demonstrate a
Multi-rate DFG
64 BT 64 8 Ci= 8 64 CT 64 8 Fj= 8
8x8 1D-DCT(BT(i,:)) 8x8 1D-DCT(CT(:,j))
8 3 8 3
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 54
An example of a CSDFG, where node A first fires and takes 1 time unit and produces 2
tokens, and then in its second firing it takes 3 time units and generates 4 tokens, the node B in
its first firing consumes one token in 2 time unit and in its second firing takes 4 time unit and
consumes 3 tokens, it respectively produces 4, and 7 tokens, similarly behavior of node C can
be inferred from the figure
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 55
CDFG implementing conditional firing
a==0
False
True
a b e a b
x
-
+
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 56
A dataflow graph is a good representation for
mapping on run-time reconfigurable platform
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 57
Mathematical transformation changing a DFG to
meet design goals
Transformation
A B C A B C
<Vi,Ei> → <Vt,Et>
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 58
Performance Measures
Iteration Period
Sample period and Throughput
Latency
Power Dissipation
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 59
Power Dissipation
Power is a major consideration
Specially for battery operated devices
There are two classes of power dissipation in a
digital circuit
Static power dissipation is due to the leakage current in
digital logic
Dynamic power dissipation is due to all the switching
activity
• Major source of power dissipation
• Stop the clock in areas that are not active
• The dynamic power consumption in a CMOS circuit
α and C are switching activity and physical
Pd CV 2
DD f capacitance of the design
VDD is the supply voltage
f is the clock frequency.
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 60
Latency and sampling period
Latency is the time delay for the design to
produce an output y[n] in response to an input
x[n]
Sampling period as the time difference between
the arrivals of two successive samples
input signal
y[n]
x[n]
Fully Dedicated Architecture output signal
1T ……. 5T
time
0 2T 3T t time 0 1T 2T 2T …. t
3T
sample period T
latency
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 61
Mapping Multi-rate DFG in Hardware
Sequential Mapping
Parallel Mapping
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 62
(a) Multi rate to single rate edge (b) Sequentially synthesis
(c) Parallel Synthesis
A B A B
1 3 1 1
clkG clkA clkB 3xclkG
clkG
(a) (b)
A B
clkA
B
clkB
clkG
(c)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan
63
(a) A multi-rate DFG (b) Hardware realization
N
R1
o N
R2
4N
N
R3
N 3N
A B A N
1
R4
1 3 4 1
N
clk R5
1
N 4N 2 4N N
R6 B
clk
3 enable
4N
2
1
(a) (b)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 64
(a) Hypothetical DFG with different types of node (b) HW
realization
clkG clkg
1 1
1 1 2 2 1
A B,7 C,8 D,x E,9 A B C D E
1 1 1 1 1
1 1
clkG clkg clkg clkg clkG clkg clkg
clkG clkg
(a) (b)
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 65
Summary
The chapter defines synchronous digital design to process digital signals
Define 1-D, 2-D and video digital signals
Digital Design of Signal Processing Systems, John Wiley & Sons by Dr. Shoab A. Khan 66