You are on page 1of 22

Chapter 18: Programmable DSPs

Keshab K. Parhi and Viktor Owall


DSP Applications
DSP applications are often real time
but with a wide variety of sample rates
• High rates
– Radar
– Video
• Medium rates
– Audio
– Speech
• Low rates
– Weather
– Finance
Chap. 18 2
...with different demands on

• numeric representation
– float or fixed
– and nmber of bits
• Throughput/speed
• Power/energy dissipation
• Cost
Chap. 18 3
DSP features
Fast Multiply/Accumulate (MAC)
• FIR x(n)
D D D
• FFT
h0 h1 h2 h3
• etc.

y(n)
• Multiple Access Memories
• Specialized addressing modes
• Specialized execution control (loops)
• Specialized interfaces, e.g. AD/DA

Chap. 18 4
Addressing Modes
• Implied addressing
P=X*Y; operation sets location
• Immediate data
AX0=1234
• Memory direct
R1=Mem[101]
• Register direct
sub R1, R2
• Register indirect
A0=A0+ *R5
• Register indirect with increment/decrement
A0=A0+ *R5++
Chap. 18
A0=A0+ *R5-- 5
Standard DSP Alternatives
PCs or Workstations
• Non-real time
• low requirements
General purpose microprocessors
• slower for DSP applications
• might be one µproc. there anyway
Custom
• perfomance
• low cost at volume
• High development cost
Chap. 18 6
Standard Processors vs. Special Purpose

Algorithm

Standard Special
Processor Processor Cores Purpose
Domain Specific
Processors • High Calculation Capacity
• Programmable • Low Power
etc.
• Low Design cost • User defined Interface
• Standard Interface • Variable Wordlength
• Good supply of tools • Low Price at Volume

Chap. 18 7
Architectural Partitioning
Local busses
Processor Processor
Core Core
ASIC and
Dist.
Distributed
MEM memory
Main Main to decrease
MEM MEM ASIC data transfers

MIPS intensive
Conflicting req. algorithms in
• Throughput Flexibility by dedicated HW
• Flexibility using programmable to increase
• Power Consumption processor core throughput and
• Time to market save power
Chap. 18 8
Fixed point DSP
24 24
Motorola DSP56000x
X0
Operand X1
Registers Y0
Y1
• Usually DSP has single cycle
24 24 multiplier, may be pipelined

• Double wordlength out

56 56
+ guard bits
ALU
Shifter
56
A (56) 24
B (56)
24
• scaling
56 56

Accumulators Shifter/
Limiter • Altenative is mult
24 24 with reduced
wordlength output,
Chap. 18 e.g. 24 9
Memory Structures, von Neuman

Processor Core

Addresss bus

Data bus

Memory

Chap. 18 10
Memory Structures, Harvard
Original Harvard
Processor Core • one data
• one program

Addresss bus 2

Data bus 2

Addresss bus 2

Data bus 2

Memory A Memory B
Chap. 18 11
TI Processors, high speed

Chap. 18 12
TI Processors, low power

Chap. 18 13
TI, C64

Chap. 18 14
TI, C55

Chap. 18 15
Processor Architectures
SIMD –
Single Instruction Multiple Data

Program

Processor Processor Processor Processor

Chap. 18 16
Processor Architectures
MIMD –
Multiple Instruction Multiple Data

Program Program Program Program

Processor Processor Processor Processor

Chap. 18 17
Processor Architectures
VLIW –
Very Long Instruction Words
Control Unit
VLIW Instruction

Functional Functional Functional Functional


Unit Unit Unit Unit

Chap. 18 18
Split Processors

Functional units
can be split into
submodules, e.g.
for images (8bits)
TI320C80,
1 RISC
4 x 32bit DSP which
can be split into
8bit modules
Chap. 18 19
Vector Processors

Chap. 18 20
Low Power MMAC
Multiplier Multiple Accumulator

V. Sundararajan and K.K. Parhi, "A


Novel Multiply Multiple Accumulator
Component for Low Power PDSP
Design", Proc. of 2000 IEEE Int. Conf.
on Acoustics, Speech and Signal
Processing, Vol. 6, pp. 3247-3250,
Istanbul, June 2000

Chap. 18 21
Low Power MMAC
Schedule 16-tap FIR 4 acc. MMAC

Chap. 18 22

You might also like