You are on page 1of 28

DIGITAL SIGNAL

PROCESSORS

Dr.Hariharan Muthusamy,
School of Mechatronic Engineering,
Universiti Malaysia Perlis(UniMAP)

CONTENTS
Understanding of the key issues underlying general- and
special-purpose processor for DSP,
impact of DSP algorithms on the hardware and software
architectures of these processors
How key DSP algorithms are implemented for real-time
execution on general-purpose digital signal processors.
Real-time processing implies as soon as possible but
within specified time limits.
Stream Processing where data is processed one sample at a
time(digital filtering)
Block Processing where fixed blocks of data points are
processing at a time (FFT and correlation)
Implementation of DSP algorithms requires both hardware
and software.

Hardware array of processors, standard microprocessors,


DSP chips or microprogrammed special-purpose devices
Software-low level assembly language codes or microcodes
native to the DSP hardware, and/or codes in an efficient
high level language, such as C or C++.
DSP processors general purpose and special purpose
Fixed-point devices- Texas Instruments(TMS320C54x) and
Motorola DSP563x processors
Floating-point devices- Texas Instruments(TMS320C4x) and
Analog Devices (ADSP21xxx SHARC)
Two types of special purpose hardware
algorithm specific digital signal processor-designed for
efficient execution of specific DSP algorithms
application specific digital signal processor designed for
specific applications
Both general-purpose and special purpose processors can be
designed with single chips or with individual blocks of
multiplies, ALUs, memories and so on.

General purpose processors available today are based on the


Von-Neumann concepts where operations are performed
sequentially.
Characteristics of generic hardware architecture
Multiple bus structure with separate memory space for
data and program instructions
I/O port provides a means of passing data to and from
external devices such as the ADC and DAC or for passing
digital data to other processors.
Arithmetic units for logical and arithmetic operations,
which include an ALU, a hardware multiplier and shifters.
Why is such an architecture necessary?
Repetitive arithmetic operations(multiply, add, memory
accesses, and heavy data flow through the CPU)
Standard microprocessors is not suited.
Important goal-to optimize both the hardware architecture
and the instruction set for DSP operations.

In digital signal processors, this is achieve by making


extensive use of the concepts of parallelism.
Harvard

architecture
Pipelining
Fast, dedicated hardware mutliplier/accumulator
Special instructions dedicated to DSP
Replication
On-chip memory/cache
Extended parallelism-SIMD, VLIW and static
superscalar processing.

Harvard Architecture

The program and data memories lie in two separate spaces,


permitting a full overlap of instruction fetch and execution

Pipelining

It is technique which allows two or more operations to overlap


during execution. In pipelining, a task is broken down into a
number of distinct subtasks which are then overlapped during
execution. It is used in DSP processors to increase speed. In a
perfect pipeline, the average time per instruction is given by
Time per instruction(nonpipeline)/number of pipe stages

Hardware multiplier-accumulator

Basic numerical operations in DSP are multiplications and


additions. Those operations are more time consuming.
To make real-time DSP possible a fast, dedicated hardware
multiplier-accumulator(MAC) using fixed or floating point
arithmetic is mandatory.

Benefits of Special Instructions

They lead to a more compact code which takes up less space in


memory.
They lead to an increase in the speed of execution of DSP
algorithms.
Special Instructions
instructions that support basic DSP operations
instructions that reduce the overhead in instruction
loops
Application-oriented instructions
Replication
Replication involves using two or more basic units, for
example using more than one ALU, multiplier or
memory unit.
In DSP, the norm is to have one CPU, with one or more
arithmetic elements replicated.

On-chip memory/cache
DSP

chips operate so fast that slow inexpensive memories


are unable to keep up.
To overcome this problem, many DSP chips contain fast onchip data RAMs or ROMs.
In such processors, slow external memories may be used to
hold the program code.
At initialization, the code may be transferred to the fast,
internal memory for full-speed execution.
Extended Parallelism SIMD, VLIW and static
superscalar processing
To increase both the number of instructions executed in each
cycle and the number of operations performed per instruction
to enhance performance.
Three techniques are normally used to increase the
computational performance of DSP processor
Single instruction multiple data(SIMD), very-large

Instruction word(VLIW) and superscalar processing.


SIMD processing is used to increase the number of operations
performed per instruction.
It has multiple data paths and multiple execution units.
Single instruction may be issued to the multiple execution
units to process blocks of data simultaneously and in this way
the number of operations performed in one cycle is increased.
The processor is able to perform two separate arithmetic
operations simultaneously with a single instruction.
Ex: Lucent DSP16000, Texas Instruments TMS320C62x and
Analog Devices TigerSHARC, ADSP-TS001

Very-long instruction word processing is an important approach


for substantially increasing the number of instructions that are
processed per cycle.
A VLIW is essentially a concatenation of several short
instructions and requires multiple execution units, running in
parallel, to carry out the instructions in a single cycle. (Texas
Instruments TMS320C62x )
Superscalar processing is another technique for increasing the
instruction rate of a DSP processor by exploiting instructionlevel parallelism.
Superscalar refers to computer architectures that enable
multiple instructions to be executed in one cycle.
In superscalar DSP processors, multiple execution units are
provided and several instructions may be issued to the units for
concurrent execution. (Analog Devices TigerSHARC)

General purpose DSP processors


They are high speed microprocessors, with hardware
architectures and instruction sets optimized for DSP
operations.
These processors make extensive use of parallelism, Harvard
architecture, pipelining and dedicated hardware whenever
possible to perform time-consuming operations, such as
shifting/scaling, multiplication and so on.
Never-ending quest to find better ways to perform DSP
operations, in terms of computational efficiency, ease of
implementation, cost, power consumption, size and application
specific needs.
Fixed-point digital signal processors
First generation fixed-point DSP processor (Texas
Instruments- TMS320C1x)
Second generation fixed-point DSPs-(Texas Instruments
TMS320C5x, Motorola DSP5600x, ADSP21xx, DSP16xx
families.

2nd generation DSPs have special instructions include a


multiply and accumulate with data move instruction
Can be combined with a repeat instruction to execute an FIR
filter with considerable time savings.
Its bit-reversed addressing capability is useful in FFTs.
Third generation fixed-point DSPs TMS320C54x,
DSP563x, and DSP16000
It has more data paths, larger-on-chip memory and instruction
cache.
Low power and have a power management facility.
Fourth generation fixed-point DSP processors- TMS320C62x,
VLIW architecture, large program and data cache memories,
Simplicity and high computation performance.

SELECTING DIGITAL SIGNAL PROCESSORS


Architectural features size of on-chip memory, special
instructions, and I/O capability.
Execution speed two main units of measurement for this
are the clock speed of the processor, in MHz, and the number of
instructions performed in millions of instructions per
second(MIPS).
Type of arithmetic- fixed and floating point arithmetic.
Floating point- natural choice for applications with wide
and variable dynamic range requirements
Fixed point-low cost and high volume applications.
Floating processors are more expensive than fixed,
although the cost difference has fallen significantly in
recent years.
Wordlength- significant impact on signal quality. Longer the
data word the lower errors that are introduced by DSP

Implementation of DSP algorithms in general purpose


digital signal processors
FIR digital filter
Non-recursive N-point FIR filters
N 1

y ( n) = h( k ) x ( n k )
k =0

How FIR filter works, consider the sample case of N=3


y(n) = h(0)x(n)+h(1)x(n-1)+h(2)x(n-2)
Digital FIR notch filter satisfying the following specifications
given below is to be implemented on the second-generation
fixed-point DSP processor, TMS320C50.
Notch frequency 1.875kHZ, Attenuation at notch frequency
=60dB, passband edge frequencies = 1.575kHz and 2.175kHz
Passband ripple = 0.01dB, and sampling frequency = 7.5kHz.
Complete FIR filter has at least 4 essential parts:
Initialization: Initialize the system; this may include setting
up a coefficient table.

Input section This may include reading of the input


sample, x(n) from an ADC via a serial port.
Inner loop computation: execution of the FIR equation to
obtain y(n)
Output section: include shifting/rounding of the result of the
inner loop computation and sending this, e.g to the DAC via a
serial port.
IIR digital filtering
Canonic second order section
W(n) = SF1x(n) a1w(n-1)- a2w(n-2)
Y(n) = b0w(n)+b1w(n-1)+b2w(n-2)
Where x(n) represents the input data, w(n) represents the
internal node, y(n) is the filter output samples and SF1 is a
scale factor
Direct form second order IIR filter
y(n) =b0x(n)+b1x(n-1)+b2x(n-2)-a1y(n-1)-a2y(n-2)

FFT processing
Discrete Fourier Transform(DFT) of a finite data sequence
x(n), is defined as
N 1

X (k ) = x(n)WNnk
n =0

Why special purpose?


DSP operations are computationally intensive. In wide
bandwidth applications where the input/output data rates are
high, more general signal processors cannot perform the
required computations fast enough.
For a given application most general-purpose DSPs contain
many on-chip resources that are either redundant for example
addressing modes, instruction set and I/O peripherals.
In special-purpose DSPs, the hardware is optimized to execute
a specific algorithm or to perform certain functions in a specific
application.
This leads to greater utilization of on-chip resources and
increased speed of operation.

Special-purpose hardware can be implemented as a singlechip product or realized as blocks of individual


components.

You might also like