You are on page 1of 6

Architecture of DSP Processors

BASIC OF PROCESSORS

Although fundamentally related, DSP processors are significantly different from


general- purpose processors (GPPs) like the Intel Pentium. Some of the most common functions
performed in the digital domain are signal filtering, convolution and fast Fourier transform. In
mathematical terms, these functions perform a series of dot products. There are mainly three
types of architectures employed for the processors.

1.Von Neumann architecture :

It refers to computer architectures that use the same data storage for their instructions and data
(in contrast to the Harvard architecture).
A von Neumann Architecture computer has five pats: an arithmetic-logic unit, a control unit, a
memory, some form of input/output and a bus that provides a data path between these parts.
A von Neumann Architecture computer performs or emulates the following sequence of steps:
1. Fetch the next instruction from memory at the address in the program counter.
2. Add 1 to the program counter.
3. Decode the instruction using the control unit. The control unit commands the rest of the
computer to perform some operations. The instruction may change the address in the program
counter, permitting repetitive operations the instruction ay also change the program counter only
if some arithmetic condition is true, giving the effect of a decision which can be calculated to any
degree of complexity by the preceding arithmetic and logic

4. Go back to step 1.

Von Neumann computers spend a lot of time moving data to and from the memory, and his
slows the computer (this problem is called Von Neumann bottleneck). So, engineers often
separate the bus into two or more buses, usually one for instructions and the other for data.
This type of architecture is cheap requiring less pins that the Harvard architecture, and
simple to use because the programmer can place instructions or data anywhere throughout the
available memory. But it does not permit multiple memory accesses

Harvard Architecture:

The term Harvard Architecture originally referred to computer architectures that used
separate data storage for their instructions and data (in contrast to the Von Neumann
architecture). The term originated from the Harvard Mark I relay based computer, which stored
instructions on punched tape and data in relay latches.

The term Harvard Architecture is usually used now to refer to a particular computer
architecture design philosophy where separate data paths exist for the transfer of instructions and
data.
All computers consist primary of two pats, the CPU which processes data, and the memory
which holds the data. The memory in turn has two aspects to it, the data itself, and the location
where it is found – known as the address. Both are important to the CPU as many common
instructions boil down to something like “take the data in this address and add it to the data in
that address”, without actually knowing what the data itself is.
In recent years the speed of the CPU has grown many times in comparison to the memory it
talks to, so care needs to be taken to reduce the numbers of times you access it in order to keep
performance. If for instance, every instruction run in the CPU requires an access to memory, the
computer gains nothing for increased CPU speed – a problem referred to as being memory
bound.

Memory can be made much faster, but only at high cost. The solution then is to provide a
small amount of very fast memory known as acache. As long as the memory the CPU needs is in
the cache, the performance hit is very much less than it is if the cache then has to turn around and
get the data from the main memory. Tuning the cache is an important aspect of computer design.
The Harvard architecture refers to one particular solution to this problem. Instructions and
data are stored in separate caches to improve performance. However this has the disadvantage of
halving the amount of cache available to either one, so it works best only if the CPU reads
instructions and data at about the same frequency.
The Harvard architecture requires two memory buses. This makes it expensive to bring off
the chip – for example a DSP using 32 bit words and with a 32 bit address space requires at least
64 pins for each memory bus – a total of 128 pins if the Harvard architecture is brought off the
chip. This result is very large chips, which are difficult to design into a circuit

The true Harvard Architecture dedicates on bus for fetching instructions, with the other
available to fetch operands. This is inadequate for DSP operations, which usually involve at least
two operands. So DSP Harvard architectures usually permit the ‘program’ bus to be used also for
access of operands. Note that it is often necessary to fetch three things – the instructions plus two
operands – and the Harvard architecture is inadequate to support this: so DSP Harvard
architectures often also include a cache memory which can be used to store instructions which
will be reused, leaving both Harvard bused free for fetching operands. This extension – Harvard
architecture plus cache – is sometimes called as extended Harvard architecture or Super Harvard
architecture (SHARC).
General Architecture of DSP Processors:-
All general DSP Processors Core is composed of the Data Path, Control Path and
Address Generation Unit (AGU). The Memory Subsystem is located out of the processor core.
These in turn are built up of various module. A basic DSP processor supports RISC and CISC
instructions. The RISC uses the general registers for operands and writes them back to the
Register File (RF). The CISC used the memory subsystem to compute vector elements like in the
case of convolution. The CISC reads from the memory and write them to the accumulator special
registers located in the Multiplication and Accumulation (MAC). The memory bus is distributed
to the memory and DSP core DP components MAC and RF. However, there could be more
components that can be connected to the memory bus. This depends on the choice of the
instruction set which specifies all the operands required to perform a certain instruction. If there
are instructions that ALU performs by fetching operands from the memory subsystem then the
memory bus would also be connected to the ALU and so on. The Control Path generates the
control signals for all components in the core, keeps track of the Program Counter and has a
stack to service subroutines.
Apart for the DSP core the DSP Processor will also contain the Direct Memory Access
(DMA), Memory Management Unit (MMU), Timer and Interrupt controller. The DMA enhances
the data transfer in parallel with core execution. The MMU is used to ensure memory access
reliability and efficiency. The Timer is used to check the execution limit of a routine service. The
interrupt controller handles the Processor core interrupts.
Arithmetic and Logic unit:-
As the name describes, the ALU does all the logic and arithmetic computations. In some
cases the ALU could also cover shifting and rotation operations. There are 2 ways to design the
ALU. One, the ALU could be a part of the MAC and the second, the ALU as an individual
module. The former reduces the silicon costs making the design more complicated for design and
testing. The latter one increase the silicon costs supports parallel execution if required and,
reduces the design and verification time. Taking into consideration about the silicon costs
presently, the latter design is preferred due to the above mentioned reasons.
Below figure gives a brief introduction to the ALU components in a general DSP core. The ALU
normally performs single precision computations. Iterative instructions are never assigned to the
ALU in general DSP Processor core. The preprocessing of the ALU performs the operand
selections depending on the micro operations of the ALU. The pre processed operands are then
sent to the kernel. The kernel is the major computation block in the ALU which contains the
adder, shifting unit and logic unit. Aft the computations the data is sent to the post processing
unit which selects the output. The ALU writes to flags that are used for the determining the
processor status. The flags are used for the conditional execution instructions. It is a common
practice result.

You might also like