You are on page 1of 4

ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print) IJCST Vol.

4, Issue 2, April - June 2013

Advanced Digital Signal Processing (ADSP) for Decreasing


Power Dissipation
1
Nandinii Alla, 2Suresh Badugu
1,2
Dept. of ECE, K L University, Vaddeswaram, Guntur, AP, India

Abstract and correction in transmission as well as data compression.


The design of the core of a fixed point general purpose Advanced DSP algorithms have long been run on standard computers, on
Digital Signal Processing (ADSP) processor and its instruction specialized processors called digital signal processor on purpose-
set. This processor employs parallel processing techniques and built hardware such as Application-Specific Integrated Circuit
specialized addressing models to speed up the processing of (ASICs). Today there are additional technologies used for digital
multimedia applications. The ADSP has a dual MAC structure signal processing including more powerful general purpose
with one enhanced MAC that provides a SIMD, Single Instruction microprocessors, Field-Programmable Gate Arrays (FPGAs),
Multiple Data, unit consisting of four parallel data paths that are digital signal controllers (mostly for industrial apps such as motor
optimized for accelerating multimedia applications. The SIMD control), and stream processors, among others.
unit performs four multimedia-oriented 16-bit operations every
clock cycle. This accelerates computationally intensive procedures A. Signal Sampling
such as video and audio decoding. The ADSP uses a memory bank With the increasing use of computers the usage of and need for
of four memories to provide multiple accesses of source data each digital signal processing has increased. To use an analog signal on
clock cycle. Various techniques have been developed for reducing a computer, it must be digitized with an analog-to-digital converter.
the power consumption of VLSI designs, including voltage Sampling is usually carried out in two stages, discretization and
scaling, switched-capacitance reduction, clock gating, power- quantization. In the discretization stage, the space of signals is
down techniques, threshold-voltage controlling, multiple supply partitioned into equivalence classes and quantization is carried
voltages, and dynamic voltage frequency scaling These low-power out by replacing the signal with representative signal of the
techniques have been proven to be efficient at certain expense and corresponding equivalence class. In the quantization stage the
are applicable to multimedia/ Digital signal processing designs. representative signal values are approximated by values from a
Among these low-power techniques, a promising direction for finite set.
significantly reducing power consumption is reducing the dynamic The Nyquist–Shannon sampling theorem states that a signal can be
power which dominates total power dissipation. This Proposed exactly reconstructed from its samples if the sampling frequency is
paper can obviously decrease the switching (or dynamic) power greater than twice the highest frequency of the signal; but requires
dissipation, which comprises a significant portion of the whole an infinite number of samples. In practice, the sampling frequency
power dissipation in integrated circuits. is often significantly more than twice that required by the signal’s
limited bandwidth.
Keywords Some (continuous-time) periodic signals become non-periodic
VLSI Design, Multimedia/ Digital Signal Processing, Power after sampling, and some non-periodic signals become periodic
Dissipation, SIMD after sampling. In general, for a periodic signal with period T to
be periodic (with period N) after sampling with sampling interval
I. Introduction Ts, the following must be satisfied:
Digital Signal Processing (DSP) is the mathematical manipulation
of an information signal to modify or improve it in some way. It
is characterized by the representation of discrete time, discrete where k is an integer.
frequency, or other discrete domain signals by a sequence of
numbers or symbols and the processing of these signals. Digital B. DSP Domains
signal processing and analog signal processing are subfields of In DSP, engineers usually study digital signals in one of the
signal processing. DSP includes subfields like: audio and speech following domains: time domain (one-dimensional signals), spatial
signal processing, sonar and radar signal processing, sensor array domain (multidimensional signals), frequency domain, and wavelet
processing, spectral estimation, statistical signal processing, domains. They choose the domain in which to process a signal by
digital image processing, signal processing for communications, making an informed guess (or by trying different possibilities) as
control of systems, biomedical signal processing, seismic data to which domain best represents the essential characteristics of the
processing, etc. signal. A sequence of samples from a measuring device produces
The goal of DSP is usually to measure, filter and/or compress a time or spatial domain representation, whereas a discrete Fourier
continuous real-world analog signals. The first step is usually to transform produces the frequency domain information, that is
convert the signal from an analog to a digital form, by sampling and the frequency spectrum. Autocorrelation is defined as the cross-
then digitizing it using an Analog-To-Digital Converter (ADC), correlation of the signal with itself over varying intervals of time
which turns the analog signal into a stream of numbers. However, or space.
often, the required output signal is another analog output signal,
which requires a Digital-To-Analog Converter (DAC). Even if C. Multimedia Processor
this process is more complex than analog processing and has A Multimedia Processor is an application specific DSP processor
a discrete value range, the application of computational power which performs a number of multimedia algorithms. The following
to digital signal processing allows for many advantages over classes of DSP algorithms might be referred to as multimedia
analog processing in many applications, such as error detection types:

w w w. i j c s t. c o m International Journal of Computer Science And Technology   107


IJCST Vol. 4, Issue 2, April - June 2013 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

• Speech coding and decoding has special architecture and hardware features to accelerate the
• Speech recognition media applications. The data have a fixed-point representation.
• Speech identification The general structure of the processor is a Harvard’s one, with
• High-fidelity audio encoding and decoding different memories for programs and for data. There are several
• Modem algorithms architectural DSP features. Most of the DSP applications require
• Audio mixing and editing high performance in repetitive computation and data intensive
• Voice synthesis tasks. The research is aimed for designing of an efficient
• Image compression and decompression architecture, for the general purpose multimedia processor, and
• Image compositing is concentrated on:
A general purpose Advanced DSP (ADSP) processor should, of • Fast Multiply-Accumulate (MAC) operations (the most
course, cover all of the above. Naturally, no processor can meet DSP algorithms, including filtering and transforms, are
the needs of all or even the most of the applications, and that is multiplication-intensive).
why it’s a designer’s task to find the optimal trade-off between • Multiple memory access architecture (this property might
functional covering and performance, cost, integration, power be very efficient in cases where the operations could be
consumption, and other factors. accelerated by reading multiple data items at the same
instruction cycle).
II. The Processor Design Flow • Specialized address models (efficient data managing and
The design flow of any DSP processor, as well as some certain special data types in the DSP applications).
explanations especially for the designed one. The schematic of • The designers should not forget about an efficient Control
the design flow is shown in fig. 1: Path and of the input/output organization. In this work we
did not concentrate on them.
We have stopped at the dual MAC (DMAC) architecture. The top-
level datapath architecture is shown in fig. 3.1. First, each MAC
had the same structure. It operates with data from the memory
and from the Register File. The data length is 16 bits and the same
applies to the memory.

Fig. 2: A top-Level Data Path Architecture

Each parallel data path provides eight-by-eight bit multiplication


and then provides a 20-bit accumulation. The hardware structure
of the parallel and the serial data paths are the same. The only
difference is the computational bit length and the extra hardware
for performing special instructions like PSAD and PDOT. Each
parallel data path has a final 20-bit result and each serial data path
has a 40-bit result. These bit lengths have been got by adding
the guard bits to a native length result to prevent overflow errors
during the hardware loops. For the large loops, and according to
general purpose preference of this ADSP processor, we found that
four guard bits for the final result in the parallel data paths, and
eight guard bits for the serial ones are enough.
Fig. 1: The DSP Processor Design Flow

III. A Top-Level Data Path Architecture Design


According to the design specification we have designed a
Advanced DSP processor (ADSP). This is a DSP processor that

108 International Journal of Computer Science And Technology w w w. i j c s t. c o m


ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print) IJCST Vol. 4, Issue 2, April - June 2013

A. Pipeline Structure
There are several strategies in the pipeline design. The main
tricky place is the number of pipelining steps. According to
the designed hardware an instruction should be executed in the
following order:
• fetching of an instruction
• decoding of an instruction
• calculating a valid operands addresses
• performing operation (execution)
• writing the final result
We have divided the instruction execution job into six clock cycles,
see fig. 5. First we need to fetch instruction, decode it and calculate
the valid execution operand address (fetch operand). Next two
cycles are for executions of the operation. Finally, we are storing
data in a last step.

Fig. 5: Pipeline Principle

where:
Fig. 3: The Six Data Paths ADSP Structure FI . fetch instruction
DI . decode instruction
IV. Control Path FO . fetch operand
The simplified version of the Control Path is shown in figure 3.5 E1 . performing the 1-st operation
and contains the programmable Finite State Machine (FSM) or E2 . performing the 2-nd operation
the Program Flow Controller, Program memory, and Instruction ST . store the result
decoder.
The Program Flow Controller reads the flag registers and status B. Data Memory
signals from the processor. It manages the next PC address for According to the design specification all memories should be
program memory addressing according to the execution of the single port SRAM only. This gives the advantage of porting design
current instruction. The next instruction, pushes the current to different silicon processes.
instruction from the program memory to the instruction decoder. The size of the data memory addressing space should be large
The program memory is a 32-bit wide, 64kW large memory. enough for covering all functional purposes. Four 16-bit different
data accesses are supported in parallel. We have divided the
memory bank into four memories (M0, M1, M2, M3), see figure
3.10:

Fig. 6: Memory Structurea

V. Conclusion
The proposed Digital Signal Processor can obviously decrease
the switching (or dynamic) power dissipation, which comprises
a significant portion of the whole power dissipation in integrated
circuits. The designed processor should show good performance
Fig. 4: Control Path Structure for 8/16-bit convolution based algorithms. The motion estimation

w w w. i j c s t. c o m International Journal of Computer Science And Technology   109


IJCST Vol. 4, Issue 2, April - June 2013 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

and compensation (MPEG) applications should also be solved very


well because of the architectural improvements made especially Nandinii Alla was born in   Guntur,
for them. The designed addressing models give the opportunity to Guntur Dt, Andhra Pradesh, India. She
process the multidimensional media algorithms with a high-level is pursuing B.Tech in ECE from K
of flexibility of the data accesses. L University, Vaddeswaram, Andhra
The high level of orthogonality of the instruction set and the Pradesh, India. Her Research interest
architecture gives the designers the possibility of using all available includes Digital Signal Processing.
6data paths simultaneously. In this way, six different results are
calculated at the same time, four loop results to the memories,
and two auxiliary register operations. The SIMD parallel data
paths can process 8-bit media data with a great performance for
non high-quality colour applications, up to five significant bits
per pixel. To process data with high-quality colours, programmers
should use the DMAC mode. The designed instruction set is quite Badugu Suresh did his B.Tech in ECE
flexible for future improvements and changes. from prakasam engineering college,
Kandukur in 2008 and M.Tech in
References Telematics and signal processing from
[1] Vladimir Gnatyuk, Christian Runesson,“A Multimedia DSP NIT Rourkela in 2010. His interesting
processor design Master Thesis”, LiTH-ISY-EX-3530- areas are signal processing, Image
2004. Processing and Communication systems.
[2] Kuan-Hung Chen, Yuan-Sun Chu,“A Spurious-Power Presently he is working as a Assistant
Suppression Technique for Multimedia/DSP Applications”, professor in K.L.University, Vaddeswaram,
IEEETRANSACTIONS ON CIRCUITSAND SYSTEMS—I: Andhra Predesh. He is 3 years teaching
REGULAR PAPERS, Vol. 56, No. 1, JANUARY 2009. experiences and 4 international journals and his negative place
[3] T. Stockhammer, M. Hannuksela, T. Wiegand,“H.264/AVC is ongole.
in wireless environments”, IEEE Trans. Circuits Syst. Video
Technol., Vol. 13, No. 7, pp. 657–673, Jul. 2003.
[4] R. Schafer, T. Wiegand, H. Schwarz,“The emerging H.264/
AVC standard,” EBU Technique Review Jan. 2003 [Online].
Available: http://www.ebu.ch/trev_293-schaefer.pdf
[5] A. Bellaouar, M. I. Elmasry,"Low-Power Digital VLSI
Design", Circuits and Systems. Norwell, MA: Kluwer,
1995.
[6] A. P. Chandrakasan, R. W. Brodersen,“Minimizing power
consumption in digital CMOS circuits”, Proc. IEEE, Vol.
83, No. 4, pp. 498–523, Apr. 1995.
[7] K. K. Parhi,“Approaches to low-power implementations of
DSP systems”, IEEE Trans. Circuits Syst. I, Fundam. Theory
Appl., Vol. 48, No. 10, pp. 1214–1224, Oct. 2001.
[8] K. Choi, R. Soma, M. Pedram,“Dynamic voltage and
frequency scaling based on workload decomposition”, In
Proc. IEEE Int. Symp. Low Power Electron. Des., 2004, pp.
174–179.
[9] J. Choi, J. Jeon, K. Choi,“Power minimization of functional
units by partially guarded computation”, In Proc. IEEE Int.
Symp. Low Power Electron. Des., 2000, pp. 131–136.
[10] O. Chen, R. Sheen, S. Wang,“A low-power adder operating
on effective dynamic data ranges”, IEEE Trans. Very Large
Scale Integr. (VLSI) Syst., Vol. 10, No. 4, pp. 435–453, Aug.
2002.

110 International Journal of Computer Science And Technology w w w. i j c s t. c o m

You might also like