Professional Documents
Culture Documents
Director
CEERI
Pilani – 333 031
(Rajasthan)
Phone :
FAX :
Email :
Architecting for VLSI Implementation
For a particular Logic Specification, there are many different possible Logic
Implementations.
These different Logic Implementations may widely differ in their cost, speed
Specifying Logic
c CEERI, Pilani 2
Architecting for VLSI Implementation
Specifying Logic
!
"
4. Through Programming Language Statements.
( , , , , , ...)
&
'
+
$%
$%
()
&
,
,
/
c CEERI, Pilani 3
Architecting for VLSI Implementation
Implementing Logic
Speed of Operation.
Power Consumption.
Design Time.
Design Cost.
c CEERI, Pilani 4
Architecting for VLSI Implementation
Implementing Logic
!
"
Product Cost.
Upgradability.
c CEERI, Pilani 5
Architecting for VLSI Implementation
0
1
A
B +
+ Z
C +
+
D
E
c CEERI, Pilani 6
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #2 (Combinational) for Logic Implementation
A
B +
+ Z
C +
D +
E
c CEERI, Pilani 7
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #3 (Sequential) for Logic Implementation
2:1
Mux
+ R
A B C D E
Select
Control
c CEERI, Pilani 8
Architecting for VLSI Implementation
Sequential Architectures
Next step should be taken only when the logic function of the previous
c CEERI, Pilani 9
Architecting for VLSI Implementation
Sequential Architectures
!
"
The stepping can be asynchronous/self-timed/synchronous (with a timing
c CEERI, Pilani 10
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #4 (Pipelined; Synchronous) for Logic Implementation
A
B +
C
+ Z
+
D +
c CEERI, Pilani 11
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #5 (Pipelined; Synchronous) for Logic Implementation
A
B +
Z
C
+ +
D +
c CEERI, Pilani 12
Architecting for VLSI Implementation
Pipelined Architectures
c CEERI, Pilani 13
Architecting for VLSI Implementation
Pipelined Architectures
!
"
They can be coarse-grained or fine-grained.
The pipeline can be balanced (all pipeline stages have identical delays) or
c CEERI, Pilani 14
Architecting for VLSI Implementation
Mixed Architectures.
c CEERI, Pilani 15
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #6 (Control-Programmable; Sequential) for Logic Implemen-
tation
2:1
Mux
A B C D E ALU R
Select
Control
Op_Select
c CEERI, Pilani 16
Architecting for VLSI Implementation
implemented.
c CEERI, Pilani 17
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #7 (Instruction-set Based; Programmable; Sequential) for
Logic Implementation
Memory CPU
0
3
3
0
3
3
0
3
3
0
3
c CEERI, Pilani 18
Architecting for VLSI Implementation
State
Sequencer
Control
Instruction
Generator Decoder
Instruction
Register
Execution Unit
c CEERI, Pilani 19
Architecting for VLSI Implementation
Each instruction in the instruction set specifies a soft gate (or virtual gate)
with an appropriate logic function and its connectivity to other ‘soft gates’
(through operand address specification).
c CEERI, Pilani 20
Architecting for VLSI Implementation
!
"
The equivalent logic network of ‘virtual gates’ (‘soft gates’) can be easily
c CEERI, Pilani 21
Architecting for VLSI Implementation
!
"
The instruction set (which defines the ‘soft gates’) acts as a hardware-
c CEERI, Pilani 22
Architecting for VLSI Implementation
4
tion of interconnections amongst ‘soft gates’).
c CEERI, Pilani 23
Architecting for VLSI Implementation
!
"
Excellent support for data structuring and program structuring at assembly
language level.
c CEERI, Pilani 24
Architecting for VLSI Implementation
!
"
Variable instruction lengths and many different instruction formats greatly
Increased complexity of the control part which occupies a large part of the
neck.
c CEERI, Pilani 25
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #8 (Instruction-set Based; Programmable; Sequential) for
Logic Implementation
Instruction
Memory
CPU
Data
Memory
c CEERI, Pilani 26
Architecting for VLSI Implementation
c CEERI, Pilani 27
Architecting for VLSI Implementation
Implementing Logic
!
"
Architecture #9 (Instruction-set Based; Programmable; Pipelined) for Logic
Implementation
c CEERI, Pilani 28
Architecting for VLSI Implementation
Instruction Fetch Instruction Decode / Compute Address / Memory Access Write Back
Register Fetch Execute
+4
Address
Data
Memory
A
LMD
Data
Reg
+
Instruction
PC
Registers
Memory
IR
S−Ex Imm
c CEERI, Pilani 29
Architecting for VLSI Implementation
c CEERI, Pilani 30
Architecting for VLSI Implementation
RISC Architectures
!
"
These will drastically reduce the complexity of the control part thereby releas-
ing chip area for more resources in the execution unit including larger register
files.
c CEERI, Pilani 31
Architecting for VLSI Implementation
!
"
Load-Store architectures :
c CEERI, Pilani 32
Architecting for VLSI Implementation
!
"
Easier pipelining of instruction execution.
Much larger fraction of chip area becomes available for execution unit re-
sources (e.g. a larger register file, more powerful operational units, more
buses) which can lead to enhanced performance.
c CEERI, Pilani 33
Architecting for VLSI Implementation
Generation 1
!
"
F D E F D E F D E
Time
Generation 2
!
"
F D E Instruction 1
F D E Instruction 2
F D E Instruction 3
Time
Generation 3
!
"
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
Time
D Decode E Execute
Generation 4
!
"
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
F D A R E W
Time
F Fetch Instruction R Read Operands
D Decode E Execute
c CEERI, Pilani 36
Architecting for VLSI Implementation
Generation 5
!
"
F D A R E E E E W
F D A R E E E E E E W
F D A R E E E E W
F D A E E E E E W
F D A E E E E E W
F D A E E E E E W
F D E E E E E W
F D E E E W
F D E E E E W
F E E E E E W
F E E E E E W
F E E E E E W
Dataflow Model
Time
F Fetch Instruction R Read Operands
D Decode E Execute
c CEERI, Pilani 37
Architecting for VLSI Implementation
How many clock cycles does it take to complete 1 instruction (cycles per
instruction or CPI) ?
c CEERI, Pilani 38
Architecting for VLSI Implementation
!
"
What is the maximum clock speed at which the processor can run ?
0.5 MHz (in 1971) to 3.5 GHz (in 2004) (Improvement = 7000 times)
7
6
:
8
f
6
;<=
>
c CEERI, Pilani 39
Architecting for VLSI Implementation
c CEERI, Pilani 40
Architecting for VLSI Implementation
Besides digital functions, a SoC typically also integrates some analog and/or
mixed signal and/or RF functions on a single chip.
The boundary between what functions must necessarily be done in analog (or
can be better done in analog) and what functions are better done as digital has
been fairly clear and stable for quite some time.
However, it is only more recent that the boundary between what digital func-
tions are better done in hardware and what functions are better done in soft-
ware has been sought to be defined in view of the speed-power-cost, time-to-
market and system upgradability points of view of the proposed solution.
c CEERI, Pilani 41
Architecting for VLSI Implementation
c CEERI, Pilani 42
Architecting for VLSI Implementation
!
"
Memory provides the means of building a soft logic network (represented by
software) as opposed to the hard logic network (represented by hardware).
Each software logic gate receives its configuration as well as inputs from mem-
ory via memory bus and stores its result in the memory via memory bus.
A hardware logic gate by contrast receives its inputs directly from the output of
a preceding hardware logic gate over a short wire.
Thus, there is typically an overhead of four memory transfers per logic opera-
tion when using software logic gates as opposed to hardware logic gates.
c CEERI, Pilani 43
Architecting for VLSI Implementation
!
"
These memory transfers occur over the memory bus and the - buses
?
@A
@A
internal to the memory and are, therefore, very slow as well as power consum-
ing owing to the large capacitances associated with memory buses ( tens of
B
) and - buses ( several ).
B
?
?
@A
@A
C
C
D
D
So, software logic, though very flexible, is both slow and very power consum-
ing.
Besides, there isn’t much concurrency in software logic. Classical von Neu-
mann CPU architectures of software logic have no concurrency.
!
"
Superscalars (with multiple pipelines) have still higher concurrencies ( 10-
B
15). However, it is no where close to the concurrency in hardware logic —
which can be massive.
For these reasons, software logic provides a very flexible but slow and high
power consuming logic implementation option, whereas hardware logic pro-
vides a totally rigid but fast and low power logic implementation option.
The software logic design is faster and its implementation is less expensive in
many situations.
Hence, one needs to carefully partition one’s system logic into software logic
and hardware logic.
c CEERI, Pilani 45
Architecting for VLSI Implementation
!
"
Very often in the past performance (speed) has been the sole criterion for
deciding what portion of the system logic be implemented in hardware.
c CEERI, Pilani 46
Architecting for VLSI Implementation
Besides the standard hardware and software options, there is another option
that effectively draws upon the strengths of both hardware logic and software
logic to provide a solution that optimally mixes the benefits of both these ap-
proaches in the context of a given application or class of applications — that
of Application Specific Instruction Set Processor (ASIP).
c CEERI, Pilani 47
Architecting for VLSI Implementation
ASIP Architectures
Thus, while there is all the flexibility afforded by this approach, perfor-
mance may be inadequate and power consumption excessive.
c CEERI, Pilani 48
Architecting for VLSI Implementation
HOST COMM.
PC CONTROLLER
PROGRAM
MEMORY PC LOGIC FETCHED
PARAMETER
REGISTER
INSTRUCTION ADDRESS
REGISTER UNIT
INSTRUCTION
DECODER PARAMETER
OUTPUT OUTPUT AND RAM
REG. CONTROLLER
CONTROL
SEQUENCER
DATA
ADDR. RAM
GEN.
SPEECH
SAMPLE RAM
c CEERI, Pilani 49
Architecting for VLSI Implementation
Reconfigurable Computing
c CEERI, Pilani 50
Architecting for VLSI Implementation
Reconfigurable Computing
!
"
FPGA blocks of SoCs or on SoC platforms provide a low cost means of im-
plementation of Application Specific Instruction Set Processor (ASIP) ideas,
and indeed, dynamically reconfigurable instruction sets and implementation
architectures — particularly where there are repetitive functions / long running
loops.
c CEERI, Pilani 51
Architecting for VLSI Implementation
Acknowledgment
c CEERI, Pilani 52