Professional Documents
Culture Documents
2/50
Generic Hardware Architecture (1/2)
combinational state
logic D Q
feedback
4/50
HW/SW Distinction?
combinational
logic
5/50
System Models
6/50
FSM for Controller Model
square square
t2
add
t3
sqrt
t4
a 2 b 2
8/50
Finite-State Machine with Datapath
start S1 output
9/50
Hierarchical Concurrent FSM
state A state D
B
E
a(P)/c b
F
C
G
10/50
Program-State Machine (PSM)
state A state D
int idx, max;
B max = 0;
for (idx=0; idx<20; idx++)
e1 e2 {
if (A[idx] > max)
{
C max = A[idx];
}
}
Transition-On-Completion, Transition-Immediately
11/50
System Architecture
12/50
Controller Architecture
13/50
Controllers and Scheduling
–/ x = a + b
s1 s2
two-state design
–/ y = c + d
14/50
Distributed Controllers
M1 M2
s1 t1
i1= 0/ – i1= 1/ –
w
s2 s3
t2
–/ w = 0 –/ w = 1 w=0/– w=1/–
16/50
Datapath Architecture
17/50
Data Operators
if x = ‘0’ then
reg1 <= a; a 0
sel
else D Q
reg1 <= b; b 1
HDL code
register-transfer
18/50
Hardware Resource Sharing
+ D Q
b
mux
c
a
b
+ mux
c +
19/50
Logic Pipeline
control control
21/50
FSMD Architecture
22/50
CISC Architecture
23/50
RISC Architecture
24/50
VLIW Architecture
Memory
26/50
Co-Design Descriptions
27/50
Hierarchy Description
Structural hierarchy:
Processor
data bus
control Memory
lines
Behavioral hierarchy:
Behavior P is composed of behavior Q and behavior R …
28/50
Behavioral Decomposition
29/50
Concurrency Classifications
Data-driven concurrency:
Operation execution depends only upon the availability of
data; the degree of concurrency is limited by data
dependencies
Pipelined concurrency:
An extension to data-driven concurrency by dividing
operations into groups (stages), which operate on different
data sets concurrently
Control-driven concurrency:
Also refer to as thread-level parallelism; explicit construct is
used to specify concurrent execution of multiple control tasks
30/50
Communication
31/50
Examples of Communication
Behavior B1 Behavior B2
int x; Channel C int y;
... ...
void send(int d)
C.send(x); {...} y = C.receive();
... int receive(void); ...
{...}
32/50
Timing
34/50
Control-dependent Synchronization
behavior X Q
begin
Q();
fork A(); B(); C(); join; A B C
R(); synchronization
end behavior X; point
R
e
A2 B2
35/50
Data-dependent Synchronization
A B A B A B
A1
A1 B1 A1 B1 B1
x:=0
e e e e x=1
entered A2
A2
A2 B2 A2 B2
x:=1
B2
synchronization synchronization
by common event by common variable
synchronization
by status detection
36/50
Co-design Methodology
High-level Interface
Compilation
synthesis synthesis
implementation
model
manufacturing
37/50
System Specification
B0 shared sync
B5 B6() B4()
{ {
int local; int local;
B6 ... wait(sync)
sync shared = local+1; local = local-1;
signal(sync); ...
} }
B7 B4
B3
Atomic behaviors
Control-flow view
38/50
Allocation and Partitioning
39/50
System Model After A & P
B6 B4() B6()
sync B4_start { {
int local; int local;
B7 B4_ctrl wait(B4_start); ...
wait(sync); shared = local+1;
B4 local = shared-1; signal(sync);
B4_done ... }
B3 signal(B4_done);
}
40/50
Allocation & Partition Issues (1/2)
41/50
Allocation & Partition Issues (2/2)
VLC to
Bitstream
Video ME/MC - DCT Quantize
data
Reference
frame + IDCT Quantize-1
Macroblock level
MC
42/50
Example of Partitioning
vs.
Sequential behaviors f3()
Parallel behaviors
P1 P2
M1 M2
d1 d2
P3
M1 M2
P1 5 5
P2 5 6
P3 – 5
44/50
First Design
M1 P1 P2
Time = 19
M2 P3
network d2
5 10 15 20
time
45/50
Second Design
M1 P1
Time = 18
M2 P2 P3
network d1
5 10 15 20
46/50
Static vs. Dynamic Partitioning
47/50
Scheduling
48/50
System Model after Scheduling
PE0 PE1
B1() B3() B7()
{ { {
B1 signal(B6_start); wait(B3_start); stmt;
B6_start ... ... ...
} } }
B6
B4() B6()
sync { {
int local; int local;
wait(sync); wait(B6_start);
B7 B4 local = shared-1; ...
B3_start ... shared = local+1;
signal(B3_start); signal(sync);
} }
B3
Atomic behaviors
System model
49/50
Discussions
50/50