Professional Documents
Culture Documents
0 1 1 ex = , e y = , eh = 1 1 0
The two conditions to be satisfied for the scheduling and the projection vector to be permissible are:
S T d 0, S T e 0,
for all dependence vectors e. Verifying these conditions we get the following for each of the cases: (i) S = [1 0]
T
d = [1
0 T 0] p = 1 pT e 1 1 0
ex
STe 0
1 1
ey eh
In this case STd =1
All S T e 0
ex
pT e 1
1 0
STe 1
2 1
ey eh
d = [1
0 T 0] p = 1
1 0 S T d = [1 2] = 1 , S T e x = [1 2] = 2 is < 0 0 1 1 1 S T e y = [1 2] = 1 is < 0 , S T eh = [1 2] =1 1 0 0 1 Since S T e < 0 for e= and e= this case is not permissible. Note that these edges 1 1 cannot be simply reversed to make S T e 0 since the functionality of the nodes in the dependence graph are unknown and it is not clear if there are precedence constraints along the edges. (b) The systolic architectures for the permissible designs (i) and (iii) are given below.
h0 D .........X3X2X1X0 .........Y3Y2Y1Y0 D P0 D P1 D P2 h1 D h2 D
Figure1.
S = [1
0]
d = [1
h1 D
0]
h0 D .........X3X2X1X0 .........Y3Y2Y1Y0 2D
Figure2.
h2 D D
D P0
D P1 2D
P2 2D
S = [1
1]
d = [1 0]
9.
x0 x1 x2 x3 x4 x5
j
w4 w3 w2 w1 y4 y5 y6 w0
(0,0)
(0,1)
s1 S = , S T e + y x Tx s 2 From the reduced graph we see that H H : s1 + H H 0 H Y : Y H 0 X X : s2 + X X 0 X Y : Y X 0 Y Y : s1 s2 + Y Y 4 If we use linear scheduling, (i.e. H = X = Y = 0 ) and choose ST=(4 0) then it will maximize the HUE (hardware utilization time)
(b)
1 If d = , S T = (4 0 pT e 1 -1 0
0 0) p = HUE = 1 / 4 1
ex ey eh
STe 0
4 4
w1 4D 4D P1
w2 4D 4D P2
w3 4D 4D P3
w4 4D
Y :Y0 Y1 Y2 Y3........
P4
0]
d = [1
0]
(c)
1 If d = , S T = (4 1
1 0) p = HUE = 1 / 4 1
ex ey eh
pT e 1 0 1
STe 0 4 4
Y 4D 4D P1
Y 4D 4D P2
Y 4D 4D P3
Y 4D
P0 X
P4
Figure 4. S = 4
0]
d = [1
1]
10.
i
The dependence vectors are given by: 1 2 0 ex = , e y = , eh = 0 0 1 (a) Since there is a delay on each edge, we can choose S = [1 1]
T
S = [1 1]
ex ey eh
2D Y X h D D
2D
2D
2D
2D
D h D h D
D h D
T T
D h D
d = [0 1] 1 1 1 (b) If d = , S T = (1 1) p = or p = choosing p= 1 1 1
Figure 5. S = 1
1]
1 1
ex ey eh
pT e 1 2 -1
STe 1
2 1
2D
2D D D
2D D D
T
2D D D
T
2D
1]
d = [1 1]
15. Y(n) is defined by the following equation: y (n) = ax(n) + bx(n 2) + cx(n 4) + dx(n 6)
x0
x1
x2
x3
x4
x5
x6
x7
d c b a
y6
y7
y8
In the given DG the inputs move along the j axis, the weights along the i axis, and the outputs move along the diagonal. Note that the dependence vectors are given by: 0 2 1 ex = , e y = , eh = 1 1 0 (a) For such a filter the RIA is given by
H (i, j ) = H (i 1, j ) Y (i, j ) = Y (i 2, j + 1) + X (i, j ) H (i, j ) X (i, j ) = X (i, j 1)
(1 0) (0,0)
(2, -1)
(0,0)
(0, 1)
The scheduling is done by applying the following formulae s1 S = , S T e + y x Tx s 2 From the reduced graph we see that
H H : s1 + H H 0 H Y : Y H 0 X X : s2 + X X 0 X Y : Y X 0 Y Y : 2( s1 ) s2 + Y Y 2
If we use linear scheduling, (i.e. H = X = Y = 0 ), some possible solutions are given by: 2 1 2 S = Or S = Or S = . 2 0 1 We choose ST= (1 0) since we need to reduce the number of delays in the systolic architecture. 1 0 (b) If d = , S T = (1 0) p = 0 1 (c) Then we have:
pT e 1
ex ey eh
STe 0
2 1
-1 0
a D Y X 2D P0 2D
b D 2D P1
c D 2D P2
d D
P3
Figure 7. S = 1
0]
d = [1 0]
(d) For the above systolic architecture the Hardware Utilization Efficiency is
1 = =1 STd 1 1