You are on page 1of 5

Birla Institute of Technology & Science, Pilani

Work Integrated Learning Programmes Division


First Semester 2023-2024

Comprehensive Examination
(EC-3 Regular)

Course No : ESZG626/MELZG651/SEZG626/SSZG626
Course Title : HARDWARE SOFTWARE CO-DESIGN
Nature of Exam : Open Book
Weightage : 40% No. of Pages = 5
Duration : 3 Hours No. of Questions = 4
Date of Exam :
Note to Students:
1. Please follow all the Instructions to Candidates given on the cover page of the answer book.
2. All parts of a question should be answered consecutively. Each answer should start from a fresh page.
3. Assumptions made if any, should be stated clearly at the beginning of your answer.

Q1. Modeling, Compiler, Analysis

A. Write a ‘C’ code to find the sum of 8 numbers. These numbers are stored in memory (M).
B. Using Compiler methods, convert the ‘C’ program into the Assembly language instruction for the
above simple (trivial) instruction set processor. Based on the problems faced, suggest newer
instructions which would simplify the problem. Use the same to get to the final solution.
C. If each instruction takes 1 clock cycle, enumerate the number of clock cycles taken for the
application of finding the average of 8 numbers.
….(3+4+3) 10 marks
Q2. Hardware / Software Partitioning
Algorithm 1 shows the pseudo code of a greedy algorithm for HW/SW partitioning. The algorithm
starts with a partition where all objects are realized in hardware. Then, objects are migrated to
software as long as the performance requirement is satisfied (function Satisfies Performance)
and the cost of the new partitioning is lower (function f). If an object is migrated, the algorithm also
tries to migrate all successor nodes (function Successors).

Algorithm 1 Pseudo code for a greedy HW/SW partitioner

1: P ={{}, O} ; //all in HW
2: procedure Partitioning (P)
3: repeat
4: Pold=P
5: for all oi € HW do
6: Attempt Move (P, oi)
7: end for
8: until P == Pold
9: end procedure

10: procedure AttemptMove (P,ox)


11: if Satisfies Performance (Move(P,ox) AND (f(Move(P,ox)) <f(P)) then
12: P = Move(P, ox)
13: for all oy € Sucessors (oy) do
14: AttemptMove (P,oy)
15: end for
16: end if
17: end procedure

Apply the algorithm to the sequence graph shown in Fig. 1. The function Satisfies Performance (P)
should return TRUE if P satisfies the latency bound L = 7. To determine the latency of a
partitioning, you have to construct a valid schedule. The execution times of start- and end nodes of
the sequence graph are 0, all other node execution times are given in Fig. 1, split into HW (dHW)
and SW (dSW). For a communication between HW and SW, a delay of 0.5 per edge has to be
accounted for. For HW nodes there are no resource constraints, i.e., all ready nodes can be executed
in parallel. The SW nodes have to share one processor.

The function f determines the cost. For a SW node the cost is 0, and for a HW node the cost is 1.
….10 marks

Q3. Prototyping and Emulation


A. Design a 3 color traffic light for a single junction (no crossroads, only on a single path, with
Red, Green & Orange) color Lights. Each color takes say 15sec ON time (assume positive
edge triggered clock).
i. The design would include drawing the ‘state diagram’, ‘Next State Table’, ‘Flip-Flop
Transition Table’, ‘Logic minimization through Kmap or other known methods’,
‘Generating logic expressions for Flip Flop inputs’, ‘Counter implementation’.
B. How many EZ-CLB are required for the design.
C. What is the clock frequency you would provide to the EZ-CLB for the functioning of the
traffic light.
Assume: (i) the intrinsic delay of the ‘4 input lut’ = 5ns and the ‘FF’ CLK  Q = 1ns. (ii) there is
zero interconnect or IO delay.
…. (4 + 4 + 2) = 10 marks

Q4. Analysis for Binary Partitioning


(a) For the Graph as below with Tasks T1, T2, T3, T4 and T5 the HW / SW partitioning for the same has
been provided in the table. Enumerate (tabulate) the total number of ways in which the partitioning can
be done. Assume any number of components can be used any number of times.
….. 2 Marks
(b) If we were to use only the HW components and one SW processing element for T5 only, How many
ways can it be done. Give one configuration for best timing and one configuration for best area.
….. 2 Marks
(c) If we were to use only the SW components and no HW components. How many ways can it be done.
Give one configuration for best timing and one configuration for best area.
….. 2 Marks
(d) If we were to use a mix of SW components and HW components. Give one configuration for best timing
and one configuration for best area.
….. 2 Marks
(e) Plot the timing (Y axis) Vs area (X axis) chart for the results of (b), (c), (d) above. Provide your analysis
on the results and conclusions from the exercise.
….. 2 Marks
…. (2 + 2 + 2 + 2 + 2) = 10 marks

***********

You might also like