Professional Documents
Culture Documents
-143079023)
Strong points:
The authors in this paper have shown that Value Prediction (VP) can be leveraged to reduce
the aggressiveness of the out-of-order engine, by executing many instructions in-order either
in the frontend or at commit stage. This leads to reduction in the number of physical register
file (PRF) ports required by the out-of-order engine. Thus, significant silicon area savings,
significant power savings in the scheduler and the register file and savings on the access
time of the register file.
Since prediction and validation are done in-order, banking the register file can greatly
decrease the number of PRF ports required by the value prediction hardware. In this way,
they have obtained performance on-par with a wide processor using VP but using a smaller
out-of-order engine & a PRF of similar complexity as a processor without VP.
Experimental results indicate that the proposed technique, EOLE acheives 5% speedup or
higher for few benchmarks and 10% performance enhancement for one benchmark.
The authors have analyzed extensivley the impact of instruction queue size and issue width
on their proposed architecture. The authors have provided an in-depth analysis of the
hardware complexity and proposed few solutions for mitigation of the hardware cost.
Weak points:
The authors claim that the register file in the OoO engine would be less likely to become a
temperature hotspot than in a conventional design. This is not always true as in this
proposal, there is provision of a distributed register file organization with one file servicing
reads from the OoO engine and the other servicing reads from the late-execution or
validation and training stage. However, the temperature hotspot formation depends on
accesses to the register file also. A very frequent (and unusual) access can also lead to local
temperature increase. Without experimental results on temperature simulations, authors
claims can not be substantiated.
The authors have not provided any experimental results on the energy/power savings
although they claim so in reference to the hardware complexity. Merely providing
performance benefits does not appear appealing as other factors (area, power etc.) may get
severely affected leading to lower overall benefits.
Points of disagreement:
If instructions were very well scheduled, eventually removing all stall cycles from RAW
dependencies, it is sure that Value Prediction would have no benefit at all, because the actual
results would always get computed before the predicted value became useful. Agreed that,
perfect scheduling is not possible in current processors, because of their complex pipelines
and memory hierarchies. Thus, analysis of the efficacy of the instruction scheduling and the
required amount of value prediction is needed to establish the impact of Value Prediction in
the proposed architecture.