Professional Documents
Culture Documents
9/22/00
Distributed control structures Implicit renaming of registers (dispatched pointers) WAR and WAW hazards eliminated by register renaming Results broadcast to all reservation stations for RAW
Store Buffers
Add1 Add2 Add3 Mult1 Mult2
FP adders
Reservation Stations
To Mem FP multipliers
Reservation stations
Serve as destinations for information surrogate registers Serve as temporary holding places for register values Serve as distributed scheduling points for information
j
0 F0 0 0 F0 0
k
R1 F2 R1 R1 F2 R1 Vj
Busy Addr
Yes Yes No Yes Yes No 80 72 80 72
Fu
Mult1 Mult2
Reservation Stations:
Name Busy Op Add1 No Add2 No Add3 No Mult1 Yes Multd Mult2 Yes Multd R1 72
S2 Qj
F0 F4 F4 R1 R1
0 F0 0 R1 Loop
R1 F2 R1 #8
Clock
9
F0 F2
Fu Load2
F4 F6 F8
Mult2
F10 F12
...
F30
Dataflow graph constructed completely in hardware CS252/Kubiatowicz 9/22/00 Renaming detaches early iterations from registers Lec 7.6
Data flows along arcs in Tokens. When two tokens arrive at compute box, box fires and produces new token. Split operations produce copies of tokens
9/22/00
/
X(0) Y
* +
X
All instructions before that have committed their state No following instructions (including the interrupting instruction) have modified any state.
Need way to resynchronize execution with instruction stream (I.e. with issue-order)
Easiest way is with in-order completion (i.e. reorder buffer) Other Techniques (Smith paper): Future File, History Buffer
CS252/Kubiatowicz Lec 7.8
9/22/00
Reorder Buffer
Result
Valid
FP Regs
Reorder Table
Res Stations
Res Stations
FP Adder
FP Adder
Newest
Reorder Buffer
F0 LD F0,10(R2) N
ROB1
Oldest
Registers
Dest Dest Reservation Stations
FP adders
9/22/00
Newest
Reorder Buffer
F10 F0 ADDD F10,F4,F0 LD F0,10(R2) N N
ROB2 ROB1
Oldest
Registers
Dest 2 ADDD R(F4),ROB1 Dest Reservation Stations
FP adders
9/22/00
Newest
Reorder Buffer
F2 F10 F0
N N N
Oldest
Registers
Dest 2 ADDD R(F4),ROB1 Dest 3 DIVD ROB2,R(F6) Reservation Stations
FP adders
9/22/00
Reorder Buffer
F0 F4 -F2 F10 F0
ADDD F0,F4,F6 LD F4,0(R3) BNE F2,<> DIVD F2,F10,F6 ADDD F10,F4,F0 LD F0,10(R2)
Newest
Oldest
Registers
Dest 2 ADDD R(F4),ROB1 6 ADDD ROB5, R(F6) Dest 3 DIVD ROB2,R(F6)
FP adders
9/22/00
Reservation Stations
FP multipliers
Reorder Buffer
Oldest
Registers
Dest 2 ADDD R(F4),ROB1 6 ADDD ROB5, R(F6) Dest 3 DIVD ROB2,R(F6)
FP adders
9/22/00
Reservation Stations
FP multipliers
Reorder Buffer
Oldest
Registers
Dest 2 ADDD R(F4),ROB1 6 ADDD M[10],R(F6) Dest 3 DIVD ROB2,R(F6)
FP adders
9/22/00
Reservation Stations
Reorder Buffer
Oldest
Registers
Dest 2 ADDD R(F4),ROB1 Dest 3 DIVD ROB2,R(F6) Reservation Stations
FP adders
9/22/00
Reorder Buffer
What about memory hazards???
Oldest
Registers
Dest 2 ADDD R(F4),ROB1 Dest 3 DIVD ROB2,R(F6) Reservation Stations
FP adders
9/22/00
Need buffer to keep track of all outstanding stores to memory, in program order.
Keep track of address (when becomes available) and value (when becomes available) FIFO ordering: will retire stores from this buffer in program order
When issuing a load, record current head of store queue (know which stores are ahead of you). When have address for load, check store queue:
If any store prior to load is waiting for its address, stall load. If load address matches earlier store address (associative lookup), then we have a memory-induced RAW hazard: store value available return value store value not available return ROB number of source Otherwise, send out request to memory
Actual stores commit in order, so no worry about WAR/WAW hazards through memory.
9/22/00 CS252/Kubiatowicz Lec 7.20
Memory Disambiguation:
FP Op Queue
Done?
ROB7 ROB6 ROB5
Newest
Reorder Buffer
LD ST LD ST
N N N Y
Oldest
Registers
Dest Dest Reservation Stations
FP adders
9/22/00
FP multipliers
Technique for both precise interrupts/exceptions and speculation: in-order completion or commit
This is why reorder buffers in all new processors
9/22/00 CS252/Kubiatowicz Lec 7.22
Administrative
Solutions for Prereq Quiz up on web site
Everyone who was going to take it has.
9/22/00
9/22/00
FP Mult FP Mult
Registers
SCOREBOARD
Rename Table
9/22/00
Memory
Executionoperate on operands
The functional unit begins execution upon receiving operands. When the result is ready, it notifies the scoreboard
Busy
No No No No No
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
F4
P4
F6
P6
F8 F10 F12
P8 P10 P12
...
F30
P30
Renamed Scoreboard 1
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
Yes No No No No
Op
Load
dest Fi
P32
S1 Fj
S2 Fk
R2
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Yes
F4
P4
F6
P32
F8 F10 F12
P8 P10 P12
...
F30
P30
9/22/00
Each instruction allocates free register Similar to single-assignment compiler transformation CS252/Kubiatowicz
Lec 7.29
Renamed Scoreboard 2
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
Yes Yes No No No
Op
Load Load
dest Fi
P32 P34
S1 Fj
S2 Fk
R2 R3
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Yes Yes
F4
P4
F6
P32
F8 F10 F12
P8 P10 P12
...
F30
P30
9/22/00
Renamed Scoreboard 3
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
Yes Yes Yes No No
Op
Load Load Multd
dest Fi
P32 P34 P36
S1 Fj
S2 Fk
R2 R3 P4
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Yes Yes Yes
P34
Int2
No
F4
P4
F6
P32
F8 F10 F12
P8 P10 P12
...
F30
P30
9/22/00
Renamed Scoreboard 4
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No Yes Yes Yes No
Op
Load Multd Sub
dest Fi
P34 P36 P38
S1 Fj
S2 Fk
R3 P4 P34
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Yes Yes No
P34 P32
Int2 Int2
No Yes
F4
P4
F6
P32
F8 F10 F12
P38 P10 P12
...
F30
P30
9/22/00
Renamed Scoreboard 5
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
P4 P34 P32
Mult1
Yes Yes No
F4
P4
F6
P32
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 6
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
P4 P34 P32
Mult1
Yes Yes No
F4
P4
F6
P32
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 7
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
P4 P34 P32
Mult1
Yes Yes No
F4
P4
F6
P32
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 8
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
P4 P34 P32
Mult1
Yes Yes No
F4
P4
F6
P32
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 9
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes No Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Multd Divd
P36 P40
P34 P36
P4 P32 Mult1
Yes No
Yes Yes
F4
P4
F6
P32
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 10
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
Notice that P32 not listed in Rename Table Still live. Must not be reallocated by accident CS252/Kubiatowicz 9/22/00
Lec 7.38
Renamed Scoreboard 11
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
P4 P34 P32
Mult1
Yes Yes No
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 12
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
P4 P34 P32
Mult1
Yes Yes No
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 13
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes Yes Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
P4 P34 P32
Mult1
Yes Yes No
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 14
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes No Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Multd Divd
P36 P40
P34 P36
P4 P32 Mult1
Yes No
Yes Yes
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 15
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes No Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Multd Divd
P36 P40
P34 P36
P4 P32 Mult1
Yes No
Yes Yes
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 16
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No Yes No Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Multd Divd
P36 P40
P34 P36
P4 P32 Mult1
Yes No
Yes Yes
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 17
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No No No Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Divd
P40
P36
P32
Mult1
Yes
Yes
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Renamed Scoreboard 18
Instruction status:
Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34+ 45+ F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2
Busy
No No No No Yes
Op
dest Fi
S1 Fj
S2 Fk
FU Qj
FU Qk
Fj? Rj
Fk? Rk
Divd
P40
P36
P32
Mult1
Yes
Yes
F4
P4
F6
P42
F8 F10 F12
P38 P40 P12
...
F30
P30
9/22/00
Two Questions:
How do we manage the free list? How does Explicit Register Renaming mix with Precise Interupts?
9/22/00 CS252/Kubiatowicz Lec 7.47
P60 P62
Oldest
Physical register file larger than ISA register file On issue, each instruction that modifies a register is allocated new physical register from freelist
9/22/00 CS252/Kubiatowicz Lec 7.48
P60 P62
F0 P0 LD P32,10(R2) N
Oldest
Note that physical register P0 is dead (or not live) past the point of this load.
When we go to commit the load, we free up
CS252/Kubiatowicz Lec 7.49
9/22/00
P60 P62
F10 P10 ADDD P34,P4,P32 N F0 P0 LD P32,10(R2) N
Oldest
9/22/00
P60 P62
Oldest
P32 P36 P4 F6 F8 P34 P12 P14 P16 P18 P20 P22 P24 p26 P28 P30 P38 P40 P44 P48
9/22/00
P60 P62
P0 P10
P32 P36 P4 F6 F8 P34 P12 P14 P16 P18 P20 P22 P24 p26 P28 P30 P38 P40 P44 P48
9/22/00
P60 P62
P60 P62
Oldest
P60 P62
Summary #1
Reorder Buffer
In-order issue, Out-of-order execution, In-order commit Holds results until they can be commited in order Serves as source of info until instructions committed Provides support for precise exceptions/Speculation: simply throw out instructions later than excepted instruction.
Memory Disambiguation:
Tracking of RAW hazards through memory Keep program-order queue of stores When have address for load, check store queue: If any store prior to load is waiting for its address, stall load. If load address matches earlier store address (associative lookup), then we have a memory-induced RAW hazard: Otherwise, send out request to memory
9/22/00 CS252/Kubiatowicz Lec 7.54
Summary #2
Explicit Renaming: more physical registers than needed by ISA.
Separates renaming from scheduling Opens up lots of options for resolving RAW hazards Rename table: tracks current association between architectural registers and physical registers Potentially complicated rename table management
9/22/00