Lecture 5: Lecture 5: Superblocks Superblocks

COS 598C COS 598C – – Advanced Compilers Advanced Compilers
Prof. David August
Department of Computer Science
Princeton University

Prof. David August COS 598C - Advanced Compilers
Regions Regions
- Region: A collection of operations that are treated as a
single unit by the compiler
- Examples
- Basic block
- Procedure
- Body of a loop
- Properties
- Connected subgraph of operations
- Control flow is the key parameter that defines regions
- Hierarchically organized
- Problem
- Basic blocks are too small (3-5 operations)
- Hard to extract sufficient parallelism
- Procedure control flow too complex for many compiler xforms
- Plus only parts of a procedure are important (90/10 rule)

Prof. David August COS 598C - Advanced Compilers
Regions (2) Regions (2)
- Want
- !ntermediate sized regions with simple control flow
- Bigger basic blocks would be ideal !!
- Separate important code from less important
- Optimize frequently executed code at the expense of the rest
- Solution
- Define new region types that consist of multiple BBs
- Profile information used in the identification
- Sequential control flow (sorta)
- Pretend the regions are basic blocks

Prof. David August COS 598C - Advanced Compilers
Region Type 1 Region Type 1 – – Trace Trace
- Trace - Linear collection of basic
blocks that tend to execute in
sequence
- ¨Likely control flow path"
- Acyclic (outer backedge ok)
- Side entrance - branch into the
middle of a trace
- Side exit - branch out of the
middle of a trace
- Compilation strategy
- Compile assuming path occurs
100¾ of the time
- Patch up side entrances and
exits afterwards
- Notivated by scheduling (i.e.,
trace scheduling)
BB2
BB4
BB6
BB5
BB1
BB3
80
20
10
90
10
90
10
80 20
10

Prof. David August COS 598C - Advanced Compilers
Linearizing Linearizing a Trace a Trace
BB2
BB4
BB6
BB5
BB1
BB3
80
20 (side exit)
10 (side exit)
90
10 (entry count)
90 (entry/
exit count)
10 (exit count)
80
20 (side entrance)
10 (side entrance)

Prof. David August COS 598C - Advanced Compilers
Intelligent Trace Layout for I Intelligent Trace Layout for I- -Cache Performance Cache Performance
BB2
BB4
BB6
BB5
BB1
BB3
trace1
trace 2
trace 3
The rest
Intraprocedural code placement
Procedure positioning
Procedure splitting
Procedure view
Trace view

Prof. David August COS 598C - Advanced Compilers
Issues With Selecting Traces Issues With Selecting Traces
- Acyclic
- Cannot go past a backedge
- Trace length
- Longer = better ?
- Not always !
- On-trace / off-trace transitions
- Naximize on-trace
- Ninimize off-trace
- Compile assuming on-trace is
100¾ (ie single BB)
- Penalty for off-trace
- Tradeoff (heuristic)
- Length
- Likelihood remain within the
trace
BB2
BB4
BB6
BB5
BB1
BB3
80
20
10
90
10
90
10
80 20
10

Prof. David August COS 598C - Advanced Compilers
Trace Selection Algorithm Trace Selection Algorithm
i = 0;
mark all BBs unvisited
while (there are unvisited nodes) do
seed = unvisited BB with largest execution freq
trace[i] += seed
mark seed visited
current = seed
/* Grow trace forward */
while (1) do
next = best_successor_of(current)
if (next == 0) then break
trace[i] += next
mark next visited
current = next
endwhile
/* Grow trace backward analogously */
i++
endwhile
!
Prof. David August COS 598C - Advanced Compilers
Best Successor/Predecessor Best Successor/Predecessor
Node weight vs. edge weight
- edge more accurate
THRESHOLD
- controls off-trace
probability
- 60-/0¾ found best
Notes on this algorithm
- BB only allowed in 1 trace
- Cumulative probability
ignored
- Nin weight for seed to be
chose (ie executed 100
times)
best_successor_of(BB)
e = control flow edge with highest
probability leaving BB
if (e is a backedge) then
return 0
endif
if (probability(e) <= THRESHOLD) then
return 0
endif
d = destination of e
if (d is visited) then
return 0
endif
return d
endprocedure
"
Prof. David August COS 598C - Advanced Compilers
Class Problem 1 Class Problem 1
BB2
BB4
BB6
BB5
BB1
BB3
20
80
100
450
20
80
BB3
BB6
BB6
51 49
49
10
41
10
41
Find the traces. Assume
a threshold probability
of 60%.
#
Prof. David August COS 598C - Advanced Compilers
Class Problem 2 Class Problem 2
BB1
100
BB2 BB3
BB5 BB6 BB4
BB7
BB8
40
135 100
35
75
25
25
50 10 5
60
15
100
Find the traces. Assume
a threshold probability of 60%.

Prof. David August COS 598C - Advanced Compilers
Traces are Nice, But … Traces are Nice, But …
- Treat trace as a big BB
- Transform trace ignoring side
entrance/exits
- !nsert fixup code
- AKA bookkeeping
- Side entrance fixup is more
painful
- Sometimes not possible so
transform not allowed
- Solution
- Eliminate side entrances
- The superblock is born
BB2
BB4
BB6
BB5
BB1
BB3
80
20
10
90
10
90
10
80 20
10

Prof. David August COS 598C - Advanced Compilers
Region Type 2 Region Type 2 – – Superblock Superblock
- Superblock - Linear collection of
basic blocks that tend to
execute in sequence in which
control flow may only enter at
the first BB
- ¨Likely control flow path"
- Acyclic (outer backedge ok)
- Trace with no side entrances
- Side exits still exist
- Superblock formation
- 1. Trace selection
- 2. Eliminate side entrances
BB2
BB4
BB6
BB5
BB1
BB3
80
20
10
90
10
90
10
80 20
10

Prof. David August COS 598C - Advanced Compilers
Tail Duplication Tail Duplication
- To eliminate all side entrances
replicate the ¨tail" portion of the
trace
- !dentify first side entrance
- Replicate all BB from the target
to the bottom
- Redirect all side entrances to
the duplicated BBs
- Copy each BB only once
- Nax code expansion = 2x-1
where x is the number of BB in
the trace
- Adjust profile information
BB2
BB4
BB6
BB5
BB1
BB3
80
20
10
90
10
90
10
80 20
10

Prof. David August COS 598C - Advanced Compilers
Superblock Formation Superblock Formation
BB2
BB4
BB6
BB5
BB1
BB3
80
20
10
90
10
90
10
80 20
10
BB2
BB4
BB6
BB5’
BB1
BB3
80
20
8
72
10
64.8
7.2
80 20
26
BB6’
BB4’
18
2.8
25.2
2

Prof. David August COS 598C - Advanced Compilers
Issues with Issues with Superblocks Superblocks
- Central tradeoff
- Side entrance elimination
- Compiler complexity
- Compiler effectiveness
- Code size increase
- Apply intelligently
- Nost frequently executed BBs
are converted to SBs
- Set upper limit on code
expansion
- 1.0 - 1.10x are typical code
expansion ratios from SB
formation
BB2
BB4
BB6
BB5’
BB1
BB3
80
20
8
72
10
64.8
7.2
80 20
26
BB6’
BB4’
18
2.8
25.2
2

Prof. David August COS 598C - Advanced Compilers
Class Problem 3 Class Problem 3
BB2
BB4
BB7
BB5
BB1
BB3
20
80
100
450
20
80
BB6
BB8
BB9
51 49
49
10
41
10
41
Create the superblocks, trace
threshold is 60%

David August Issues With Selecting Traces Trace Selection Algorithm 10 2 3 > < = 90 80 BB2 80 BB1 20 BB3 20 BB4 /$ 8 8 ) $ ! ! $ $ $ & $ 10 BB5 10 90 BB6 10 *( " (7 2 3 3 " & i = 0. David August COS 598C . mark all BBs unvisited while (there are unvisited nodes) do seed = unvisited BB with largest execution freq trace[i] += seed mark seed visited current = seed /* Grow trace forward */ while (1) do next = best_successor_of(current) if (next == 0) then break trace[i] += next mark next visited current = next endwhile /* Grow trace backward analogously */ i++ endwhile COS 598C .Advanced Compilers Prof.Advanced Compilers Prof.Advanced Compilers Prof. David August COS 598C . David August .Linearizing a Trace Intelligent Trace Layout for I-Cache Performance 10 (entry count) BB1 BB1 80 90 (entry/ exit count) BB2 80 BB4 10 (side exit) 90 BB6 BB5 10 (side entrance) BB3 BB5 10 (exit count) The rest 20 (side exit) BB3 BB4 20 (side entrance) BB6 trace 3 BB2 Intraprocedural code placement Procedure positioning Procedure splitting trace1 trace 2 Trace view Procedure view COS 598C .Advanced Compilers Prof.

Advanced Compilers ! Prof. 25 BB7 75 BB8 100 2 10 COS 598C .Best Successor/Predecessor > 2 9 : . David August COS 598C . BB1 20 80 BB2 20 BB3 80 BB4 51 49 BB5 10 41 450 BB3 49 41 BB6 10 BB6 BB6 COS 598C .Advanced Compilers Prof. A BB1 90 80 BB2 80 20 BB3 20 BB4 10 BB5 90 BB6 10 ./1 3 $ ? $@ 7 ( ( > * 9 8 " & *( ( Class Problem 1 best_successor_of(BB) e = control flow edge with highest probability leaving BB if (e is a backedge) then return 0 endif if (probability(e) <= THRESHOLD) then return 0 endif d = destination of e if (d is visited) then return 0 endif return d endprocedure 100 Find the traces. . 10 100 BB1 60 BB2 50 BB4 25 10 5 BB5 15 35 40 BB3 135 BB6 100 ) . Assume a threshold probability of 60%.Advanced Compilers " Prof. David August Class Problem 2 Traces are Nice. But … 2 2 Find the traces. David August COS 598C . Assume a threshold probability of 60%.Advanced Compilers # Prof. . David August .

20 BB3 20 BB4 10 BB5 10 90 80 BB2 80 BB4 10 BB1 20 BB3 20 0 BB1 90 80 BB2 4 3 " 2 .Advanced Compilers Prof. COS 598C .Region Type 2 – Superblock . David August COS 598C . David August .2 BB3 .2 2.Advanced Compilers Prof.Advanced Compilers Prof.8 BB6’ 25.2 2. 0 9 . *(6 ** : :( .8 BB6’ 25. *2 : B : 8 90 BB6 10 < B $* BB5 10 90 BB6 10 10 C COS 598C . David August COS 598C .2 20 BB3 ! 8 .Advanced Compilers Prof.8 BB2 80 20 BB4 8 72 BB4’ 2 18 BB5’ 26 BB6 7. $3 10 Tail Duplication 2 4 5 . 9 10 BB1 80 64. 5 & 80 .8 80 80 10 BB1 20 BB2 20 BB4 8 BB4’ 2 18 BB5’ 26 BB6 7. David August Superblock Formation Issues with Superblocks 10 BB1 90 80 BB2 80 BB4 10 BB5 10 BB6 10 72 90 20 BB3 20 64.

Advanced Compilers Prof. David August . trace threshold is 60% BB2 20 51 BB3 80 BB4 49 10 BB5 41 BB6 49 450 BB7 10 BB8 41 BB9 COS 598C .Class Problem 3 100 BB1 20 80 Create the superblocks.

Sign up to vote on this title
UsefulNot useful