You are on page 1of 7


• C.M.Fiduccia and R.M. Mattheyses, “A Linear-

Time Heuristic for Improving Network
ECE-256a Partitions”, Design Automation Conference, 1982,
Graph Partitioning pp. 175-181.
• B.W. Kernighan and S. Lin, "An Efficient
Malgorzata Marek-Sadowska Heuristic Procedure for Partitioning Graphs" Bell
Electrical and Computer Engineering Department System Technical Journal, Vol. 49, Feb. 1970, pp.
Engineering I, room 4111

ECE 256A 1 ECE 256A 2

• Partition the gates between two regions so that
– Capacity (# of gates allowed) on each side is not exceeded • Circuit has capacity constraint <= 500K gates/partition
– Cost (for example, the number) of wires across the cut is minimized • I/O constraints <= 200 pins (connections on the boundary)
• Classical problem: bi-partitioning
Minimize the number of crossings
800K gates 550K gates

… 1K pins … 190 pins

200 K gates 450 K gates

Netlist with
Bad Good
1M gates
Cut line

ECE 256A 3 ECE 256A 4

Exact solutions
GRAPH PARTITIONING Enumerate all partitions.
Suppose ∀i s(i) = 1 and kP = n
G: a graph of n nodes each of size s(i) There are: ( p ) ways to choose the 1-st subset
P: a positive integer such that ∀i 0 < s(i) ≤ P ( p ) ways to choose the 2nd subset
C: (cij) i,j = 1,2,..n be a weighted connectivity matrix :.
A k-way partition of G: a set of non-empty, disjoint subsets of The ordering of sets is not important, so the number of
d off G G: different partitions is
1 n n-p 2p p
V1, V2..Vk such that U Vi = V k! ( p )( p ) •• ( p )( p)
A partition is admissible, if For n = 40
|Vi| ≤ P P = 10
Cost = cost of external edges, between partitions. k=4

# cases > 1020

ECE 256A 5 ECE 256A 6

2-way partition
Heuristics which did not work:
1. Random solutions Let c(# of cells) = 2n,
Low probability of finding a good solution. C = (Cij) is a cost matrix; i,j = 1,2,..2n.
Experiments with 2-way partitions of 32 node graphs Cii = 0 ∀i
indicate 2-5 optimal partitions one of ½(3216) partitions; We wish to partition S ( the set of 2n cells and 2-pin
probability of success on any trial is less than 10-77 interconnects described by C) into 2 sets: A and B, B each
2. Max Flow-Min Cut containing n cells. So ∀i s(i) = 1, |A| = |B| = n. External
No way to control the sizes of partitions. cost T = A×B cab
3. Clustering - difficulties in systematic assignments of nodes
that do not obviously belong to any particular set.

ECE 256A 7 ECE 256A 8

Solution approach Outline of the Kernighan-Lin method.

Produce initial solution
1. Start with any partition A, B of S.
(maybe random)
2. Try to decrease the initial external cost T by a series of
Swap some gates across cut-line interchanges of subsets of A and B.
to improve cost One pass 3. When no further improvement is possible, the resulting
good improvement partition A
A´, B
B´ is locally minimum with respect to the
yet algorithm.
Evaluate stop criterion
Is partition good? 4. The process may be repeated with different starting

Good, done

ECE 256A 9 ECE 256A 10

Let A*, B* be a minimum cost 2-way partition; A, B is an

arbitrary 2-way partition. a

∃X ⊂ A ∧ ∃Y⊂ B such that |X| < |Y| ≤ 2 that
External cost of a ∈ A:
A* = A - X +Y
Ea =  cay
B* = B - Y + X y∈B

Internal cost of a∈A:

x y y x
=> Ia =  cax

A B A* B* Eb =  cbx
Ib =  cby
How to identify X and Y?
Dz = Ez - Iz ∀z ∈S

ECE 256A 11 ECE 256A 12

Lemma. Consider any a∈A, b∈B. If a and b are interchanged,
the gain (cost reduction) is Da + Db - 2cab K&L Improvement Procedure
b a
a => b
A B A B b1
Let z be the total cost due to all connections between A and B a1
that do not involve a or b.
Then: 1 Start with any partition
1. 22. Identify a1 in A and b1 in B so that
swapping them will give maximum gain
T = z + Ea + Eb - cab A-a1+b1 B-b1+a1
exchange a and b:
4. Continue identifying a2,b2, ..etc.
T′ = z + Ia + Ib + cab
gain = old cost - new cost = T - T′ =
Da + Db - 2cab
3. Swap a1 and b1 and lock them
in place, so they can’t be swapped again
ECE 256A 13 ECE 256A 14

K&L Algorithm: critical ideas K&L: Picking Swap Sequence

• Gain • Facts: k
– Gain is the change in cost that results from swapping one gate – Gain from doing k swaps is sequence is Gk =  g i
i =1
in the A-side with one gate in the B-side – Gk is not monotonic function of k
– Compute it as cut cij − cut cij swap
b1 a1 Gk Best sequence
(after swap) (before swap) b2 a2
b3 a3
• Greedy decision
– Make the best next swap
bk ak
– Do this swap even if it’s negative
• Biggest positive gain bn an
• Smallest (closest to zero) negative gain k n
# of swaps
• Do all n swaps
ECE 256A 15 ECE 256A 16

K&L: Doing the swaps The exchange algorithm.

• Interpretation: 1. Compute D values for all elements of S.

– We will do only those k swaps, since they maximize gain 2. Choose ai∈A and bi∈B such that gi = Dai + Dbi - 2caibi is
b1 a1 A B 3. Move ai to B, bi to A and lock them. Store gi, (ai,bi).
b2 a2
b3 a3 4 If A,B
4. A B have any movable (unlocked) elements do
a. Update D values
bk ak b. go to step 2
Improved result else go to 5.

5. Choose k to maximize partial sum i=1 gi = G.

ECE 256A 17 ECE 256A 18

A Linear-Time Heuristic for Improving
Network Partitions

6. Move the first k elements from the sequence Fiduccia and Mattheyses algorithm.
(a1,b1)(a2,b2)..(ax,bx)…(an,bn) to the other side of the
partitions. Problem:
7. Treat the resulting partition as a new partition and repeat Given a network consisting of a set of modules connected by a
the process until the partitions can not be improved (k = 0) set of nets
nets, the mincut partitioning problem is to find a
partition of the set of modules into 2 blocks A and B such that
the number of nets having modules in both blocks is minimal.
In general, size constraints are imposed on A and B.

ECE 256A 19 ECE 256A 20

P =  p(i) = total # pins

The network consists of c modules (cells) and N nets. P is the measure of the input and can be interpreted as a
* A net is defined as a set consisting of at least two cells. “size” of the network.
* Each cell is contained in at least one net. C is O(P) and N is O(P)
* n(i) denotes the # of cells in net(i).
* Any 2 cells which share a net are called neighbors. Input routine
* Each cell is assumed to have size s(i). Cells are identified by integers 1÷C, nets are numbered
* p(i) denotes the number of pins in cell(i) sequentially 1,2..N as they are entered.
At input:
C i1
nets are presented one at a time, in any order, each net being CELL C i2

completely given before the next one is started.


NET net(2)

ECE 256A 21 ECE 256A 22

Net-list input: Cutstate of a net: { uncut

For each net n = 1,..N do cut: has at least one cell in each side of the partition;
For each (cell, pin) pair (i,j) on net n do uncut: all cells of a net are on one side of the partition.
if net n is not at the front of the net-list for cell i
then insert cell i into the cell-list of net n and insert net Cutset of a partition ≡ set of nets which are cut.
n into the net-list of cell i The size |X| of a block of cells X is the sum of the sizes s(i) of
end for its constituent cells.
end for User can specify 0 < r < 1 and a mincut partition with
O(P) will suffice to do this work. |A| ≈ r is sought.
|A| + | B|
Some cells may be pre assigned to a specific side of a

ECE 256A 23 ECE 256A 24

Basic Idea: move cells, one at a time from one block to the Computational effort:
other such that the cutset is minimized. * select the base cell
* move it
Base cell (cell to be moved) chosen based on balance
* adjust the gains of its free neighbors
condition and cutset.
Gain(i) of cell(i) = #nets by which cutset decreases when Naive approach: O(P2) gain computations per pass.
cell(i) moves.
-p(i) < g(i) < +p(i) Cell gains:

After a move, a cell is locked in its new block for the

+2 +1 0 -1
reminder of the pass. Only free cells can move. Stop: no free
cells or balancing criterion. The best partition encountered
-p(i) <= g(i) <= p(i)
during the pass is returned.
pmax = max { p(i) | cell(i) is initially free }
ECE 256A 25 ECE 256A 26

+ pmax * The total amount of work required to maintain each
BUCKET array is O(P) per pass.
cell # cell #

- p max Initialization: O(pmax) + O(f) = O(P)

free cells
1 2 3
g - total # of gain adjustments
k-th entry of the BUCKET contains a doubly linked list of O(g) - work to move all free cells to their bucket lists.
free cells with gains equal k. Later we will show that g = O(P)
One BUCKET for block A, the other one for block B. R - sum of all amounts by which MAXGAIN is reset. The
Base cell moved  removed from its bucket list and placed total time/pass used to search down for non-empty bucket and
on Free_Cell_List used to reinitialize the BUCKET for the to remove a cell of highest gain is O(R + pmax) + O(f) = O(R)
next pass. + O(P).
MAXGAIN - index to keep track of max gain/BUCKET. Later, we will see that R = O(g).
ECE 256A 27 ECE 256A 28

(A, B) is balanced when
The basic idea:
rW - smax ≤ |A| ≤ rW + smax
1. Consider the first cell (if any) of highest gain from
W = |A| + |B|
BUCKET, rejecting it if move causes imbalance. If neither
|A| block has a qualifying cell, no more moves will be
r = ———
|A| + |B| attempted.
2. Choose a cell of highest gain, breaking ties by choosing the
smax = the size of the largest cell which is initially free
best balance.
3. This is the base cell; remove it from the bucket list; place it
* Initial pass needed to establish the balance.
on the FREE CELL List.
* The tolerance of smax allows to maintain the balance.

ECE 256A 29 ECE 256A 30

Computing and maintaining cell gains. Useful observations:
Given a partition (A,B), the distribution of * gain of a cell depends on its critical nets
n = (A(n), B(n)) * a net which is not critical before or after a move cannot
influence the gains of any of its cells.
#cells of net n in A # cells in B
It can be computed in O(P) for all nets. g(i) = FS(i) - TE(i)
A net is critical, if there exists a cell on it, which if moved cell #i
would change the net’s cutstate. A(n) or B(n) is 0 or 1. #nets which contain cell i as their only cell in “From”
+1 0 0 +1 +1 TE(i) = #nets which contain cell i and have empty “To”
A(n) = 1 A(n) = 1, B(n)=1 block.

0 0 +1 -1 -1

B(n) = 1 B(n) = 0
ECE 256A 31 ECE 256A 32

Compute cell gains: Critical nets:

for each cell i do B
g(i): = 0;
F: = the “from” block of cell(i) A B
A(n) = 1 A(n) = 0
T: = the ‘to” block of cell(i)
for each net n on cell i do A B

if F(n) = 1 then g(i) ++

if T(n) = 0 then g(i) -- B(n) = 1 B(n) = 0
end for
end for A net is critical before the move if and only if:
F(n) = 1, T(n) = 0, T(n) = 1
Initialization of all cell gains requires O(P) work. A net is critical after the move if and only if:
T(n) = 1, F(n) = 0 or F(n) = 1
ECE 256A 33 ECE 256A 34

Nets critical before and after the move F(n) = 1 before the move ≡ F(n) = 0 after
T(n) = 1 after the move ≡ T(n) = 0 before
Move base cell and update neighbor’s gain:
F: “from” block of base cell
A net is critical, if there exists a cell on it, which if moved
T: = “to” block of base cell
would change the net’s cutstate. A(n) or B(n) is 0 or 1.
“Move cell” = Lock it and complement its block ;
for each net n on the base cell do
if T(n) = 0 then increment gains of all free cells on net(n)
else if T(n) = 1 then decrement gain of the only T cell on
+1 0 0 +1 +1 net(i), if it is free.
A(n) = 1 A(n) = 1, B(n)=1 decrement F(n) / * change distribution*/
increment T(n)
-1 -1
0 0 +1 If F(n) = 0 then decrement gains of all free cells on net(n)
B(n) = 1 B(n) = 0 else if F(n) = 1 then increment gain of the only F cell on
ECE 256A 35 ECE 256A 36

net(n), if it is free
/* check for critical nets before the move*/
end for
if LT(n) = 0
If a net has n cells → O(n) work/update. then if FT(n) = 0 then update gains
else if FT(n) = 1 then update gains
* No more than 4 update operations/net are performed during
/* change the net distribution to reflect the move*/
1 pass.
decrement FF(n)
LF(n) locked cells on net(n) on the “from” side
increment LT(n)
FF(n) free cells on net(n) on the “from”
from side
/* check for critical nets after the move*/
LT(n) locked cells of net(n) on the “to” side
if LF(n) = 0
FT(n) free cells of net(n) on the “to” side
then if FF(n) = 0 then update gains
T(n) = 0 requires LT(n) = FT(n) = 0
else if FF(n) = 1 then update gains
T(n) = 1 requires LT(n) = 1 ∧ FT(n) = 0
or LT(n) = 0 ∧ FT(n) = 1;
the update is performed only if LT(n) = 0
ECE 256A 37 ECE 256A 38

Example After both blocks A and B have served as “T” side for a net n,
n3 C
A A n1 no further operations will occur for n. All cells of such a net
B are locked on both sides.
n3 B
n4 n2 n4
E E n2
The B side having 0 or 1 cell can cause an update of only the first
move in the sequence. Afterwards LB(n) > 0. Updates can occur
I i i l situation:
Initial i i
n1: T(n1)=0, g(A)=0, F(n1)=1, T(n1)=1
onl for FA(n) = 1 and FA(n) = 0,
only 0 once.
once 1 more update
pdate for B = F.F
g(A)=-1 F(n1)=2, T(n1)=0
g(B)=0 F(n2)=2, T(n2)=1
n2: T(n2)=1, g(D)=0, F(n2)=1, T(n2)=2
g(C)=1 F(n3)=2, T(n3)=1
g(D)=1 F(n4)=2, T(n4)=0
n3: T(n3)=1, g(C)=0, F(n3)=1, T(n3)=2
n4: T(n4)=0, g(G)=0, F(n4)=1, T(n4)=1
ECE 256A 39 ECE 256A 40

So, we have: After each “pass”, we find k, which maximizes


total# of gain adjustments/pass is O(f) gmax =  gi


f is # of initially free cells. ig gmax > 0

Thus, g = O(f) = O(P) exchange a1,a2,..,ak with b1,..bk
Each time a net is updated, the total gain of any cell on that
net can be incremented at most 2×, so during 1 update the Note, that the partial sum
value of MAXGAIN can be reset at most to MAXGAIN +2.  gi may be < 0
So R is O(N) = O(P). in early stages.

* The total work required to initialize and maintain cell gains Example:
is O(P) per pass.

ECE 256A 41 ECE 256A 42