You are on page 1of 45

Multistage switching fabrics - part II

Paolo Giaccone

Version: June 22, 2009

Switching Architectures - Part 2 -
c 2009 - Paolo Giaccone 1

Outline
• Clos network
– strictly non blocking: Clos theorem
– rearrangeable: Slepian Duguid theorem
– Paull’s matrix and Paull’s algorithm
– non-interruptible rearrangeable networks

• Recursive construction
– Benes network (p = 2), looping algorithm

– p= N
– Cantor network

• Self routing
– Banyan networks

• Minimum complexity networks

Switching Architectures - Part 2 -
c 2009 - Paolo Giaccone 2

Clos networks

Switching Architectures - Part 2 -
c 2009 - Paolo Giaccone 3

Clos networks
• three stage networks
– mi : number of inputs for modules at stage i
– ni : number of outputs for modules at stage i
– ri : number of modules at stage i
– Mi = {1, 2, . . . , ri } is the set of modules identifiers belonging to i-th stage
• Exactly one link between two modules in successive stages
– r1 = m2 , r2 = n1 = m3 , r3 = m2

Switching Architectures - Part 2 -
c 2009 - Paolo Giaccone 4

Clos network N × M Clos network with N = m1 r1 and M = r3 n3 m1 × n 1 m2 × n 2 m3 × n 3 1 1 1 N M r1 r2 r3 Switching Architectures .Paolo Giaccone 5 . c 2009 .Part 2 .

to find a new symbol available in the II-stage. Hence.Paolo Giaccone 6 . In the worst case. a symmetric network with m1 = n3 = n is SNB if and only if r2 ≥ 2n − 1 Proof: Assume that module i of the I-stage should be connected to module j of the III-stage. it should be r2 > (m1 − 1) + (n3 − 1) which implies r2 ≥ m1 + n3 − 1. Hence.Part 2 . c 2009 . They are all distinct. there are already m1 − 1 symbols in the i-th row of P and n3 − 1 symbols in the j -th column. Switching Architectures . a new symbol should be added in Pij of Paull’s matrix P . SNB Clos networks Clos Theorem: A Clos network is SNB if and only if the number of second stage switches r2 satisfies: r2 ≥ m1 + n3 − 1 In particular.

c 2009 .Part 2 . with m1 = n3 = p.Paolo Giaccone 7 . the smallest Clos network is built with r2 = 2p − 1 • total complexity: CSN B (N ) = qC(p×(2p−1))+(2p−1)C(q×q)+qC((2p−1)×p) = (2p−1)(2pq+q 2 ) • approximated complexity (assume r2 = 2p): CSN B (N ) ≈ qC(p × (2p)) + 2pC(q × q) + qC((2p) × p) = 4p2 q + 2pq 2 p × (2p − 1) q×q (2p − 1) × p 1 1 1 N N q 2p − 1 q Switching Architectures . r1 = r3 = q with N = pq • thanks to Clos Theorem. Complexity of a SNB Clos network • consider a symmetric Clos network.

Part 2 . the switching configurations of all the II-stage modules) • matrix P = [Pij ] of size r1 × r3 – Pij is a set of II-stage modules. Pij ⊆ M2 – if k ∈ Pij means that II-stage module k is connected to I-stage module i and III-stage module j – feasibility conditions ∗ each row with at most m1 symbols ∗ each column with at most n3 symbols ∗ each element with at most min{m1 . n3 } symbols ∗ each k ∈ M2 appears at most once for each row and for each column Switching Architectures . c 2009 . i. Paull’s matrix • describes the state of the active interconnections present in a Clos network (i.Paolo Giaccone 8 .e.e..

Configuring a SNB Clos network • when an input of I-stage module i should be connected to an output of III-stage module j .Paolo Giaccone 9 . c 2009 .Part 2 . find any II-stage module k such that the connections i → k and k → j are both free – such connection always exists thanks to the Clos theorem – in Paull’s matrix P . this operation corresponds to find any available symbol in both i-th row and j -th column Switching Architectures .

n3 } In particular. later in the course Switching Architectures . Rearrangeable non-blocking Clos networks Slepian-Duguid Theorem: A Clos network is REAR if and only if the number of second stage switches r2 satisfies: r2 ≥ max{m1 . a symmetric network with m1 = n3 = n is SNB if and only r2 ≥ n Proof: It will be proved using the Birchkoff von Neumann theorem. c 2009 .Paolo Giaccone 10 .Part 2 .

c 2009 . with m1 = n3 = p. the smallest Clos network is built with r2 = p • total complexity: CREAR (N ) = qC(p × p) + pC(q × q) + qC(p × p) = 2qp2 + pq 2 p×p q×q p×p 1 1 1 N N q p q Switching Architectures . r1 = r3 = q with N = pq • thanks to Slepian Duguid Theorem. Complexity of a REAR Clos network • consider a symmetric Clos network.Part 2 .Paolo Giaccone 11 .

2p − 1 CSN B (N ) = CREAR (N ) p which means: CREAR (N ) ≤ CSN B (N ) < 2CREAR (N ) Note that. c 2009 . Clos complexity comparison By setting q = N/p in the formulas of the Clos networks complexity: N2   2 2 CSN B = (2p − 1) 2N + 2 ≈ 4pN + N p p 1 2 CREARR = 2pN + N p and hence. to be advantageous with respect to the crossbar. it should be: CREAR (N ) < N 2 CSN B (N ) < N 2 Switching Architectures .Part 2 .Paolo Giaccone 12 .

Clos complexity comparison 1e+08 1e+07 Number of crosspoints 1e+06 100000 10000 SNB N=10 REAR N=10 SNB N=100 1000 REAR N=100 SNB N=1000 REAR N=1000 100 SNB N=10000 REAR N=10000 10 1 10 100 1000 10000 p Switching Architectures .Part 2 .Paolo Giaccone 13 . c 2009 .

5 0 1 10 100 1000 10000 p Switching Architectures . Clos complexity comparison 2 1.Paolo Giaccone 14 .Part 2 . c 2009 .5 N=10 N=100 CSNB/CREARR N=1000 N=10000 1 0.

Minimum complexity for REARR Clos network • minimum of CREARR obtained for p̂: r 2 ∂CREARR N N = 2N − 2 = 0 ⇒ p̂ = ∂p p 2 Hence.Part 2 . the Clos network degenerates into a N × N crossbar.Paolo Giaccone 15 . CREAR (p = N ) = 2N 2 Switching Architectures . CREAR (p = 1) = N 2 • for p = N . c 2009 . CREARR < Ccrossbar = N 2 • for p = 1. the Clos network degenerates into two tandem N × N crossbars. the minimum complexity is: opt √ √ √ CREARR = 2 2N N = Θ(N N ) opt • for any N > 8.

r3 } − 1 – for each new connection. the number of II-stage modules to reconfigure is at most two Switching Architectures . Configuring a REAR Clos network • Paull’s algorithm – incremental algorithm. used to add one connection at one time and reconfigure the network if needed – will be also used to support rate guarantees in input queued switches – (Paull’s Theorem) for each new connection.Paolo Giaccone 16 . c 2009 .Part 2 . the number of connections needed to be rearranged is at most min{r1 .

hence.b)−path b a a b b a or b a b a a b (b. and b is available in column j of P . b)-path (or a (b.Paolo Giaccone 17 . use module a for the new connection. and use a (or a b for the (b. two cases are possible – it exists a II-stage module a which is available in both row i and column j of P . ? a a b b a (a. without any rearrangment: Pij = Pij ∪ a – otherwise. there should be two II-stage modules a and b such that a is available in row i. Find an (a. Paull’s algorithm • Given Paull’s matrix P = [Pij ] and a new connection to add in Pij .a)−path b a b a a b Switching Architectures . a)-path) for the new connection: Pij = Pij ∪ a. a)-path) starting from Pij . c 2009 . Now swap a with b in such path.Part 2 .

n3 } + 1 but the reconfiguration algorithm is different from the classical Paull’s one (see paper by Hwang. Non-interruptible rearrangeable networks • NI-REAR network – rearrangeable non-blocking – no connection is taken down before its rerouted path has already connected ∗ exploits double path • Theorem: A Clos network is NI-REAR if r2 ≥ max{m1 . Lin. Lioubimov.Part 2 .Paolo Giaccone 18 . 2006) Switching Architectures . c 2009 . n3 } + 2 • note that it is sufficient r2 ≥ max{m1 .

Part 2 . Paul algorithm reconfigures the connections through just two modules (A and B) of the II stage • trick: use two additional reserved modules (R1 and R2) in the middle stage.Paolo Giaccone 19 . Configuration of NI-REAR networks • in case of reconfiguration. just to create the double-paths A A R1 B R2 B R1 R1 R2 R2 Switching Architectures . c 2009 .

c 2009 .Paolo Giaccone 20 .Part 2 . Recursive construction of switching networks Switching Architectures .

p = 2) = 3N 2 /4+6N √ – for keeping the same “aspect ratio”. Recursive construction of switching networks • main idea to exploit recursively – to build a REAR Clos network. use a REAR Clos network for each module – to build a SNB Clos network.Part 2 . CSN B (N. use p = N Switching Architectures . keep small p CREAR (N ) = 2qp2 + pq 2 CSN B (N ) = (2p − 1)q(2p + q) CREAR (N. c 2009 . p = 2) = N 2 /2+4N vs. use a SNB Clos network for each module • many ways to factorize the network – for small complexity.Paolo Giaccone 21 .

c 2009 . recursively factorized with p = 2. exploiting only 2 × 2 modules • N = 2n for some n 1 1 N/2 x N/2 N N N/2 x N/2 N/2 N/2 Switching Architectures .Part 2 . Benes network • Clos network. REAR.Paolo Giaccone 22 .

Example of Benes networks 4x4 8x8 16x16 Switching Architectures .Part 2 .Paolo Giaccone 23 . c 2009 .

by setting k = log2 N − 1: S(N ) = 2 log2 N − 1 Switching Architectures . . . by setting k = log2 N − 1 and considering C2 = 4: N C(N ) = N (log2 N − 1)C2 + C2 = 4N log2 N − 2N 2 The number of stages satisfies:     N N S(N ) = 2 + S = 2k + S for k = 0. . . . Benes network complexity The number of crosspoints satisfies:     N N C(N ) = N C2 + 2C = kN C2 + 2k C for k = 0. log2 N − 1 2 2k Now. . c 2009 . log2 N − 1 2 2k and again.Paolo Giaccone 24 .Part 2 . . .

Benes network configuration Two algorithms: • Paull’s algorithm applied recursively • Looping algorithm – equivalent to Paull’s algorithm using a particular sequence of switching requests – all the switching requests should be known in advance to avoid reconfigurations Switching Architectures .Paolo Giaccone 25 .Part 2 . c 2009 .

t. c 2009 .Part 2 . then T (n) = Θ(nlogb a log2 n) Switching Architectures . ∀n ≥ n0 : cg(n) ≤ f (n) ≤ c′ g(n) – f (n) = O(g(n)) means that ∃ c > 0.Paolo Giaccone 26 . Master method for recurrence equations • Landau notation for f (n). f (n) = O(nlogb a−ǫ ). n0 s. c′ > 0. b ≥ 1 – if ∃ ǫ > 0 s. then T (n) = Θ(nlogb a ) – if f (n) = Θ(nlogb a ). a ≥ 1. g(n) > 0 – f (n) = Θ(g(n)) means that ∃ c.t. ∀n ≥ n0 : f (n) ≤ cg(n) f (n) – f (n) ∼ g(n) means that lim =1 n→∞ g(n) • Master method to solve T (n) = aT (n/b) + f (n).t. n0 s.

a = 3. a = b = 2. p = 2 (i. then f (n) = Θ(N ) = O(nlog2 3−ǫ ) with ǫ = 0.5. then f (n) = Θ(N ). C(N ) = Θ(N log2 N ) • SNB Clos network. hence. factorized recursively. factorized recursively with factor 2 • REAR Clos network. p = 2 – C(N ) = 2N C2 + 3C(N/2) – using master method. b = 2. Benes network) – C(N ) = N C2 + 2C(N/2) – using master method.Part 2 . c 2009 .58 ) Switching Architectures . C(N ) = Θ(N log2 3 ) ≈ Θ(N 1.e..Paolo Giaccone 27 . Clos networks. hence. factorized recursively.

58 Switching Architectures . large N ) C(N ) ≈ 3log2 n 2n C(2) = nlog2 3 N C(2) = 4N (log2 N )1. For convenient factorization. c 2009 .Part 2 .. .. √ REAR Clos network. since 1/2 + 1/22 + .e. ..+n/2 C 2n/2 If we set k = log2 n. assume N √ √   2 k  k  C(N ) = 3 N C( N ) = 3 2n/2 C 2n/2 = 3k 2n/2+n/2 +. factorized recursively with factor N = 2n and n = 2k . + 1/2k ≈ 1 for large k (i.Paolo Giaccone 28 .

..Paolo Giaccone 29 . large N ) C(N ) ≈ 6log2 n 2n C(2) = nlog2 6 N C(2) = 4N (log2 N )2. . factorized recursively with factor N For convenient factorization. + 1/2k ≈ 1 for large k (i. since 1/2 + 1/22 + . we assume √ that r2 = 2 N and then: √ √ √ √ √ √ √ √ C(N ) = N C( N × 2 N ) + 2 N C( N ) + N C(2 N × N ) √ √ √ Since C( N × 2 N ) = 2C( N ). For a better layout. c 2009 .e.Part 2 . assume N = 2n and n = 2k . √ √ n/2  n/2  k n/2+n/22 +. √ SNB Clos network.58 Switching Architectures .. .+n/2k  n/2k  C(N ) = 6 N C( N ) = 6 2 C 2 =6 2 C 2 If we set k = log2 n.

based on recursive factorization • m parallel N × N Benes networks. Cantor network • SNB network.Paolo Giaccone 30 .Part 2 . each with 2 log2 N − 1 stages – we will show that Cantor network is SNB if m = log2 N • each input/output is connected to the corresponding m inputs/outputs of the Benes networks through a (1 × m)-multiplexer/(m × 1)-demultiplexer Switching Architectures . c 2009 .

c 2009 .Part 2 .Paolo Giaccone 31 . 8 × 8 Cantor network Switching Architectures .

c 2009 .e.Paolo Giaccone 32 .minimum complexity • A(k) (with 1 ≤ k ≤ log2 N ) is the (worst case) number of 2 × 2 modules in the Benes networks reachable without rearrangement at stage k by an input of the Cantor network – A(1) = m – A(2) = 2A(1) − 1 – A(3) = 2A(2) − 2 – A(k) = 2A(k − 1) − 2k−2 = 2k−1 A(1) − (k − 1)2k−2 • for the middle stage: A(log2 N ) = 21 N m − 41 (log2 N − 1)N . Cantor network . for SNB. which corresponds also the the number of central stage modules reachable by the outputs • in total N m/2 modules in the middle stage.m = log2 N ) is sufficient to provide SNB Switching Architectures . it is sufficient: 2A(log2 N ) > N m/2 ⇒ 2N m − (log2 N − 1)N > N m • m > log2 N − 1 (i..Part 2 .

minimum complexity Switching Architectures .Part 2 . c 2009 . Cantor network .Paolo Giaccone 33 .

c 2009 .Part 2 .Paolo Giaccone 34 . Cantor network • Complexity: C(N ) = N C(1 × m) + m(4N log2 N − 2N ) + N C(m × 1) = 2N m + m(4N log2 N − 2N ) = 2N log2 N + 4N (log2 N )2 − 2N log2 N ∼ 4N (log2 N )2 • Control: as SNB Clos Switching Architectures .

c 2009 .Part 2 .58 • SNB ⇒ C(N ) = 4N (log2 N )2 (Cantor) Switching Architectures .58 – SNB ⇒ C(N ) = 4N (log2 N )2.summary • p=2 – REAR ⇒ C(N ) = 4N log2 N (Benes) – SNB ⇒ C(N ) = Θ(N 1.Paolo Giaccone 35 . Recursive factorization .58 ) √ • p= N – REAR ⇒ C(N ) = 4N (log2 N )1.

then the switch has been built with the minimum theoretical number of modules – if log2 S ∼ log2 X . X = N ! N! – for a N × M switch (N > M ). Minimum number of modules • S is the total number of possible states of a switching network – if a switch is built with Cn modules n × n. equivalently. then S = (n!)Cn – if a switch is built with C2 modules 2 × 2. X = (N − M )! • Property: if a switch is non-blocking. log2 S ≥ log2 X – note that the viceversa does not hold – if S = X . c 2009 . then S = 2C2 • X is the total number of possible switching requests – for a N × N switch.Paolo Giaccone 36 . then S ≥ X or. then the switch is asymptotically optimal Switching Architectures .Part 2 .

Paolo Giaccone 37 . Ĉ2 Ĉ2 = log2 X ⇒ Ĉn = log2 n! • minimum complexity in terms of crosspoints Ĉnp 2 n log2 X Ĉ2p = 4 log2 X p Ĉn = log2 n! • in general. reducing the number of modules implies a more complex routing algorithm Switching Architectures .Part 2 . for n > 2. c 2009 . it is not easy to design and/or configure a network which satisfies the minimum number of modules • usually. Minimum number of modules • minimum number Ĉn of n × n modules to satisfy all the possible switching requests log2 S = log2 X ⇒ log2 (n!)Ĉn = log2 X log2 X Ĉn = log2 n! • with respect to a network with 2 × 2 modules.

44N • N × N crossbar N2 – number of possible states: Scrossbar = 2 ⇒ log2 Scrossbar = N 2 • N × N Benes network (asymptotically optimal) – number of possible states: SBenes = 2(2 log2 N −1)(N/2) = 2N log N −N/2 – taking logarithms: log2 SBenes = N log2 N − 0. Optimality for N × N switches built with 2 × 2 modules • using Stirling approximation (for N large enough):  N √ N N ! ∼ 2πN e 1 log2 N ! ∼ N log2 N − N log2 e + log2 N + 1. c 2009 .32 2 X = N! ⇒ log2 X ∼ N log2 N − 1.Part 2 .5N ∼ log2 X • N × N Cantor network – log2 SCantor = (log2 N ) log2 SBenes Switching Architectures .Paolo Giaccone 38 .

04858e+06 Switching Architectures . c 2009 .Part 2 .Paolo Giaccone N 39 . Complexity for N × N switches built with 2 × 2 modules 1e+08 1e+07 1e+06 100000 log10 S 10000 X Benes Cantor 1000 SNB REAR 100 Crossbar 10 1 4 16 64 256 1024 4096 16384 65536262144 1.

Switch with output grouping • switch N × N • outputs are divided into k groups (assume k is divisor of N ) – a single connection request from a specific input to any output inside a specific group • number of possible switching requests: N! X =   k N ! k • minimum number of 2 × 2 modules   N N N Ĉ2 ∼ N log2 N − 1. Ĉ2 ∼ N log2 N for k = N (usual switch) Switching Architectures .Part 2 .e. c 2009 .e...44 ∼ N log2 k k k k – i.Paolo Giaccone 40 .44N − k log2 − 1. Ĉ2 = 0 for k = 1 (null switch) – i.

c 2009 . Asymmetric switches • switch N × αN with α ∈ (0.Part 2 .t. 1) chosen s.Paolo Giaccone 41 . αN is integer • number of possible switching requests: N! N! X= = (N − αN )! ((1 − α)N )! • minimum number of 2 × 2 modules  Ĉ2 ∼ N log2 N − (1 − α)N log2 (1 − α)N ) ∼ αN log2 N • minimum complexity decreases by factor α with respect to the usual Benes network – this does not imply that such network is simple to design and/or configure! Switching Architectures .

Paolo Giaccone 42 . Banyan networks • self-routing N × N switch – header of the packet drives the routing path • complexity Θ(N log2 N ) • unique path from each input to each output • based on the Benes network input id output id self−routing binary string module 1 module 2 module 3 edge 1 edge 2 edge 3 edge 4 Switching Architectures .Part 2 . c 2009 .

Examples of Banyan networks Baseline network Banyan network Shuffle exchange (Omega) network Flip (inverse shuffle exchange) network Switching Architectures . c 2009 .Paolo Giaccone 43 .Part 2 .

e. Blocking in Banyan networks • Property: if self-routing addresses satisfy both conditions: – strictly monotone outputs. i. no idle inputs between any two active inputs then the self-routing is non-blocking • in general. i. Banyan networks are blocking – number of 2 × 2 modules in a Banyan network: (N/2) log2 N – number of possible states in a Banyan network: SBanyan = 2(N/2) log2 N – taking logarithm: 1 log2 SBanyan = (N/2) log2 N ∼ log2 N ! 2 • the probability that a random input-output permutation is non-blocking is 2−(N/2) log2 N which goes to zero very quickly by increasing N Switching Architectures .e.Part 2 . output destinations are increasing at the inputs – compact monotone inputs.Paolo Giaccone 44 . c 2009 .

c 2009 .Part 2 . Batcher-Banyan network Two switching phases: • (1) self-sorting Batcher network – transform any switching request into a non-blocking switching request for Banyan network • (2) self-routing Banyan network • final complexity = Θ(N (log2 N )2 ) Compact and monotone output destinations Sorting network Self routing network (Batcher) (Banyan) Switching Architectures .Paolo Giaccone 45 .