You are on page 1of 55

Bridging in the Data Center With or Without Spanning Tree

BRKDCT-2927

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Overview
Transparent bridging data plane Spanning Tree Protocol (control plane)
How it works, how it fails Stability features Application to DC design

Future of bridging

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Transparent Bridging
Data Plane

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Ethernet
Set the Expectations Physical Layer
coax cable, repeater, hubs

Broadcast medium
Any frame seen by the whole LAN, unmodified

Plug and play (literally!)


No cooperation expected from the host Protocols were developed with Ethernet behavior in mind

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Transparent Bridging
Looks Like Ethernet for End Devices Layer 2:
Terminate Layer 1 Can take decisions based on frame content

Transparent to Ethernet clients implies:


Create a broadcast domain Forward frames unmodified Be plug and play

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Bridges Segment the Collision Domain


By Terminating Layer 1 A B C

repeater A B C

bridge: less collisions, full-duplex possible

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Bridges Filter Frames


By Taking Decisions Based On Frame Content Bridges learn MAC addresses independently Build a filtering database (not a routing table!) Increase overall bandwidth available

B
Dst: B

A,B

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Why Not a Routing Table?


Frames with unknown destination address *must* be flooded
=> need support for flooding

There is no cooperation from the hosts


No hierarchy in the MAC addresses No subnet Only host routes would be possible => not scalable

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Extreme Hierarchical Network Example


4 Billion Hosts
It might be acceptable to have 4 billion routes here But not here
32 layers

Routers: 3 summary routes per devices Bridges: 4 billion host routes per devices
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

Forwarding Decision
Fundamental Difference Between Routing and Bridging Routing:
If an entry exists in the forwarding table, forward Else, drop

Bridging:
If an entry exists in the filtering database, drop Else, flood

optimization

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

10

No Routing Table Consequences


Routing: Notion of location associated to addresses
Equal Cost Multipathing (ECMP), Reverse Path Forwarding Check (RPFC)

To A A

To B B

R1

R2

Bridging: flooding requires a tree A


BRKDCT-2927_c2

B1
Cisco Public

B2

B
11

2009 Cisco Systems, Inc. All rights reserved.

Failure to Provide a Tree Is Catastrophic


A loop will result in network wide flooding Can have an impact on CPU (low end platforms) No Time To Live (TTL) field in frames

Bridging loop

Failure domain bridging domain


BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

12

So Why Bridging?
Some protocols require it IP uses it: subnet concept linked to Layer 2

172.28.192.1

.2

.3

.4

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

13

Extend a Subnet across Devices


For port density (not enough port on device) For provisioning flexibility (add devices without changing L3 network configuration) For redundancy (NIC teaming) Virtual machine mobility

172.28.192.1
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved.

.4
Cisco Public

.2

.5

.6

.7

.3
14

Section Summary
Bridging is complementary to routing Bridging is flexible Bridging main weaknesses are:
Failure domain = bridging domain (not scalable) A tree is required => no multipathing

Those limitations are causes by historic constraints in the data plane STP not mentioned yet (control plane)

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

15

Spanning Tree Protocol


Control Plane

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

16

STP Goals
Enforce a tree (at all time) Spanning eventually In a plug and play fashion Notify learning function of topology changes

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

17

STP Information
Bridges exchange information using Bridge Protocol Data Units (BPDUs) This information can be compared Bridges propagate a degraded version of the best information they ever received
A:1

0
10 13 10

A better than B, B better than C

B:2

12 A

C:3

0
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

0
18

STP Strategy For Building a Tree


The bridge with best information is the root A bridge keeps its best path to the root forwarding Alternate paths to the root are blocked
Root bridge (best information in the network)
Root port Alternate port Designated port

A
Root Port (best path to the root)

B
A

C
Alternate Port (alternate path to the root)
19

Designated Port (best information on this segment)


BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

STP Stability: What Can STP Do Wrong?


Failure to create a spanning topology
Loss of connectivity. Local issue, simple to troubleshoot, similar to most L3 failures.

Failure to create a tree topology, i.e. introduce a loop


The real issue!

Failure to notify the learning function


Temporary black holing for some addresses

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

20

How Can STP Open a Loop?


Fundamental difference bridging vs. routing:
Router: not control message => no forwarding Bridge: no control message => no blocking

A port that fails to receive BPDUs goes designated (forwarding) Most STP failures are related to BPDUs being lost or not acted upon Extra care must be taken before putting a designated port to forwarding

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

21

Unidirectional Link Failure


BPDUs lost A link only transmit traffic in one direction BPDUs are dropped Clockwise loop open
A

loop
Root port Alternate port Designated port
10 13 12 A 10

B BPDU ignored by B (worse information)


BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

C BPDU lost because of unidirectional link failure


22

Brain Dead Bridge


BPDUs Ignored C does not process BPDUs (CPU) C still forwards traffic (ASIC) Traffic loops in both directions
A

Root port Alternate port Designated port

loop
A

B
A

BPDUs ignored and not relayed

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

23

Layer 2 Features and STP Enhancements


EtherChannels BPDUguard, RootGuard Dispute mechanism Bridge Assurance

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

24

EtherChannel
Minor Change In the Data Plane Bundle several physical links into a logical one
No blocked port (redundancy not handled by STP) Per frame (not per-vlan) load balancing

Control protocols like PAgP (Port Aggregation Protocol) and LACP (Link Aggregation Control Protocol) handle the bundling process and monitor the health of the link Limited to parallel links between two switches A A
Channel looks like a single link to STP

Root port Alternate port Designated port

B
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

B
25

Rootguard/BPDUguard
Enforce a Policy Rootguard: prevents a port from accepting better info BPDUguard: shut down a port that receives a BPDU Not stability features per se Enforce security policy Restrict STPs freedom Trade off stability/connectivity

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

26

Dispute Mechanism
Protects Against Unidirectional Link There can only be a one designated port on a LAN RSTP (Rapid Spanning Tree) and MST (Multiple Spanning Trees) advertise a role in their BPDUs A designated port with worse information is a problem No loop!
Root port Alternate port Designated port
10 Designated:13

Worse designated BPDU: B detects a dispute


10

B Disputed port
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved.

Designated:12 A

BPDU lost because of unidirectional link failure


27

Cisco Public

Dispute Mechanism
Protects Against Bundling Errors A channel is a single logical link from STPs perspective A single BPDU is sent on a single physical port
p1

A
p1 & p2 bundled on B

po1

half loop

p2

Without Dispute Mechanism


p1 & p2 not bundled on B

po1

:10 nated Desig p1


Desig nated p2 :12

po1 disputed

A:1 0

B:2 0Worse designated

With Dispute Mechanism

BPDU: A detects a dispute


28

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Bridge Assurance
Identify and configure network ports vs. edge ports On p2p network ports:
Send periodic BPDUs, regardless of role Expect periodic BPDUs, regardless of role If no BPDU is received, the port goes inconsistent
Root port Alternate port Designated port Edge port

Network port sends periodic BPDUs


Designated:10

A:1 0

Root:12

Edge port: does not expect BPDUs


BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

does not trigger dispute Network port: expects BPDUs


29

B:2 Worse 0 root BPDU:

Bridge Assurance
The Ultimate Brain Dead Detection Mechanism Introduce a behavior closer to L3:
A network port with no peer does not transmit traffic Bridge Assurance Inconsistent ports (no BPDU received)

Root port Alternate port Designated port

B
A

brain dead bridge

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

30

Data Center Network Design Examples


STP Features at Work

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

31

Redundancy Handled by STP


N E

Data Center Core

B R L

Network port Edge port Normal port type BPDUguard Rootguard Loopguard

HSRP

Aggregation

ACTIVE ACTIVE

HSRP
N N
Backup Backup

STANDBY STANDBY

Layer 3 Layer 2 (STP + Bridge Assurance) Layer 2 (STP + BA + Rootguard)

Root
N N N R R R R

Root
N N N R R R R

Access
N N

N N N

N L L

E B

E B

E B

E B

E B

Layer 2 (STP + BPDUguard)

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

32

Protecting Against Access Failures Where Can a Loop Be Open?


The access layer is blocking the loops A loop can only be open if an access bridge puts both its uplinks to forwarding:
N N

Aggregation
N R N R

This port could introduce a loop


N N

Access

Root port Alternate port Designated port

N R

Network port Root Guard

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

33

An Uplink Must Go to Designated Role For a Loop to Occur


Only root ports and designated ports can be forwarding There is at most one root port per bridge This means that a loop can only be open if an access uplink takes the designated role

Aggregation

Aggregation

loop

loop
Root port Alternate port Designated port

Access

Access

At least one designated uplink


BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

34

Protecting Against Access Failures Designated Silent Access Uplink


Uplink is designated Uplink does not send BPDUs Bridge Assurance prevents the loop
Root port Alternate port Designated port
N R

N R

Network port Root Guard

N R

Bridge Assurance blocks the aggregation port


N N

Designated port (problem)

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

35

Protecting Against Access Failures Designated Access Uplink, Worse BPDU


Uplink is designated Uplink sends worse designated information Dispute mechanism prevents the loop
Root port Alternate port Designated port
N R

N R

Network port Root Guard

N R

Dispute mechanism blocks the aggregation port Designated port (problem)

wo rs e
N

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

36

Protecting Against Access Failures Designated Access Uplink, Better BPDU


Uplink is designated Uplink sends better designated information Root Guard forbids this scenario
Root port Alternate port Designated port
N R

N R

Network port Root Guard

N R

Root Guard blocks the aggregation port Designated port (problem)

be tte r
N

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

37

Protecting Against Access Failures Two Root Access Uplinks


Two root port on a bridge would be a severe bug There is a limit to what can be done in the control plane

Root port Alternate port Designated port


N R

N R

Network port Root Guard

N R

Root port (problem)

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

38

Virtual Port Channel (vPC)


Introduces some changes to the Data Plane Provides load balancing Does not rely on STP for redundancy Limited to pair of switches
VPC domain VPC domain

Blocked port
Redundancy handled by STP
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

Redundancy handled by vPC

STP view of vPC


39

vPC Data Center Example


N E

Data Center Core

B R L

Network port Edge port Normal port type BPDUguard Rootguard Loopguard

HSRP

Aggregation

ACTIVE ACTIVE

VPC domain
N N

HSRP
Backup Backup

STANDBY STANDBY

Layer 3 Layer 2 (STP + Bridge Assurance) Layer 2 (STP + BA + Rootguard)

Root
N N N R R R R

Root
N N N R R R R

Access
N

N L E B E B E B

E B

E B

Layer 2 (STP + BPDUguard)

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

40

Fixing STP Problems


By Fixing the Data Plane

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

41

Mac-in-Mac (802.1ah) Model


Introduced for Service Providers
Create more services Solve Mac Address Table scalability issues A ->X B ->Y X Y
X W Z Y

Backbone Edge Bridge A


User space

A ->X B ->Y

Backbone Bridge AB XY AB
Provider Bridge

AB
User space

Backbone space Backbone Edge Bridge

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

42

Mac-in-Mac Scalability
Backbone Edge Bridges (BEB) are able to:
map mac addresses between user and backbone spaces encapsulate/decapsulate frames

BEB only need to learn a subset of the mac addresses Backbone Bridges are regular bridges They only see backbone space addresses Now, lets assume that the backbone bridges are not bridges but new special devices

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

43

Application: Routing Backbone Frames


Backbone addresses are limited in number =>
They can be propagated by a control protocol A routing table is possible in the backbone!

To X To Y A
User space X W Y

B
User space

Backbone space Next generation bridge

ECMP, RPFC etc now possible in the backbone


BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

44

Adding a TTL
Frames are encapsulated unchanged in a new frame format in the backbone
The encapsulation can carry a TTL A Link state protocol allows determining the exact hop count A ->X, TTL 2
Y

To X To Y A
User space X W

B
User space

Backbone space

AB

XY 1 AB XY 2 AB

AB
45

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

Upcoming Technologies
By introducing a new data plane in the backbone, the advantages of Layer 3 can be added to Layer 2 The backbone addresses are not seen by L2 users, they represent a location, aggregating several devices
Global PC A address = X.A
Backbone Address (location) Mac Address (ID) PCA
User space X

Backbone space

The plug and play aspect of L2 can be maintained

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

46

Data Center Ethernet (DCE) TRILL


(Transparent Interconnection of Lots of Links) Goal: replace current transparent bridging model
Add multipathing Introduce L3-like stability for bridging

New frame format, using a compact backbone address to minimize overhead. Note: DCE offers other properties (like lossless Ethernet) not relevant to this presentation
DCE/ TRILL

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

47

Conclusion
L2 desirable for its flexibility (as a complement to L3) Transparent bridging has some scalability issues Several stability features have been developed in the control plane => they will never be enough to match L3 The final solution will be injecting L3 elements in the data plane

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

48

References, Related Sessions


BRKDCT-2961, Evolution of Hierarchical Network Design for the Data Center BRKDCT-2981, Overview of L2MP technologies Data Center DesignIP Network Infrastructure http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Cente r/DC_3_0/DC-3_0_IPInfra.html

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

49

Interested in Data Center?


Discover the Data Center of the Future
Cisco booth: #617 See a simulated data center and discover the benefits including investing to save, energy efficiency and innovation.

Data Center Booth


Come by and see whats happening in the world of Data Center demos; social media activities; bloggers; author signings Demos include: Unified Computing Systems Cisco on Cisco Data Center Interactive Tour Unified Service Delivery for Service Providers Advanced Services
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

50

Interested in Data Center?


Data Center Super Session
Data Center Virtualization Architectures, Road to Cloud Computing (UCS) Wednesday, July 1, 2:30 3:30 pm, Hall D Speakers: John McCool and Ed Bugnion

Panel: 10 Gig LOM


Wednesday 08:00 AM Moscone S303

Panel: Next Generation Data Center


Wednesday 04:00 PM Moscone S303

Panel: Mobility in the DC Data


Thursday 08:00 AM Moscone S303
BRKDCT-2927_c2 2009 Cisco Systems, Inc. All rights reserved. Cisco Public

51

Please Visit the Cisco Booth in the World of Solutions


See the technology in action
Data Center and Virtualization
DC1 Cisco Unified Computing System DC2 Data Center Switching: Cisco Nexus and Catalyst DC3 Unified Fabric Solutions DC4 Data Center Switching: Cisco Nexus and Catalyst DC5 Data Center 3.0: Accelerate Your Business, Optimize Your Future DC6 Storage Area Networking: MDS DC7 Application Networking Systems: WAAS and ACE

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

52

Complete Your Online Session Evaluation


Give us your feedback and you could win fabulous prizes. Winners announced daily. Receive 20 Passport points for each session evaluation you complete. Complete your session evaluation online now (open a browser through our wireless network to access our portal) or visit one of the Internet stations throughout the Convention Center.

Dont forget to activate your Cisco Live Virtual account for access to all session material, communities, and on-demand and live activities throughout the year. Activate your account at the Cisco booth in the World of Solutions or visit www.ciscolive.com.
53

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

54

Appendix: LoopGuard
Transition a Port to Designated Under Scrutiny
p1 designated (sends best info) p1 stops sending BPDUs
A:? p1 ? p2 B:20 A:30 p1 p1 A:10 A:10 p2 B:20

p1 starts sending worse information


A:30 p2 B:20

p2 ages out p1s info and becomes designated


p1 A:? ? p2 B:20 A:30 p1 B:20

p2 becomes designated
p2 B:20

p2s transition to forwarding prevented by LoopGuard

transition authorized by LoopGuard

BRKDCT-2927_c2

2009 Cisco Systems, Inc. All rights reserved.

Cisco Public

55