You are on page 1of 71

Advanced VPC Operation and Troubleshooting

BRKCRS-3146

Dmitry Goloubev Technical Leader, Tech services

Follow us on Twitter for real time updates of the event:

@ciscoliveeurope, #CLEUR

Housekeeping

We value your feedback- don't forget to complete your online session evaluations after each session & the Overall Conference Evaluation which will be available online from Thursday
Visit the World of Solutions and Meet the Engineer

Visit the Cisco Store to purchase your recommended readings


Please switch off your mobile phones After the event dont forget to visit Cisco Live Virtual: www.ciscolivevirtual.com Follow us on Twitter for real time updates of the event: @ciscoliveeurope, #CLEUR

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Goals
Understand general concepts of Virtual Port Channel feature on Nexus 7000 Review the impact of VPC on bridging and routing Learn how to troubleshoot VPC

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VPC at the network level enables to build PortChannel to 2 separate switches virtualizing network building block
from this to this or, logically

No blocked ports, More usable bandwidth, Load-sharing Distribution switch or link failure does not mean reconvergence

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VPC components at a glance


2 active control planes 2 configs 2 points of management 2 active data planes Primary-Secondary notion for some aspects of operation Control messages and Data frames flow between active and standby via Peer-Link Peer-Link is 802.1Q trunk Control messages are carried by CFS over Peer Link
Primary
Active
Control Plane

VPC domain Secondary


Active

Peer-Link Peer Keepalive link

Control Plane

Active
Data Plane

Active
Data Plane

VPC

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Agenda

Initialization & Redundancy considerations


Spanning Tree Traffic forwarding 1st hop redundancy Multicast considerations

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Stages of VPC initialization


1. 2. 3. VPC manager starts Peer-keepalive comes up (receives keepalives from the peer) Peer-link comes up (data is not passing through yet, just CFS)

4.
5. 6. 7.

Primary/Secondary Role resolved


Global Consistency check Peer-link is up for data SVIs brought up (VPC + 10 sec)

8.

VPCs brought up (SVI + 30 sec)

16:34:06 %VPC-5-VPCM_ENABLED: vPC Manager enabled 16:34:07 %VPC-5-PEER_KEEP_ALIVE_STATUS: In domain 2, peer keep-alive status changed to enabled

16:34:17 %ETHPORT-5-IF_UP: Interface port-channel2 is up in Layer3 Peer-Keepalive

16:34:19 %VPC-3-VPC_PEER_LINK_BRINGUP_FAILED: vPC peer-link bringup failed (vPC peer is not reachable over cfs)

16:34:19 %ETHPORT-5-IF_UP: Interface port-channel1 is up in mode trunk Peer-Link 16:34:23 %VPC-4-VPC_ROLE_CHANGE: In domain 2, VPC role status has changed to primary

16:34:23 %VPC-5-VPC_DELAY_SVI_BUP_TIMER_START: vPC restore, delay interface-vlan bringup timer started 16:34:33 %VPC-5-VPC_DELAY_SVI_BUP_TIMER_EXPIRED: vPC restore, delay interface-vlan bringup timer expired, reiniting interface-vlans 16:34:33 %INTERFACE_VLAN-5-UPDOWN: Line Protocol on Interface vlan 4, changed state to up Timers are adjustable in VPC 16:34:33 %VPC-5-VPC_RESTORE_TIMER_START: vPC restore timer started to reinit vPCs domain configuration context 16:34:41 %VPC-3-VPC_BRINGUP_FAILED: vPC 102 bringup failed (Peer-link state is not UP) SVI delay restore interface-vlan 16:35:03 %VPC-5-VPC_RESTORE_TIMER_EXPIRED: vPC restore timer expired, reiniting vPCs VPC delay restore 16:35:13 %VPC-5-VPC_UP: vPC 102 is up 16:35:13 %ETHPORT-5-IF_UP: Interface port-channel102 is up in mode trunk
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

VPC consistency checking


Certain configuration mistakes could lead to loops or blackholing [when STP config is inconsistent] Others might cause undesirable forwarding implications to specific interfaces [Inconsistent ACL, SVIs] Consistency checking prevents the prevents network-wide issues (type1) and warns about possible forwarding oddities (type2) Inconsistency Type
Type 1 / Global Type 1 / Interface Type 2

Action
Vlans suspended on peer-link, VPCs up with respective vlans suspended Vlans suspended on respective VPC Syslog message

Example of inconsistency
Rapid-PVST STP on one peer, MST STP on another MTU mismatch, STP guard config mismatch SVI is up on one peer, down on another

Nexus# Nexus# sh sh vpc vpc consistency-parameters consistency-parameters global interface port-channel 1 Name Name Type Type Local Local Value Value Peer Peer Value Value ------------------------------- ------------------------------------------- --------------------------------------------STP lag-id Mode 1 1 Rapid-PVST [(7f9b, Rapid-PVST [(7f9b, STP ... Disabled 1 None None STP modeMST Region Name 1 1 "" active "" active STP STP MST PortRegion Type Revision 1 1 0 Default 0 Default STP STP MST PortRegion Guard Instance to 1 1 None None STP VLAN MST Mapping Simulate PVST 1 Default Default STP Native Loopguard Vlan 1 1 Disabled 1 Disabled 1 STP PortBridge Mode Assurance 1 1 Enabled trunk Enabled trunk STP MTU Port Type, Edge 1 1 Normal, 1500 Disabled, Normal, 1500 Disabled, BPDUFilter, Duplex Edge BPDUGuard 1 Disabled full Disabled full STP Speed MST Simulate PVST 1 1 Enabled 10 Gb/s Enabled 10 Gb/s Interface-vlan Allowed VLANs admin up 2 101 101 101 101 Interface-vlan routing 2 1,101 1,101
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

Graceful Consistency check


VPC Type 1 inconsistency suspends all vlans on corresponding VPC on both peers

This triggers forwarding interruption during config changes (for example while changing MTU on VPC)
As of 4.2(8) and 5.2(1) VPC supports Graceful Consistency Check Graceful consistency check brings down interfaces on secondary peer upon inconsistency, primary peer keeps forwarding traffic Enabled by default
Nexus(config-vpc-domain)# graceful consistency-check

Nexus# show vpc brief vPC domain id Peer status vPC keep-alive status vPC role ... Graceful Consistency Check

: : : :

1 peer adjacency formed ok peer is alive secondary

: Enabled

vPC status ---------------------------------------------------------------------------id Port Status Consistency Reason Active vlans ---------- ----------- ----------------1 Po1 down* failed vPC type-1 2-10
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

10

VPC behavior at initialization


Peer-Keepalives must be heard before we bring up the Peer-Link VPC control plane must be able to communicate to the peer over peer-link
Negotiate LACP/STP operating roles for the chassis Wait for per-port peer parameters and handshake to bring up vPC ports

Performs peer parameters consistency check on each VPC bringup Will not bring up VPCs if only one of two VPC peers comes up (for example after power outage)

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

11

VPC Reload Restore

Allows to bring up VPCs after timeout if peer is presumed dead

Default timeout 360 sec


Assumes primary role for STP and LACP
Nexus(config)# vpc domain 1 Nexus(config-vpc-domain)# reload restore ? <CR> delay Duration to wait before assuming peer dead and restoring vpcs Nexus(config-vpc-domain)# reload restore delay ? <240-3600> Time-out for restoring vPC links (in seconds)

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

12

VPC auto-recovery (replaces Reload-Restore as of NXOS 5.2.1)


Auto-recovery addresses cases of multiple failures. For example
Peer-link fails and after a while primary switch (or keepalive link) fails Both VPC peers are reloaded and only one comes back up

How it works If Peer-link is down on secondary switch, 3 consecutive missing peer-keepalives will trigger auto-recovery After reload (role is none established) auto-recovery timer (240 sec) expires while peer-link and peer-keepalive still down, autorecovery kicks in
Switch assumes primary role VPCs are brought up bypassing consistency checks
Nexus(config)# vpc domain 1 Nexus(config-vpc-domain)# auto-recovery Nexus# sh vpc | i recovery Auto-recovery status : Enabled (timeout = 240 seconds)

Failure type
After reload only single peer comes up Peer-link fails, then eventually complete primary switch fails
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved.

Reload restore

Auto recovery

Cisco Public


13

Troubleshooting VPC: initialization


Always start with sh vpc it gives ~90% of all information needed for initial situation assessment
(*) - local vPC is down, forwarding via vPC peer-link 1 peer adjacency formed ok peer is alive success Consistency Check Not Performed primary 1 Disabled vpc1# sh vpc Legend:

vPC domain id : Peer status : vPC keep-alive status : Configuration consistency status: Type-2 consistency reason : vPC role : Number of vPCs configured : Peer Gateway : Dual-active excluded VLANs :

vPC Peer-link status --------------------------------------------------------------------id Port Status Active vlans ---------- -------------------------------------------------1 Po100 up 1,101 vPC status ---------------------------------------------------------------------id Port Status Consistency Reason Active vlans ---------- ----------- ----------------1 Po1 up success success 101

CFS can communicate with the peer We hear peer-alives Configs are compatible Master/Slave for certain apps Peer-Link is up with expected vlans Vlans are active on VPCs

Peer status issue check if peer-link is up, check if remote end is also configured as peer-link, then look at CFS. Note peer-link will fully come up when 1) peer-keepalive is up and 2) peers can talk via CFS over peer-link
Peer-keepalive issue check sh vpc keepalive, check outgoing interface being up, in correct vrf, check the route to destination (in correct vrf), ping the remote and check the same on the remote peer Role issue check sh vpc role on both sides, note that peer thats been up/active the longest will remain operational-active even if other peer will have better priority. This is done to minimize traffic disruption. If role is none established it means the VPC came up after reload/new config and VPCs will not come up before role is resolved or reload-restore/auto-recovery kicks in Consistency issues check sh vpc consistency global|interface Vlans not up check if respective vlan allowed on peer-link, check syslog for other causes sh log log | inc VLANS Always keep track of situation on both peers
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

14

VPC redundancy model


Process restartability Supervisor redundancy
Processes checkpoint their runtime state Crashing process is restarted statefully by NXOS system manager VPC Domain Switch 1 HA-policy will trigger supervisor switchover in response to excessive process crashing, software, hardware or diagnostic failure Switch 2
Process 1 Process 2

VPC redundancy

Active

Active

Process 1 Process 2

Process X

Process X

Standby(SSO)

Standby(SSO)

BRKCRS-3146

Devices dual-attached to VPC domain are protected against single switch failureCisco (power, hardware, maintenance etc) Public 2011 Cisco and/or its affiliates. All rights reserved.

15

VPC Keepalive link


Heartbeat between vPC peers to prevent dual-active scenario Keepalives are sent every second by default on UDP port 3200 3 second hold timeout on peer-link loss how long we ignore keepalives after peer-link loss 5 seconds keepalive timeout (starts after hold timeout after peer-link down) how long we wait for failure after hold timeout Use dedicated link, although NXOS does not enforce this just IP connectivity is verified Management interface can be used as keepalive link, but do not connect the interfaces together directly (only active supervisor management interface is up)

Peer Keepalive

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

16

Handling Peer-link failure flow


Peer-link failure Ignore keepalives for hold-timeout (3 sec)
2ndary

Note: If primary fails completely once the VPCs are down on secondary, VPCs will stay down until primary recovers

Am I primary?
primary

Start keepalive timeout timer (default 5 sec)

Keepalive timeout expired?


yes

no

Received Keepalive?
yes

no

Primary is gone Become primary

Primary is alive Bring down all VPC ports

Done
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

17

Handling Peer-link failure flow with Auto-recovery


Note: Unlike in the previous case the keepalive status is always checked, not only for keepalivehold + keepalivetimeout seconds after peer-link failure yes

no

Peer-link Down?
yes

Received Keepalive
no

NEW I am primary?
2ndary

Primary is alive Bring down all VPC ports


no

Missed 3 Keepalives in a row?


yes

primary

Primary is gone Become primary Bypass consistency checks Bring up VPCs Done

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

18

If Peer-link and Keepalive both fail


while primary peer is still alive

Dual-active situation
There will be 2 primary switches sending independent BPDUs
VPC Port-channels on upstream/downstream switches will be error-disabled by EtherChannel Misconfiguration Guard after ~90seconds
http://www.cisco.com/en/US/tech/tk389/tk213/technologies_tech_note09186a008009448d.shtml

If Nexus 7000/5000 is on the other end of VPC no errordisable as NXOS does not support EtherChannel Guard

Depending on remote configuration (presence of VPC, peer-switch etc) there can be different outcomes ranging from no impact to STP dispute, to STP state cycling between dispute, blocking and forwarding. Split vlan Provision redundancy for keepalive link, make sure it doesnt share datapath with peer-link

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

19

What to do if only 1 peer is operational and VPCs are down


due to power issue, hardware failure on the 2nd peer etc VPC(s) will be down if they had to flap or current peer was reloaded (because consistency check couldnt be performed without 2nd peer) Non-issue with auto-recovery, but what if current NXOS version < 5.2 ? Possible actions
Recover 2nd peer or remove VPC config from port-channel(s) vpc(config-if)# no vpc 123 or in case of many VPCs, remove VPC config vpc# sh run vpc > bootflash:myvpc.conf vpc(config)# no feature vpc
vpc# sh vpc ... Peer status : peer link is down vPC keep-alive status : Suspended (Destination IP not reachable) Configuration consistency status : failed Configuration inconsistency reason: Consistency Check Not Performed vPC role : none established ... vPC status ---------------------------------------------------------------------id Port Status Consistency Reason Active vlans ---------- ----------- ----------------102 Po102 down Not Consistency Check Not Applicable Performed
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

20

Troubleshooting VPC peer-keepalives


Nexus# show vpc peer-keepalive vPC keep-alive status --Send status --Last send at --Sent on interface --Receive status --Last receive at --Received on interface --Last update from peer : : : : : : : : peer is alive Success 2009.06.19 00:41:15 589 ms Eth2/35 Success 2009.06.19 00:41:14 580 ms Eth2/35 (1) seconds, (9) msec

vPC Keep-alive parameters --Destination : 7.7.7.77 --Keepalive interval : 1000 msec --Keepalive timeout : 5 seconds --Keepalive hold timeout : 3 seconds --Keepalive vrf : v1 --Keepalive udp port : 3200 --Keepalive tos : 192 Nexus# show vpc statistics peer-keepalive
vPC keep-alive status : peer is alive vPC keep-alive statistics ---------------------------------------------------peer-keepalive tx count: 9773 peer-keepalive rx count: 8985 average interval for peer rx: 991 Count of peer state changes: 0

Peer-keepalive is only essential at the time when peer-link goes down or comes up At any other time peer-keepalive failure will only trigger syslog Peer-keepalives might be affected by extreme control plane load (check CPU utilization & COPP) Number of keepalive state transitions, closer to 0 - better

Only reception of keepalive packets at IP level is required Generic routing/switching connectivity troubleshooting might be needed if packets are lost (make sure there is a route/arp in the correct VRF)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

21

Cisco Fabric Services CFS


CFS messaging

Transport mechanism for control-plane messaging between VPC peers Uses


Consistency validation MAC address synchronization vPC member port status signalling IGMP snooping synchronization vPC status signalling

VPC CFS messages are encapsulated in Ethernet frames and delivered between to peer via the peer-link
Nexus# sh cfs application ---------------------------------------------Application Enabled Scope ---------------------------------------------arp Yes Physical-eth stp Yes Physical-eth vpc Yes Physical-eth igmp Yes Physical-eth l2fm Yes Physical-eth ...
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

22

VPC: CFS troubleshooting


Cisco Fabric Services Transport of control messages between VPC peers
Nexus# show cfs status Distribution : Enabled Distribution over IP : Disabled IPv4 multicast address : 239.255.70.83 IPv6 multicast address : ff15::efff:4653 Distribution over Ethernet : Enabled Nexus# show cfs peers Physical Fabric --------------------------------------------Switch WWN IP Address --------------------------------------------20:00:00:1b:54:c2:42:41 10.48.73.222 [Local] Nexus 20:00:00:1b:54:c2:42:44 0.0.0.0 Nexus# show cfs internal ethernet-peer statistics | i Trans|Rece Number of Segments Transmitted : 218 Number of Acks Transmitted : 223 Maximum Segment Size Transmitted : 0 Number of Transmission Timeouts : 0 Number of segments in Transmit Queue : 0 Number of segments in Re-Transmit Queue : 0 Total Number of Segments Received : 441 Number of Acks Received : 217 Number of Duplicate Messages Received : 0 Number of Unexpected Segments Received : 0 Number of fragmented segments Received : 2 Number of duplicate fragments Received : 0 Number of unfragmented segments Received : 210 Number of Received Segments Dropped : 0 TX/RX counters should move when Number of Unreliable segments Transmitted : 1 Number of UnreliableVPC segments Received is active or coming up : 1

Total number of entries = 2

Nexus# sh cfs internal notification log name vpc Sun Nov 14 15:27:22 2010: Peer add 20:00:00:1b:54:c2:42:44 Sun Nov 14 19:05:25 2010: Peer gone 20:00:00:1b:54:c2:42:44 Sun Nov 14 19:08:03 2010: Peer add 20:00:00:1b:54:c2:42:44
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

Remote peer should be seen Shows timestamps for when CFS communication for VPC was interrupted (peer-reload, peer-link issues etc)
23

Swapping Primary Secondary roles


Sometimes it is preferred for operational reasons to have specific switch as primary

VPCs are down for ~1 minute after primary changes to secondary


Approach 1. Change role priority 2. Bounce peer-link
vpc1(config)# vpc domain 2 vpc1(config-vpc-domain)# role priority 60 Warning: !!:: vPCs will be flapped on current primary vPC switch while attempting role change ::!! Note: --------:: Change will take effect after user has re-initd the vPC peer-link ::-------vpc1(config-vpc-domain)# int po1 vpc1(config-if)# shut .... vpc1(config-if)# no shut ... 21:28:34 %VPC-5-ROLE_PRIORITY_CFGD: In domain 2, vPC role priority changed to 60 21:28:34 %VPC-5-SYSTEM_PRIO_CFGD: In domain 2, vPC system priority changed to 32667 21:28:36 %ETHPORT-5-IF_DOWN_NONE: Interface port-channel102 is down (None) 21:28:36 %VPC-4-VPC_ROLE_CHANGE: In domain 2, VPC role status has changed to secondary 21:35:40 %VPC-5-VPC_PEER_LINK_UP: vPC Peer-link is up

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

24

VPC operational considerations


from troubleshooting perspective
VPC troubleshooting is often part of investigation of larger scale event connectivity issues following power-outage, upgrade, migration, major changes etc Datacenter connectivity being impacted usually implies lots of pressure (time and otherwise) Always know the current situation before trying to recover
Trying to fix a non-issue one risks to make things worse At minimum collect the state of the system before trying anything drastic

When traffic forwarding is concerned basic information on interfaces, VPC states, STP states, MAC addresses, L3 routes/ARPs is essential takes a minute to collect, just paste this into shell on both peers
term len 0 sh int sh vpc sh port-channel summary sh spanning-tree sh mac address-table sh routing vrf all sh ip arp vrf all

sh tech detail is preferred (though takes ~10 minutes to collect, depending on CPU load and number of linecards) note: if VDCs are used best practice is to collect sh tech detail from both main VDC and VDC in question. sh tech brief is faster alternative
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

25

VPC config considerations


VPC Domain # must be unique for each Layer2-adjacent VPC domain otherwise issues with multicast forwarding, LACP negotiation of cross-VPC links may arise Set logging level for vpc to 5 makes VPC operation easier to follow

Use LACP for the peer-link (channel-group <x> mode active) more resilient to separate link failures (fiber/sfp going bad) or switch control-plane failures
Use auto-recovery (if available, use reloadrestore if not) useful for cases of multiple failures, more graceful recovery

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

26

Agenda

Initialization & Redundancy considerations


Spanning Tree Traffic forwarding 1st hop redundancy Multicast considerations

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

29

Spanning Tree in VPC domain

1
Primary Secondary

STP process

1 2

STP process

STP runs on both switches (2 active control planes) but only primary switch drives STP of VPCs. Port state changes are communicated to secondary via CFS messages. For non-VPC ports domain appears as 2 bridges
Peer-link is part of STP. BPDU handling is modified such that Peer-link will not be blocked (similar to MST implementation of IST) Non-VPC ports are managed independently by local STP process on each switch

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

30

STP behavior upon VPC primary failure


Primary switch (STP root) fails Secondary switch becomes operational primary and STP root

1 2
Primary OP-Primary Secondary
Backup

STP root port doesnt change nor any STP port states for VPCs, forwarding continues
Depending on control plane load it might take few seconds for Op-primary to start sending BPDUs.

ROOT

ROOT ROOT

This might cause STP reconvergence on connected switches hence increasing hello time or peer-switch feature might be considered in large deployments

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

31

STP behavior upon VPC primary recovery

1 2 3
OP-Secondary

Left switch comes back up Peer-Link comes back up

2
SYNC

OP-Primary Secondary
Backup

3
4 5

VPC role is resolved as Operational-secondary


Left switch has better STP priority becomes STP root STP root port of right switch will change and that will trigger SYNC: all non-edge STP ports will be temporarily blocked Once sync is complete ports will resume forwarding

ROOT

ROOT ROOT

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

32

VPC Peer-Switch feature


Both VPC switches originate BPDUs with preconfigured information. This allows to keep the same BPDU when primary fails/recovers no extra SYNC required short interruption in forwarding described on previous slide is avoided Both left and right switches consider themselves root Both left and right switches send BPDUs all the time no need to raise hello time & STP Bridge Assurance can be enabled on VPCs
Primary ROOT Secondary ROOT

spanning-tree vlan 1-1000 priority 8192 vpc domain 1 peer-switch

spanning-tree vlan 1-1000 priority 8192 vpc domain 1 peer-switch

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

33

VPC Peer-Switch feature


left# sh span vlan 101 VLAN0101 Spanning tree enabled protocol rstp Root ID Priority 8293 Address 0023.04ee.be01 This bridge is the root ... Bridge ID ... Interface ---------------Po1 Po100 Role ---Desg Root Sts --FWD FWD Cost --------1 2 Prio.Nbr -------128.4096 128.4195 Type --------------(vPC) P2p (vPC peer-link) Priority Address 8293 (priority 8192) 0023.04ee.be01

Primary ROOT

Secondary ROOT

left# sh vpc role | i mac vPC system-mac vPC local system-mac

: 00:23:04:ee:be:01 : 00:1b:54:c2:42:43

right# sh span vlan 101 VLAN0101 Spanning tree enabled protocol rstp Root ID Priority 8293 Address 0023.04ee.be01 This bridge is the root ... Bridge ID Priority Address Role ---Desg Desg 8293 (priority 8192) 0023.04ee.be01 Sts --FWD FWD Cost --------1 2 Prio.Nbr -------128.4096 128.4195 Type --------------(vPC) P2p (vPC peer-link) 34

In Peer-Switch mode bridge-ID comes from system-mac as opposed to local mac in normal mode

... Interface ---------------Po1 Po100


Cisco Public

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

STP inconsistencies
When STP detects certain abnormal situations it will mark ports as inconsistent and block them to prevent forwarding loops
- Root Root Guard feature detected inconsistency (unwanted bridge tries to become root) - Loop Loop Guard feature detected inconsistency (port becomes designated because no BPDUs are being received) - Bridge Assurance (BA) (no BPDUs are received from remote side) - VPC Peer-link (any of above inconsistencies happened on VPC peer-link)

%STP-2-VPC_PEER_LINK_INCONSIST_BLOCK: vPC peer-link detected BPDU receive timeout blocking port-channel11 VLAN0121.

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

35

Handling Peer-Link STP inconsistencies on Primary switch

Primary

inconsistency

1
Secondary

When peer-link STP inconsistency is detected on primary switch the link will be put in inconsistent STP state (effectively blocking state) BPDUs are not sent on peer-link when it is inconsistent. This is to allow secondary switch to detect inconsistency and react

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

36

Handling Peer-Link STP inconsistencies on Secondary switch

inconsistency

Primary

inconsistency

1 2 Secondary

When peer-link STP inconsistency is detected on secondary switch the peer link will be put in inconsistent STP state (effectively blocking state)
Respective vlans or MST instances are also blocked on all VPCs

1 2

This behavior depends on STP Bridge Assurance on peer-link (default) as a way to signal to the secondary peer about inconsistency With BA disabled on Peer-link any inconsistency on the Primary will lead to Peer-link flap
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

37

STP troubleshooting: PES/SPS & BPDU redirection


Primary VPC peer controls the port states on the secondary peer by means of SPS (set-port-state) messages Changes in STP information are syncronized between peers using PES (port-event-sync) messages
nexus# sh spanning-tree internal info vpc | exc 0$ ... ======= CFSoe Statistics ========================= Total PES Msgs sent : 4 Total SPS Msgs sent : 4 Total MCS Msgs sent : 8 Total PES Response Msgs received : 4 Total SPS Response Msgs received : 4 Total Response Msgs received : 8

Constantly incrementing SPS/PES counters might indicate STP instability or constant reconvergence. Use sh spanning detail and debug spanning-tree events to find a reason for reconvergences

BPDUs are sent to VPCs out of primary switch. If VPC leg connected to primary is down, BPDUs are sent over peer-link and sent out by secondary
nexus# sh system internal frame traffic | i BPDU Ingress BPDUs qualified for redirection 42 Ingress BPDUs redirected to peer 42 Egress BPDUs qualified for redirection 0 Egress BPDUs dropped due to remote down 0 Egress BPDUs redirected to peer 0
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

38

STP troubleshooting
vpc1# sh spanning-tree vlan 4 VLAN0004 Spanning tree enabled protocol rstp Root ID Priority 32772 Address 0018.ba88.4a00 Cost 2 Port 4096 (port-channel1) Hello Time 2 sec Max Age 20 sec

It is possible to see situation when there are 2 root ports: peer-link and some VPC This happens when STP root is behind VPC and BPDU is received by the peer - this does not indicate any issue
Forward Delay 15 sec

Priority 32772 (priority 32768 sys-id-ext 4) Address 68bd.abd7.51c2 Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------Po1 Root FWD 1 128.4096 (vPC peer-link) Network P2p Po102 Root FWD 1 128.4197 (vPC) P2p

Bridge ID

Peer link is running STP


vpc1# sh spanning-tree vlan 4 detail | i "^ Port|BPDU" Port 4096 (port-channel1, vPC Peer-link) of VLAN0004 is root forwarding BPDU: sent 46416, received 46418 Port 4197 (port-channel102, vPC) of VLAN0004 is root forwarding BPDU: sent 0, received 0

On the other end of peer-link po1 is designated


BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

39

STP troubleshooting
Looking at BPDUs live

vpc1# debug spanning-tree bpdu_tx tree 101 14:20:37.556707 stp: RSTP(101): transmitting RSTP BPDU 14:20:37.556750 stp: vb_vlan_shim_send_bpdu(1933): VDC channel100 enc_type 1 len 42 14:20:37.556834 stp: RSTP(101): transmitting RSTP BPDU on port-channel1 14:20:37.556863 stp: vb_vlan_shim_send_bpdu(1933): VDC 4 Vlan 101 port port-channel1 enc_type 2 len 36 vpc1# debug spanning-tree all 14:22:23.560147 stp: RSTP(1): transmitting RSTP BPDU on port-channel100 14:22:23.560169 stp: vb_vlan_shim_send_bpdu(1933): VDC 4 Vlan 1 port port-channel100 enc_type 2 len 36 14:22:23.560219 stp: BPDU TX: vb 1 vlan 1 port port-channel100 len 36 ->0180c2000000 CFG P:0000 V:02 T:02 F:78 R:80:01:00:1b:54:c2:42:43 00000002 B:80:01:00:1b:54:c2:42:44 9063 A:0000 M:0014 H:0002 F:000f

This output can be easily limited to necessary Vlan/Interface, but it doent dump the BPDU Very chatty use debug logfile on port-channel100 <file> to redirect output to a file 4 Vlan 101 port port-

Alternatively use ethanalyzer to capture and dump BPDUs. Beware the BPDUs received by other peer and redirected to primary will not be seen in expected way because of extra encapsulation Looking at past events
nexus# sh spanning-tree internal event-history tree 0 interface port-channel 50 VDC02 MST0000 <port-channel50> 0) Transition at 497772 usecs after Tue Oct 20 17:42:01 2009 State: FWD Role: Root Age: 5 Inc: no [STP_PORT_STATE_CHANGE] 1) Transition at 661395 usecs after Tue Oct 20 17:42:01 2009 State: FWD Role: Root Age: 4 Inc: no [STP_PORT_ROLE_CHANGE] 2) Transition at 17741 usecs after Tue Oct 20 17:42:03 2009 State: BLK Role: Root Age: 5 Inc: no [STP_PORT_STATE_CHANGE] ...
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

40

Layer2 stability features recap


Feature Condition
Detects if link becomes unidirectional I.e. link cannot carry BPDUs both ways causes loops Expects to receive a BPDU every hello_time from the peer. I.e. cases of dead control plane on the remote side, also BPDU loss Checks the remote port role in the received BPDU, role should not be designated in BPDU received on designated port Cases of unidirectional communication Doesnt allow port to take designated role if it stopped receiving BPDUs Unidirectional communication, control plane issues on remote

Works on
Physical port

Effect
Error-disables unidirectional links Blocks port at STP level (BAinconsistent state) Blocks port at STP level (Disputed state)

Note
Useful on port-channels to take out broken links, alternative fast-timers PAGP/LACP Main protection mechanism where supported, alternative is Loop Guard

UDLD

Bridge Assurance (BA)

Logical port

Dispute

Logical port

Complements BA, on by default. Somewhat overlaps with UDLD, but not as effective on port-channels. Only works with RSTP/MST BPDUs Superseded by BA + Dispute, use with PVST+ or when BA is not supported

Loop Guard

Logical port

Blocks port at STP level (Loopinconsistent)

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

41

Bridge assurance, Dispute & UDLD


BA is default enabled on Peer-Link, not recommended for VPCs unless Peer-Switch feature is also operational
Dispute is default enabled (for both RSTP and MST on VPC) UDLD [normal mode] is recommended to take out bad links from channels (otherwise LACP takes ~100sec vs ~20 with UDLD) Recommendation Preferred BA + UDLD + Dispute (on all interswitch links when using Peer-switch) when all switches support this (nexus 7000/5000 and cat6500/VSS do support) Without Peer-switch BA should be kept only on Peer-Link (no BA or LoopGuard on VPCs) use UDLD + Dispute

If preferred config is not supported use Loop Guard + UDLD (supported by all Cisco switches)
Can potentially mix and match supported features per-switch, but do understand which cases in which combinations each feature covers
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

42

Agenda

Initialization & Redundancy considerations


Spanning Tree Traffic forwarding 1st hop redundancy Multicast considerations

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

44

Special case for forwarding


4

PC B

1 2

PC A ends a packet to PC B
MAC B is not known by left switch flood MAC B is not known by right switch flood B receives duplicate frames MAC A will be learned on wrong port on the lower access switch blackholing traffic to A

3 4 5

PC A

x
A

x
5

Frames received on Peer-Link must not be flooded out of VPCs


BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

45

Special case for forwarding: VPC way

PC B

3 1 2
MAC B is not known by left switch flood Frames received from Peer-Link are never sent out of VPC (except those without operational ports on ingress switch) Egress port ASICs will drop the frame Frame is still flooded to devices that are solely connected to egress switch This rule (called VPC check) stands for all traffic (L2, L3, unicast, multicast, broadcast, flooded etc) on Nexus 7000 (Nexus 3000/5000 VPC have similar rule, but different implementation)

2
PC A

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

46

Summary: VPC traffic forwarding with Nexus 7000

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

47

Topologies where VPC forwarding rules will have implications


Packets arriving to the left switch, with destination MAC of right switch will be dropped With peer-gateway enabled adjacencies may not come up This issue is not specific to OSPF same for any routing protocol Use routed links to connect routers Configuration and operational state of SVI interfaces for vlans present on VPCs should be consistent

routed

routed

OSPF

vlan 2

Frames received from Peer-Link are never sent out of VPC (except those without operational ports on ingress switch)
2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

BRKCRS-3146

SVI 1 up SVI 2 down

SVI 1 up SVI 2 up

Otherwise packet arriving to left switch for destination on VPC in vlan 2 will have to cross PeerLink and will be dropped by right switch Add routed cross-link between peers

48

Verifying whether frame will be sent to peer-link


Verify where the destination MAC address of the frame points to
Nexus# show mac address-table vlan 35 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+------+---------------+ 35 0007.b400.0101 dynamic 0 False False Po1 G 35 0007.b400.0102 static False False sup-eth1(R) G 35 001b.54c2.4241 static False False sup-eth1(R) * 35 001b.54c2.4244 static False False vPC Peer-Link + 35 0012.da65.9ec0 dynamic 0 False False Po1

If frame arrives to this switch in vlan 35 destined to 001b.54c2.4244 it will be sent to peer-link If this MAC address belongs to one of L3 SVI interfaces of peer-switch and IP destination of the frame is behind the VPC and this VPC has active links on this (local) switch then frame will be dropped by peer-switch
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

49

MAC address learning


1
PC B

MAC A is learned on lower VPC MAC A is learned on Peer-Link Frame destined to A arriving to right switch will be sent to Peer-Link

2 3 2

A 1

A 3

PC A

Traffic should prefer local links when available (traffic locality rule)

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

50

MAC address learning: VPC way

PC B

1 2 A 1
CFS message

MAC A is learned on lower VPC MAC addresses are never learned from traffic on Peer-Link

A 2

Left switch sends a CFS message to right switch telling about MAC A learned on lower VPC. Right switch updates MAC address table
Frame destined to A arriving to right switch will be sent to lower VPC

3
PC A

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

51

Po50 Vlan 50

Po22 Vlan 20

Troubleshooting Layer 2
91.0.0.10 0013.1908.e246
nexus# sh mac address-table address 0013.1908.e246 vlan 50

VPC

20.1.2.3

VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+-----------------* 50 0013.1908.e246 dynamic 0 F F Po50 nexus# sh spanning-tree vlan 50 interface port-channel 50 Mst Instance Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------MST0002 Desg FWD 200 128.4145 (vPC) P2p nexus# sh hardware mac address-table 2 address 0013.1908.e246 vlan 50

MAC addresses should point to expected ports in expected vlans (path towards source) nexus# sh system internal pixm info ltl 0x00a36 | i Eth.*, 0x0a36 Eth2/36, The ports should be in STP nexus# sh mac address-table address 0021.55e0.66c2 vlan 20 forwarding mode VLAN MAC Address Type age Secure NTFY Ports Hardware MAC address ---------+-----------------+--------+---------+------+----+------------------ * 20 0021.55e0.66c2 dynamic 660 F F Po22 table should be consistent nexus# sh spanning-tree vlan 20 interface port-channel 22 with software table Mst Instance Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- Finding port# for given index MST0000 Desg FWD 200 128.4117 (vPC) Network P2p Linecard Slot number nexus# sh hardware mac address-table 1 address 0021.55e0.66c2 vlan 20
Valid| PI | BD | MAC | Index | Stat| SW | Modi| Age | Tmr | | | | | | ic | | fied| Byte| Sel | -----+----+-------+---------------+--------+-----+----+-----+-----+-----+ 1 1 161 0013.1908.e246 0x00a36 0 3 0 141 1
Valid| PI | BD | MAC | Index | Stat| SW | Modi| Age | Tmr | | | | | | ic | | fied| Byte| Sel | -----+----+-------+---------------+--------+-----+----+-----+-----+-----+ 1 1 18 0021.55e0.66c2 0x00a32 0 2 0 103 1 nexus# sh system internal pixm info ltl 0x00a32 | i Eth.*, 0x0a32 Eth1/13, Eth1/14,
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

53

Troubleshooting Layer 3
91.0.0.10 0013.1908.e246

Po50 Vlan 50

Po22 Vlan 20

VPC

20.1.2.3

nexus# sh routing ip 20.1.2.3 ... 20.1.2.3/32, ubest/mbest: 1/0 *via 20.1.1.240, Vlan20, [1/0], 03:48:59, static nexus# sh ip arp 20.1.1.240 Address Age MAC Address 20.1.1.240 00:02:17 0021.55e0.66c2

Interface Vlan20

nexus# sh forwarding ip route 20.1.2.3 module 2 ... ------------------+------------------+--------------------Prefix | Next-hop | Interface ------------------+------------------+--------------------20.1.2.3/32 20.1.1.240 Vlan20 nexus# sh forwarding adjacency 20.1.1.240 module 2 IPv4 adjacency information next-hop rewrite info interface -------------- --------------- ------------20.1.1.240 0021.55e0.66c2 Vlan20 nexus# sh int vl 20 | i address Hardware is EtherSVI, address is 0023.ac66.1a42

Is there route to destination Is the next hop resolved Looking at module 2 because this is where packets in question should be received Is adjacency consistent with ARP Router MAC must have Gateway flag in order for packet to be L3 switched

nexus# sh mac address-table address 0023.ac66.1a42 vlan 20 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+-----------------G 20 0023.ac66.1a42 static F F sup-eth1(R)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

56

Where given packet will be load-balanced


For equal-cost routes
nexus# sh routing hash 91.0.0.10 20.1.2.3 Load-share parameters used for software forwarding: load-share mode: address source-destination port source-destination Universal-id seed: 0xcdb5769f Hash for VRF "default" Hashing to path *20.1.1.3 (hash: 0x2a), for route: 20.1.2.3/32, ubest/mbest: 2/0 *via 20.1.1.3, Vlan20, [1/0], 00:01:37, static *via 20.1.1.240, Vlan20, [1/0], 16:32:42, static

Load-balancing is configurable under ip load-sharing address in default VDC and affects all VDCs

For port-channels
nexus# sh port-channel load-balance forwarding-path interface port-channel 22 dst-ip 20.1.2.3 src-ip 91.0.0.10 vlan 20 module 2 Missing params will be substituted by 0's. Load-balancing is configurable Module 2: Load-balance Algorithm: source-dest-ip-vlan under port-channel load-balance RBH: 0 Outgoing port id: Ethernet1/14

in default VDC and affects all VDCs

Use sh port-channel rbh-distribution to see which link sends traffic for which of 8 available load-balancing buckets

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

57

Datapath Drops
#1 command to look for hardware packet drops Not every drop listed here is actual data packet drop |------------------------------------------------------------------------| | Device:R2D2 Role:MAC | times to see if any Run several |------------------------------------------------------------------------| Instance:7 counters increase at rate similar to ID Name Value Ports traffic loss ------------28688 aric_no_port_select_error 0000000000000002 I2 To clear1,3,5,7 counters, use ... |------------------------------------------------------------------------| clear statistics module-all device all | Device:Ashburton Role:MAC Mod: 1 |
nexus# sh hardware internal errors all ---------------------------------------Hardware errors as reported in module 1 ---------------------------------------|------------------------------------------------------------------------| Instance:0 3629 Egress Port-1 VSL Dropped Packet Count 0000000853635833 5 3630 Egress Port-2 VSL Dropped Packet Count 0000000857893046 3 ... |------------------------------------------------------------------------| | Device:Naxos Role:MAC SECURITY | |------------------------------------------------------------------------| Instance:0 ID Name Value Ports ------------106 m1_fab_p25_txq_tc0_drop_count 00000000000012af 2 ... |------------------------------------------------------------------------| | Device:Metropolis Role:REWR | |------------------------------------------------------------------------| Instance:1 ID Name Value Ports ------------70 Krypton input controller zero portsel cnt 0000000000000038 18,20,22,24,26,28,30,32 |------------------------------------------------------------------------| | Device:Lamira Role:L3 | |------------------------------------------------------------------------| Instance:0 ID Name Value Ports ------------93 CL2 Invalid Pkt count 00000008759cb9cb 1-32 I1 ...
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

60

Agenda

Initialization & Redundancy considerations


Spanning Tree Traffic forwarding 1st hop redundancy Multicast considerations

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

63

1st hop redundancy with VPC


MAC_B vMAC IP B IP A

PC B

Each of VPC peers will L3 forward packets destined to its respective Router MAC address
HSRP/VRRP/GLBP used for 1st hop redundancy
Router MAC1 0001.0002.0003 Virtual MAC 0000.0c07.ac00

HSRP

Router MAC2 0005.0006.0007 Virtual MAC 0000.0c07.ac00

Both switches will L3 switch packets to vMAC address as long as one of them is HSRP active or HSRP standby. If both switches are HSRP listening, they will not L3 switch packets to vMAC

PC A

MAC_A vMAC IP A IP B
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

64

First hop redundancy troubleshooting


standby
Interface Vlan1 ip address 1.1.1.252/24 hsrp 1 ip 1.1.1.254

active
Interface Vlan1 ip address 1.1.1.253/24 hsrp 1 ip 1.1.1.254

HSRP

Nexus# sh hsrp brief Interface Grp Prio P State Vlan1 1 100 Standby

Active addr 1.1.1.253

Standby addr Group addr local 1.1.1.254

Nexus# sh mac address-table address 0000.0c07.ac01 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+-----+------+------+----------G 1 0000.0c07.ac01 static False False sup-eth1(R) Nexus2# sh hsrp brief Interface Grp Prio P State Vlan1 1 100 Active

Active addr local

Standby addr Group addr 1.1.1.252 1.1.1.254

Nexus2# sh mac address-table address 0000.0c07.ac01 VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+-----+------+------+----------G 1 0000.0c07.ac01 static False False sup-eth1(R)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

Both peers will L3 forward packets destined to vMac address as long as either peer in VPC domain is in active or standby state for corresponding group Virtual mac address (vMac) will be installed in both peers G (gateway) flag must be present on any MAC address for which the nexus is expected to L3 forward packets Only active will respond to ARP for VIP

65

1st hop issue with some devices


MAC_B Router MAC1 IP B IP A

1
3 2

PC A sends a packet to Server B Left VPC switch will receive the packet and forward it to Server B, note Source MAC of outgoing packet will be that of Router1 Server B responding to PC A will populate destination MAC from source MAC of received frame (this is wrong, it should use ARP) If frame from BA will be load-balanced to right switch the MAC address of Router1 will point to Peer-Link and this is where the frame will be sent

Server B
Router MAC1 MAC_B IP A IP B MAC_B Router MAC1 IP B IP A

4 2
Router MAC1 0001.0002.0003 Virtual MAC 0000.0c07.ac00 Router MAC2 0005.0006.0007 Virtual MAC 0000.0c07.ac00

Left switch will receive the frame from Peer-Link and drop it

X
5
PC A
MAC_A vMAC IP A IP B
BRKCRS-3146

1
2011 Cisco and/or its affiliates. All rights reserved.

Why? Frames received from Peer-Link are never sent out of VPC except those without operational ports on ingress switch - egress port ASICs will drop the frame (VPC check)
Cisco Public

66

Peer-Gateway : the workaround


MAC_B Router MAC1 IP B IP A

Server B
MAC_B Router MAC1 IP B IP A

2
Router MAC1 0001.0002.0003 Virtual MAC Router MAC2 0000.0c07.ac00 0005.0006.0007 Virtual MAC 0000.0c07.ac00 Router MAC2 0005.0006.0007 Virtual MAC Router MAC1 0000.0c07.ac00 0001.0002.0003 Virtual MAC 0000.0c07.ac00

With peer-gateway both peers will install router MACs of each other in L2 table which will allow them to L3 forward traffic destined to either Router MAC

Server B responding to PC A will populate destination MAC from source MAC of received frame (this is wrong, it should use ARP) Right switch will forward packet towards destination

PC A

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

67

Peer-Gateway : the implications


1 2
Top device attempts to establish OSPF adjacency with the left switch If peer-gateway is enabled in VPC domain and OSPF unicast packet will be load-balanced to the right switch, this packet will be dropped Why? Right switch will try to L3-switch the unicast packet (because RouterMAC1 is marked as gateway MAC and destination IP is not local) As packet has TTL==1 it will be dropped Same applies to any other protocol that uses unicast packets with TTL==1 entering right switch but destined to left switch (or vise versa)

MAC_B Router MAC1 IP TOP IP LEFT, TTL 1


Router MAC1 0001.0002.0003 Router MAC2 0005.0006.0007 Virtual MAC 0000.0c07.ac00

Router MAC2 0005.0006.0007 Router MAC1 0001.0002.0003 Virtual MAC 0000.0c07.ac00

Routing protocol peering with devices attached to VPC domain via SVI interface is not supported Routed interface should be used in this case There is peer-gateway exclude-vlan command to turn off peer-gateway on certain vlans
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

68

VPC Agenda

Initialization & Redundancy considerations


Spanning Tree Traffic forwarding 1st hop redundancy Multicast considerations

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

71

IP Multicast with VPC


RP

Receiver sends IGMP report (join) Access switch sends join to right VPC peer Right VPC peer creates (*,G) adds VPC to OIF (as proxy-DR)

Source S1

IGMP is encapsulated in CFS and sent to left peer Left peer (DR) creates (*,G) adding VPC to OIF

(*,G)VPC
Primary

(*,G)VPC
2ndary

DR (left peer) sends PIM Join to RP Once (S1,G) traffic starts arriving, VPC peers will resolve which one will be forwarder for that (S,G): peer with best metric to source or primary in a tie (this mechanism is specific to PIM in VPC mode, normally PIM would use assert) Only forwarder will have OIFs populated in (S,G) the non-forwarder wont have VPC SVIs in OIF list

(S1,G)VPC DR

CFS:IGMP

(S1,G)null Proxy-DR

Receiver

IGMP join

Forwarder will send a copy of frame to the peerlink for receivers single-connected to other peer

Goal is to allow the peer receiving source traffic to forward it to receivers behind VPC without crossing peer-link (VPC check will drop such traffic otherwise)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

72

IP Multicast with VPC: Prebuilt-SPT


RP

Source S1

In case of DR failure proxy-DR becomes DR and posts OIF-list from (*,G) to (S,G), but it will also need to pull traffic from RP/source which delays recovery
(*,G)VPC
2ndary Primary

(*,G)VPC
(S1,G)VPC DR New DR

With ip pim pre-build-spt proxy-DR will also send a PIM Join to source/RP to draw the traffic Traffic pulled by proxy-DR will be dropped until it becomes DR provision uplink and replication bandwidth accordingly

(S1,G)VPC null

Receiver

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

73

IP Multicast with VPC: source behind VPC


RP

When Source is behind VPC both DR and ProxyDR will add OIFs for the group to (S,G)
(*,G)VPC2
Primary

(*,G)VPC2
2ndary

(S1,G)VPC2 DR

(S1,G)VPC2 Proxy-DR

This is because either peer can receive source traffic and need to be able to send it to receivers behind VPCs without crossing peer-link (to avoid dropping the traffic by VPC check)

Source S1

VPC1

VPC2

Receiver

When VPC is configured on N7K-F248XP-25 linecard (F2) there is no proxy-DR function (due to hardware specifics). Packet will be bridged to DR over peer-link BRKCRS-3146 Cisco Public 2011 Cisco and/or its affiliates. All rights reserved. (VPC check is modified accordingly for L3 multicast packets on F2 linecards)

74

Forwarder election in VPC


Peers do metrics exchange over CFS for each new source Peer that has better metric to source or primary will be forwarder
VPC1# sh ip pim internal vpc rpf Source: 10.0.1.1 Pref/Metric: 110/21 Source role: primary Forwarding state: Win (forwarding)

For sources behind VPC both peers will forward as they have no control on which one will get the traffic
VPC1# sh ip pim internal vpc rpf Source: 1.1.1.1 Pref/Metric: 0/0 Source role: primary Forwarding state: Win-force (forwarding)

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

75

VPC multicast: following packet flow


Nexus# show ip mroute 239.1.2.3 (*, 239.1.2.3/32), uptime: 06:46:05, igmp pim ip static Incoming interface: Vlan36, RPF nbr: 36.0.0.3 Outgoing interface list: (count: 2) Ethernet2/43, uptime: 03:01:36, static Vlan37, uptime: 06:46:05, igmp (33.0.0.33/32, 239.1.2.3/32), uptime: 06:46:05, ip pim mrib Incoming interface: Vlan36, RPF nbr: 36.0.0.3 Outgoing interface list: (count: 2) Ethernet2/43, uptime: 03:01:36, mrib Vlan37, uptime: 06:46:04, mrib Nexus# show ip igmp snooping groups vlan 37 Type: S - Static, D - Dynamic, R - Router port Vlan 37 37 Group Address */* 239.1.2.3 Ver v2 Type R D Port list Vlan37 Eth2/8

control plane state for this group where information came from stable? RPF interface

where are receivers on this vlan?

Are packets being switched by this entry?


Nexus# show ip mroute 239.1.2.3 summary software-forwarded Total Total Total Total Group number number number number count: of of of of 1, routes: 3 (*,G) routes: 1 (S,G) routes: 1 (*,G-prefix) routes: 1 rough average sources per group: 1.0 aps 0 49 pps 0 200
Cisco Public

Is traffic being switched for this group? counters updated once ~1 minute packets forwarded in software average packet size
oifs 2

Group: 239.1.2.3/32, Source count: 1 Source packets bytes (*,G) 0 0 sw-pkts: 0 33.0.0.33 5046908 252345396 sw-pkts: 1
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved.

bit-rate 0.000 bps 80.053

kbps 2
77

Following the flow: forwarding information


Nexus# show forwarding multicast route group 239.1.2.3 slot 1 ======= (*, 239.1.2.3/32), RPF Interface: Vlan36, flags: G Received Packets: 0 Bytes: 0 Number of Outgoing Interfaces: 2 Outgoing Interface List Index: 4 Vlan37 Outgoing Packets:0 Bytes:0 Ethernet2/43 Outgoing Packets:N/A Bytes:N/A

(33.0.0.33/32, 239.1.2.3/32), RPF Interface: Vlan36, flags: Received Packets: 5723369 Bytes: 366295616 Number of Outgoing Interfaces: 2 This is platform independent forwarding Outgoing Interface List Index: 4 Vlan37 Outgoing Packets:0 Bytes:0 information Ethernet2/43 Outgoing Packets:N/A Bytes:N/A
slot 2 =======

(*, 239.1.2.3/32), RPF Interface: Vlan36, flags: G

Ingress linecard entry Egress linecard entry Counters are updated once per ~1minute Counters between ingress/egress do not have to Received Packets: 0 Bytes: 0 match, as information is collected not at the same Number of Outgoing Interfaces: 2 Outgoing Interface List Index: 4 exact time, receiver might join after the entry was Vlan37 Outgoing Packets:5725816 Bytes:366452224 created etc Ethernet2/43 Outgoing Packets:3032294 Bytes:194066816

(33.0.0.33/32, 239.1.2.3/32), RPF Interface: Vlan36, flags: Received Packets: 0 Bytes: 0 Number of Outgoing Interfaces: 2 Outgoing Interface List Index: 4 Vlan37 Outgoing Packets:5725816 Bytes:366452224 Ethernet2/43 Outgoing Packets:3032294 Bytes:194066816

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

78

When traffic arrives via VPC

How to find which slot receives the S,G flow when ingress interface is port-channel scattered across several modules? show forwarding multicast route group <g> source <s>

Nexus# show forwarding multicast route group 239.1.1.1 source 1.0.1.2 | i Received|slot slot 1 Received Packets: 0 Bytes: 0 slot 2 Received Packets: 727203 Bytes: 487290999

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

79

Are there drops in forwarding path?


Start looking from Ingress module
Nexus# show hardware internal errors module 1 ---------------------------------------Hardware errors as reported in module 1 ---------------------------------------... |------------------------------------------------------------------------| | Device:Lamira Role:L3 Mod: 1 | | Last cleared @ Thu Apr 8 12:57:37 2010 | Device Statistics Category :: ERROR |------------------------------------------------------------------------| Instance:0 ID Name Value Ports ------------259 L3 Fib Miss Pkt ctr 0000000000000007 1-32 I1 262 L3 Non-Rpf Drop Pkt ctr 0000000000125617 1-32 I1 319 NF2 V4 IPMAC Lkup Error 0000000000272277 1-32 I1 455 Exception cause: DROP (Unicast) 0000000000025510 1-32 I1 465 Exception cause: DROP (Multicast) 0000000000226148 1-32 I1

Always take several snapshots and look for drops that grow coherently with [suspected] multicast traffic drops There are always some drops shown by above command this doesnt always mean the actual network packets are dropped. Some of these are diag packets, some are packets that are dropped on blocked ports, extra floods etc
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

81

Review & Summary


Infrastructure
Redundancy at process, supervisor, port-channel, chassis, VPC level Both peers are needed to bring up VPCs autorecovery/reload-restore can change this Peer-Keepalive + Role defines behavior during VPC failovers

Forwarding
Traffic locality (VPC check) + No learning on Peer-Link No blocking ports (generally), but common L2 stability mechanisms still important (LACP active, UDLD, BA, Dispute) Interfacing with L3 requires separate links + cross link

Troubleshooting
Layered, always take basic info, narrow down to a layer/issue type before trying to recover Data plane troubleshoot each peer like normal switch paying attention to nuances like VPC check, dual-DR and Router-MACs

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

82

Recommended Reading

Please visit the Cisco Store for suitable reading.

Please complete your Session Survey


We value your feedback
Don't forget to complete your online session evaluations after each session. Complete 4 session evaluations & the Overall Conference Evaluation (available from Thursday) to receive your Cisco Live T-shirt Surveys can be found on the Attendee Website at www.ciscolivelondon.com/onsite which can also be accessed through the screens at the Communication Stations Or use the Cisco Live Mobile App to complete the surveys from your phone, download the app at www.ciscolivelondon.com/connect/mobile/app.html
1. Scan the QR code (Go to http://tinyurl.com/qrmelist for QR code reader software, alternatively type in the access URL above) 2. Download the app or access the mobile site 3. Log in to complete and submit the evaluations

http://m.cisco.com/mat/cleu12/
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public

84

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

85

Thank you.

BRKCRS-3146

2011 Cisco and/or its affiliates. All rights reserved.

Cisco Public

86