Professional Documents
Culture Documents
Troubleshooting Cisco Catalyst 3750, 3560, 2960-S and 2960-X Series Switches
Troubleshooting Cisco Catalyst 3750, 3560, 2960-S and 2960-X Series Switches
John Wu
BU Escalation
Administrators spend most of their time…
Security-related configuration
Traffic optimization
Other
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Product Overview
3750-X 3560-X
2960-X/2960-XR
3750v2 3560-E
2960-S/SF
3560v2
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Product Overview
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Catalyst 3750-X : Architecture Overview
Two
Stack
Switch Fabric Stack PHY
4 Cables
TCAM TCAM TCAM
3
Port ASIC Port ASIC Port ASIC CPU
SDRAM
12 Port 12 Port 12 Port 12 Port
PHY PHY PHY PHY Modular
2 PHY Flash
Serial
10/100
24X1G POE 24X1G POE
StackWise,
10G or 1G StackWise
12X1G 12X1G 12X1G 12X1G
Plus
1
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Switch Components: Troubleshooting Areas of Focus
Two
Stack Stack
Switch Fabric PHY Cables
TCAM TCAM TCAM
Stack
Port ASIC Port ASIC Port ASIC CPU errors
SDRAM
12 Port 12 Port 12 Port 12 Port
PHY PHY PHY PHY Modular PHY Flash
Serial
TCAM 10/100 High ?
24X1G POE 24X1G POE
StackWise,
12X1G 12X1G 12X1G 12X1G
10G or 1G StackWise
Plus
Buffers?
QoS Running
TCAM Interface out?
Resources? Issue? No PoE ?
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Before We Start
• The outputs in this presentation are from 3750, 3750x
• Troubleshooting the 2960, 3560, and 3750 series switches are basically the same
– Differences called out
• Caution!!!
– debug and show platform commands to follow in the slides
– Excessive debug output to console may disable switch
– show platform commands are intended for in-depth troubleshooting by Cisco engineers
– Use debug and show platform commands as advised by TAC only
• TroubleShooting Basics
– Check the syslog for warnings and errors
– Use common sense
– Some TS techniques impact switch operation
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Troubleshooting Link Issues
Switch Fabric Stack PHY
Link issues
When are we concerned about the link issues?
Connectivity
Traffic
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Troubleshooting Link Issues
• Is the link up/up?
• Are packets being sent and received ?
• Are there any errors ?
• Collect data multiple times to confirm
Switch# show interface GigabitEthernet 1/0/1
GigabitEthernet1/0/1 is up, line protocol is up (connected)
....
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Output queue: 0/40 (size/max)
7539 packets input, 9856 bytes, 0 no buffer
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 4059 multicast, 0 pause input
0 input packets with dribble condition detected
3508 packets output, 3560 bytes, 0 underruns
0 output errors, 0 collisions, 4 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Ethernet Controller Stats
Details about errors
- Clear Stats with clear controller ethernet-controller command
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
PHY Registers
Information from PHY registers
False carrier and symbol errors are reported
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Link Issues: What Kind of Errors?
Switch# show int gi1/0/1 counters
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Link Issues: Link Not Coming Up
• Check configured duplex and speed on both switch and host
• Upgrade the NIC drivers on the host to the latest version
• Try a different cable/NIC and switchport to exclude faulty hardware
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
SFP Link Issue Prevention – EEM TCL Use Case
• 1000 Base-T (copper) and 100 • LinkUpApplyConfig is a Tcl policy
Base-FX SFPs have embedded that monitors an SFP link-up event
PHYs, allowing speed and duplex
to be configured on their respective
interface
EEM
1
1•
Tcl
However, these settings are Policy
cleared as soon as the SFP is 2
unplugged
2
If the same SFP type is re-inserted, • Speed and duplex settings (in
its configuration is not recovered startup-config) are automatically re-
applied to the SFP interface
LinkUpApplyConfig.tcl can be downloaded at the following hyperlink:
https://supportforums.cisco.com/docs/DOC-23267
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Link Issues: Checking Physical Cabling
• TDR feature helps determine possible cabling issues
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Switch Hardware Components: CPU
Switch Fabric Stack PHY
• CPU Functions
- Runs the IOS
- Processes Control Plane traffic (LACP / STP / CDP / etc.)
- Processes packets that are not switched in Hardware
- Communicates with controller(s)
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
CPU: Troubleshooting Processes
• High CPU utilization is due to
• Processes taking up resources
• Forwarded Network Traffic
• Normal CPU utilization varies by
• Switch Model
• Number of connected ports, modules, switches in stack
• Feature set (LANBASE, IP BASE, IP SERVICES)
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
CPU: Troubleshooting Processes
• Is CPU high an issue?
• HW switched traffic not impacted by high CPU
• It’s an issue when these issues occur because control packets not sent or received
• Spanning Tree Protocol (STP) reconverges
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
CPU: Troubleshooting Processes
Use “show process cpu history” to display the history of CPU utilization
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
CPU: Troubleshooting Processes
• Configure the CPU threshold
(config)# process cpu threshold type {total | process | interrupt}
\ rising percentage interval seconds [falling fall-percentage interval seconds]
Syslog message
*Mar 1 01:03:15.601: %SYS-1-CPURISINGTHRESHOLD: Threshold: Process CPU Utilisation
(Total/Intr): 18%/0%, Top 3 processes(Pid/Util): 4/10%, 75/1%, 164/0%
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
CPU: Troubleshooting Processes
Use “show process cpu sorted” to display current CPU utilization
Switch# show processes cpu sorted
CPU utilization for five seconds: 43%/7%; one minute: 28%; five minutes: 22%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
196 897835293 538983117 1665 6.05% 6.74% 10.05% 0 IP Input
102 46542612 69782387 666 2.33% 1.79% 1.61% 0 hpm main process
8 7967710 67451 118127 2.33% 0.29% 0.19% 0 Licensing Auto U
141 48894294 114699852 426 1.24% 1.01% 1.18% 0 Hulc LED Process
68 45347109 1374466 32992 1.24% 0.85% 0.86% 0 Adjust Regions
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
16 CPU Queues and Port ASIC queues
16 different CPU Queues
Packets to CPU Queues first stored on port ASIC
Switch# show platform port-asic stats drop Switch# show controllers cpu-interface
Supervisor TxQueue Drop Statistics cpu-queue-frames retrieved dropped
----------------- ---------- ----------
Queue 0: 0 rpc 132917740 0
Queue 1: 0 stp 31879262 0
Queue 2: 0 ipc 10746915 0
Queue 3: 0 routing protocol 267 0
Queue 4: 0 packets dropped L2 protocol 424610 0
Queue 5: 0 before reaching remote console 1121711 0
Queue 6: 0 to the CPU sw forwarding 0 0
Queue 7: 1000 host 345 0
Queue 8: 0 broadcast 13931 0
Queue 9: 0 cbt-to-spt 0 0
Queue 10: 0 igmp snooping 0 0
Queue 11: 0 icmp 0 0
Queue 12: 0 logging 0 0
Queue 13: 0 rpf-fail 0 0
Queue 14: 0 dstats 132935598 0
Queue 15: 0 cpu heartbeat 82903147 0
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
16 CPU Queues & CPU Buffer Pools
Switch# show buffer | in RxQ • Each queue reserves buffers
for specific traffic
RxQ0 buffers (rpc)
RxQ1 buffers (stp) • CPU buffer pools are named
RxQ2 buffers (ipc) RxQ0 to RxQ15
RxQ3 buffers (routing protocol)
RxQ4 buffers (L2 protocol)
RxQ5 buffers (remote console)
RxQ6 buffers (sw forwarding)
RxQ7 buffers (host)
RxQ8 buffers (broadcast)
RxQ9 buffers (cbt-to-spt)
RxQ10 buffers (igmp snooping)
RxQ11 buffers (icmp)
RxQ12 buffers (logging)
RxQ13 buffers (rpf-fail)
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
CPU: Software Forwarding Queue (Q6)
• For Traffic that hardware cannot process
- SW forwarding performance is much lower than HW To debug any CPU Q
Incoming physical
interface
SMAC of the host sending
the traffic
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
CPU: Routing Protocol Queue (Q3)
• Receives all traffic for routing protocols (BGP, OSPF, EIGRP, HSRP, etc.)
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
CPU: Host Queue (Q7)
• Used for all unicast traffic sent to the switch
– TACACS, SSH, telnet, ping, SNMP
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
CPU: ICMP Queue (Q11)
• Receives all traffic for which an ICMP message needs to be generated
– Excluding PING
– Routed port only
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
CPU: ICMP Queue (Q11)
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
CPU: ICMP Queue (Q11)
ICMP Unreachables Example
• Causes
– High amount of traffic is dropped because of a “deny” statement in an ACL
– CPU is interrupted to send ICMP unreachable packets back to the source
• Solution
– Disable ICMP unreachables on the ingress interface
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
CPU: Best Practices
• Storm Control can help to protect CPU
(config-if)#storm-control broadcast level level[.level]
(config-if)#storm-control action ?
shutdown Shutdown this interface if a storm occurs
send SNMP trap if a storm occurs
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Switch Hardware Components: Memory
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Troubleshooting Memory Utilization
• Syslog messages most common indication
%SYS−2−MALLOCFAIL: Memory allocation of 1028 bytes failed from 0x601617A4, pool Processor, alignment 0
−Process= "IP Input", ipl= 2, pid= 21
%PLATFORM_RPC-0-RESOURCE_CRASH: System is unable to allocate memory for RPC
• Switch not accessible, any CLI output becomes “show process memory”, etc.
Switch# show memory statistics
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Troubleshooting Memory Utilization Is any process steadily
increasing held memory?
Two main reasons of Processor memory problems
- Process does not release the memory after use
- Process does not limit the amount of memory it allocates
Switch# show processes memory sorted
PID TTY Allocated Freed Holding Getbufs Retbufs Process
0 0 74539888 23738156 47199076 0 0 *Init*
0 0 3399716 17490880 1590292 10657136 553112 *Dead*
65 0 712620 27424 594488 0 0 Stack Mgr Notifi
324 0 19794764 19262624 539264 0 0 hulc running con
11 0 228060 14940 226488 0 0 ARP Input
Reference:
http://www.cisco.com/en/US/docs/ios/12_2s/feature/guide/fs_memnt.html
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Memory Leak Example (using show commands)
Switch#show clock Switch#show proc mem 204
09:34:41.300 UTC Wed Apr 3 2013 Process ID: 204
Process Name: HTTP CORE
switch#show proc mem sorted Total Memory Held: 4175420 bytes
Processor Pool Total: 78964596 Used: 36942892 Free: Processor memory Holding = 4175420 bytes
42021704 pc = 0x015E5430, size = 2143156, count = 5265
I/O Pool Total: 12574720 Used: 8583916 Free: 3990804 pc = 0x0166F148, size = 1643716, count = 11583
PID TTY Allocated Freed Holding Getbufs Retbufs Process pc = 0x01685C18, size = 231660, count = 351
204 0 25509496 21347536 3957470 0 0 HTTP CORE pc = 0x004F4B60, size = 35136, count = 61
pc = 0x015E55C8, size = 27976, count = 351
Switch#show clock pc = 0x015DB614, size = 25720, count = 351
11:34:41.300 UTC Wed Apr 3 2013 pc = 0x01141F34, size = 2408, count = 2
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
IOS Memory Leak Detector
• Inbuilt Memory Leak Detector can be used to detect memory leaks
show memory debug leaks [chunks | largest | lowmem | summary]
Switch#show memory debug leak
Adding blocks for GD...
I/O memory
Address Size Alloc_pc PID Alloc-Proc Name
Processor memory
Address Size Alloc_pc PID Alloc-Proc Name
28D91E8 72 16EF0B4 0 *Dead* SSH2 String
28D927C 124 16EF0B4 0 *Dead* SSH2 String
28D9358 68 16EF0B4 0 *Dead* SSH2 String
Reference:
http://www.cisco.com/en/US/partner/docs/ios/fundamentals/configuration/guide/cf_mem-leak-detect.html
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Troubleshooting: I/O Memory Buffers
Switch# show buffers
• I/O memory for incoming CPU bound Buffer elements:
packets 1679 in free list (500 max allowed)
27109526 hits, 0 misses, 1641 created
• Classified into two major pools :
Public buffer pools:
- Public Buffer Pools (pre-assigned based on Small buffers, 104 bytes (total 50, permanent
the packet size) 50, peak 181 @ 3w5d):
49 in free list (20 min, 150 max allowed)
- Interface buffer pools (pre-assigned for 15 129877853 hits, 141 misses, 390 trims, 390
CPU queues) created
0 failures (0 no memory)
Middle buffers, 600 bytes (total 25, permanent
• Look for incrementing Failures and No 25, peak 94 @ 7w0d):
Memory in show buffers output 25 in free list (10 min, 150 max allowed)
616791 hits, 54 misses, 162 trims, 162
• show memory debug leak can detect I/O created
memory leaks as well 0 failures (0 no memory)
:(truncated)
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Switch Hardware Components: Port ASIC Issues
Is it a “performance” problem?
‒ Packet drops on a port
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Troubleshooting ASIC Issues : HW Forwarding
• L2 and L3 forwarding decision based on
- Routing/Switching tables
- ACL Redirection (PBR/WCCP)
Switch# show plat forward <src intf> <srcmac> <dstmac> [ip <srcip> <dstip>
<protocol>]
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Troubleshooting ASIC Issues: Software Forwarding
ASIC cannot process data packets and forwards them to CPU using Q6
Switch# clear controllers cpu
Switch# show controllers cpu-interface | include sw forwarding
sw forwarding 71558 0 0 0 0
Switch# debug platform cpu-queues software-fwd-q
SW-FWD-Q:Consumed by SW-Bridging: Remote Port Blocked L3If:Vlan101 L2If:GigabitEthernet1/0/2
DI:0x2FD, LT:7, Vlan:101 SrcGPN:2, SrcGID:2, ACLLogIdx:0x0, MacDA:000f.f7e8.e042, MacSA:
0000.00bb.87df IP_SA:10.101.1.100 IP_DA:10.99.1.100 IP_Proto:255
Switch# show plat forward Gi1/0/2 00.00bb.87df 000f.f7e8.e042 ip 10.101.1.100 10.99.1.100 255
Redirected by Input ACL. New destIndex is 0x02C7.
==========================================
Egress: ASIC 0, switch 1
CPU queues: 6 14.
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
ASIC Issues: Mapping Interfaces to Port-ASIC
• Show platform pm if-number shows this mapping
• Physical and ASIC port numbers may not match
• This command shows all members
interface gid gpn lpn port slot unit slun port-type lpn-idb gpn-idb
----------------------------------------------------------------------
Gi3/0/1 109 109 1 1/1 3 1 1 local Yes Yes
Gi3/0/2 110 110 2 1/0 3 2 2 local Yes Yes
Gi3/0/3 111 111 3 1/3 3 3 3 local Yes Yes
Gi3/0/4 112 112 4 1/2 3 4 4 local Yes Yes
Gi3/0/5 113 113 5 1/5 3 5 5 local Yes Yes
Gi3/0/6 114 114 6 1/4 3 6 6 local Yes Yes
Gi3/0/7 115 115 7 1/7 3 7 7 local Yes Yes
ASIC/Port
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
ASIC Issues: Port-ASIC Statistics
Provides overview of possible drops/issues on the switch
Local and Member switches
Switch# remote command 2 show controller ethernet-controller port-asic statistics
===========================================================================
Switch 2, PortASIC 0 Statistics
---------------------------------------------------------------------------
0 RxQ-0, wt-0 enqueue frames 0 RxQ-0, wt-0 drop frames
8811506 RxQ-0, wt-1 enqueue frames 0 RxQ-0, wt-1 drop frames
0 RxQ-0, wt-2 enqueue frames 0 RxQ-0, wt-2 drop frames
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
ASIC Issues: Egress Queue Drops
• Queue and weight are 0-based
• Tuning of buffers is only possible when QoS is enabled
• Drops on egress indicate oversubscription
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Switch Hardware Components: Port ASIC QoS
• Ingress QoS
• Egress QoS QoS
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Cisco Catalyst 3750 QoS Overview
Policer Marker
Queue 1
Policer Marker
StackWise
Queue 2
Classify Queue 1
Input
Traffic
SRR SRR
Queue 2 Queue 3
Policer Marker
Queue 4
Policer Marker
Egress Queue/
Ingress Queue/
Schedule
Classification Policing Marking Schedule
Congestion
Congestion
Control
• Inspect incoming • Compares incoming • Act on policer Control • Four SRR queues/port shared
• Two queues/port ASIC
packets traffic rate w/ decision or shaped servicing
shared servicing
• Assign QOS Label configured policer • Reclass or drop • One queue is configurable
• One queue is
to grouped packet and determine if out-of-profile for strict priority servicing
configurable for strict
• Use ACL, or other packet is IN or Out of • WTD for congestion
priority servicing
configuration to Profile. control (three thresholds
• WTD for congestion
determine QOS • Either aggregate or per queue)
control (three
labels individual flow basis
thresholds per queue) • Egress queue shaping
• 256 policers/ASIC • Egress port rate limiting
• SRR is performed
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Cisco Catalyst 2960S/X QoS Model
Policer Marker
Queue 1
Policer Marker
Queue 2
Classify
Input
Traffic
SRR
Queue 3
Policer Marker
Queue 4
Policer Marker
Egress Queue/
NO Schedule
Classification Policing Marking Congestion
Ingress Queues
Control
• Inspect incoming • Compares incoming • Act on policer • Four SRR queues/port shared
packets traffic rate w/ decision or shaped servicing
• Assign QOS Label configured policer • Reclass or drop • One queue is configurable
to grouped packet and determine if out-of-profile for strict priority servicing
• Use ACL, or other packet is IN or Out of • WTD for congestion
configuration to Profile. control (three thresholds
determine QOS • Either aggregate or per queue)
labels individual flow basis • Egress queue shaping
• 256 policers/ASIC • Egress port rate limiting
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
Why Ingress QoS ?
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
QoS Troubleshooting – Ingress
access Gi1/0/2 dot1q
3750
10000 IP packets
with DSCP 34
Ingress policer with
trust DSCP
dscp: incoming
-------------------------------
0 - 4 : 0 0 0 0 0
30 - 34 : 0 0 0 0 10000
...
Policer: Inprofile: 1467 OutofProfile: 8533
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Why Egress QoS? – Rate Transition
• Slower speed interfaces take longer to transmit packets
• Introduction of Gigabit servers pushes congestion to the edge
• QoS drops lowest priority packets
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Egress Queuing
Policer Marker
Queue 1
Policer Marker
StackWise
Queue 2
Classify Queue 1
Input
Traffic
SRR SRR
Queue 2 Queue 3
Policer Marker
Queue 4
Policer Marker
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
What is an Egress Queue-set
Buffer Allocation and Drop Strategy
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Mapping Classes to Egress Queues
• Maps available for DSCP and COS.
• 64 DSCP default values shown.
• Each DSCP value maps to an egress Queue, and threshold
• Queues range: 1-4, Threshold range:01-03
DSCP:0
Queue 4:
Threshold 3
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
QoS Troubleshooting - Ingress
access Gi1/0/2 Gi1/0/1 dot1q
Remember this from a 3750
few slides ago?? 10000 IP packets
with DSCP 34
Ingress policer with
trust DSCP
dscp: incoming
-------------------------------
0 - 4 : 0 0 0 0 0
30 - 34 : 0 0 0 0 10000
...
Policer: Inprofile: 1467 OutofProfile: 8533
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
QoS Troubleshooting - Egress
access Gi1/0/2 Gi1/0/1 dot1q
3750
10000 IP packets
with DSCP 34
Ingress policer with
trust DSCP
25 - 29 : 0 0 0 0 0
30 - 34 : 0 0 0 0 1467
<output removed>
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
QoS Troubleshooting – Egress (2)
access Gi1/0/2 Gi1/0/1 dot1q
3750
10000 IP packets
with DSCP 34
• 1467 packets were in profile and made it to the egress port but with DSCP 0 instead
of 34
• Possible reasons
– Attached service policy does not mark or trust dscp value
– Traffic is being routed via the CPU
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
QoS Troubleshooting – Egress Q Maps
100Mb/s 10Mb/s
3750
10000 IP packets
with DSCP 34 Gig 1/0/2 Gig 1/0/1
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
QoS Troubleshooting – Egress Queue Thresholds
100Mb/s 10Mb/s
3750
10000 IP packets
with DSCP 34 Gig 1/0/2 Gig 1/0/1
Switch# show mls qos interface Gig 1/0/1 statistics • 1080 packets will
dscp: outgoing
-------------------------------
egress on Q4,
0 - 4 : 0 0 0 0 0 threshold 1
30 - 34 : 0 0 0 0 1080
...
output queues enqueued:
• Remaining 8920 pkts
queue: threshold1 threshold2 threshold3 dropped because of
-----------------------------------------
queue 0: 2 0 0 congestion
queue 1: 0 6 260
queue 2: 0 0 0
queue 3: 1080 0 0
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
QoS Troubleshooting - Port-ASIC
100Mb/s 10Mb/s
3750
10000 IP packets
with DSCP 34
Gig 1/0/2 Gig 1/0/1
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
QoS Troubleshooting - Buffer Tuning
Tuning Buffers and Thresholds to fix Congestion
• Queue-sets define the buffer allocation
• Default values can be modified
• 2 Queue-sets are available
• Reserved - how many buffers will be reserved for this port
– Default Queue-set values listed below
Switch# show mls qos int gi1/0/1 buffers
GigabitEthernet1/0/1
The port is mapped to qset : 1 Identifies Queue-set assigned to interface
The allocations between the queues are : 25 25 25 25
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
Egress QoS Summary
• Packet drops don’t always indicate a problem
– For ex, Gigabit servers can easily oversubscribe 100M clients
– Most protocols react well to drop and will slow down so
maximum performance can be achieved
• Analyze traffic patterns
• Tune buffers as needed – increasing thresholds has minimal side effects
• Take advantage of both queue-sets
– E.g.: use Queue-set 1 on downlinks, Queue-set 2 on uplinks
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Switch Hardware Components: TCAM
Switch Fabric Stack PHY
TCAM Resources?
The TCAM stores Forwarding database
- IPv4, IPv6 and MAC addresses
ACLs
‒ Service policies and security
Multicast Addresses and Groups
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
TCAM: Switch Database Manager (SDM)
• SDM defines how TCAM Switch# show sdm prefer default
"desktop default" template:
resources are allocated The selected template optimizes the resources in
the switch to support this level of features for
• Changing SDM template requires 8 routed interfaces and 1024 VLANs.
reboot
number of unicast mac addresses: 6K
• All stack members must use same number of IPv4 IGMP groups + multicast routes: 1K
number of IPv4 unicast routes: 8K
SDM template number of directly-connected IPv4 hosts: 6K
Switch# show sdm prefer ? number of indirect IPv4 routes: 2K
access Access bias number of IPv4 policy based routing aces: 0
default Default bias number of IPv4/MAC qos aces: 0.5K
dual-ipv4-and-ipv6 Support both number of IPv4/MAC security aces: 1K
IPv4 and IPv6
routing Unicast bias
vlan VLAN bias
Switch# show sdm prefer dual-ipv4-and- List of available SDM Types
ipv6 ?
default Default bias
routing Unicast bias
vlan VLAN bias
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
TCAM Utilization Route entries not
forwarded in H/W
• TCAM space is limited Switch# show platform ip unicast failed route
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
TCAM Overload
• An error message will get generated
• Traffic forwarding will be done (partly) in Software
• CPU utilization will go up – packets punted to CPU for processing
%ACLMGR-4-UNLOADING: Unloading ACL input label 1 VLAN interfaces 101 IPv4/Mac feature
%ACLMGR-4-ACLTCAMFULL: ACL TCAM Full. Software Forwarding packets on Input label 1 on L3 L2
Switch# sh platform acl oacltcamfull
Vlan oacl_tcam_full_bitmap notify_apps
101 0x 0 NOT-FULL Means ACL Not Fully
Programmed in TCAM
Vlan ipv6_oacl_tcam_full_bitmap notify_apps
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
TCAM: Switch Database Manager (SDM)
• Strategies to choose SDM
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
TCAM Hardware Summary
• TCAM Partition based on SDM Template
• L2 and L3 overload of TCAM resource: punt to CPU
• Number of ACEs depend on
– Switch Model
– SDM Template – different Templates for Layer 3 capable switches
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Troubleshooting Power Over Ethernet
Two
Stack Stack
Switch Fabric PHY Cables
TCAM TCAM TCAM
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Troubleshooting PoE: PD Drawing Too Much Power
Imax or Tstart error reported when a PD misbehaves and draws more power
- Imax error is an operating fault and reported after PD power up
- Tstart is a start up fault before PD reported Power Good
%ILPOWER-3-CONTROLLER_PORT_ERR: Controller port error, Interface gig1/0/1 Power Controller reports power Imax error detected
%ILPOWER-3-CONTROLLER_PORT_ERR: Controller port error, Interface: Power Controller reports power Tstart error detected
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Troubleshooting PoE: Power given but Power Good
not reported
PD detected. Power was granted but the PD was not up
PD cannot provide MPS (10mA)
%ILPOWER-3-CONTROLLER_PORT_ERR: Controller port error, Interface Gi8/0/1: Power given, but Power Controller does not report Power
Good
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Troubleshooting PoE: Multiple ports not providing
power
Check the controller status and Error counters
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Troubleshooting PoE: Useful Debugs
Specific debug conditions can
be used where available
Switch#debug condition interface gigabitEthernet 1/0/15
Condition 1 set
Switch#debug ilpower powerman
ILPOWER powerman debugging is on
ilpower_power_assgn_handle_event: event 0,
pwr assign is done by proto CDP
Port Gi1/0/9: Selected Protocol CDP
Ilpower interface (Gi1/0/9) process tlv from cdp INPUT:
power_request_level[] = 12000 0 0 0 0
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Switch Hardware Components: Stacking
Only 3750, 3750G, 3750-E, 3750-X support stacking with StackWise or StackWise Plus
‒ C2960-S, C2960-SF, 2960-X, 2960-XR support FlexStack or FlexStack Plus
Conditions that can prevent a switch from joining a stack:
- Incompatible IOS Versions between the stack members
- Stack cable issue
- SDM Template mismatch
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Troubleshooting Stacks: Version Mismatch
• Software Version Mismatch
– IOS version of all stack switches (show version) should be either the same or
compatible
• Switches with different Major Version numbers
– Occurs on switch member addition, or RMA replacement
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Troubleshooting Stacks, Stack Cables
• A Switch can join a stack with only one Stackwise interface connected to
another active “stack member”.
• Important precautions for connecting Stackwise cables
- Retainer screws on the connector should not be loose
- Retainer screws on the connector should not be too tight
- Retainer screws should be tightened “finger tight” and no more
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Troubleshooting: Stack Commands
• Details on the stack ports, members 1 and 3 active
3750# show switch
Switch/Stack Mac Address : 001b.545f.2800
Mac persistency wait time: 4 mins
H/W Current
Switch# Role Mac Address Priority Version State
----------------------------------------------------------
*1 Master 001b.545f.2800 12 1 Ready
2 Member 0000.0000.0000 0 1 Provisioned
3 Member 001d.46be.7500 8 1 Ready
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
Server
Troubleshooting L2 Unicast Forwarding
• Symptom: Host cannot reach server
• Steps
– Layer 1 operational between host/Phone and switch?
– Switch receiving traffic on that interface?
– Congestion between host and switch?
Distribution
– MAC address learned? and Core
– MAC address of next hop correct?
– Spanning tree state forwarding?
– Other features preventing traffic flow?
C3750
• Errored packets on the interface
– Check HW programming
• Consider possibilities
• Create and execute action plan Host
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
L2 Forwarding: Troubleshooting - 1
Server
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
L2 Forwarding: Troubleshooting – 2
MAC Address Learning Server
• Step 4a: Verify if the Mac-address is correctly learned on the
port
Switch# show mac address-table interface gigabitEthernet 1/0/3
Mac Address Table
-------------------------------------------
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
L2 Forwarding: Troubleshooting – 3
Server
Spanning Tree
• Step 5: Spanning tree state forwarding in software?
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
L2: Mac-Address Disappears From a Port
• Check for spanning tree topology changes
• Does the link remain up?
Link down causes MAC
• Is it learned on another port? Addresses to be flushed
.
.
.
Number of topology changes 5 last change occurred 18:45:22 ago
from GigabitEthernet1/0/3
...
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
L2 Forwarding: Troubleshooting – 4
Server
Interface
• Step 6a: Check Interfaces for Error-Disabled
Switch# show interface status err-disabled
Nothing in list.
Switch#
No interfaces are Disabled
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 102
Checklist: Interface Troubleshooting
• Are packets being received?
• Is the expected Mac-address learned on another port?
• Check if dot1x is in use, if so, is the port authorized?
• Does port security allow more Mac-addresses?
• Is the port in spanning tree forwarding?
• Other features preventing traffic flow?
– ACLs
– PVLAN
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 103
Layer 3 IP Unicast Routing
• Use the switch to debug end to end IP issues
- Verify source reachability from the switch
- Verify destination reachability from the switch
- Verify hardware forwarding from source to destination (and back)
VLAN:101
IP: 100.1.1.1
3750 3750
Mac: 000f.f7e8.e042
3750
Gi1/0/1
Vlan:100 Source
Destination IP: 10.1.1.1 IP: 100.1.1.2
IP: 172.16.100.100 Mac :000f.f7e8.e041 Gi1/0/2 Mac: 0018.ba88.1fc1
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
L3: Verify Source Reachability
Troubleshooting Steps
• Source IP = 100.1.1.2
• PING the source
• PING the source with a loopback
• Verify the ARP table
• Verify the MAC table
VLAN:101
IP: 100.1.1.1
3750 3750
Mac: 000f.f7e8.e042
3750
Gi1/0/1
Vlan:100 Source
Destination IP: 10.1.1.1 IP: 100.1.1.2
IP: 172.16.100.100 Mac :000f.f7e8.e041 Gi1/0/2 Mac: 0018.ba88.1fc1
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 106
L3: Verify Source Reachability
3750# ping 100.1.1.2
==========================================
<output removed>
Output Packets:
========================================== Packet arriving on CPU queue 7
Egress: Asic 0, switch 2 (host) & 14 (dstats)
CPU queues: 7 14.
Source Vlan Id: Real 101, Mapped 9. L2EncapType 0, L3EncapType 0
portMap 0x0, non-SPAN portMap 0x0
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
L3: Verify Destination Reachability
Troubleshooting Steps
• Destination IP = 172.16.100.100
• Verify there is a route to the destination
• Verify there is a valid ARP for the next hop
• PING the destination (repeat w/ source VLAN as source address)
VLAN:101
IP: 100.1.1.1
3750 3750
Mac: 000f.f7e8.e042
3750
Gi1/0/1
Vlan:100 Source
Destination IP: 10.1.1.1 IP: 100.1.1.2
IP: 172.16.100.100 Mac :000f.f7e8.e041 Gi1/0/2 Mac: 0018.ba88.1fc1
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
L3: Verify Destination Reachability - 1
Resolved next hop
Switch# sh ip route 172.16.100.100 Show next hop to final destination
Routing entry for 172.16.100.0/24
Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 1
Last update from 10.1.1.2 on Vlan100, 00:08:54 ago
Routing Descriptor Blocks:
* 10.1.1.2, from 100.1.1.2, 00:08:54 ago, via Vlan100
Route metric is 20, traffic share count is 1
verify next hop is known
Switch# sh ip arp 10.1.1.2
Protocol Address Age (min) Hardware Addr Type Interface
Internet 9 0018.ba88.1fc1 ARPA Vlan100
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 110
L3: Verify Destination
Unresolved next hop
Reachability - 2
Show next hop to final destination
Switch# sh ip route 172.16.100.0 255.255.255.0
Routing entry for 172.16.100.0/24
Known via "static", distance 1, metric 0
Routing Descriptor Blocks:
* 10.13.13.3
Route metric is 0, traffic share count is 1
Switch# sh run | include ip route verify next hop is known: FAIL
ip route 172.16.100.0 255.255.255.0 10.13.13.3
Switch# sh ip arp 10.13.13.3
Protocol Address Age (min) Hardware Addr Type Interface
Internet 0 Incomplete ARPA
Switch# sh platform ip unicast failed arp ARP Throttle Queue trying to resolve next
Total of 1 arp entries waiting on ARP-HRPC ThrottleQ
======================== hop
ARP throttled IP Address
========================
10.13.13.3/32 Table:0
-------------------------
Switch# sh platform ip unicast route 172.16.100.0 255.255.255.0
Fib 172.16.100.0/24 Tbl:0 Bucket:0
IOS Path 0 Spl Adj glean
HL3UFlags:0x80
SFT Entry:hdl:0xA5 HwFL:0x4 Adjacency not programmed in ASIC
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
L3: Verify Hardware Forwarding
Show platform forward to verify HW programming
Ingress:
Don’t use vlan option if it is not a trunk
Global Port Number: 5, lpn: 5 Asic Number: 1
Dropping the frame due to VLAN mode filtering,(allowDotOneQ) is set to FALSE
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
Agenda
• Product Overview
• Areas of Troubleshooting
– PHY Local Link 3750-X 3750v2
– CPU
– Memory
3560X
– Port ASIC
– Power Over Ethernet
– Stacking
2960-X/2960-XR
– Advanced Examples
3560E
• Tools and Tricks
2960-S/SF Compact
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 114
Tools and Tricks
• Enable NTP to troubleshoot across switches
• Include date and time for debug and log messages
– service timestamps log [datetime|uptime] localtime msec show-timezone
– service timestamps debug [datetime|uptime] localtime msec show-timezone
• Session to another switch member
– C3750#session <member #>
– C3750#remote command <1-9|all> “IOS command”
• Automate successful troubleshooting using EEM
• Review open caveats sections in release notes
• Search Bug Toolkit for known issues
• Reference Output Interpreter to decode command output
• Reference System Message Guide for mitigation recommendations
• Check the documentation and online guides
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
References
• Troubleshooting Catalyst 3750:
http://www.cisco.com/en/US/products/hw/switches/ps5023/prod_troubleshooting_guides_list.html
Want to learn more? Check out CCNP Practical Studies: Troubleshooting by Donna Harrington.
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 116
Cisco Beyond - Product Extension Community
EEM Scripting Community
• Open source scripts,
share, upload,
download, learn by
example
• Categories include:
Network Management,
Diagnostics, Routing,
QoS, High availability,
User interface, Security
• Comments, ratings,
community managed
forum
http://cisco.com/go/ciscobeyond
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
Complete Your Online Session Evaluation
• Give us your feedback and you
could win fabulous prizes. Winners
announced daily.
• Complete your session evaluation
through the Cisco Live mobile app
or visit one of the interactive kiosks
located throughout the convention
center.
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
Continue Your Education
• Demos in the Cisco Campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
BRKCRS-3141 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 119