Professional Documents
Culture Documents
Extreme Networks, Inc. confidential and proprietary information -BETA ~FOR REVIEW ONLY.
Network Troubleshooting
Network Troubleshooting
Student Objectives
The Network Troubleshooting module presents a systematic methodology for troubleshooting, how to
use the troubleshooting tools provided in Extreme Networks switches, and examples of how to
identify common problems.
Upon completion of this module, the successful student will be able to:
Identify the network management and maintenance systems that prepare you to troubleshoot.
Describe the Layered approach used for troubleshooting.
Identify useful commands for finding errors in OSI Layers 1, 2, and 3.
Interpret the output of show commands.
Use correct to troubleshoot problems at the Physical, Data lLnk, and Network Layers.
List the steps required to perform systematic troubleshooting.
Network Troubleshooting
Overview
Troubleshooting is an important part of managing any network. This module presents a systematic
methodology for troubleshooting, how to use the troubleshooting tools provided in Extreme Networks
switches, and examples of how to identify common problems. The troubleshooting topics covered are:
Gathering and maintaining the information and tools necessary to troubleshoot the network.
Organizing and applying a systematic method of troubleshooting.
Identifying which commands to use to systematically troubleshoot a network problem.
Describing how to test the operation of the switch.
Figure 2: Overview
Network Troubleshooting
Create an illustration of the physical connectivity of the network and keep it up to date.
Create an illustration of the logical connectivity of the network and keep it up to date.
Set up a Trivial File Transfer Protocol (TFTP) server and:
Store copies of the active and previous versions of configuration files and images, so that you can
access them quickly in case of a problem.
Make a backup of each switch configuration file on a regularly scheduled basis.
Update the stored configuration files after making any changes.
Maintain a contact list of vendors, service suppliers, and users to inform in case of a problem.
Store user names, passwords, and SNMP default community strings in a safe location and change (or
remove) the default values.
Record network problems, symptoms, and the resolution.
Provide a clean/conditioned environment (air and power) and provide proper ventilation depending
on hardware type. Some switches vent front to rear, others vent side to side.
Software maintenance such as installing new software to fix bugs or add functionality.
Network Troubleshooting
If you experience problems in the communication between devices, check the communication layer by
layer, using the per layer description as presented on the following pages.
When, for example, an end-to-end ping test at Layer 3 succeeds, the problem is probably in the upper
four layers.
Network Troubleshooting
Your hardware may be faulty. Defective hardware can be cables, patch panels, connectors, ports, or
modules in your switch.
Verify that the correct cables are being used. Distinguish between:
Media type
Length of the cables
Crossover cable versus straight cable
Verify that the port settings are configured properly. Common problems in port settings are:
Network Troubleshooting
The port LEDs provide an indication of the port status. Always verify the status of the port link
integrity by checking port activity LEDs when troubleshooting Layer-1 problems.
The description for port LEDs in Table 1 applies to ports on the SummitX450 family of switches,
Summit 200, Summit 300, and Summit 400.
The description for port LEDs in Table 2 applies to ports on the BlackDiamond 8800 series,
BlackDiamond 10808 series, and non-POE ports on the BlackDiamond 12804 switches.
The description for port LEDs in Table 3 applies to POE ports with power enabled on the
BlackDiamond 12804.
Network Troubleshooting
The illustration lists the most common show commands used to troubleshoot Layer 1.
Network Troubleshooting
Compare the port settings with the port settings at the other end of the cable.
Primary Information
The show ports configuration command output shows:
Use this information to verify that the intended ports are enabled and active. Also verify that the speed,
duplex and frame size match the other end of the physical connection.
If the switch supports user-created virtual routers, verify that the VLAN router interface (RIF) is
configured on the correct VR. Commands such as ping and tftp require that the VR be specified.
The media type is useful if redundant fiber and copper ports are supported. In this case the port
number is the same for both the fiber and copper port.
Secondary Information
The command output also shows:
If Ethernet flow control is enabled symmetrically in both directions (SYM) or asymmetrically in one
direction (ASYM). 10Gbps ports can be asymmetrical.
The load sharing master port if the port is participating in link aggregation.
Use this information to verify that the port settings and media type match the other end of the physical
connection.
Network Troubleshooting
The display automatically refreshes unless you enter the no-refresh option.
Transmit Collisions (TX Coll). The total number of collisions seen by the port, regardless of whether
a device connected to the port participated in any of the collisions.
Transmit Late Collisions (TX Late Coll). The total number of collisions that have occurred after the
ports transmit window has expired. This could indicate a duplex setting mismatch.
Transmit Deferred Frames (TX Deferred). The total number of frames that were transmitted by the
port after the first transmission attempt was deferred by other network traffic.
Transmit Errored Frames (TX Errors). The total number of frames that were not completely
transmitted by the port because of network errors (such as late collisions or excessive collisions).
Transmit Lost Frames (TX Lost). The total number of transmit frames that did not get completely
transmitted because of buffer problems (FIFO underflow).
Transmit Parity Frames (TX Parity). The bit summation has a parity mismatch.
The display automatically refreshes unless you enter the no-refresh option.
Receive Bad CRC Frames (RX CRC). The total number of frames received by the port that were of
the correct length, but contained a bad FCS value. This could indicate a duplex setting mismatch.
Receive Oversize Frames (RX Over). The total number of good frames received by the port greater
than the supported maximum length of 1,522 bytes.
Receive Undersize Frames (RX Under). The total number of frames received by the port that were
less than 64 bytes long.
Receive Fragmented Frames (RX Frag). The total number of frames received by the port were of
incorrect length and contained a bad FCS value.
Receive Jabber Frames (RX Jabber). The total number of frames received by the port that was of
greater than the support maximum length and had a Cyclic Redundancy Check (CRC) error.
Receive Alignment Errors (RX Align). The total number of frames received by the port that occurs if
a frame has a CRC error and does not contain an integral number of octets. In a shared Ethernet
environment, alignment errors could be the result of collisions.
Receive Frames Lost (RX Lost). The total number of frames received by the port that were lost
because of buffer overflow in the switch.
Network Troubleshooting
The display i automatically refreshes unless you enter the no-refresh option.
Collisions are primarily a problem for shared Ethernet environments where devices are attached
through a hub. If you see collisions, it indicates that someone has connected a hub to the switch port.
Network Troubleshooting
This command gives a good indication of the port load. You can toggle with the space bar between
packet-per-second, bytes-per-second, and bandwidth utilization. You can use this information to decide
when to invest in extra link capacity to avoid congestion in the network.
NOTE
Monitoring port utilization is useful to identify broadcast storms caused by loops in the network. Extreme Networks
switches are non-blocking. So, links can be utilized up to 100% when there is a network loop.
Enter the show ports utilization command then toggle the screen using the space bar.
The first two screens in the output show transmitted and received port byte and packet counts.
Use this information to verify that the ports are passing traffic.
The third screen in the show port utilization command output shows:
Use this information to verify that there are no Layer 2 forwarding loops.
If a Layer 2 forwarding loop exists it causes a broadcast storm. During a broadcast storm the port
utilization reaches close to 100%.
NOTE
If a forwarding loop exists it may take a minute of two for the utilization to show the broadcast storm.
Network Troubleshooting
The EDP interval timers and which ports have EDP enabled.
Use this information to verify that EDP is enabled on the intended ports and that the timers match the
other end of the connection.
Use this information to verify that the ports are physically connected to the intended EDP neighbor on
the intended port.
You can also use this command for troubleshooting at the Data Link Layer because you get information
about the VLANs on all neighboring switches.
Figure 13: Displaying Connectivity and Configuration Information for Neighboring EXTR Switches
Network Troubleshooting
This might be because of misconfiguration, but often this mismatch occurs because of problems with the
auto negotiation of link speed and duplex mode. The easiest command to verify this error is the show
ports configuration command.
This command displays the configured and actual link speed and duplex mode of all ports.
There appears to be a duplex mismatch between the two switches. One is forwarding frames in full
duplex mode and the other one is running in half duplex mode.
In the example, Switch 1 is configured for autonegotiation off with speed set at 100mb and duplex at
full. Switch 2 is left in autonegotiation. Switch 2 is unable to negotiate its duplex setting as Switch 1 has
auto disabled.
Port Number
Link Status The current status of the link. Options are:
Ready (the port is ready to accept a link)
Active (the link is present at this port)
Disabled (D): The link is disabled at this port.
Not Present (NP): The link is not present at this port.
Transmit Collisions: The total number of collisions seen by the port, regardless of whether a device
connected to the port participated in any of the collisions.
Transmit Late Collisions: The total number of collisions that have occurred after the ports transmit
window has expired.
Transmit Deferred Frames: The total number of frames that were transmitted by the port after the
first transmission attempt was deferred by other network traffic.
Transmit Errored Frames: The total number of frames that were not completely transmitted by the
port because of network errors (such as late collisions or excessive collisions).
Transmit Parity Frames: The bit summation has a parity mismatch.
Network Troubleshooting
Use sh ports txerrors and sh ports rxerrors to examine the port error statistics.and identify if there are
incorrect settings. A duplex mismatch causes transmission errors to occur on both sides.
Switch 1 Symptoms
On a switch running in half duplex mode there are a lot of late collisions. The attached switch, running
in full duplex mode, sends frames if there is data. It does not detect if the media is already used or not.
Switch 2 Symptoms
On a switch running in full duplex mode there are a lot of CRC errors. The attached switch, running in
half duplex mode stops transmitting data each time it detects a collision.
The illustration shows Switch 1, running in half duplex mode and Switch 2, running in full duplex
mode. The show ports txerrors command on Switch 1 shows the late collisions. The show ports rxerrors
command on Switch 2 shows the CRC errors.
Port Number
Link Status The current status of the link. Options are:
Ready (the port is ready to accept a link)
Active (the link is present at this port)
Disabled (D): The link is disabled at this port.
Not Present (NP): The link is not present at this port.
Receive Bad CRC Frames (RX CRC)The total number of frames received by the port that were of
the correct length, but contained a bad FCS value.
Receive Oversize Frames (RX Over)The total number of good frames received by the port greater
than the supported maximum length of 1,522 bytes. Ports with jumbo frames enabled do not
increment this counter.
Receive Undersize Frames (RX Under)The total number of frames received by the port that were
less than 64 bytes long.
Receive Fragmented Frames (RX Frag)The total number of frames received by the port were of
incorrect length and contained a bad frame check sequence (FCS) value.
Receive Jabber Frames (RX Jabber)The total number of frames received by the port that were
longer than the supported maximum length and had a Cyclic Redundancy Check (CRC) error.
Receive Alignment Errors (RX Align)The total number of frames received by the port that have a
CRC error and do not contain an integral number of octets.
Receive Frames Lost (RX Lost)The total number of frames received by the port that were lost
because of buffer overflow in the switch.
Network Troubleshooting
Protocol mismatches
MAC addressing problems
VLAN problems
There might be protocol mismatches because of the different L2 protocols. The Ethernet II, IEEE 802.3,
or LLC/SNAP setting causes the mismatch. A wrong driver in the Ethernet card of a server might also
cause a mismatch.
Addressing Problems
Start by answering the following questions to identify possible addressing problems.
Are there permanent MAC address entries in the Forwarding Database (FDB) which are incorrect?
Is there a faulty or misconfigured device that is responding to all ARP requests on a VLAN or network?
VLAN Problems
When troubleshooting VLAN problems verify the following:
Network Troubleshooting
The output from the show commands is later used to identify the problem in the Layer 2 example. Each
relevant field in the output of the command is explained.
Network Troubleshooting
Primary Information
The show fdb command output shows:
The Ethernet MAC address of each learned device in the forwarding database.
The VLAN where the device is a member.
The port where the address is learned.
For troubleshooting Layer 2 and Layer 3, verify that the expected device is physically connected to the
port and is being learned. Verify that the entry is not a black hole. Verify that the device is in the right
VLAN(s).
Secondary Information
The show fdb command output also shows:
Network Troubleshooting
This command displays the same information as for an individual VLAN, but shows every VLAN, one-
by-one.
For Layer 1 problems verify that the port is enabled, active, and part of or not part of a load sharing
group.
enabled.
correctly tagged.
not in loopback mode.
For Layer 3 related problems, verify that the following are correct:
The IP address
The subnetwork mask
The virtual router
Network Troubleshooting
Network Troubleshooting
To find the problem, check the VLAN configuration on both switches and compare them using the
show vlan command.
The show vlan <vlan name> command helps to find this error. This command shows if the ports are
added tagged or untagged.
The tag is the same on both switches, so this is not the problem. However, on Switch 1 the port is
tagged, while on Switch 2 the port is untagged. To solve the problem, change the configuration on one
of the two switches.
Enter the following command on Switch 2 to solve the problem and provide connectivity between the
two switches:
NOTE
In general it is recommended to always add ports tagged on inter-switch links so the port can be used as 802.1Q
links for several VLANs.
Network Troubleshooting
Routing Advertisements
Extreme Networks switches do not advertise routes if:
In a stable network, the Link State Database (LSDB) does not change much. One way to determine
whether the entries in the LSDB are changing is to monitor the checksum and SPF runs. These are
indicators of how often the LSDB is changing. Using multiple areas there are issues concerning stub
area, NSSA, default routes, ABRs, ASBRs, Virtual Links, Route Summarization and more.
Network Troubleshooting
show vlan
Network Troubleshooting
The interface is up
IP forwarding is enabled
The interface has the correct IP address and subnetwork mask
Network Troubleshooting
The show iproute command is one of the most important commands for Layer 3 troubleshooting.
Use this information to verify that the destination network is in the routing table. Verify where the
packets are being sent for a particular designating network. Also verify if the preferred route is the
intended route.
Verify that directly connected routes are used when IP route sharing is enabled using Equal Cost Multi
Paths (ECMP). For IP route sharing directly connected routes should be used instead of other routes
even if the cost or relative route priority of the other routes has been manually set.
Network Troubleshooting
Use this information to determine the source of the routes. This is useful for the following:
Directly connected routes have a relative route priority of 10. Relative route priorities cannot be
modified for directly connected routes or black hole routes.
Redistribution of RIP routes into OSPF results in OSPF external routes with different relative priorities.
Network Troubleshooting
Use this information to verify that each IP address is mapped to a single MAC address. Duplicate IP
addresses are a very common misconfiguration and can cause confusing and unpredictable behavior.
Also verify that the IP address is in the correct VLAN and subnetwork.
To remove dynamic entries in the IP ARP table, enter the following command:
Secondary Information
The show iparp command output also shows:
Use this information to verify that ARP is working correctly. Also use this to verify that the proxy ARP
server is responding
Network Troubleshooting
show ospf
If OSPF is enabled.
The router ID and whether it is configured or automatically assigned.
If this router is an ASBR or ABR.
If the router ID is configured or automatically selected.
Use this information to verify that OSPF is enabled globally on the router. Verify that the router ID is
configured correctly. Virtual links are built using the router ID.
Verify that the router is acting in the right role: Interior Router, ABR, or ASBR.
Network Troubleshooting
show rip
Network Troubleshooting
Use the IP statistics to identify if errors are causing the protocol to fail or converge slowly.
Use the ICMP statistics to identify if errors are causing connectivity failures or ping response failures.
Use the ICMP statistics to identify if errors are causing problems with multicast routing.
The second part of the display shows the following for each VLAN:
Network Troubleshooting
Ping is a good tool to use to start troubleshooting. If ping responses are returned, the problem must be
at a higher layer. If the ping response is not received, the problem is at the IP or lower layer.
NOTE
If you ping a router interface that is configured on the switch and is active, the router interface responds with an
ICMP echo response even if IP forwarding is not enabled. The router interface may appear to be forwarding packets
at Layer 3 even though it is not.
Use the following command to trace the routed path between the switch and a destination end station.:
traceroute {vr <vrid>} {ipv4 <host>} {ipv6 <host>} {ttl <number>} {from
<from>} {[port <port>] | icmp}
Traceroute works by increasing the time-to-live (TTL) value of each successive batch of packets sent. The
first three packets have a time-to-live (TTL) value of one (implying that they make a single hop). The
next three packets have a TTL value of 2, and so on.
The command output shows the IP address of the router interface at each hop along the path and the
number of milliseconds it took each packet to receive an ICMP time exceeded packet from that
interface.
The command continues until the user enters Ctrl-c to abort the command.
The traceroute command is useful in determining which route a packet is taking when multiple routes
are available. It is also useful in identifying firewalls that may be blocking access to a site.
If the trace route cannot reach the destination the command output identifies the reason why. Failures
can be due to:
ICMP network unreachable - There is no path. Check the routing table. Make sure you are specifying
the correct virtual router. Make sure IP forwarding is enabled.
ICMP host unreachable - Make sure you entered the correct address.
ICMP fragmentation needed - The packet is too large for one of the routers in the path. Enable
fragmentation on the router.
ICMP source route failed - Source routing is seldom supported.
Transmit error - Check IP statistics for IP errors.
Network Troubleshooting
A closer look shows that network 10.1.1.0/24 is not in the routing table of Switch 2 although RIP is
running. But even if we configure a static route for network 10.1.1.0/24 on Switch 2 there is still no
connectivity. Of course configuring a static route would not be a proper solution, even if it helps,
because we are running a dynamic routing protocol.
A typical example for a Layer 3 error is that IP forwarding is not enabled for one VLAN.
The next step is to take a closer look at Switch 1's IP configuration, perhaps IP forwarding is not
enabled for all VLANs. There are several commands to use to display the relevant information.
show ipconfig
The first screen displays the switch global settings. You see that IP routing and RIP are enabled
correctly. However, this does not mean that these functions are enabled for all VLANs.
NOTE
Part of the output is omitted from the illustration. What is displayed is the overview of the router interfaces because
that is what is important.
For the VLAN named noipforwarding, IP forwarding is not enabled. This is the cause of the problem.
Network Troubleshooting
show vlan
When you use this command without any parameters it displays an overview of the configured VLANs.
The display includes the most important IP information.
The flags indicate that IP forwarding is not enabled for the VLAN noipforwarding and the error is
detected.
Network Troubleshooting
A timestamp of when the event was logged. The timestamp is in the format yyyy-mm-dd,hh:mm:ss.
The sending devices IP Address.
The IP Port number used for receiving log messages. 23 in the example.
The Syslog Facility number 7 is the default for Extreme Networks devices.
The protocol reporting the event. OSPF in this example.
The actual log message.
Start Intra area SPF Area 0.0.0.10. (The start of an SPF run for area 0.0.0.10)
Start IntraArea Route Table update SPF area 0.0.0.10. (The start of the routing table update)
End IntraArea Route Table update area 0.0.0.10. (The end of the routing table update)
End IntraArea Spf Area 0.0.0.10. (The end of an SPF run for area 0.0.0.10)
NOTE
The syslog facility can be set to different values and the receiver can be programmed to treat different facility levels
uniquely.
Network Troubleshooting
The show log command displays the messages stored in either the internal memory buffer or in
NVRAM depending on the switch type. The messages shown can be limited by specifying a severity
level, a time range, or a match expression.
Timestamp - records the month and day of the event, along with the time (hours, minutes, seconds,
and hundredths of a second).
Severity Level - indicates the urgency of a condition reported in the log. Severities include critical,
error, warning, notice, info, debug-summary, debug-verbose, and debug-data.
Component, Subcomponent, and Condition Name - describes the subsystem in the software that
generates the event. This provides a good indication of where a fault might be.
Message - A description of the event occurrence. If the event was caused by a user, the user name is
also provided.
Use this information in the same way you use the remote system logging messages.
Logging of CLI configuration commands must be enabled. The switch log overwrites existing log
messages in a wrap-around memory buffer.
Network Troubleshooting
Network Troubleshooting
Network Troubleshooting
Network Troubleshooting
Consider Escalation
Before you escalate a problem to technical support, ask the following:
Escalation is most effective when you provide a good problem description and all relevant network
information.
http://www.extremenetworks.com/services/tac-userguide.aspx
Prior to contacting Extreme Networks upgrade to the newest software release and gather the following:
NOTE
The output of the show tech-support command can be very large.
Network Troubleshooting
Testing Theories
A layered approach gives structure to the troubleshooting process.
Use the OSI model to structure tests. Start testing the lowest layers first then work your way up the OSI
layers.
For example:
1. Use the link Light Emitting Diodes (LEDs), show commands, and physical inspection of the cable to
verify the Physical Layer.
2. Use the ping command, verify packet statistics, verify port settings, and verify the Layer 2
forwarding tables to test Layer 2 connectivity.
3. Use an end-to-end ping command, trace route command, verify Layer 3 configuration, and verify
Layer 3 route tables to test Layer 3 connectivity.
4. Use telnet, FTP, TFTP, or SNMP to test protocols at the top three layers.
If the test results do not support the theory, go back to test a new theory.
Network Troubleshooting
Implementing a Solution
After you implement a solution, recheck the status of the problem. If there is still a problem, go back to
either gather more information or test different theories.
The symptoms of the problem might change as a result of the troubleshooting actions. If you did not
improve the situation by your action, consider undoing the actions.
Network Troubleshooting
If you make configuration changes, make sure that you update the network documentation so it is
accurate.
The symptoms
The time and frequency of the symptom
The theories tested
The theory that proved to be correct
The solution
The root cause of the problem
Any ideas of how to recognize this same problem in the future
How long the network operation was affected
Network Troubleshooting
Summary
The Network Troubleshooting module presented basic concepts of network troubleshooting and the
show commands to use.
Identify the network management and maintenance systems that prepare you to troubleshoot.
Define the steps to perform systematic troubleshooting.
Describe the layered approach for troubleshooting.
Identify useful commands for finding errors at each layer.
Interpret the output of show commands.
Troubleshoot problems at the Physical, Data Link, and Network Layers.
Network Troubleshooting