Troubleshooting - Layer 2 Network (V600R003C00 - 02)

HUAWEI NetEngine80E/40E Router
V600R003C00
Troubleshooting - Layer 2 Network

Issue
02
Date
2011-09-10
HUAWEI TECHNOLOGIES CO., LTD.
Copyright Huawei Technologies Co., Ltd. 2011. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or representations
of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute the warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.

Address:
Huawei Industrial Base

Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website:
http://www.huawei.com
Email:
support@huawei.com
Issue 02 (2011-09-10)
Huawei Proprietary and Confidential

Copyright Huawei Technologies Co., Ltd.

About This Document
About This Document

Purpose
NOTE
l This document takes interface numbers and link types of the NE40E-X8 as an example. In working
situations, the actual interface numbers and link types may be different from those used in this
document.
l On NE80E/40E series excluding NE40E-X1 and NE40E-X2, line processing boards are called Line
Processing Units (LPUs) and switching fabric boards are called Switching Fabric Units (SFUs). On
the NE40E-X1 and NE40E-X2, there are no LPUs and SFUs, and NPUs implement the same functions
of LPUs and SFUs to exchange and forward packets.
This document describes how to troubleshoot the services of the HUAWEI NetEngine80E/
40E in terms of common faults and causes, troubleshooting cases, and FAQs.
This document describes the procedure and method for troubleshooting for the HUAWEI
NetEngine80E/40E.
Related Versions
The following table lists the product versions related to this document.
Product Name
Version
HUAWEI NetEngine80E/40E
Router
V600R003C00
Intended Audience
This document is intended for:
l
System maintenance engineers
Commissioning engineers
Network monitoring engineers
Issue 02 (2011-09-10)

ii

About This Document
Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol
Description
DANGER
WARNING
CAUTION
Indicates a hazard with a high level of risk, which if not

avoided, will result in death or serious injury.
Indicates a hazard with a medium or low level of risk, which
if not avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation, which if not
avoided, could result in equipment damage, data loss,
performance degradation, or unexpected results.
TIP
Indicates a tip that may help you solve a problem or save

time.
NOTE
Provides additional information to emphasize or supplement

important points of the main text.
Command Conventions
The command conventions that may be found in this document are defined as follows.
Issue 02 (2011-09-10)
Convention
Description
Boldface
The keywords of a command line are in boldface.
Italic
Command arguments are in italics.
[]
Items (keywords or arguments) in brackets [ ] are optional.
{ x | y | ... }
Optional items are grouped in braces and separated by

vertical bars. One item is selected.
[ x | y | ... ]
Optional items are grouped in brackets and separated by

vertical bars. One item is selected or no item is selected.
{ x | y | ... }*
Optional items are grouped in braces and separated by

vertical bars. A minimum of one item or a maximum of all
items can be selected.
[ x | y | ... ]*
Optional items are grouped in brackets and separated by

vertical bars. Several items or no item can be selected.
&<1-n>
The parameter before the & sign can be repeated 1 to n times.
A line starting with the # sign is comments.

iii

About This Document
Change History
Changes between document issues are cumulative. The latest document issue contains all the
changes made in earlier issues.
Changes in Issue 02 (2011-09-10)

The second commercial release. There is no update compared with the previous issue.
Changes in Issue 01 (2011-05-30)

Initial field trial release.
Issue 02 (2011-09-10)

iv

Contents
Contents
About This Document.....................................................................................................................ii
1 QinQ Troubleshooting.................................................................................................................1
1.1 Related Troubleshooting Cases..........................................................................................................................2
1.1.1 Non-Huawei Devices Cannot Ping the User Gateway Address Configured on the BRAS Attached to the
router...................................................................................................................................................................2
1.1.2 Some DHCP Clients Cannot Obtain IP Addresses from the DHCP Server When the DHCP Relay Agent
Is Configured with a Sub-Interface for Dot1q VLAN Tag Termination............................................................6
1.1.3 Network Segments Cannot Successfully Ping Each Other In Dot1q Termination Access.......................7
2 HGMP Troubleshooting............................................................................................................10
2.1 A Candidate Switch Directly Connected to the Administrator Switch Cannot Be Added to the Cluster........11
2.1.1 Common Causes......................................................................................................................................11
2.1.2 Troubleshooting Flowchart......................................................................................................................11
2.1.3 Troubleshooting Procedure......................................................................................................................13
2.1.4 Relevant Alarms and Logs......................................................................................................................16
3 Ethernet OAM Troubleshooting..............................................................................................18

3.1 Ethernet OAM 802.1ag Trace Fails..................................................................................................................19
3.1.1 Common Causes......................................................................................................................................19
4 MSTP Troubleshooting..............................................................................................................23
4.1 MSTP Topology Change Leads to Service Interruption..................................................................................24
4.1.1 Common Causes......................................................................................................................................24
5 RRPP Troubleshooting...............................................................................................................31
5.1 RRPP Loop Occurs Temporarily......................................................................................................................32
5.1.1 Common Causes......................................................................................................................................32
Issue 02 (2011-09-10)


Contents
6 PPP Troubleshooting..................................................................................................................35
6.1 Protocol Status of a PPP Interface Is Down.....................................................................................................36
6.1.1 Common Causes......................................................................................................................................36
6.2 Related Troubleshooting Cases........................................................................................................................41
6.2.1 Half of Packets Are Discarded the First Minute After an MP-Group Link Fails....................................42
6.2.2 Incorrect Configurations of Member Interfaces Cause the MP-group Link to Work Unstably..............43
6.3 ATM Troubleshooting......................................................................................................................................45
6.4 The Ping Fails on Both Ends of an ATM Network..........................................................................................45
6.4.1 Common Causes......................................................................................................................................45
6.5 Troubleshooting Cases.....................................................................................................................................47
6.5.1 OSPF Neighbor Relationship Cannot Be Set Up Between P2P ATM Sub-interfaces............................47
Issue 02 (2011-09-10)

vi

1 QinQ Troubleshooting
QinQ Troubleshooting
About This Chapter

1.1 Related Troubleshooting Cases
Issue 02 (2011-09-10)



1.1.1 Non-Huawei Devices Cannot Ping the User Gateway Address
Configured on the BRAS Attached to the router
Users are connected to non-Huawei devices through the BRAS and the router. User can configure
a sub-interface for dot1q VLAN tag termination on the router to run unicast routing protocols.
vid specified in the control-vid command must be the same as vid specified in the dot1q
termination vid command. Otherwise, non-Huawei devices cannot ping the user gateway
address configured on the BRAS attached to the router.
Fault Symptom
On the network shown in Figure 1-1, a CE accesses an external network through the BRAS,
PE, and NPEs. After ACL rules and a routing protocol on a sub-interface for dot1q VLAN tag
termination are configured on the PE, the PE can successfully ping the user gateway address of
the BRAS; the PE can successfully ping the user gateway address of the BRAS by using the
address of the interface connecting to an NPE as the source address; none of the NPEs can ping
the user gateway address of the BRAS.
NOTE
The PE is Huawei device; NPEs are non-Huawei devices.
Issue 02 (2011-09-10)


Figure 1-1 Networking diagram
MPLS/IP
Core
NPE2
NPE3
NPE1
NPE4
GE1/0/3
GE1/0/4
GE1/0/2
4.4.4.4/24
GE1/0/5
PE
GE1/0/1.1
2.1.1.2/30
Data flow
GE1/0/2
2.1.1.1/30
Gateway address
1.1.1.1/30
BRAS
GE1/0/1
VLAN
4000
CE
Fault Analysis
1.
After using an NPE to ping the address of the upstream interface GE 1/0/2 on the BRAS
and the user gateway address of the BRAS by using the external network address as the
source address, do as follows:
l Run the display interface command on each of the four upstream interfaces of the PE.
The command output shows that each upstream interface has received ping packets and
forwarded the packets through the downstream interface GE 1/0/1. Take the display on
the upstream interface GE 1/0/2 of the PE as an example.
<PE> display interface gigabitethernet 1/0/2
GigabitEthernet1/0/2 current state : UP
Line protocol current state : UP
Last line protocol up time : 2010-05-31 11:57:21 UTC-08:00
Issue 02 (2011-09-10)


Description:HUAWEI, GigabitEthernet1/0/2 Interface

Route Port,The Maximum Transmit Unit is 1500
Internet Address is 4.4.4.4/24
IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is
00e0-5c5e-8600
QoS max-bandwidth : 1000000 Kbps
Output queue : (Urgent queue : Size/Length/Discards) 0/50/0
Output queue : (Protocol queue : Size/Length/Discards) 0/1000/0
Output queue : (FIFO queue : Size/Length/Discards) 0/256/0
Last physical up time
: 2010-05-31 11:57:20 UTC-08:00
Last physical down time : 2010-05-31 11:53:57 UTC-08:00
Current system time: 2010-05-31 12:05:46-08:00
Hardware address is 00e0-5c5e-8600
Last 300 seconds input rate 0 bytes/sec, 0 packets/sec
Last 300 seconds output rate 2 bytes/sec, 0 packets/sec
Input: 2768 bytes, 31 packets
Output: 4134 bytes, 46 packets
Input:
Unicast: 16 packets, Multicast: 15 packets
Broadcast: 0
Output:
Broadcast: 1 packets
Input bandwidth utilization : 0.00
Output bandwidth utilization : 0.00
Take the display on the downstream interface GE 1/0/1 of the PE as an example.

<PE> display interface gigabitethernet 1/0/1
GigabitEthernet1/0/1 current state : UP
Line protocol current state : DOWN
Description:HUAWEI, GigabitEthernet1/0/1 Interface
Route Port,The Maximum Transmit Unit is 1500
Internet protocol processing : disabled
IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is
00e0-7b2c-8200
QoS max-bandwidth : 1000000 Kbps
Output queue : (Urgent queue : Size/Length/Discards) 0/50/0
Output queue : (Protocol queue : Size/Length/Discards) 0/1000/0
Output queue : (FIFO queue : Size/Length/Discards) 0/256/0
Last physical up time
: 2010-05-31 14:18:20 UTC-08:00
Last physical down time : 2010-05-31 13:47:43 UTC-08:00
Current system time: 2010-05-31 14:23:27-08:00
Hardware address is 00e0-7b2c-8200
Last 300 seconds input rate 2 bytes/sec, 0 packets/sec
Last 300 seconds output rate 0 bytes/sec, 0 packets/sec
Input: 2854 bytes, 32 packets
Output: 4008 bytes, 45 packets
Input:
Broadcast: 1
Output:
Broadcast: 0 packets
Input bandwidth utilization : 0.00
Output bandwidth utilization : 0.00
In the command output, Input indicates the total number of packets and bytes received
on the interface; Output indicates the total number of packets and bytes sent by the
interface.
l Run the display acl command on the PE. The command output shows that ping packets
can match ACL rules on upstream interfaces of the PE but are directly discarded on the
downstream interface GE 1/0/1. Therefore, it can be concluded that the PE is the failure
point.
Issue 02 (2011-09-10)


<PE> display acl 3100

Advanced ACL 3100, 5 rules
rule 5 permit icmp source 4.4.4.4 0 destination 1.1.1.1 0 (2 times
matched)
matched)
matched)
rule 20 permit icmp destination 2.1.1.1 0 (31 times matched)
rule 25 permit icmp destination 1.1.1.1 0 (20 times matched)
traffic classifier icmp operator or
if-match acl 3100
traffic behavior icmp
traffic policy icmp
statistics enable
classifier icmp behavior icmp
2.
Run the display this command on the downstream interface GE 1/0/1 of the PE to view
the configuration of the interface.
<PE> display this
interface GigabitEthernet1/0/1.1
control-vid 1 dot1q-termination
dot1q termination vid 4000
ip address 2.1.1.2 255.255.255.252
traffic-policy icmp outbound
arp broadcast enable
trust upstream default
trust 8021p
The command output shows that vid specified in the control-vid command is different from
vid specified in the dot1q termination vid command. After the two VLAN IDs are set to
the same value, every NPE can ping the user gateway address of the BRAS successfully,
and the fault is cleared.
Procedure
Step 1 Run the system-view command to enter the system view.
Re-configure the sub-interface for dot1q VLAN tag termination of the PE to ensure that the
control VLAN ID is the same as the user VLAN ID terminated by the sub-interface.
Step 2 Run the interface interface-type interface-number.subinterface-number command to enter the
view of the Ethernet sub-interface on the AC side of the PE.
Step 3 Run the control-vid vid dot1q-termination rt-protocol command to configure the Ethernet
sub-interface as a sub-interface for dot1q VLAN tag termination and set the control VLAN ID
of the sub-interface.
NOTE
vid specified in the control-vid command must be the same as vid specified in the dot1q termination
vid command.
Step 4 Run the return command to return to the user view, and then run the save command to save the
modification.
After the preceding configurations, every NPE can ping the user gateway address of the BRAS
successfully. The fault is cleared.
----End
Summary
In the case of a sub-interface for dot1q VLAN tag termination that runs routing protocols, you
need to ensure that the control VLAN ID configured for the sub-interface is the same as the user
Issue 02 (2011-09-10)


VLAN ID terminated by the sub-interface. In addition, you are recommended to use the controlvid command with the parameter rt-protocol. In this manner, the system will prompt a message
when the control VLAN ID is different from the user VLAN ID terminated by the sub-interface.
If the parameter rt-protocol is not configured, the system does not prompt any message when
the control VLAN ID is different from the user VLAN ID terminated by the sub-interface.
1.1.2 Some DHCP Clients Cannot Obtain IP Addresses from the

DHCP Server When the DHCP Relay Agent Is Configured with a
Sub-Interface for Dot1q VLAN Tag Termination
After the DHCP relay agent is configured to terminate packets from multiple VLANs through
the same sub-interface, some DHCP clients cannot obtain IP addresses from the DHCP server.
Fault Symptom
In the networking shown in Figure 1-2, Router Afunctions as the DHCP server; Router B
functions as the DHCP relay agent. A PC, that is, a DHCP client, is connected to Router B
through Switch A. After the configuration, the DHCP client cannot obtain an IP address from
the DHCP server.
Figure 1-2 Networking for a DHCP client to obtain an IP address from the DHCP server
RouterA
DHCP Server
RouterB
DHCP Relay
GE1/0/1.100
SwitchA
PC
VLAN10
VLAN20
VLAN30
DHCP Clients
Fault Analysis
1.
Issue 02 (2011-09-10)
Assign an IP address from the address pool to the PC. Ping the gateway address of Router
B from the PC. The ping is successful, indicating that the Layer 2 network runs properly.

Capture packets on the PC. The PC succeeds in sending a DHCPDISCOVER message but
fails to receive a DHCPOFFER message.
2.
Ping the IP address of Router A from Router B. The ping is successful.
3.
Capture packets transmitted between Router B and Switch A through port mirroring. A
DHCPOFFER message has been sent from Router B to Switch A.
4.
Analyze the DHCPOFFER message sent from Router B. When the sub-interface for Dot1q
VLAN tag termination sends the DHCPOFFER message to the PC, Router B adds only the
VLAN tag with the smallest VLAN ID of the sub-interface for Dot1q VLAN tag termination
to the DHCPOFFER message. In this example, the Router B adds only VLAN 10, shown
in Figure 1-2, to the DHCPOFFER message. The PC, however, belongs to VLAN 20.
Therefore, the DHCPOFFER message is discarded by Switch A.
The Option field in a DHCP message is used to carry control information and parameters
that are not defined in common protocols. If you require that the Option field carry the
VLAN ID of a tagged packet sent from a client, you can enable the Option 82 function on
the DHCP relay agent.
Procedure
Step 2 Run the interface { ethernet | gigabitethernet } interface-number.subinterface-number
command to enter the Ethernet sub-interface view.
Step 3 Run the dhcp option82 insert enable command to enable Option 82 insertion for DHCP
messages on the sub-interface.
After the preceding operations are completed, the PC can obtain an IP address from the DHCP
server. The fault is thus rectified.
----End
Summary
The Option 82 field is a DHCP Relay Agent Information option recording the location
information of a DHCP client. It is a special field contained in a DHCP message.
The DHCP relay agent appends the Option 82 field to a DHCPDISCOVER message sent from
a client to the server. Upon receipt of the DHCPDISCOVER message that carries the Option 82
field, the DHCP server responds to the DHCP relay agent with a DHCPOFFER message that
contains the same Option 82 field. The DHCP relay agent then determines to where the
DHCPOFFER message is destined based on the Option 82 field carried in the message.
1.1.3 Network Segments Cannot Successfully Ping Each Other In

Dot1q Termination Access
Fault Symptom
In the networking shown in Figure 1-3, users on the network segment A access the router through
the BAS; users on the network segment B access the router through the sub-interface for Dot1q
VLAN tag termination on the router. After ping operations are performed between network
segment A and network segment B, it is found that some addresses cannot successfully ping
each other. The router, however, can successfully ping both network segment A and network
segment B.
Issue 02 (2011-09-10)


Figure 1-3 Networking for network segments failing to ping each other in Dot1q termination
access
Switch
GE1/0/0.2
GE1/0/0.1
User Network
A
User Network
B
GE2/0/0.1
BAS
GE2/0/0.2
Router
Switch
Fault Analysis
1.
Run the display this command in the view of sub-interfaces of GE 1/0/0 on the router to
view information about the sub-interfaces of GE 1/0/0. The information is displayed as
follows:
#
control-vid 10 dot1q-termination
dot1q termination vid 3
ip address 10.1.1.1 255.255.255.0
arp broadcast enable
#
undo shutdown
vlan-type dot1q 10
ip address 10.1.1.2 255.255.255.0
#
2.
Run the display this command in the view of sub-interfaces of GE 2/0/0 on the router to
view information about the sub-interfaces of GE 2/0/0. The information is displayed as
follows:
#
undo shutdown
vlan-type dot1q 10
ip address 10.2.1.1 255.255.255.0
#
undo shutdown
vlan-type dot1q 10
ip address 10.2.1.2 255.255.255.0
#
3.
Issue 02 (2011-09-10)
You can find that GE 1/0/0.1 is configured as the sub-interface for Dot1q VLAN tag
termination, whereas other sub-interfaces are configured to terminate VLAN packets
through the vlan-type dot1q command. If a user needs to access the router through a subinterface for Dot1q VLAN tag termination and also supports network segment routes, it is
required that rt-protocol be configured and the configured user VLAN ID be the same as

the control VLAN ID. In this case, you can change the configuration of GE 1/0/0.1 to rectify
the fault.
Procedure
Step 2 Run the interface interface-type interface-number command to enter the sub-interface view.
Step 3 Run the control-vid vid dot1q-termination rt-protocol command to configure the control
VLAN for the sub-interface for Dot1q VLAN tag termination.
NOTE
If rt-protocol is configured when configuring the control VLAN, ensure that the user VLAN ID is the
same as the control VLAN ID when running the dot1q termination vid command.
Step 4 Run the vlan-group group-id command to configure a user VLAN group.
Step 5 Run the statistic enable command to enable the QinQ statistics function for the user VLAN
group on the sub-interface.
Step 6 Run the quit command to return to the sub-interface view.
Step 7 Run the dot1q termination vid low-pe-vid vlan-group group-id command to configure the subinterface for Dot1q VLAN tag termination.
After the preceding operations, network segment A and network segment B can successfully
ping each other.
----End
Summary
Network segment A is connected to the router through the BAS. Therefore, the router can
successfully ping network segment A. The router is directly connected to network segment B at
Layer 2. Therefore, the router can successfully ping network segment B.
Some addresses on network segment A and network segment B cannot successfully ping each
other due to incorrect Dot1q termination configurations of a sub-interface on the router. If a user
needs to access the router through a sub-interface for Dot1q VLAN tag termination and also
supports network segment routes, it is required that rt-protocol be configured and the configured
user VLAN ID be the same as the control VLAN ID.
Issue 02 (2011-09-10)


2 HGMP Troubleshooting
HGMP Troubleshooting
About This Chapter

2.1 A Candidate Switch Directly Connected to the Administrator Switch Cannot Be Added to
the Cluster
Issue 02 (2011-09-10)

10

2.1 A Candidate Switch Directly Connected to the

Administrator Switch Cannot Be Added to the Cluster
2.1.1 Common Causes
Two switches are directly connected. A cluster is created on one switch. The other switch, that
is, a candidate switch, cannot be added to the cluster, and there is no prompt on the administrator
switch.
This fault is commonly caused by one of the following:
l
Packets cannot be exchanged between the administrator switch and candidate switch
because either of the interfaces connecting them is Down.
The basic configuration of layer 2 forwarding is incorrectly configured.
Layer 2 packet forwarding or transparent transmission of packets fails.
Packets cannot be exchanged between the administrator switch and candidate switch
because either of the interfaces that the packets pass through is blocked by a ring protocol.
The cluster, NDP, or NTDP is incorrectly configured.
The candidate switch has been added to the cluster and still remains in the cluster, and the
new cluster to which the candidate switch is added has a different name from the current
cluster.
Authentication of the candidate switch fails due to inconsistent super passwords of the
candidate switch and administrator switch.
2.1.2 Troubleshooting Flowchart

Figure 2-1 shows the troubleshooting flowchart.
Issue 02 (2011-09-10)

11

Figure 2-1 Troubleshooting flowchart for the fault that a candidate switch directly connected
to the administrator switch cannot be added to the cluster
A candidate switch directly
connected to the
administrator switch cannot
be added to the cluster
Basic configurations
correct?
No
Change basic
configurations
No
Yes
Layer 2
Loop protocols block the
interface running
HGMP?
No
Modify the
configurations of
the Layer 2 loop
protocols
No
Yes
NDP
configurations correct?
No
No
No
No
No
Yes
Fault rectified?
Fault rectified?
Yes
Change cluster
configurations
No
Yes
NDP can discovery
neighbors?
Yes
Change NTDP
configurations
Yes
Cluster configurations
correct?
Fault rectified?
Yes
Change NDP
configurations
Yes
NTDP
configurations correct?
Fault rectified?
No
Collect NDP
debugging
information
No
Collect NTDP
debugging
information
Yes
Fault rectified?
Yes
End
NTDP can discover
topologies?
Yes
Disable and then enable cluster on
the switch
Switch can
be added to the new
cluster?
Yes
No
Collect debugging
information on the
switches
End
Issue 02 (2011-09-10)

Seek technical
support
12

2.1.3 Troubleshooting Procedure

NOTE
Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct
the fault, you will have a record of your actions to provide Huawei technical support personnel.
Procedure
Step 1 Check that basic configurations of the administrator and candidate switches are correct.
HGMP packets can be exchanged only when Layer 2 forwarding is normal. You need to ensure
that the administrator and candidate switches are correctly configured so that they can exchange
Layer 2 packets.
Ensure that the two switches are configured as follows:
l
The two directly connected interfaces are added to the same VLAN.
The VLAN is the cluster management VLAN, which is specified by running the
mngvlanid vlan-id command in the cluster view. In addition, vlan-id specifies the VLAN
to which the interfaces belong.
The two interfaces are added to the VLAN in the same manner. For example, the port
trunk allow-pass vlan vlan-id command is run on both interfaces with vlan-id being the
same.
If the preceding configurations are correct, run the display vlan vlan-id command on both the
administrator and candidate switches to check whether interfaces in the VLAN are Up. For
example,
[HUAWEI] display vlan 1000
VLAN ID
Type
Status
MAC Learning
---------------------------------------------------------1000
common
enable
enable
---------------Tagged
Port: GigabitEthernet1/0/1
---------------Interface
Physical
GigabitEthernet1/0/1
UP
If the interfaces are Down, the physical link may fail. In this case, rectify the physical link
fault.
If the interfaces are Up, Layer 2 protocol is normal. In the case where the fault still persists,
either cluster configurations or packet processing at layers above Layer 2 may be incorrect.
Go to Step 2.
Step 2 Check that the Layer 2 ring protocols on the interfaces of administrator and candidate switches
run normally.
l
If STP is enabled on administrator and candidate switches, check whether the interfaces
running HGMP protocol are blocked by STP. Run the display stp brief command to check
the interface status. For example,
[HUAWEI] display stp brief
MSTID
Port
0
0
0
Role
ROOT
DESI
DESI
STP State
FORWARDING
FORWARDING
FORWARDING
Protection
NONE
NONE
NONE
If the packets can be normally forwarded, the "STP state" field is displayed as
FORWARDING on the interfaces running HGMP protocol. If the "STP state" field is
displayed as DISCARDING, it indicates that the interface is blocked by STP so that the
Issue 02 (2011-09-10)

13

interface cannot forward HGMP packets. You need to change the STP priority so that the
interface can leave the DISCARDING state and the switch can be elected as the root bridge.
by running the stp priority priority-level command in the system view. priority-level
ranges from 0 to 61440. The smaller the value, the higher the priority. The device with a
lower STP priority is elected as the root bridge of the ring.
If the interfaces running HGMP protocol are in the FORWARDING state, it indicates that
STP on the interfaces runs normally.
l
If RRPP is configured on both administrator and candidate switches, check whether the
interfaces running HGMP protocol are blocked by RRPP. Run the display rrpp verbose
domain domain-index command to check the interface status. For example,
[HUAWEI] display rrpp verbose domain 1
Domain Index : 1
Control VLAN : major 1000
sub 1001
Protected VLAN : Reference Instance 1
Hello Timer
: 1 sec(default is 1 sec)
RRPP Ring
Ring Level
Node Mode
Ring State
Is Enabled
Primary port
Secondary port
:
:
:
:
:
:
:
Fail Timer : 6 sec(default is 6 sec)
1
0
Master
Failed
Enable
Is Actived : Yes
Port status: UP
Port status: DOWN
If the "Port status" field is displayed as BLOCK, it indicates that cluster packets on the
interfaces running HGMP protocol are blocked by RRPP. RRPP blocks secondary ports
only. You need to change the blocked interface to be a non-secondary ports to ensure that
the interface leave the blocked state.
If the interfaces running HGMP protocol are in the Upstatus, it indicates that RRPP on the
interfaces runs normally. Go to Step 3.
NOTE
Only one ring protocol, in general, is configured on an interface. Check which ring protocol is
configured on the interface before checking the interface status.
Step 3 Check that basic NDP functions are normal.

Run the display ndp command on both the administrator and candidate switches to check
whether NDP can successfully discover neighbors. If NDP can discover neighbors, information
about the directly connected neighbors can be displayed. For example,
<HUAWEI> display ndp
Neighbor discovery protocol is enabled.
Neighbor Discovery Protocol Ver: 1, Hello Timer: 60(s), Aging Timer: 180(s)
Interface: GigabitEthernet1/0/2
Status: Enabled, Packets Sent: 114, Packets Received: 108, Packets Error: 0
Neighbor 1: Aging Time: 174(s)
MAC Address : 0018-8203-39d8
Port Name
: GigabitEthernet1/0/1
Software Version: NE40E Version V600R003C00SPC300
Device Name : NE40E
Port Duplex : FULL
Product Ver : NE40E
If NDP cannot discover neighbors, check that NDP is configured as follows:

l NDP is globally enabled on both switches by running the ndp enable command in the system
view.
l NDP is enabled on the two directly connected interfaces by running the ndp enable command
in the interface view.
Issue 02 (2011-09-10)

14

CAUTION
Debugging affects the performance of the system. So, after debugging, run the undo debugging
all command to disable it immediately.
If the NDP configurations are correct whereas NDP still cannot discover neighbors, collect the
debugging information displayed by running the following commands and then contact Huawei
technical support personnel.
l Run the terminal monitor and terminal debugging commands in the user view to enable
monitoring debugging.
l Run the debugging ndp packet interface interface-type interface-number command in the
user view to enable NDP debugging and collect the debugging information in three minutes.
If NDP can discover neighbors, go to Step 4.
Step 4 Check that basic NTDP functions are normal.
Check that NTDP is configured as follows:
l NTDP is globally enabled on both switches by running the ntdp enable command in the
system view.
l NTDP is enabled on the two directly connected interfaces by running the ntdp enable
command in the interface view.
l The cluster management VLAN is configured in the cluster view by running the
mngvlanid vlan-id command in the cluster view. In addition, vlan-id specifies the VLAN to
which the interface belongs.
If the NTDP configurations are incorrect, correctly configure NTDP.
If the NTDP configurations are correct, run the ntdp explore command on the administrator
and candidate switches to discover topologies. After five seconds, run the display ntdp devicelist command on the two switches to check whether NTDP can discover topologies. If NTDP
can discover topologies, information about neighbors can be displayed. For example,
[HUAWEI] display ntdp device-list
The device-list of NTDP:
-----------------------------------------------------------------------------MAC
HOP IP
PLATFORM
-----------------------------------------------------------------------------001c-2334-2312 1
1.1.1.2/24
NE80E/40E
0018-82af-fc38 0
1.1.1.1/24
NE80E/40E
If NTDP cannot discover topologies, collect the debugging information displayed by running
the following commands on the two switches and then contact Huawei technical support
personnel.
l Run the debugging ntdp all command in the use view to enable NTDP debugging.
l Run the ntdp explore command to discover topologies and the display ntdp device-list
command to display the topologies.
If NTDP discovers topologies, go to Step 5.
Issue 02 (2011-09-10)

15

NOTE
l A switch can be added to the cluster only if it has been discovered by NTDP on the administrator switch.
l Switches do not forward received NDP packets and therefore ring protocols cannot block NDP packets.
NTDP packets are forwarded after being received and therefore NTDP packets may be blocked by ring
protocols.
Step 5 Check that the basic cluster function is normal.

Check whether the cluster function is configured as follows:
l The cluster function is globally enabled on both switches by running the cluster enable
command in the system view.
l VLANIF interfaces of the cluster management VLAN are configured on both switches by
running the interface vlanif vlan-id command in the system view. vlan-id must be the same
as that in the mngvlanid command configured in the cluster view.
l An available IP pool is configured on the administrator switch by running the ip-pool
administrator-ip-address mask command in the cluster view.
l The IP addresses manually assigned to the VLANIF interfaces of the management VLAN
do not reside in the IP pool configured by using the ip-pool command.
l No super password or the same super password is configured for the administrator and
candidate switches.
If the cluster configurations are incorrect, correctly configure the cluster function.
If the cluster configurations are correct, disable cluster from the switch by running the undo
cluster enable command, and then run the cluster enable command to ensure that the switch
does not belong to any cluster. Then, delete the cluster on the administrator switch, and then
create a new cluster. Check whether the candidate switch can be added to the new cluster.
l Run the undo build command in the cluster view to delete the existing cluster.
l Run the auto-build command to create a new cluster.
If the candidate switch still cannot be added to the cluster, collect the debugging information
displayed by running the following commands on the two switches and then contact Huawei
technical support personnel.
l Run the debugging cluster all command in the use view to enable cluster debugging.
l Manually add the candidate switch to the cluster by running the add-member macaddress mac-address command in the cluster view and collect the command output displayed
in 10 seconds.
Step 6 Collect the following information and contact Huawei technical support personnel.
l Results of the preceding troubleshooting procedure
l Configuration files, log files, and alarm files of the device
----End
2.1.4 Relevant Alarms and Logs

Relevant Alarms
HGMP/4/ClstMemStusChg:OID:[oid],DeviceID:[string], Role:[integer].
Issue 02 (2011-09-10)

16

Relevant Logs
None.
Issue 02 (2011-09-10)

17

3 Ethernet OAM Troubleshooting
Ethernet OAM Troubleshooting
About This Chapter

3.1 Ethernet OAM 802.1ag Trace Fails
Issue 02 (2011-09-10)

18

3.1 Ethernet OAM 802.1ag Trace Fails

3.1.1 Common Causes
On the network shown in Figure 3-1, Router A fails to perform the 802.1ag MAC trace operation
to trace Router C.
[RouterA-md-one-ma-one] trace mac-8021ag mac 0018-823c-c449
Tracing the route to 0018-823c-c449 over a maximum of 64 hops:
Request timed out.
Figure 3-1 Troubleshooting flowchart for the fault that Ethernet OAM 802.1ag trace fails
GE8/0/13
GE1/0/10
GE1/0/13
RouterA
RouterB
GE2/0/10
RouterC

l
Devices on the traced link run Ethernet OAM 802.1ag of different versions.
A MEP configured on Router C (the traced node) is at a level different from that on Router
A (the trace-initiating node).
A MEP on an intermediate node has the same level as or higher level than that on Router
A.
An intermediate node has no MAC address entry of Router C.
Issue 02 (2011-09-10)

19


Figure 3-2 Troubleshooting flowchart for the fault that Ethernet OAM 802.1ag trace fails
802.1ag trace fails
802.1ag of
the same version
is run on Router A,
Router B, and
Router C?
No
Change the 802.1ag

version to be the same
Is fault rectified?
Yes
No
Yes
A higher-level
MEP is configured on
Router C?
No
Configure a MEP on
Router C at the same
level as the MEP on
Router A
Yes
Is fault rectified?
No
Yes
A higher-level
MEP is configured on
Router B?
Yes
Configure a MEP on
Router B at the same
level as the MEP on
Router A
Yes
Is fault rectified?
No
No
Router B has a
MAC address entry of
Router C?
No
Perform 802.1ag MAC

ping on Router A to
ping Router C and
allow Router B to learn
the MAC address
Yes
Is fault rectified?
No
Yes
Collect information
Seek techincal support
End

Procedure
Step 1 Run the display oam global configuration command to check that only 802.1ag Draft 7 or only
Standard 802.1ag-2007 is run on all devices along the traced link.
Issue 02 (2011-09-10)

20

l If only 802.1ag Draft 7 or only Standard 802.1ag-2007 is run on all devices along the traced
link, go to Step 2.
l If 802.1ag Draft 7 and Standard 802.1ag-2007 are run on different devices along the traced
link, run the cfm version command to change all devices to run only 802.1ag Draft 7 or only
Standard 802.1ag-2007.
If Router A successfully performs the MAC trace operation to trace Router C, go to Step
6.
If Router A fails to perform the MAC trace operation to trace Router C, go to Step 2.
Step 2 Run the display this command to check that the MEP configured on Router C has the same level
as the MEP configured on Router A.
l If so, go to Step 3.
l If not, run the cfm md command to set the MEP level on Router C the same as that on Router
A.
6.
Step 3 Run the display cfm mep command to check that the level of the MEP on an intermediate node
is the same as or higher than that on Router A.
NOTE
After 802.1ag packets in a lower-level MD enter a higher-level MD, the 802.1ag packets will be discarded;
802.1ag packets in a higher-level MD can successfully travel through a lower-level MD. 802.1ag packets in an
MD of a specified level cannot travel through the MD with the same level.
l If the level of the MEP on an intermediate node is lower than that on Router A, go to Step
4.
l If the level of the MEP on an intermediate node is the same as or higher than that on Router
A, run the cfm md command to set the level of the MEP on the intermediate node to be lower
than that on Router A.
6.
Step 4 Run the display mac-address dynamic unit unit-id command on each intermediate node to
check that the MAC address entry of Router C exists.
l If so, go to Step 5.
l If not, run the ping mac-8021ag command to allow the intermediate node to learn the MAC
address of Router C.
6.
Step 5 If the fault persists, contact Huawei technical support personnel.
----End
Issue 02 (2011-09-10)

21


Relevant Alarms
EOAM1AG/4/CCFAULTALARM:OID 1.3.6.1.4.1.2011.5.25.136.1.6.1 MEP is disconnected or
reconnected with one or more MEPs. (CfmMdIndex=[GAUGE], CfmMaIndex=[GAUGE], RmepId=
[GAUGE], CfmMdIndex=[GAUGE], CfmMdIndex=[GAUGE], CfmMaIndex=[GAUGE], Dmac=
[OPAQUE], HighestDefect=[INTEGER], MdName=[STRING], MdLevel=[INTEGER], MaName=
[STRING])
EOAM1AG/4/CCFAULTALARM:OID 1.3.6.1.4.1.2011.5.25.136.1.6.1 A MEP has lost contact.
(CfmMdIndex=[GAUGE], CfmMaIndex=[GAUGE], RmepId=[GAUGE], CfmMdIndex=[GAUGE],
CfmMdIndex=[GAUGE], CfmMaIndex=[GAUGE], Dmac=[OPAQUE], HighestDefect=[INTEGER],
MdName=[STRING], MdLevel=[INTEGER], MaName=[STRING])
Relevant Logs
None
Issue 02 (2011-09-10)

22

4 MSTP Troubleshooting
MSTP Troubleshooting
About This Chapter

4.1 MSTP Topology Change Leads to Service Interruption
Issue 02 (2011-09-10)

23

4.1 MSTP Topology Change Leads to Service Interruption

4.1.1 Common Causes
After MSTP is configured on a device and the MSTP topology changes, services are interrupted.
l
MSTP is incorrectly configured.
Physical links flap, causing a large number of TC messages to be sent.
An MSTP-aware device receives MSTP TC messages from clients or transparentlytransmitted MSTP TC messages.

The troubleshooting of MSTP topology change leads to service interruption is based on the
network shown in Figure 4-1.
Issue 02 (2011-09-10)

24

Figure 4-1 Networking diagram of MSTP
S2
S1
GE1/0/1
GE1/0/2
GE1/0/1
GE1/0/2
GE1/0/2
GE1/0/2
GE1/0/1
GE1/0/1
S4
S3
CIST(MSTI0):
Root Switch: S1
Blocked port
MSTI1:
Root Switch: S1
Blocked port
MSTI2:
Root Switch: S2
Blocked port
The troubleshooting roadmap is as follows:

l
Check that the MSTP status is correct.
Check whether the device has received TC messages.
Check that no physical interface on the device alternates between Up and Down.
Check that the MSTP convergence mode is Normal.
Issue 02 (2011-09-10)

25

Figure 4-2 Troubleshooting flowchart for the fault that an MSTP topology change leads to
service interruption
Services are interrupted
or the device is
disconnected
MSTP status
is correct?
No
Check and modify

the MSTP
configuration
Is fault
rectified?
No
Yes
Yes
MSTP
recalculation is
performed?
No
Seek technical
support
Yes
Physical
Yes
Shut down the
interface on the device
flapping interface
alternates between Up
and Down?
Is fault
rectified?
Yes
No
No
MSTP
convergence mode is
Normal?
Yes
Set the MSTP

convergence
mode to Normal
Is fault
rectified?
Yes
No
No
Collect information
End
Seek technical
support

NOTE
Procedure
Step 1 Check the status of interfaces on MSTP devices.
Check the role of each MSTP-enabled port in each instance.
Issue 02 (2011-09-10)

26

On the network shown in Figure 4-1, there is only one MSTP ring, which means that each
instance can have only one blocked interface.Run the display stp brief command on each device
to check whether the status of each port is normal.
Run the display stp brief command in any view to check the MSTP status on S1.As shown in
Figure 4-1, in instances 0 and 1, S1 functions as a root bridge and all ports on S1 are designated
ports.In instance 2, one port on S1 is a designated port and the other port is a root port.Both ports
are in the Forwarding state.
[S1] display stp brief
MSTID
Port
0
0
1
1
2
2
Role
DESI
DESI
DESI
DESI
ROOT
DESI
STP State
FORWARDING
FORWARDING
FORWARDING
FORWARDING
FORWARDING
FORWARDING
Protection
NONE
NONE
NONE
NONE
NONE
NONE
Run the display stp brief command in any view to check the MSTP status on S2. As shown in
Figure 4-1, in instances 2, S2 functions as a root bridge and all ports on S2 are designated ports.In
other instances, one ports on S2 is a designated port and the other port is a root port. Both of
them are in the Forwarding state.
MSTID
Port
0
0
1
1
2
2
Role
ROOT
DESI
ROOT
DESI
DESI
DESI
STP State
FORWARDING
FORWARDING
FORWARDING
FORWARDING
FORWARDING
FORWARDING
Protection
NONE
NONE
NONE
NONE
NONE
NONE
Figure 4-1, in instance 2, one port on S3 is an Alternate port and the other port is a root port.
The Alternate port is blocked and in the Discarding state.In other instances, one port on S3 is a
designated port and the other port is a root port. Both of them are in the Forwarding state.
MSTID
Port
0
0
1
1
2
2
Role
DEST
ROOT
DEST
ROOT
ALTE
ROOT
STP State
FORWARDING
FORWARDING
FORWARDING
FORWARDING
DISCARDING
FORWARDING
Protection
NONE
NONE
NONE
NONE
NONE
NONE
Figure 4-1, in instance 0, one port on S4 is an Alternate port and the other port is a root port.
The Alternate port is blocked and in the Discarding state.In instance 2, one port on S4 is a
designated port and the other port is a root port. Both of them are in the Forwarding state.
MSTID
Port
0
0
1
1
2
2
Role
ALTE
ROOT
ALTE
ROOT
DESI
ROOT
STP State
DISCARDING
FORWARDING
DISCARDING
FORWARDING
FORWARDING
FORWARDING
Protection
NONE
NONE
NONE
NONE
NONE
NONE
On the network shown in Figure 4-1, each instance has only one port in the Discarding
state and the other port is in the Forwarding state.If several ports are in the Discarding state,
an MSTP calculation error occurs. To solve this problem, go to Step 6.
If the MSTP status is correct, go to Step 2.
Issue 02 (2011-09-10)

27

Step 2 Check that the MSTP configuration is correct.

Run the display stp region-configuration command to view mappings between VLANs and
instances.
[S1] display stp region-configuration
Oper Configuration:
Format selector :0
Region name
:huawei
Revision level :0
Instance
0
1
2
Vlans Mapped
21 to 4094
1 to 10
11 to 20
Check whether mappings between VLANs and instances are correct.If the mapping
between a VLAN and an instance is incorrect, run the instance command to map the VLAN
to a specified spanning tree instance. Run the active region-configuration command to
active the mapping between the VLAN and instance configured by using the instance
command.
Run the display current-configuration command to view the MSTP configuration in the
configuration file of the device.
l
Check interface configurations to confirm that MSTP-enabled interfaces have been

configured with the command (for example bpdu-tunnel enable) to enable protocol
packets to be sent to the CPU.
Check whether MSTP is disabled on the interfaces connecting to user terminals or the
interfaces are configured as edge interfaces.
If an MSTP-enabled device is configured with a BPDU tunnel, check whether the BPDU
tunnel configuration is correct. For BPDU tunnel configurations, see the chapter "BPDU
Tunnel Configuration" in the NE80E/40E Configuration Guide - LAN Access and MAN
Access .
Check whether a port is added to a VLAN correctly.For VLAN configurations, see the
chapter "VLAN Configuration" in the NE80E/40E Configuration Guide - LAN Access and
MAN Access .
If the MSTP configuration is correct, go to Step 3.
Step 3 Check that no MSTP recalculation is performed.

Run the display stp command in any view to check whether the device has received TC
messages.
[S1] display stp
-------[CIST Global Info][Mode MSTP]------CIST Bridge
:57344.00e0-fc00-1597
Bridge Times
:Hello 2s MaxAge 20s FwDly 15s MaxHop 20
CIST Root/ERPC
:0
.0018-826f-fc7a / 20000
CIST RegRoot/IRPC
:57344.00e0-fc00-1597 / 0
CIST RootPortId
:128.2
BPDU-Protection
:disabled
TC or TCN received :0
TC count per hello :0
STP Converge Mode
:Nomal
Time since last TC :2 days 14h:16m:15s
-------[MSTI 1 Global Info]------MSTI Bridge ID
:4096.00e0-fc00-1597
MSTI RegRoot/IRPC
:4096.00e0-fc00-1597 / 0
MSTI RootPortId
:0.0
Master Bridge
:57344.00e0-fc00-1597
Cost to Master
:0
TC received
:0
Issue 02 (2011-09-10)

28

If values of the TC or TCN received, TC count per hello, TC received, and TC count per
hello fields in the command output increase, the device has received TC messages and the
network topology has changed. In this case, you need to view log messages MSTP/6/
SET_PORT_DISCARDING and MSTP/6/SET_PORT_FORWARDING to check
whether the role of an MSTP-enabled port changes.
If the port role does not change, go to Step 4.
If the port role changes, go to Step 6.
NOTE
If a multi-process has been created on the device and TC notification has been configured in the
multi-process, when the topology of the multi-process changes, a TC message is sent to the process
0 for instructing devices in process 0 to refresh their MAC and ARP address tables. In this manner,
devices on the network can re-select links to forward traffic, ensuring non-stop traffic.
If the values in the TC or TCN received, TC count per hello, TC received, and TC count
per hello fields in the command output are 0s, it indicates that the device does not receive
any TC message. In this case, contact Huawei technical support personnel.
Step 4 Check that no interface on the device alternates between Up and Down.
View the log message IFNET/4/IF_STATE to check whether an MSTP-enabled port alternates
between Up and Down.
l
If an MSTP-enabled interface alternates between Up and Down, it indicates that the

interface flaps. If a physical interface frequently alternates between Up and Down, the
MSTP status of the device on the network will become unsteady. As a result, a large number
of TC messages are generated; ARP entries and MAC entries are frequently deleted;
services are interrupted. Run the shutdown command on the flapping interface. If services
are not restored after the flapping interface is shut down, go to Step 5.
If no interface flaps, go to Step 5.
Step 5 Check that the MSTP convergence mode is Normal.

Run the display stp command in any view to check the MSTP convergence mode of the device.
[S1] display stp
-------[CIST Global Info][Mode MSTP]------CIST Bridge
:57344.00e0-fc00-1597
Bridge Times
:Hello 2s MaxAge 20s FwDly 15s MaxHop 20
CIST Root/ERPC
:0
.0018-826f-fc7a / 20000
CIST RegRoot/IRPC
:57344.00e0-fc00-1597 / 0
CIST RootPortId
:128.2
BPDU-Protection
:disabled
TC or TCN received :0
STP Converge Mode
:Normal
Time since last TC :2 days 14h:16m:15s
-------[MSTI 1 Global Info]------MSTI Bridge ID
:4096.00e0-fc00-1597
MSTI RegRoot/IRPC
:4096.00e0-fc00-1597 / 0
MSTI RootPortId
:0.0
Master Bridge
:57344.00e0-fc00-1597
Cost to Master
:0
TC received
:0
If the convergence mode is Normal, go to Step 6.
If the convergence mode is Fast, run the stp converge normal command to change the
convergence mode to Normal. If services are not restored after the convergence mode is
changed, go to Step 6.
Issue 02 (2011-09-10)

29

----End

Relevant Alarms
MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.1 hwMstpiPortStateForwarding
MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.2 hwMstpiPortStateDiscarding
MSTP_1.3.6.1.2.1.17.0.2 TOPOC
Relevant Logs
MSTP/6/RECEIVE_MSTITC
VOSCPU/4/CPU_USAGE_HIGH
Issue 02 (2011-09-10)

30

5 RRPP Troubleshooting
RRPP Troubleshooting
About This Chapter

5.1 RRPP Loop Occurs Temporarily
Issue 02 (2011-09-10)

31

5.1 RRPP Loop Occurs Temporarily

5.1.1 Common Causes
After RRPP is configured on a device, a loop occurs temporarily.
l
The configuration is incorrect.
Values of the Failtime timers configured for nodes along the RRPP ring are different.

Temporary RRPP loop troubleshooting is based on the network shown in Figure 5-1.
Figure 5-1 Networking diagram of RRPP
Ethernet0/0/4
Ethernet0/0/2
Ethernet0/0/4
SwitchA
SwitchB
Ethernet0/0/8
Ethernet0/0/1
SwitchC
Ethernet0/0/8

l
Check that every node on the RRPP ring is correctly configured.
Check that the Failtime timer of every node on the RRPP ring is set to the same value.
Issue 02 (2011-09-10)

32

Figure 5-2 Troubleshooting flowchart for the fault that an RRPP loop occurs temporarily
RRPP loop occurs
temporarily
Every node
on The RRPP ring is
correctly configured?
No
Modify the
configurations
Is fault rectified?
Yes
No
Yes
Failtime timer
of every node on the
RRPP ring is set to the
same value?
No
Correct the
configurations
Yes
Is fault rectified?
No
Yes
Collect information
Seek technical support
End

NOTE
Procedure
Step 1 Check that every node on the RRPP ring is correctly configured.
Run the display this command in the RRPP view of each node on the RRPP ring to view RRPP
configurations.
[RouterA-rrpp-domain-region1] display this
#
rrpp domain 1
control-vlan 100
protected-vlan reference-instance 0
timer hello-timer 1 fail-timer 3
ring 1 node-mode master primary-port Ethernet1/0/2 secondary-port Ethernet1/0/4
level 0
ring 1 enable
#
return
Check whether all nodes on the RRPP ring belong to the same domain, whether the nodes are
configured with the same control VLAN ID and instance number, and whether the RRPP ring
has only one master node.
Issue 02 (2011-09-10)

33

l If all configurations are correct, go to Step 2.

l If any of the preceding configurations is incorrect, RRPP configurations may be incorrect.
For correct configurations, see the chapter "RRPP Configuration" in the NE80E/40E
Configuration Guide - LAN Access and MAN Access.
Step 2 Check that the Failtime timer of every node on the RRPP ring is set to the same value.
Run the display rrpp verbose domain domain-id command in any view to check detailed RRPP
configurations.
[RouterA-rrpp-domain-region1] display rrpp verbose domain 1
Domain Index : 1
Control VLAN : major 20
sub 21
Hello Timer
: 1 sec(default is 1 sec) Fail Timer : 3 sec(default is 3 sec)
RRPP Ring
: 1
Ring Level
: 0
Node Mode
: Master
Ring State
: Complete
Is Enabled
: Enable
Is Active : Yes
Primary port :
Port status: UP
Secondary port:
Port status: BLOCKED
If the Failtime timers of the nodes on the RRPP ring are set to different values, correct the
configurations according to the chapter "RRPP Configuration" in the NE80E/40E
Configuration Guide - LAN Access and MAN Access.
If the Failtime timer of every node on the RRPP ring is set to the same value, go to Step
3.
----End

Relevant Alarms
RRPP_1.3.6.1.4.1.2011.5.25.113.4.2 hwRrppRingFail
Relevant Logs
RRPP/3/FAIL
RRPP/5/PBLK
RRPP/5/RESTORE
Issue 02 (2011-09-10)

34

6 PPP Troubleshooting
PPP Troubleshooting
About This Chapter

NOTE
PPP cannot be configured on the X1 and X2 models of the NE80E/40E.
6.1 Protocol Status of a PPP Interface Is Down

6.3 ATM Troubleshooting
This chapter describes common causes of Asynchronous Transfer Mode (ATM) faults, and
provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms,
logs, and commands.
6.4 The Ping Fails on Both Ends of an ATM Network
This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting
procedure for the fault that the ping fails on both ends of an ATM network.
6.5 Troubleshooting Cases
Issue 02 (2011-09-10)

35

6.1 Protocol Status of a PPP Interface Is Down

6.1.1 Common Causes
After an interface is configured with PPP, LCP negotiation fails, which causes the protocol status
of the interface to be Down.
l
PPP configurations on the two ends of the link are incorrect.
The physical status of the interface is Down.
PPP packets are discarded.
A loop occurs on the link.
The link delay is too long.

l
Check that PPP configurations on the two ends of the link are correct.
Check that the physical status of the interface is Up.
Check that the interface can sent and receive protocol packets.
Check that the link is loop-free.
Check that the link delay is tolerant.
Issue 02 (2011-09-10)

36

Figure 6-1 Troubleshooting flowchart for an LCP negotiation failure

LCP negotiation fails
PPP configurations
on the two ends of the link
are correct?
No
Modify PPP
configurations
Is fault rectified?
Yes
No
Yes
Physical status of the

interface is Up?
No Locate and rectify

transmission or
lower-layer faults
Yes
Is fault rectified?
No
Yes
Interface can
sent and receive protocol
packets?
No Locate and rectify

transmission or
lower-layer faults
Is fault rectified?
Yes
No
Yes
Link is loop-free?
No
Eliminate the
loop
Yes
Is fault rectified?
No
Yes
No
Link delay is tolerant
Check the link

delay
Yes
Is fault rectified?
No
Yes
Seek technical support
End

NOTE
Procedure
Step 1 Check that PPP configurations on the two ends of the link are correct.
Run the display this command in the view of the interface whose protocol status is Down to
check PPP configurations.
[HUAWEI-Pos1/0/1] display this
#
interface Pos1/0/1
link-protocol ppp
undo shutdown
ip address 10.10.1.1 255.255.255.0
#
return
Issue 02 (2011-09-10)

37

Check whether the following configurations on the two ends match each other. If not,
modify the configuration referring to the chapter "PPP and MP Configuration" in the
NE80E/40E Configuration Guide - WAN Access.
Check whether the authenticator and authenticatee are configured with the same
authentication mode. The ppp authentication-modeauthentication-mode command
indicates the authentication mode adopted by the local end that functions as the
authenticator. You need to check the authentication mode adopted by the peer. For
example, if the ppp pap local-user user-name password simple password command
has been configured, it means that the peer adopts PAP authentication.
Check whether both ends are bundled into an MP-group or neither of the two ends is
bundled into an MP-group. If one end has been bundled into an MP-group, the other
end must be bundled into the same MP-group. If the ppp mp mp-group interfacenumber command is configured, it means that the interface has been bundled into an
MP-group.
Check whether the authenticator and authenticatee are configured with the same
password for PPP authentication.
If PAP authentication is adopted, do as follows to check the configured user name
and password:
Check the user name and password of the authenticatee in the interface view.
#
interface Pos1/0/0
link-protocol ppp
ppp pap local-user huawei password simple huawei
undo shutdown
#
return
Check the user name and password of the authenticator in the AAA view.
[HUAWEI] aaa
[HUAWEI-aaa] display this
#
aaa
local-user huawei password simple huawei
#
return
If the authenticator adopts CHAP authentication and is configured with a user name,
do as follows to check the user name and password:
Check the user name of the authenticatee in the interface view, and then check the
password in the AAA view based on the user name.
#
interface Pos1/0/0
link-protocol ppp
ppp chap user huawei
undo shutdown
#
return
[HUAWEI-Pos1/0/0] aaa
#
aaa
#
return
[HUAWEI] aaa
Issue 02 (2011-09-10)

38

#
aaa
#
return
If the authenticator adopts CHAP authentication but is not configured with a user
name, do as follows to check the user name and password:
Check the user name and password of the authenticatee in the interface view.
#
interface Pos1/0/0
link-protocol ppp
ppp chap user huawei
ppp chap password simple huawei undo shutdown
#
return
[HUAWEI] aaa
#
aaa
#
return
If the preceding configurations are correct but the fault persists, go to Step 2.
Step 2 Check that the physical status of the interface is Up.

Run the display interface interface-type interface-number command to check the physical status
of the interface.
l
If the physical status of the interface is Down, you need to rectify the physical fault of the
interface. For detailed troubleshooting procedures, see "Physical Interconnection
Troubleshooting".
If the physical status of the interface is Up but the fault persists, go to Step 3.
Step 3 Check that the interface can sent and receive protocol packets.
Run the display interface interface-type interface-number command to check the number of
sent packets and received packets to determine whether the interface sends and receives protocol
packets.
[HUAWEI] display interface Pos 1/0/0
Pos1/0/0 current state : UP
Line protocol current state : UP
Last line protocol up time : 2010-02-05 06:35:43
Description:HUAWEI, Pos1/0/0 Interface
Route Port,The Maximum Transmit Unit is 4470, Hold timer is 10(sec)
Internet Address is 108.108.1.1/24
Link layer protocol is PPP
LCP opened, IPCP opened
The Vendor PN is HFBR-57E0P
The Vendor Name is AVAGO
Port BW: 155M, Transceiver max BW: 155M, Transceiver Mode: MultiMode
WaveLength: 1310nm, Transmission Distance: 2000m
Physical layer is Packet Over SDH
Scramble enabled, clock master, CRC-32, loopback: none
Flag J0 "NetEngine
"
Flag J1 "NetEngine
"
Flag C2 22(0x16)
SDH alarm:
section layer: none
line
layer: none
Issue 02 (2011-09-10)

39

path
layer: none
SDH error:
section layer: B1 0
line
layer: B2 0 REI 44
path
layer: B3 0 REI 23
Statistics last cleared:never
Last 300 seconds input rate 24 bits/sec, 0 packets/sec
Last 300 seconds output rate 24 bits/sec, 0 packets/sec
Input: 70945 packets, 1135144 bytes
Input error: 0 shortpacket, 0 longpacket, 0 CRC, 0 lostpacket
Output: 70945 packets, 1135140 bytes
Output error: 0 lostpackets
Output error: 0 overrunpackets, 0 underrunpackets
If the number of received or sent packets is 0, or the number does not increase, it indicates
that packets are discarded during transmission. Check whether the physical connection is
correct. For detailed information, see "Physical Interface Cannot Go Up".
If the physical connection is correct, you need to locate the cause of packet loss. For detailed
troubleshooting procedures, see "Packet Loss Troubleshooting".
If packets can be received and sent but the fault persists, go to Step 4.
CAUTION
Debugging affects the system performance. Therefore, after debugging, run the undo debugging
all command to disable it immediately.
In addition, you can run the debugging ppp all interface interface-type interface-number
command to check the number of sent and received protocol packets and the status changes of
the PPP state machine.
Jun 2 2010 17:19:41.310.1 HUAWEI PPP/7/debug2:Slot=1;
PPP Event:
Pos1/0/0 LCP TO+(Timeout with counter > 0) Event
state acksent , Retransmit = 4
PPP Packet:
Pos1/0/0 Output LCP(c021) Pkt, Len 18
State acksent, code ConfReq(01), id 3, len 14
MRU(1), len 4, val 1176
MagicNumber(5), len 6, val 00abb891
PPP Packet:
Pos1/0/0 Input LCP(c021) Pkt, Len 18
State acksent, code ConfAck(02), id 3, len 14
MRU(1), len 4, val 1176
MagicNumber(5), len 6, val 00abb891
PPP Event:
Pos1/0/0 LCP RCA(Receive Config Ack) Event
state acksent
Step 4 Check that the link is loop-free.

Run the display interface interface-type interface-number command to check the physical status
of the interface.
[HUAWEI] display interface Pos 1/0/2
Pos1/0/2 current state : UP
Line protocol current state : DOWN
Description:HUAWEI, Pos1/0/2 Interface
Route Port,The Maximum Transmit Unit is 4470, Hold timer is 10(sec)
Internet protocol processing : disabled
Issue 02 (2011-09-10)

40

Link layer protocol is PPP, loopback is detected

LCP closed
The Vendor PN is HFBR-57E0P
The Vendor Name is AVAGO
Port BW: 155M, Transceiver max BW: 155M, Transceiver Mode: MultiMode
WaveLength: 1310nm, Transmission Distance: 2000m
Physical layer is Packet Over SDH
Scramble enabled, clock master, CRC-32, loopback: local
Flag J0 "NetEngine
"
Flag J1 "NetEngine
"
Flag C2 22(0x16)
SDH alarm:
section layer: none
line
layer: none
path
layer: none
SDH error:
section layer: B1 22
line
layer: B2 94 REI 145
path
layer: B3 44 REI 86
Statistics last cleared:never
Last 300 seconds input rate 56 bits/sec, 0 packets/sec
Last 300 seconds output rate 56 bits/sec, 0 packets/sec
Input: 40530 packets, 890400 bytes
Input error: 0 shortpacket, 0 longpacket, 2 CRC, 0 lostpacket
Output: 36512 packets, 946612 bytes
Output error: 0 lostpackets
Output error: 0 overrunpackets, 0 underrunpackets
If loopback is detected is displayed, it indicates that a loop occurs on the link. You need
to locate the cause of the loop and eliminate the loop.
If no loop occurs but the fault persists, go to Step 5.
Step 5 Check that the link delay is tolerant.

Use a tester to test the link delay. On a Huawei router, the transmission of a PPP packet times
out in 3 seconds, and the timeout period is configurable. The link delay must be smaller than the
timeout period.
l
If the link delay is too long, replace or maintain the relevant device.
If the link delay is tolerant but the fault persists, go to Step 6.
l Configuration files, log files, and alarm files of the devices
----End

Relevant Alarms
None.
Relevant Logs
None.

Issue 02 (2011-09-10)

41

6.2.1 Half of Packets Are Discarded the First Minute After an MPGroup Link Fails
Fault Symptom
Router A and Router B are connected by using an MP-group interface consisting of two CE1
interfaces. All timeslots of each CE1 interface are bundled into a synchronous serial interface
to transmit services.
Figure 6-2 Networking for half of packets discarded the first minute after an MP-Group link
fails
CE1 1/0/0
CE1 1/0/1
RouterA
MP-Group
CE1 1/0/0
CE1 1/0/1
RouterB
After the link of CE1 1/0/0 fails, 50% packets are dropped in the first minute, and the link
recovers and no packet is dropped in two minutes.
Fault Analysis
After an MP-group member link becomes unavailable, the traffic transmitted over this link
automatically switches to the other available link.
In addition, a PPP-enabled router sets the PPP status of the link to Down after failing to receive
10 consecutive Hold time messages.
A Hold time message is sent every 10 seconds by default, and it takes 100s to send 10 Hold time
packets. This is why the peer router can detect that the link is Down about two minutes (100s)
after the PPP status is set down.
Router B continues to send messages along the failed link 100s after the failure occurs. As a
result, half of packets are dropped during this period of time. After Router A sets the PPP status
to Down, Router B detects that the link goes Down and no longer sends messages along the
failed link, and thus no message is dropped.
You can set a shorter interval at which a Hold time message is sent to rectify the fault.
Run the following commands on two routers on the two ends of the PPP link:
Procedure
Step 2 Run the interface serial 1/0/0:0 command to enter the view of the synchronous serial interface
composed of CE1 interfaces.
Step 3 Run the timer hold 1 command to set the interval at which a Hold time message is sent to 1
second.
Step 4 Run the shutdown and undo shutdown commands to restart the CE1 interface.
Issue 02 (2011-09-10)

42

After the configurations, run the ping command on Router A or Router B to ping the peer. No
packet is dropped. The fault is then rectified.
----End
Summary
Set a smaller interval at which a Hold time message is sent on both ends of a PPP link, enabling
a rapid response to the link status change on an unstable network.
6.2.2 Incorrect Configurations of Member Interfaces Cause the MPgroup Link to Work Unstably
Fault Symptom
RouterA functions as the egress of the network and accesses the Internet through RouterB.
RouterA and RouterB intercommunicate through an MP-group link on which there is a
transmission device. An MP-group link is formed by bundling eight CE1 links.
Figure 6-3 Networking diagram of MP-Group
MP-Group
MP-Group
Users
Internet
RouterA
Transmission
Device
RouterB
When a user attached to RouterA accesses the Internet, the DNS server sometimes cannot be
pinged successfully; the long delay of packets and packet loss may frequently occur; the webpage
sometimes cannot be accessed.
Fault Analysis
1.
On the transmission device, check the status of the physical interface and optical fiber. The
physical interface and optical fiber are detected normal.
2.
On RouterA and RouterB, run the display interface command to view information about
the MP-group interface and its member interfaces.
Check the following information:
l Whether the interface becomes Up
l Whether the clock mode is correct
l Whether the line coding/decoding formats on both ends are consistent
l Whether loopback is configured
l Whether alarms are generated
[RouterA] display controller e1 1/0/0
E1 1/0/0 current state : UP
Description : E1 1/0/0 Interface
Physical layer is Packet Over NO-CRC4
clock slave, linecode hdb3 , loopback none
section layer:
alarm: LOS
Issue 02 (2011-09-10)

43

line layer:
alarm: none
path layer:
alarm: none
[RouterB] display controller e1 1/0/0
E1 1/0/0 current state : UP
Description : E1 1/0/0 Interface
Physical layer is Packet Over NO-CRC4
clock master, linecode hdb3 , loopback none
section layer:
alarm: LOS
line layer:
alarm: none
path layer:
alarm: none
When a transmission device is on the link, the router needs to obtain clock signals from the
transmission device, which requires the router to work in slave clock mode.
The CE1 interface on RouterA is configured to work in slave clock mode, whereas the CE1
interface on RouterB is configured to work in master clock mode. Therefore, the link is
unstable due to incorrect clock configuration.
Procedure
Step 1 Run the system-view command on RouterB to enter the system view.
Step 2 Run the controller e1 1/0/0 command to enter the CE 1 interface view.
Step 3 Run the clock slave command to change the clock mode of the CE1 interface to the slave clock
mode.
Step 4 Configure the same on each member interface of the MP-group interface.
Check the status of the CE1 interfaces on the routers on both ends. You can find that the link is
stable and packets can be transmitted and received normally. The fault is rectified.
----End
Summary
Functioning as the synchronous interface, the CE1 interface can work in data terminal equipment
(DTE) mode and data circuit-terminating equipment (DCE) mode. In addition, the interface
requires a clock mode.
On the CE1 interface, you can select one of the following clock modes:
l
Master clock mode: uses the internal clock signal.
Slave clock mode: uses the line clock signal.
When the CE1 interfaces of two devices are directly connected, one interface should be
configured to work in master clock mode and the other interface should be configured to work
in slave clock mode.
When the CE1 interface of a device is connected to a transmission device, the CE1 interface
must be configured to work in salve clock mode and obtains clock signals from the transmission
device.
Issue 02 (2011-09-10)

44

6.3 ATM Troubleshooting

This chapter describes common causes of Asynchronous Transfer Mode (ATM) faults, and
provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms,
logs, and commands.
6.4 The Ping Fails on Both Ends of an ATM Network

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting
procedure for the fault that the ping fails on both ends of an ATM network.
6.4.1 Common Causes

l
The physical link goes Down.
The protocol goes Down.

Issue 02 (2011-09-10)

45

Figure 6-4 Troubleshooting flowchart for a ping failure on both ends of an ATM network
Two ends of the ATM
network cannot
access each other
Check that
the optical fiber is
correctly connected
No
Connect the optical fiber

correctly
Is the fault
rectified?
Yes
No
Yes
Check that the

VPI/VCI values on
both ends are the
same?
No
Set the same VPI/VCI

values
Yes
Is the fault
rectified?
No
Yes
Check that the
protocol types and
encapsulation types
are the same
No
Set the same protocol

type and encapsulation
type
Is the fault
rectified?
Yes
No
Yes
Check that the
bandwidths and service
types are the
same
No
Set the same bandwidth

and service type
Is the fault
rectified?
Yes
No
Yes
Ask for technical support
End

NOTE
Procedure
Step 1 Check that the optical fiber is properly installed.
l
If the optical fiber is not properly installed, see "Physical Interconnection Troubleshooting."
If the optical fiber is properly installed, see Step 2.
Step 2 Check that PVCs on the interfaces on both ends are the same.
The PVCs on the interfaces on both ends must be the same because the ATM interfaces use the
PVCs for communication. If the PVCs are configured with IPoA, check whether the protocol
address mapping is correctly configured. If two devices are directly connected, the PVC mapped
Issue 02 (2011-09-10)

46

from the local to the peer IP address on the local end must be the same as the PVC mapped from
the peer to the local IP address on the peer end.
l
If the PVCs on both ends are different, change them to be the same. If the fault persists, go
to Step 3.
If the PVC values on both ends are the same, go to Step 3.
Step 3 Check that the configurations and protocol types on the ATM interfaces on the two ends are the
same.
Check that the protocols configured for the PVCs on the ATM interfaces on the two ends are
the same. If the routing protocol over ATM is Open Shortest Path First (OSPF), the mapping
entry configured on the ATM interfaces must contain broadcast. If the ATM interfaces on the
two ends can use small packets, but not big packets, to ping each other, check that the MTUs on
the ATM interfaces are the same. If the MTUs on the ATM interfaces are different, go to Step
4.
Step 4 Collect the following information and contact Huawei technical support personnel:
l Configuration, log, and alarm files
----End

None.
6.5 Troubleshooting Cases

6.5.1 OSPF Neighbor Relationship Cannot Be Set Up Between P2P
ATM Sub-interfaces
Fault Symptom
As shown in Figure 6-5, two P2P ATM sub-interfaces are directly connected. The P2P ATM
sub-interfaces can ping each other successfully, but the OSPF neighbor relationship cannot be
set up between them.
Issue 02 (2011-09-10)

47

Figure 6-5 Networking diagram for a fault that the OSPF neighbor relationship cannot be set
up between two P2P ATM sub-interfaces
Area 0
RouterA
RouterB
ATM3/0/0.1
ATM3/0/0.1
Fault Analysis
On a P2P network, OSPF packets are broadcast.
The PVC configurations on both ATM interfaces show that broadcast is configured not for IPoA
but for OSPF. As a result, the OSPF neighbor relationship cannot be set up between the two
ATM interfaces.
Procedure
Step 2 Run the interface atm interface-number.sub-number command to enter the ATM sub-interface
view.
Step 3 Run the pvc vpi/vci command to enter the PVC view.
Step 4 Run the map ip ip-address broadcast command to allow packets to be broadcast over the PVC.
After the preceding configuration is complete, the OSPF neighbor relationship is set up
successfully.
----End
Summary
Note the following points when configuring ATM:
l
The encapsulation types and protocol types must be the same on both ends of the ATM
PVC.
The PVC types are determined by the upper layer protocols.
Issue 02 (2011-09-10)

48

Troubleshooting - Layer 2 Network (V600R003C00 - 02)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Troubleshooting - Layer 2 Network (V600R003C00 - 02)

Uploaded by

Copyright:

Available Formats

HUAWEI NetEngine80E/40E Router