You are on page 1of 1

SDH Maintenance Guide for Huawei MSTP Product Family

Routine Maintenance Requirements Troubleshooting SNCP Switching Faults Troubleshooting SDH Service Interruptions
Loopback
The following operations are recommended in routine maintenance to identify
Application scenario: This method can be used to check different network segments to locate a fault.
and eliminate potential network risks and facilitate troubleshooting. Troubleshoot SNCP Troubleshoot service
switching faults. interruptions.
Basic method: Perform loopbacks at different points and test services to narrow down the fault scope.

Essential Skills Networking Information Key point: Analyze the signal flow and perform loopback tests segment by segment.
Yes
Master basic SDH principles. Be familiar with network topologies. Example
Yes Yes
Check whether Reset, reseat, or replace
Check whether SNCP ring Check whether Disconnect the working equipment alarms exist. the board.
Understand the MSTP product system and hardware. Master service configurations. NE (board)-1 NE (board)-2 NE (board)-3 NE (board)-4 NE (board)-5 configurations are complete interrupted services are path to change the SNCP
Know the methods for handling common alarms. Know the equipment running status. Test meter when the fault occurs. switched over. ring to bidirectional links.
Networking No
Master database backup and restoration methods. Be familiar with engineering documents.
and B A No
Master basic operations on common instruments. No
loopback Check whether Yes Replace the fan or
No temperature alarms exist. air filter.
Check whether
services are normal.

No
Equipment Status Maintenance and Backup ERR Yes

Handle alarms and exceptions in time. Back up NE databases periodically. Test result Check whether Yes Troubleshoot protection
OK Restore the working path protection switching faults.
Monitor link performance. Manage NE and NMS accounts. and disconnect the switching fails.
protection path to change
Check the protection group status. Manage spare parts. Test result the SNCP ring to
ERR bidirectional links. No

Yes Check whether the No Correct fiber connections


Check whether line

Basic Principles for Locating Faults Analysis


B
Faulty section
A
Perform loopbacks site
No
Check whether
Yes
alarms exist. receive optical power is
normal.
or problems on the peer
site.

by site to locate the services are normal. No Yes


faulty site.
Before handling a fault, collect and record live network data. Fault locating Check whether line
Yes
gradually narrows down the scope and finally identifies the faulty site. Instruments and Tools alarms are reported after Handle the fault on the
self-loop on the local site. local site.
Application scenario: This method can be used to demarcate external or internal faults, locate
No
interconnection problems, and check link performance. Check whether fibers are
Check external causes before internal faults correctly connected on Re-connect the fibers. No
Basic method: Use the BER tester, SDH analyzer, and optical power meter to locate a fault. the faulty site.
Rule out external causes, for example, power failure, water corrosion, cable damage, or peer
equipment faults. Key point: Understand how to properly use these instruments.
Yes
Yes
Check whether
Check whether J1 and C2 Modify and re-deliver
higher order path
? Component Replacement Check whether SNCP
services are properly
No Check whether No Correct the trigger
alarms exist.
bytes are the same as those
on the local site.
the configurations.

trigger conditions conditions.


+ - Application scenario: This method can be used to locate faults on passive nodes or handle complex faults. configured on the
faulty site. are correct. No
Yes
Basic method: Replace components suspected to be faulty to locate and rectify a fault.
Yes
Replaceable components include fibers, network cables, boards, switches, and power supplies. Yes

No Yes Check whether the


Key point: Analyze the signal flow before using this method. Check whether services Properly configure SNCP Check whether a T_ALOS alarm is cleared Yes Handle cable, port, or
Check the network before NEs are properly configured
on the source and
services on the source
and sink sites.
T_ALOS alarm exists. after self-loop on the local interconnection problems.
site.
sink sites.
Locate a fault to a site by analyzing the fault symptoms.
Yes

Troubleshooting MSP Switching Faults


No
No
Handle the fault on the
Check whether local site.
No Check whether services No Properly configure
switching trigger are properly configured
information is received. services on
on pass-through sites. pass-through sites.
Troubleshoot MSP Check whether Yes
switching faults. lower order path Perform loopbacks site by
Yes Yes ,
alarms exist. site to locate the fault.

No
Analyze high-severity alarms before low-severity ones Check whether
Yes
Check whether the line
Yes
Reseat or replace the
Yes Yes hardware is faulty. board is faulty. line board.
Analyze critical and major alarms before minor alarms and warnings. Check whether the Check whether protection Troubleshoot the active Check whether No Modify service
MSP protocol status is switching occurs on the cross-connect board. service configurations configurations.
normal. cross-connect boards. are correct.
No No
No No Yes
Check whether the Yes
Reseat or replace the
cross-connect cross-connect board.
Perform loopbacks Check whether a Yes Cancel the
Troubleshoot the standby board is faulty.
segment by segment on loopback or unloaded configurations.
cross-connect board. the interrupted path. byte exists.

No
No No
Check whether the Check whether the
Critical alarm Major alarm Minor alarm Warning Check whether MSP
No
Modify the parameters.
fault is rectified. fault is rectified.

parameters are correct.


Yes Yes
Contact Huawei Contact Huawei
Yes engineers. End End
Check high-speed services before low-speed services engineers.

High-speed signal alarms generally cause low-speed signal alarms. Therefore, clear
high-speed signal alarms first. Check whether the Yes
Eliminate human factors.
For example, most tributary alarms are cleared after line faults are rectified. problem is caused by
manual operations.

No

VC-4
VC-12 VC-12
Restart the MSP
SDH Alarm Signal Flow
protocol and check Yes
whether protection
switching is normal.
SPI RST MST MSA HPT HPA LPT
LOS
No SPI: SDH physical interface B2: multiplex section bit error monitor byte
Common Methods for Locating Faults (A1, A2) LOF
(J0) J0_MM
"1"
AIS
RST: regenerator section termination M1: multiplex section remote block error indication
RS_BIP Err.
(B1)
Yes Yes
(K2)
MS_AIS "1" AIS MST: multiplex section termination H1, H2, H3: pointer justification byte
Alarm Analysis Check whether
hardware is faulty.
Check whether the line
board is faulty.
Reseat or replace the
line board. MS_BIP Err.
(B2)
MSA: multiplex section adaptation C2: signal label byte
Application scenario: Alarms illustrate fault information. When a fault occurs, check (M1)
MS_REI
alarms reported by the possibly faulty equipment. MS_RDI
(K2) "1" J1: higher order path trace byte
No
AU_AIS HPT: higher order path termination
(H1, H2, H3)
Basic method: Check current and historical alarms, fault symptoms, and fault time to No (H1, H2)
AU_LOP
narrow down the fault scope and finally locate the fault to a site or module. HP_SLM, HP_UNEQ "1" HPA: higher order path adaptation B3: higher order path bit error monitor byte
(C2)
Yes HP_TIM AIS
Key point: Handle alarms based on their severities and types: Check whether the Reseat or replace the (J1)
Hardware alarms (HARD_BAD, BUS_ERR, FAN_FAIL, POWER_FAIL, CHIP_FAIL)
cross-connect board cross-connect board. (B3)
HP_BIP Err. LPT: lower order path termination G1: path status byte
is faulty. HP_REI
Link alarms (R_LOS, R_LOC, R_LOF, AU_AIS, ETH_LOS) (G1)
HP_RDI A1, A2: framing byte V1, V2, V3: lower order path pointer indication byte
(G1)
Protection alarms (MS_AIS, APS_FAIL, APS_INDI) TU_AIS "1"
(V1, V2, V3)
Temperature alarms (TEMP_ALARM, TEMP_OVER) TU_LOP J0: regenerator section trace byte H4: multiframe indication byte
(V1, V2)
HP_LOM
(H4)
(V5)
LP_UNEQ "1" B1: regenerator section bit error monitor byte V5: path status and signal label byte
Configuration Data Analysis No (J2)
LP_TIM
AIS
Check whether the fault LP_BIP Err.
is rectified. (V5) K2: automatic protection switching byte J2: lower order path identifier
Application scenario: This method can be used to locate a fault that occurred during LP_REI
(V5)
site deployment, capacity expansion, and link adjustment. LP_RDI
NOTE
(V5) "1"
Yes
Basic method: Check whether NE configurations are correct. (V5)
LP_SLM AIS

Key point: Check whether NE configurations are correct, including the MSP node Contact Huawei
AIS: inserts all 1s into the lower level service to indicate the remote end that the service is
parameters, loopback settings of the line and tributary boards, protection for tributary End Indicates that an alarm is generated. Indicates that an alarm is detected. unavailable. Common AIS alarms include MS_AIS, AU_AIS, TU_AIS, and E1/T1 AIS.
engineers.
paths, and path trace identifiers.