Professional Documents
Culture Documents
RTN Microwave: Huawei Transport Network Maintenance Reference
RTN Microwave: Huawei Transport Network Maintenance Reference
RTN Microwave
Issue 03
Date 2016-06-30
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Website: http://www.huawei.com
Email: support@huawei.com
Overview
For assisting maintenance engineers in troubleshooting, this document describes how to
troubleshoot OptiX RTN products, and is organized as follows:
l Basic principles and common methods for locating faults
l This chapter describes basic principles and common methods for locating faults. Each
method is illustrated using an example.
l Troubleshooting process and guide
l This chapter describes the general troubleshooting process, fault categories, and how to
diagnose each category of faults.
l Equipment interworking guide
l This chapter provides criteria for correct interworking between OptiX RTN products and
other products, and methods used for locating interworking faults.
l Typical cases
l This chapter provides typical troubleshooting cases for helping maintenance personnel
improve their fault diagnosis capabilities.
l Appendix
l This chapter provides references.
Intended Audience
This document is intended for:
l Technical support engineers
l Maintenance engineers
Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol Description
General Conventions
The general conventions that may be found in this document are defined as follows.
Convention Description
Update History
Updates between document issues are cumulative. Therefore, the latest document issue
contains all updates made in previous issues.
Contents
4 Typical Cases................................................................................................................................ 59
4.1 List of Cases................................................................................................................................................................. 60
4.2 Radio Link Faults......................................................................................................................................................... 61
4.2.1 Radio Link Interruptions Due to Multipath Fading...................................................................................................61
4.2.2 Service Bit Errors Due to Interference to Radio Links..............................................................................................62
4.2.3 Intermittent Link Interruptions Caused by IF Interference........................................................................................63
A Appendix......................................................................................................................................76
A.1 Distribution of Rain Zones.......................................................................................................................................... 76
A.2 Refractivity Gradient................................................................................................................................................... 78
This chapter describes basic principles and common methods for locating faults. Each method
is illustrated using an example.
1.1 Basic Principles for Locating Faults
1.2 Common Methods for Locating Faults
1.3 Signal Flow Analysis
1.4 Alarm and Performance Analysis
1.5 Receive and Transmit Power Analysis
1.6 Loopback
1.7 Replacement
1.8 Configuration Data Analysis
1.9 Tests Using Instruments and Tools
1.10 RMON Performance Analysis
1.11 Network Planning Analysis
Description
Fault locating aims to narrow down the most likely areas for faults, since transmission
equipment faults affect services in a large area.
Table 1-1 lists the basic principles for locating faults. These principles are summarized based
on characteristics of transmission equipment.
External first, transmission Rule out external faults, for example, faults on power
next supply equipment or interconnected equipment, or cable
damage.
Network first, NE next Locate a fault to a radio site or a radio based on fault
symptoms.
High-severity alarms first, First handle high-severity alarms, such as critical alarms
low-severity alarms next and major alarms. Then handle low-severity alarms, such as
minor alarms and warnings.
Signal flow All scenarios This method helps locate a fault to a radio site or
analysis radio hop. Familiarity with service signal flows,
cable connections, and air-interface link
connections helps analyze fault symptoms and
locate possibly faulty points.
Alarm analysis All scenarios Alarms well illustrate fault information. Handle
alarms reported by faulty points immediately
after analyzing service signal flows.
Receive and Locating radio By analyzing the current and historical receive
transmit power link faults and transmit power on a radio link, determine
analysis whether any errors, for example, interference and
fading, exist on the radio link.
Loopback Locating a fault to This method is fast and independent of alarm and
a component or performance event analysis. It, however, affects
site section by embedded control channels (ECCs) and normal
section service running.
Replacement Locating a fault to This method does not require sound theoretical
a component or knowledge or skills but requires spare parts. It
board, or applies to nearly sites.
identifying
external faults
Tests using Isolating external This method provides accurate results. Before
instruments and faults and using this method, interrupt services.
tools addressing
interworking
issues
Fault Symptoms
As shown in Figure 1-2, a microwave chain network was set up, and all 2G and 3G base
station services in an area were interrupted for approximately 10 minutes.
NE1704
NE1712 NE1711 NE1710 NE1709
Procedure
Step 1 Checked the distribution of the NEs on which services were interrupted and the service flow
direction.
NE1704 converged the interrupted services, so the service interruption was related to
NE1704.
Step 2 Checked alarms and operation records on NE1704.
NE1704 reported an MW_CFG_MISMATCH alarm, and the Hybrid radio E1 capacity was
changed on NE1704 right before the services were interrupted. It was inferred that the
services were interrupted due to an E1 capacity mismatch between NE1704 and NE1705.
Step 3 Corrected the Hybrid radio E1 capacity on NE1704.
The fault was rectified.
----End
Checking current and historical alarms, fault symptoms, and fault time helps narrow down the
most likely areas for faults, and helps locate a fault to a hop, site, or module.
The alarm and performance analysis method entails capabilities in using the NMS and
analyzing service signal flows.
Fault Symptoms
An OptiX RTN 620 NE on a network reported a HARD_BAD alarm and an XCP_INDI
alarm.
Procedure
Step 1 Checked alarms.
l The PXC board in slot 1 reported a HARD_BAD alarm, whose parameters indicated that
the 38M clock was lost and the analog phase-locked loop (PLL) was unlocked.
l The boards in slots 5, 6, and 7 reported the HARD_BAD alarm, whose parameters
indicated that the 38M clock was lost and the PXC board in slot 1 was faulty. The fault
caused loss of the first 38M clock.
The HARD_BAD alarm reported by the board in slot 1 triggered a switchover, causing the
SCC board to report an XCP_INDI alarm.
----End
Procedure
Step 1 Checked the ODU receive power that was recorded during the alarm period.
The difference between the maximum receive power and the minimum receive power was
more than 40 dB, and the minimum receive power was close to or less than the receiver
sensitivity. Therefore, it was inferred that the fault was caused by spatial fading.
----End
1.6 Loopback
B A
Test result
ERR
OK
ERR
B Faulty section A
Test meter
Test result
ERR
OK
OK
ODU ODU
ODU ODU
7 IFH2 8 7 IFH2 8
NE1 NE2
A
Third-party
SDH E1 BER tester
RNC
Procedure
Step 1 Analyzed the service signal flow.
The alarmed E1 signal was received from NE2.
Step 2 Checked alarms reported by NE2.
NE2 did not report any hardware alarms or service alarms.
Step 3 Set an inloop at the tributary board (point 1) on NE2, and connected an E1 bit error rate
(BER) tester to point A (third-party SDH equipment).
The service had bit errors.
Step 4 Set an outloop at the SD1 board (point 2) on NE1.
The E1 BER tester at point A read no bit error. It was suspected that the radio link between
NE1 and NE2 was faulty.
Step 5 Tested the radio link performance by setting an inloop at the tributary board (point 1) on NE2
and connecting an E1 BER tester to point B (OptiX OSN equipment).
The E1 BER tester at point B read no bit error.
NOTE
The E1 BER tester was connected to the OptiX OSN equipment and the corresponding E1 cross-connections
were modified, because NE1 had no E1 tributary port.
Step 6 Checked the interconnection configuration data on the OptiX RTN equipment, OptiX OSN
equipment, and third-party SDH equipment.
The preceding equipment used their own clock sources, and the clocks were not synchronized.
NOTE
All equipment on an SDH network must trace the reference clock. In the preceding example, the OptiX RTN
equipment, OptiX OSN equipment, and third-party SDH equipment are interconnected through SDH ports.
After E1 services are encapsulated and mapped several times, serious jitter may be generated and results in bit
errors. To resolve similar issues, plan and implement clock solutions when building SDH or microwave
transmission networks.
----End
1.7 Replacement
Fault Symptoms
See the following figure. Two sites, site A and site B, were interconnected using 2+0 radio
links. At each site, ODUs of the same type (with the same sub-band but different working
frequencies) were used. NE B-2 at site B frequently reported services alarms such as R_LOC
and R_LOF.
ODU ODU
NE A-1 A-1 B-1 NE B-1
ODU ODU
NE A-2 A-2 B-2
NE B-2
R_LOC/ R_LOF
Procedure
Step 1 Checked historical performance events and the receive power within the period of alarm
reporting.
The receive power was normal. Because the alarms did not persist, loopback tests were
inapplicable. The replacement method could be used for fault locating. The receive end was
suspected faulty. However, it was difficult to replace an ODU. Because the 2+0 links used the
same type of ODUs, the IF cables at site B could be interchanged for fault locating.
Step 2 Interchanged the IF cables at site B and checked for alarms for two days.
NE B-2 still reported service alarms. Therefore, site B was not faulty, and site A was possibly
faulty.
Site A Site B
ODU ODU
NE A-1 A-1 B-1 NE B-1
ODU ODU
NE A-2 A-2 B-2
NE B-2
R_LOC/ R_LOF
Step 3 Restored the IF cable connections at site B, interchanged the IF cables at site A, and checked
for alarms for two days.
NE B-1 reported service alarms. Therefore, the IF cable connecting NE A-2 and ODU A-2
was faulty.
Site A Site B
R_LOC/ R_LOF
ODU ODU
NE A-1 A-1 A-1 NE B-1
ODU ODU
NE A-2 A-2 A-2
NE B-2
----End
Procedure
Step 1 Checked alarms.
The CONIFG_NOSUPPORT alarm indicating an incorrect frequency caused the
RADIO_MUTE alarm.
NOTE
Step 3 Changed the Tx frequency to a correct value based on the network planning information.
----End
Fault Symptoms
In the network shown in following figure, the NMS set up data communication network
(DCN) communication with NE1 and NE2 through the multiprotocol label switching (MPLS)
network. NE1 was connected to the MPLS network using a hub and communicated with the
MPLS network through the Open Shortest Path First (OSPF) protocol. The NMS pinged NE1
successfully but failed to ping NE2. Therefore, NMS could not reach NE2. The routing table
of NE1 indicated that NE1 did not learn routes to upstream NEs. The MPLS network had
multiple radio hops at its edge, but the fault occurred only between NE1 and NE2.
MPLS
Procedure
Step 1 Connected the hub to a PC and used the data service packet sniffer to analyze the OSPF
packets received by NE1.
The designated router (DR) IP addresses in the OSPF packets were xx.xx.xx.1, but the IP
address of the NE that sent the DR packets was xx.xx.xx.2. Therefore, NE1 did not receive
any DD packets sent by the DR elected on the OSPF subnet. As a result, NE1 could not create
an adjacency with the DR and could not learn OSPF routes.
Step 2 Sniffed and analyzed OSPF packets at another OptiX RTN NE that was connected to the
MPLS network and was operating normally.
The OptiX RTN NE received OSPF packets from the DR. Therefore, an OptiX RTN NE fault
was ruled out.
Step 3 Increased the priority of NE1's gateway (IP address: xx.xx.xx.2) so the gateway became the
DR on the subnet.
NE1 learned OSPF routes, and NE2 was reachable to the NMS.
----End
l Storing all the statistics on the agent side and supporting offline manager operations
l Storing historical data to facilitate fault diagnosis
l Supporting error detection and reporting
l Supporting multiple manager sites
The OptiX RTN equipment achieves RMON using the following management groups:
Fault Symptoms
Figure 1-8 shows a mobile network, where OptiX RTN 600 V100R003s provided backhaul
transmission. Packet loss occurred when BTS1 at site 1 and BTS2 at site 2 were pinged from
the RNC, but did not occur when BTS3 at site 3 was pinged.
Procedure
Step 1 Suspected that the radio bandwidth between NE 3-002 and NE 3-003 was insufficient,
causing the loss of ping packets.
Step 2 Analyzed the RMON data of NE 3-002 to check whether packet loss was caused by
insufficient radio bandwidth between site 2 and site 3.
The maximum traffic volume of NE 3-001 already reached its maximum air interface
bandwidth (25 Mbit/s). Therefore, packet loss was caused by congestion. For details, see the
following figure.
----End
Based on terrains and rain falling of areas that radio links cover, network planning generally
determines operating frequencies, T/R spacing, transmit power, antenna heights, and
protection/diversity modes. Based on the preceding information, radio link indicators such as
normal receive power, fading margin, and system availability can be obtained.
l Availability: Check whether actual link availability meets customers' requirements. For
rain zones (zones L, M, N, P, and Q specified by ITU-T), it is recommended that you use
low frequency bands and polarization direction V. For a radio link subject to severe
multipath fading, it is recommended that you increase the height difference between the
antennas at both ends or use 1+1 SD protection as long as LOS is guaranteed.
l Multipath fading prediction methods. Generally, the following methods are available:
– ITU-R-P.530-7/8 method: It is globally applicable.
– ITU-R-P.530-9 method: It is applicable to areas with high reflection gradients, for
example, the Middle East, the Mediterranean sea, and West Africa. It works with
the ITU-R-P.530-7/8 method. During the prediction, low availability is used as the
calculation result.
– KQ factor method: It is applicable to China (seldom used).
– Vigants-Barnett method: It is applicable to North America.
l Rain fading prediction methods. Generally, the following methods are available:
– ITU: It is globally applicable.
– R.K. Crane: It is applicable to North America.
– For a link covering several rain zones, it is recommended that you select the zone
with the heaviest rainfall for calculation.
Fault Symptoms
A radio link frequently but intermittently reported MW_RDI, R_LOC, and RPS_INDI alarms,
and HSB switchovers were triggered.
Procedure
Step 1 Queried historical receive power values of the radio link.
The receive power decreased to a value close to the receiver sensitivity when an alarm was
reported. Most alarms were reported during the night or in the early morning. When the
weather was favorable at noon, the receive power was normal. Therefore, intermittent radio
link interruptions were caused by multipath fading.
Step 2 Checked annual interruption time predicted for the radio link.
The actual annual interruption time was longer than the predicted time of 1877 seconds.
Therefore, the fading margin was insufficient.
The ITU-R-P.530-7/8 method was used. The area covered by the radio link was in the Middle
East, and therefore the ITU-R-P.530-9 method should be used.
Step 4 Used the ITU-R-P.530-9 method to predict annual interruption time without changing other
conditions.
The obtained value was about 175833 seconds, which was longer than the value obtained
using the ITU-R-P.530-7/8 method.
According to the preceding analysis, the actual annual interruption time was much longer than
the predicted time because an incorrect multipath algorithm was used in network planning.
Step 5 Planned this link using a correct algorithm and deployed 1+1 SD protection for the link. The
link availability met service requirements.
----End
This chapter describes the general troubleshooting process, fault categories, and how to
diagnose each category of faults.
2.1 Troubleshooting Process Overview
2.2 Fault Categories
2.3 Troubleshooting Radio Links
2.4 Troubleshooting TDM Services
2.5 Troubleshooting Data Services
2.6 Troubleshooting Microwave Protection
2.7 Troubleshooting Clocks
2.8 Troubleshooting DCN Communication
Start
1
Record fault
symptoms
Rectify external faults
2 Yes
Caused by
external factors?
No
3
4
Diagnose the fault Report to Huawei
Yes
Is the fault No
rectified?
Yes
Write a
troubleshooting report
End
Mark Explanation
Mark Explanation
3 Find causes of a fault with reference to section 1.2 Common Methods for
Locating Faults, determine the category of the fault with reference to
section 2.2 Fault Categories, and rectify the fault as instructed in the
corresponding section listed below:
l 2.3 Troubleshooting Radio Links
l 2.4 Troubleshooting TDM Services
l 2.5 Troubleshooting Data Services
l 2.6 Troubleshooting Microwave Protection
l 2.7 Troubleshooting Clocks
l 2.8 Troubleshooting DCN Communication
4 Contact Huawei local office or dial Huawei technical service hotline for
problem reporting and technical support.
NOTICE
When handling critical problems such as a service interruption, exercise the following
precautions:
l Restore services as soon as possible.
l Analyze fault symptoms, find causes, and then handle faults. If causes are unknown,
exercise precautions when you perform operations in case the problems become severer.
l If a fault persists, contact Huawei engineers and coordinate with them to handle the fault
promptly.
l Record the operations performed during fault handling and save the original data related to
the fault.
Radio link fault Radio links report link-related alarms such as MW_LOF and
RADIO_RSL_LOW, or have bit errors.
Time division multiplexing Radio links work normally but their carried TDM services
(TDM) service fault are interrupted or deteriorate.
Data service fault Radio links work normally but their carried data services
have packet loss or are unavailable.
Protection fault Protected radio links or their carried services are faulty, or
protection switching fails (no switchover is performed or
services are unavailable after switching is complete).
Interference Antenna
Fading Poor LOS Cables
installation
Damaged Power
Reflection cable faults
components
Troubleshooting Process
Figure 2-3 illustrates the process for diagnosing a radio link fault.
Start
Ye
Hardware alarms exist? s Rectify equipment faults.
No
Yes
The link is blocked.
Co-channel or adjacent-channel
Yes interference occurs.
RSL greater than the
receiver sensitivity? Large-delay, multipath reflection
occurs.
No
Yes
Raining when the fault Rain fading
occurs?
No
Multipath fading
Yes
The fault occurs regularly?
Terrain reflection
No
Link interruption Ye
s Check whether the designed value
time greater than the
is appropriate.
designed value?
No
Multipath l The receive power changes greatly and l Increase the path inclination by adjusting
fading quickly (generally from 10 dB to dozens the antenna mount heights at both ends,
of dB within seconds). The changes occur therefore increasing height differences
periodically, especially during the between the antennas at both ends.
transition between day and night. l Reduce surface reflection. For apparent
l A typical symptom of duct-type fading is strong reflection surfaces, for example,
that the receive power undergoes large areas of water, flat lands, and bold
substantial up-fading and down-fading. mountain tops, adjust antennas to move
reflection points out of the strong
reflection areas or mask the reflection by
using landforms.
l Reduce the path clearance. With LOS
conditions guaranteed, lower antenna
mount heights as much as possible.
l Use space diversity or increase the fading
margin. In normal conditions, space
diversity is the most efficient method for
decreasing multipath fading.
Interference l A link's receive power is greater than the l Plan frequencies or polarization directions
receiver sensitivity, but the link is properly. In theory, a large spacing
interrupted or has bit errors. between the operating frequency of target
l When no fading occurs, an IF board signals and the operating frequency of
reports a radio link alarm, especially when interference signals reduces interference.
interference is strong. In addition, note issues such as frequency
resources and network-wide planning.
l When interference occurs at the local end
(interference signal power greater than – l Plan Tx high and Tx low sites properly. If
90 dBm), the local receive power is greater multiple ODUs provide multiple
than –90 dBm after the peer ODU is microwave directions at a site, plan the site
muted. as a Tx high site or Tx low site for all
microwave directions, if possible.
l A frequency scanner can detect
interference signal power when being l Plan microwave routes properly.
tuned to the operating band of an ODU. Generally, adopt Z-shaped radio link
distribution to prevent over-reach
interference.
Rain fading When it rains, a link may be interrupted or Increase link fading margin, use low
deteriorate. frequency bands, or use vertical polarization.
l Increase link fading margin for rain zones
L, M, N, P, and Q.
l Rain fading impairs radio links that
operate at high frequency bands, especially
frequency bands higher than 18 GHz.
Radio links operating at frequency bands
lower than 10 GHz are not affected. If rain
fading is severe, change radio links'
operating frequency bands, if necessary.
l Rain fading in horizontal polarization is
severer than that in vertical polarization.
Poor LOS The receive power is always lower than the l If radio links or antennas are blocked,
designed power. adjust antenna mount heights or positions
to bypass obstacles.
l Adjust deviated antennas.
Fault Symptoms
TDM services are interrupted or have bit errors.
Cause 1 The hardware is faulty. Analyze alarms and perform loopbacks to check
whether board hardware is faulty. If a board is
faulty, replace the board.
Cause 2 A radio link is faulty. On the NMS, find the occurrence period of the
fault and check whether any service alarm is
generated on the radio link. If a radio link alarm is
generated, first rectify radio link faults.
Cause 5 Power supply voltage Check whether the voltage of the external input
fluctuates, the power supply fluctuates or whether the equipment
grounding is improper, is grounded improperly.
or external interference
exists.
Fault Symptoms
On a network, services at all base stations, which are converged at level 1 or level 2
convergence nodes and then transmitted to base station controllers (BSCs)/RNCs, are
interrupted. To be specific, all voice services, Internet access services, and video services are
interrupted.
Cause Analysis
If services at all base stations on an entire network or in an area are interrupted, faults
probably occur at the convergence nodes that are interconnected with BSCs/RNCs. Therefore,
check for the following faults at convergence nodes:
NOTICE
Before locating faults, collect data of all NEs that are possibly faulty, if possible.
1. Rule out hardware faults and radio link faults with reference to section 2.2 Fault
Categories and 2.3 Troubleshooting Radio Links.
2. Check whether upstream convergence ports at the convergence nodes report equipment
alarms.
If Then
These ports report any of the Clear the alarms as instructed in "Alarms and
following equipment alarms: Handling Procedures" in the Maintenance Guide.
l ETH_LOS
l LASER_MOD_ERR
l LASER_NOT_FITED
l ETH_NO_FLOW
3. Check RMON statistics about upstream convergence ports at the convergence nodes.
If Then
The ports receive data but do not The boards where the ports locate may be faulty.
transmit data In this case, go to the next step.
The ports do not receive data The interconnected equipment is faulty. In this
case, rectify the fault by following instructions in
chapter 3 Equipment Interworking Guide.
4. Check the Ethernet bandwidths provided by radio links at the convergence nodes.
If Then
Attributes of the service ports Set attributes for the service ports again (including
are incorrectly set port enabled/disabled, tag attribute, and default
VLAN) and check whether the services recover. If
not, go to the next step.
VLAN settings are inconsistent Re-set VLANs for the services and check whether
with actual services the services recover. If not, go to the next step.
NOTICE
If the fault persists after all the preceding steps are performed, dial Huawei technical service
hotline or contact Huawei local office.
Fault Symptoms
Services at all base stations on an entire network or in an area experience packet loss. For
example, all Internet service users experience a low access rate, calls are delayed, ping
packets between BSCs/RNCs and base stations are lost, or artifacts appear in video services.
Cause Analysis
If services at all base stations on an entire network or in an area experience packet loss, faults
probably occur at convergence nodes (possibly OptiX PTN 1900 or OptiX RTN 950) that are
interconnected with BSCs/RNCs. Therefore, check for the following faults at the convergence
nodes (the possibility of service configuration errors is eliminated because the services are not
interrupted):
l Incorrect parameter setting (for example, mismatched working modes) for Ethernet ports
l Network cable or fiber fault
l Service traffic exceeding preset bandwidth
l Member link fault in link aggregation groups (LAGs)
l Oversized burst traffic
l Broadcast storm
l Inappropriate quality of service (QoS) parameter setting
DANGER
Before locating faults, collect data of all NEs that are possibly faulty, if possible.
If Then
The convergence nodes report Clear the alarms as instructed in "Alarms and
alarms like ETH_LOS or Handling Procedures" in the Maintenance Guide.
experience alarm jitters If the alarms clear, check whether the fault is
rectified. If the alarms persist, go to the next step.
2. At the convergence nodes, check whether the ports used for interconnection and their
peer ports at the interconnected equipment are consistently set.
If Then
The ports' working modes are Change their working modes to the same and
inconsistent with their peer check whether the fault is rectified. If not, check
ports' working modes the next item.
The ports' physical states are Verify fiber connections or network cable
different from the settings connections at the ports. Then, enable the ports
again and check whether the fault is rectified. If
not, check the next item.
The ports' maximum Change the value of the MTU parameter to 9600
transmission unit (MTU) bytes and check whether the fault is rectified. If
settings are different from actual not, check the next item.
packet lengths
3. Check the traffic volume at each convergence port and each convergence node.
If Then
The total volume of traffic Split the traffic or increase the maximum
converged to a convergence bandwidth configured for the convergence node. If
node exceeds the maximum only a few service packets are lost (generally due
bandwidth configured for the to oversized burst traffic), check for historical
convergence node threshold-crossing events.
Check whether the fault is rectified. If not, check
the next item.
The burst traffic volumes at the Enable traffic shaping at the convergence ports
convergence nodes exceed the that are interconnected with BSCs/RNCs, and
maximum bandwidths check whether the fault is rectified. If not, check
configured for the convergence the next item.
nodes
4. Check whether QoS settings are appropriate if QoS policies are configured for the
convergence nodes or BSCs.
If Then
NOTICE
If the fault persists after all the preceding steps are performed, dial Huawei technical service
hotline or contact Huawei local office.
----End
Cause Analysis
If services at some base stations are interrupted, certain equipment on the transmission link is
faulty. To diagnose the fault, check service continuity on the link and RMON counts of
service ports, determine the fault scope, and check for the following faults at those possibly
faulty nodes:
l Board hardware fault
l Boards not installed
l Abnormal physical ports (used for interconnection)
l Service configuration error
NOTICE
Before locating faults, collect data of all NEs that are possibly faulty, if possible.
1. Check service continuity on each branch of the faulty link to determine the fault scope.
If Then
The services from base stations The NE or its next-hop NE on the faulty link is
or OptiX RTN NEs to an NE on faulty. In this case, go to the next step.
the faulty link are available, but
the services from the faulty link
to the NE are interrupted
If Then
An NE on the faulty link If an NE on the faulty link transmits data but does
receives data but does not not receive data, check the traffic counts of its
transmit data, or transmits data next-hop NE. Repeat this operation until you
but does not receive data locate the NE that does not transmit data. The
located NE is considered a faulty NE. Then, go to
the next step.
3. At the faulty NE, check whether the port used for interconnection and its peer port at the
interconnected equipment are consistently set.
If Then
The port's working mode is Change the working mode to the same and check
inconsistent with its peer port's whether the fault is rectified. If not, check the next
working mode item.
The port's MTU setting is Change the value of the MTU parameter to 9216
different from the actual packet bytes and check whether the fault is rectified. If
length not, go to the next step.
The services are not configured Re-configure the services and check whether the
or are incorrectly configured services recover. If not, go to the next step.
If Then
Attributes of the service ports Set attributes for the service ports again (including
are incorrectly set port enabled/disabled, tag attribute, Layer 2/Layer
3 attribute, and default VLAN) and check whether
the services recover. If not, go to the next step.
7. Check the service VLAN. If the service VLAN is incorrectly set, re-set it.
DANGER
If the fault persists after all the preceding steps are performed, dial Huawei technical service
hotline or contact Huawei local office.
----End
Cause Analysis
If services at some base stations experience packet loss, certain equipment on the transmission
link is faulty. To diagnose the fault, check service continuity on the link and RMON counts of
service ports, determine the fault scope, and check for the following faults at those possibly
faulty nodes:
l Abnormal physical ports (used for interconnection)
l Service traffic exceeding preset bandwidth
l Oversized burst traffic
l Broadcast storm
l Inappropriate QoS parameter setting
NOTICE
Before locating faults, collect data of all NEs that are possibly faulty, if possible.
1. Check RMON counts of ports on the faulty link, and determine the fault scope by
comparing traffic volumes at involved NEs.
If Then
The volume of traffic received Consider the NE as a faulty NE and go to the next
by an NE is greater than the step.
volume of traffic transmitted by
the NE
The volume of traffic received Check the traffic volume at the next-hop NE.
by an NE is equal to the volume Repeat this operation until you locate the NE
of traffic transmitted by the NE, whose volume of received traffic is largely
but both volumes are too low different from its volume of transmitted traffic.
The located NE is considered a faulty NE. Then,
go to the next step.
The NE reports alarms like Clear the alarms as instructed in "Alarms and
ETH_LOS or experiences alarm Handling Procedures" in the Maintenance Guide.
jitters If the alarms clear, check whether the fault is
rectified. If not, go to the next step.
3. At the faulty NE, check whether the port used for interconnection and its peer port at the
interconnected equipment are consistently set.
If Then
The port's working mode is Change the working mode to the same and check
inconsistent with its peer port's whether the fault is rectified. If not, check the next
working mode item.
The port's MTU setting is Change the value of the MTU parameter to 9600
different from the actual packet bytes and check whether the fault is rectified. If
length not, check the next item.
If Then
The total volume of traffic Split the traffic or increase the maximum
converged to an upstream bandwidth configured for the port. Then check
service port exceeds the whether the fault is rectified. If not, check the next
maximum bandwidth configured item.
for the port
The burst traffic volume at an Enable traffic shaping for the port, and check
upstream service port exceeds whether the fault is rectified. If not, check the next
the maximum bandwidth item.
configured for the port
5. Check whether QoS settings are appropriate if QoS policies are configured for the faulty
NE.
If Then
NOTICE
If the fault persists after all the preceding steps are performed, dial Huawei technical service
hotline or contact Huawei local office.
----End
Fault Symptoms
A switchover in microwave 1+1 protection, triggered by a radio link fault or an equipment
fault, fails or is delayed.
Cause The microwave 1+1 protection group Check the current switching state and
1 is in the forced or lockout switching switching records of the microwave
state, causing a switchover failure. 1+1 protection group.
Cause In the microwave 1+1 protection Check the alarms reported by boards in
2 group, both the main and standby links the microwave 1+1 protection group,
are interrupted or both the main and and the current switching state of the
standby units are faulty, resulting in a microwave 1+1 protection group.
switchover failure.
Cause The NE is being reset or a switchover Check the alarms reported by the NE,
3 between the main and standby system switchover records of the main and
control boards just happens, resulting standby system control boards (OptiX
in a switchover failure or a delayed RTN 950/980 NEs support main and
switchover. standby system control boards), and
the current switching state of the
microwave 1+1 protection group.
Cause An RDI-caused switchover is triggered Check the alarms reported by the NE,
4 immediately after a switchover is and parameter settings, current
complete. As the RDI-caused switching state, and switching records
switchover needs to wait for the of the microwave 1+1 protection
expiration of the wait-to-restore group.
(WTR) timer (in revertive mode, the
waiting time is the preset WTR time;
in non-revertive mode, the waiting
time is 300s), the switchover is
delayed.
Cause In OptiX RTN 600 V100R005/OptiX Check the alarms reported by the NE,
5 RTN 900 V100R002C02 and later and the current switching state and
versions, anti-jitter is provided for switching records of the microwave
switchovers triggered by RDIs and 1+1 protection group.
service alarms, to prevent repeated
microwave 1+1 protection switchovers
caused by deep and fast fading. As a
result, some switchovers are delayed.
Fault Symptoms
A microwave 1+1 protection group fails to switch services back to its main unit although its
main link or unit recovers.
Cause 1 The microwave 1+1 protection group Check whether the revertive mode is
works in non-revertive mode. enabled for the microwave 1+1
protection group. If not, enable it.
Cause 2 The current switching state of the Check whether the current switching
microwave 1+1 protection group is state of the microwave 1+1
RDI, so an automatic revertive protection group is RDI. If yes,
switchover cannot take place. manually clear the RDI state.
Cause 3 When the microwave 1+1 protection Check whether boards in the
group is in the WTR state, the microwave 1+1 protection group
microwave 1+1 protocol detects that report hardware alarms. If yes,
the main unit is faulty. As a result, handle the alarms.
revertive switchover to the main unit
fails.
Fault Symptoms
After the working channel of a subnetwork connection protection (SNCP) protection group
becomes faulty, an SNCP switchover fails or is delayed.
Cause 1 The SNCP protection group is in the Check the current switching state and
forced or lockout switching state, switching records of the SNCP
causing a switchover failure. protection group.
Cause 2 Both the working and protection Check the alarms reported by boards
channels in the SNCP protection in the SNCP protection group, and
group are unavailable, resulting in a the current switching state of the
switchover failure. SNCP protection group.
Cause 3 The NE is being reset or a switchover Check the alarms reported by the NE,
between the main and standby system the records of switchovers between
control boards just happens, resulting the main and standby system control
in a switchover failure or a delayed boards, and the current switching
switchover. state of the SNCP protection group.
Cause 4 On an SNCP ring formed by NEs Find the NEs whose NE software
using both SDH and Hybrid boards, versions are earlier than OptiX RTN
some NEs use the NE software 600 V10R005 or OptiX RTN 900
earlier than OptiX RTN 600 V100R002C02, and the NEs for
V10R005 or OptiX RTN 900 which E1_AIS insertion is disabled.
V100R002C02, or E1_AIS insertion
is disabled for some NEs.
Fault Symptoms
Fault Alarm Impact on System
Sympto
m
Possible Causes
Possible causes of clock faults are as follows:
The OptiX RTN equipment provides various clock alarms to help locate clock faults. When a
clock system becomes faulty, rectify the fault based on reported alarms.
Cause 1 The clock input mode (2 On the NMS, check whether the clock input
Mbit/s or 2 MHz) mode configured for the external clock
configured for an external source is the same as the actual clock input
clock source is different mode.
from the actual clock input If not, change the clock input mode for the
mode. external clock source. Then, check whether
the alarm clears.
Cause 2 A system control, switching, On the NMS, check whether the system
and timing board is faulty. control, switching, and timing board reports
hardware alarms like HARD_BAD.
If yes, clear the hardware alarms and then
check whether the EXT_SYNC_LOS alarm
clears.
Cause 3 A clock input cable is Verify that the clock input cable is correctly
connected incorrectly. connected.
Verify that the port impedance of the
equipment providing the clock source is the
same as the impedance of the clock input
port. If not, for example, a 75-ohm port is
connected to a 120-ohm port, install an
impedance coupler between the two ports.
Verify that the clock input cable is
disconnected or damaged.
Cause 4 The equipment providing a Check whether the equipment providing the
clock source is faulty. clock source is working correctly.
If not, use another equipment to provide a
clock source and then check whether the
alarm clears.
SYNC_C_LOS
No. Possible Cause Handling Procedure
Cause 2 Service signals tracing a On the NMS, check whether any signal loss
clock source are lost. alarms like ETH_LOS, MW_LOF, R_LOC,
and T_ALOS are reported.
If yes, clear these alarms and then check
whether the SYNC_C_LOS alarm clears.
LTI
No. Possible Cause Handling Procedure
Cause 2 A line/tributary/link clock On the NMS, check whether any signal loss
source is lost. alarms like ETH_LOS, MW_LOF, R_LOC,
and T_ALOS are reported.
If yes, clear these alarms and then check
whether the LTI alarm clears.
Cause 3 Clock sources are set to On the NMS, check whether clock sources
work in non-revertive or are set to work in non-revertive mode. If
locked mode. As a result, yes, change the mode to revertive and then
after the currently traced check whether the LTI alarm clears.
clock source is lost, On the NMS, check whether an
automatic switchover to a SYNC_LOCKOFF alarm is reported. If yes,
normal clock source fails. clear the SYNC_LOCKOFF alarm and then
check whether the LTI alarm clears.
SYN_BAD
No. Possible Cause Handling Procedure
Cause 1 The quality of the traced Replace the currently traced clock source
clock source deteriorates or and then check whether the alarm clears.
clock sources are If the alarm persists, check whether the
interlocked. input clock is correctly configured. If the
configuration is incorrect, correct the clock
configuration and then check whether the
alarm clears.
Cause 2 The alarmed board is faulty. On the NMS, check whether the alarmed
board also reports hardware alarms like
HARD_BAD and TEMP_OVER.
If yes, clear the hardware alarms and then
check whether the SYN_BAD alarm clears.
CLK_NO_TRACE_MODE
No. Possible Cause Handling Procedure
Cause 1 No system clock source On the NMS, check whether a system clock
priority list is configured, source priority list is configured.
and the NE uses its default If not, configure a system clock source
system clock source priority priority list and add available clock sources
list. to the list.
Fault NEs connected through their service ports like air interfaces, Ethernet ports,
symptoms and SDH ports are unreachable to their NMS.
Item Description
Handling l For cause 1, rectify service faults including hardware faults and radio
measures link faults.
l For cause 2, check whether the IDs, IP addresses, DCC channel
attributes, and inband DCN attributes of the NEs are modified before
they become unreachable to their NMS.
l For cause 3, replace the faulty system control boards.
Table 2-5 NEs connected through NMS/COM ports are unreachable to their NMS
Item Description
Fault NEs connected through NMS/COM ports are unreachable to their NMS.
symptoms
Illustration Upstream NE
Item Description
Handling l For cause 1, check the network cable of the NMS. If it is faulty, replace
measures it.
l For cause 2, check whether the IDs, IP addresses, DCC channel
attributes, and inband DCN attributes of the NEs are modified before
they become unreachable to their NMS.
l For cause 3, replace the faulty system control boards.
Illustration
NE that is unreachable
to its NMS
Handling l For cause 1, check whether the ID, IP address, DCC channel attributes,
measures and inband DCN attributes of the NE are modified before it becomes
unreachable to its NMS.
l For cause 2, verify that each NE on the DCN subnet has a unique ID and
IP address.
l For cause 3, check the routing table of the gateway NE. If the gateway
NE manages more than the recommended number of NEs, divide the
ECC subnet into several smaller ones.
l For cause 4, replace the faulty system control board.
Start
1
Locate a
faulty NE
2
Check for hardware
Icon of the Yes The NMS cannot Hardware Yes alarms and check
faulty NE in reach the NE fault? cable connections
gray?
No
3
Settings
incorrectly Yes Check settings or
modified? undo modifications
No
4
No
5
Yes
Link fault? Rectify link faults
No
7
No
Fault rectified?
Yes
6 Some NEs may occasionally become Verify that a minimum of 192 kbit/s
unreachable to their NMS. bandwidth is allocated to inband
DCN. If the allocated bandwidth is
lower than 192 kbit/s, packets from
the NMS may be lost.
7 Direct connection from the faulty NE Search for the IP address of the faulty
to its NMS fails. NE on the NMS. If the IP address is
not found, or if the IP address is
found but the NMS still cannot reach
the faulty NE, press the reset button
on the system control board of the
faulty NE.
E1/E3 port Cable The shield layer of the If the coaxial cable is grounded
grounding coaxial cable in different modes at the two
connecting two 75- ends, electric potential
ohm ports must be difference and bit errors may
grounded in the same occur.
mode at the two ends.
4 Typical Cases
Fault Symptoms
The received signal levels (RSLs) at both ends of a 1+1 SD cross-ocean radio link fluctuated
dramatically, leading to bit errors or even link interruptions.
Procedure
Step 1 Checked the alarms reported by NEs at both ends of the radio link.
The NEs did not report any hardware alarms but frequently reported radio link alarms and
service interruption alarms.
Step 2 Checked the RSLs of the main and standby ODUs at each end.
The RSLs of the main and standby ODUs at each end fluctuated dramatically, with a
fluctuation range over 30 dB. Therefore, the fault was possibly caused by multipath fading.
Step 3 Checked the network plans and the mounting height difference between the main and standby
antennas at each end.
The mounting height difference between the main and standby antennas at each end was only
4 meters, so space diversity performance was poor.
NOTE
To protect long-distance cross-ocean radio links against multipath fading, take the following measures during
network planning:
l Ensure that the fading margin is greater than or equal to 30 dB.
l Increase the mounting height difference between the main and standby antennas at both ends of a 1+1 SD
radio link.
Step 4 Adjusted the mounting heights of the main antennas to 24 meters and those of the standby
antennas to 10 meters.
The following figure shows the simulation result and illustrates satisfactory diversity
compensation.
NOTE
The value of K generally ranges from 0.67 to 1.33. In this case, the RSLs of the main and standby
antennas are not correlated with each other. When designing mounting heights for main and standby
antennas, keep appropriate antenna spacing for minimizing the impact of reflection on radio links. When
reflection causes high attenuation on the main path, the attenuation on the standby path is low.
----End
Procedure
Step 1 Checked the alarms and logs of the two NEs.
The NEs did not report any hardware alarm. NE A reported an MW_FEC_UNCOR alarm, but
NE B did not.
Step 2 Checked the RSLs at the two NEs.
The RSL at NE A was –62 dBm and that at NE B was –70 dBm. These two values were
greater than the receiver sensitivity (–85 dBm) in mode 5.
Step 3 Checked for interference signals by muting the ODU at NE B.
----End
Fault Symptoms
A hop of radio link near an airport was intermittently interrupted. No hardware alarm was
reported.
Procedure
Step 1 Checked the historical receive power at the two ends of the link.
Step 2 It was found that the receive power was stable when the link was interrupted. Therefore, the
interruption was not caused by rain fading or multipath fading. Generally, if a link is
interrupted when the receive power is higher than sensitivity, the interruption is caused by
interference.
Step 3 Queried the MSEs of related IF boards. The MSEs were greater than -30 dB, which indicated
that there was interference.
Step 4 Started the frequency scanning function provided with the equipment. No interference was
found around the operating frequency of the link.
Step 5 Configured inloops at IF ports. The MSEs were normal during the inloops. Therefore, the
interruption was not caused by an IF board fault.
Step 6 Replaced the IF cable. The MSEs were still poor. Therefore, the interruption was not caused
by an IF cable failure.
Step 7 Queried the MSEs of adjacent sites, and found that there was interference.
Step 8 Used the spectrum analyzer to scan the intermediate frequency, and found that there was
interference on multiple frequencies near the 140 MHz downstream intermediate frequency.
These frequencies were found to be frequencies used by civil aviation.
Step 9 Replaced the IF cable with a shielded enhanced cable because the intermediate frequency
cannot be changed. Then, the MSEs were improved.
----End
Fault Symptoms
A hop of 1+1 SD radio links provided by RTN 900 V100R003 was interrupted. The receive
power of the main ODU of NE A was -90 dBm, and the link reported the RADIO_RSL_LOW
and MW_LOF alarms. The receive power of the standby ODU of NE A was -40 dBm, which
was a normal value. However, no protection switching occurred.
Procedure
Step 1 Analyzed the cause why protection switching did not occur. Checked the 1+1 SD protection
group configurations, and found that reverse switching was enabled and the WTR time of the
1+1 SD protection was set to 10 minutes.
If a member link in a 1+1 protection group fails but no switching occurs, the cause is usually
that the reverse switching timer has not expired.
Step 2 Queried the receive power of the standby ODU of NE A, and found that the receive power
was -40 dBm, which was a normal value. Forcibly switched services to the standby link on
NE A. The services recovered.
Step 3 Reverse switching occurs usually because a fault (such as a hardware fault) occurs in the
transmit part at the source end but the equipment cannot detect the fault. Queried the transmit
power and receive power of the main ODU of NE A. The transmit power was 23 dBm, which
was a normal value; the receive power was still -90 dBm.
Step 4 Replaced the main ODU of NE A and queried the receive power of the main ODU. The
receiver power was still -90 dBm.
Step 5 Replaced the flexible waveguide connected to the main ODU and queried the receive power
of the main ODU. The receive power was still -90 dBm.
Step 6 Checked the connections between the ODUs and antenna, and found a sign of water at the
antenna feed port and then found water accumulated in the antenna feed. The accumulated
water caused a failure to transmit RF signals.
Step 7 Emptied the water in the antenna feed, dried it, and installed it again. Queried the receive
power of the main ODU. The receive power was -35 dBm, which as a normal value.
----End
Procedure
Step 1 Checked historical alarms. It was found that site F reported MW_LOF and MW_BER_SD
alarms, and site E reported an MW_RDI alarm when transient link interruption occurred. This
indicated that a unidirectional link fault occurred.
Step 2 Analyzed link performance curves and found that the waveforms of MSE performance curves
were almost the same for the four IF directions at site F. It was unlikely that the four IF boards
and ODUs were faulty at the same time. It was suspected that a fault occurred during space
propagation.
Step 3 Analyzed the link performance curves and link interruption time. It was found that even slight
decrease of receive power caused link interruption. This indicated that the demodulation
threshold signal level of receiver had degraded, which was often caused by interference.
According to the link performance curve, transient link interruption occurred at night. During
the day, the receive power was stable and no transient link interruption occurred.
Step 4 Applied for a maintenance window, performed frequency scanning when transient link
interruption occurred at night, and found that co-channel interference existed.
Step 5 Checked for interference sources. Because frequencies were strictly managed and the links
spanned over remote areas, it was unlikely that the links were interfered by devices from other
carriers. Therefore, it was necessary to check for interference within the microwave network.
Step 6 Analyzed the network plan. According to the network plan, the A-B and E-F links operated at
the same frequency. In addition, sites A, B, E, and F were almost on a line. No angle was
formed to avoid co-channel interference, resulting in over-reach interference.
Step 7 Changed the frequency at which the E-F link operated. The fault was cleared.
----End
Fault Symptoms
In an XPIC-enabled 4+0 long haul microwave link group, continuous bit errors occurred on a
channel whereas the performance of the other three channels was normal.
Procedure
Step 1 Checked the RSLs of the four channels. All of them met requirements in the network plan.
This indicated that antennas were properly aligned.
Step 2 Checked the link MSE curves. The MSE value for the faulty channel dramatically fluctuated
and the MSE values for the other three channels were stable, which indicated that interference
might exist.
Step 3 Muted the peer RFU on the faulty channel and checked the local RSL value. The local RSL
value was -90 dBm, which indicated that no interference existed and no signal leakage
occurred.
Step 4 Suspected that crossmodulation occurred on the channel. Checked whether elliptic
waveguides and flexible waveguides were properly routed and connected. It was found that an
elliptic waveguide was fixed using angle iron instead of required fixing clamps, resulting in
deformation of the elliptic waveguide.
Step 5 Replaced the deformed elliptic waveguide. The fault was cleared.
----End
l Route elliptic waveguides as designed, ensure that the waveguide bend radius meets
requirements, and use mapping fixing clamps to fix the waveguides, preventing the
waveguides from being deformed.
l Ensure that no copper scale enters a waveguide when making connectors for the
waveguide.
l Properly connect and waterproof waveguides.
NE A NE B
OptiX PTN 3900 OptiX PTN 3900
Procedure
Step 1 Checked for equipment alarms and radio link alarms on the NEs.
No equipment alarm or radio link alarm was found. Therefore, it was suspected that NE data
was incorrectly configured.
Step 2 Checked operation logs of the OptiX RTN 620s on the U2000.
On NE5, a bridge service was configured between the EMS6 board in slot 4 and the EMS6
board in slot 8 when the fault occurred.
Step 3 Checked the cable connection between the four ports and the service configuration data of
NE5.
Port 1 and port 2 on the EMS6 board in slot 4 were respectively connected to port 1 and port 2
on the EMS6 board in slot 8 using network cables. Parameter Hub/Spoke, however, was
incorrectly set for the four ports. As a result, a loop formed among the four ports and packets
were forwarded among the four ports, leading to the broadcast storm. For the cable
connection between the four ports, see the following figure.
----End
Procedure
Step 1 Checked for equipment alarms and radio link alarms on the NEs.
No equipment alarm or radio link alarm was found. Therefore, it was suspected that NE data
was incorrectly configured.
Step 2 Checked the working mode parameters for the IF boards at both ends of the radio link.
The E1 capacity was set to different values, resulting in different bandwidths for data services
and finally service interruptions. No alarm indicating E1 capacity inconsistency was provided.
Step 3 Changed E1 capacities to ensure that both NEs had the same E1 capacity.
----End
Fault Symptoms
A mobile backhaul network was reconstructed to a packet network. The original 2G BTS
services were transmitted through CES E1 services. After services of site A were cut over, the
services became unavailable and the BTS failed to be started.
Procedure
Step 1 Checked the service configuration. To be specific, performed an LSP ping test and PW ping
test to check packet service configuration. The test results showed that the packet service
configuration was correct.
Step 2 Checked the CES service configuration. The frame format of the port on the RTN equipment
was set to CRC4 multiframe. Changed the frame format to unframe and the CES emulation
mode from CESoPSN to SAToP. The BTS services recovered.
Step 3 Communicated with wireless engineers and found that the frame format was set to double
frame for E1 signals of the BTS. Different frame formats caused the service interruption.
----End
Fault Symptoms
The radio link in 1+1 protection between NE549 and NE606 became faulty, resulting in a
service interruption. The faulty radio link automatically recovered 5 minutes later.
7 IF1A 17 17 7 IF1B
ODU ODU
5 IF1A 6 SD1 15 15 5 IF1B 6 SL1
ODU ODU
4 PD1 4 PD1
Procedure
Step 1 Checked historical alarms of the two NEs.
NE549 reported a RADIO_MUTE alarm when the radio link was interrupted.
A command of muting an ODU was executed before the radio link was interrupted. This
misoperation triggered the RADIO_MUTE alarm.
Step 3 Checked the switching state of the 1+1 protection group because a RADIO_MUTE alarm
should have triggered a 1+1 protection switchover.
The 1+1 protection group on NE549 was in the forced switching state and was kept working
on the main channel, so the RADIO_MUTE alarm could not trigger a 1+1 protection
switchover.
NOTE
An NE automatically unmutes its ODU 5 minutes (the default time) after the ODU is muted. This explains
why the radio link between NE549 and NE606 automatically recovered 5 minutes after the link interruption.
Step 4 Cleared the forced switching state of the 1+1 protection group on NE549 so the protection
group entered the automatic switching state.
----End
Fault Symptoms
See the following networking diagram. An OptiX RTN 620 (NE A) was deployed at site A
and an OptiX RTN 605 1F (NE B) was deployed at site B. The board in slot 5 on NE A
interworked with the board in slot 8 on NE B through an air interface. Certain base stations
traced clock signals from NE A and NE B, and the clock signals became abnormal.
Procedure
Step 1 Checked the alarms of the NEs.
l NE A reported HARD_BAD alarms from February 28 to April 11 and SYN_BAD
alarms in May. The value of the first parameter of HARD_BAD alarms was 6, indicating
that the digital phase-locked loop (PLL) was abnormal. The SYN_BAD alarms indicated
that the traced clock source deteriorated.
l NE B reported an RP_LOC alarm, indicating that the clock signals received from the
PLL were lost.
The preceding information showed that NE A traced the link clock from its IF board in slot 5
and NE B traced the link clock from its IF board in slot 8. The two IF boards interworked
with each other through an air interface so the two NEs traced each other's clock. In the case
of clock interlocking, a small frequency deviation is gradually enlarged and finally falls out of
the permitted range.
Step 3 Changed the clock configuration NE A so that NE A traces the link clock from its IF board in
slot 6.
The fault was rectified.
----End
Clock cable
FE
NodeB 1
NE B NE A OptiX PTN 1900
OptiX RTN 910 OptiX RTN 910
Procedure
Step 1 Checked cable connection at NodeB 1 because it reported an alarm indicating the loss of
clock signals from an external clock port.
The external clock port on NodeB 1 was a 75-ohm coaxial port and the external clock port on
NE B was a 120-ohm twisted-pair port. To connect the external clock port on NE B to the
external clock port on NodeB 1, an impedance converter box (Balun-box) was installed on the
external clock port of NE B.
The wire connection diagram of the converter box shows that the Tx wire from NE B was
connected to the Rx end of the converter box and the Rx wire from NE B was connected to
the Tx end of the converter box. Cable connection examination showed that the Tx wire from
the converter box was connected to the Rx end of NodeB 1 and the Rx wire from the
converter box was connected to the Tx end of NodeB 1. As a result, the Tx wire from NE B
was connected to the Tx end of NodeB 1 and the Rx wire from NE B was connected to the Rx
end of NodeB 1, so signals were unavailable.
4/5Tx 4/5 Rx Rx
1/2Rx 1/2 Tx Tx
Correct Incorrect
connection connection
Step 2 Corrected the cable connection. NodeB 1 could trace clock signals normally.
----End
l The Tx port at the local end is connected to the Rx port at the opposite end, and the Rx
port at the local end is connected to the Tx port at the opposite end.
l If two ports are connected using a twisted pair, the positive end of the local port is
connected to the positive end of the opposite port, and the negative end of the local port
is connected to the negative end of the opposite port.
Procedure
Step 1 The server could be pinged, but services were unavailable. This was usually caused packet
loss. Suspected that insufficient radio link capacity caused congestion and consequently
resulted in packet loss.
Step 2 Queried the air interface bandwidth utilization of the radio link and found that no congestion
occurred on the link.
Step 3 Queried the RMON performance statistics of the interconnected Ethernet ports of the RTN
and PTN equipment and found statistics about oversized packets and corrupted packets.
Step 4 Queried the port configurations. The maximum frame length configured for the Ethernet port
of the RTN equipment was 1522 bytes, and that configured for the Ethernet port of the PTN
equipment was 1620 bytes. The maximum frame lengths configured for the interconnected
ports were inconsistent.
Step 5 Changed the maximum frame length to 1620 bytes for the Ethernet port of the RTN
equipment. Users could access services on the server.
----End
A Appendix
0.1 2 3 5 8 6 8 12 10 20 12 15 22 35 65 72
0.03 5 6 9 13 12 15 20 18 28 23 33 40 65 105 96