You are on page 1of 46

SingleRAN

Fault Management Feature


Parameter Description

Issue 01
Date 2014-04-26

HUAWEI TECHNOLOGIES CO., LTD.


Copyright © Huawei Technologies Co., Ltd. 2014. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or representations
of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China

Website: http://www.huawei.com
Email: support@huawei.com

Issue 01 (2014-04-26) Huawei Proprietary and Confidential i


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description Contents

Contents

1 About This Document..................................................................................................................1


1.1 Scope..............................................................................................................................................................................1
1.2 Intended Audience..........................................................................................................................................................1
1.3 Change History...............................................................................................................................................................2
1.4 Differences Between Base Station Types.......................................................................................................................3

2 Fault Management Architecture.................................................................................................4


3 Basic Concepts................................................................................................................................8
4 NE Fault Management................................................................................................................13
4.1 Overview......................................................................................................................................................................13
4.2 Maintenance Mode Alarm............................................................................................................................................19
4.2.1 Basic Concepts..........................................................................................................................................................19
4.2.2 Technology Description............................................................................................................................................20
4.3 Fast Fault Diagnosis and Troubleshooting...................................................................................................................21
4.4 OML Identification.......................................................................................................................................................24
4.4.1 Function Description.................................................................................................................................................24
4.4.2 Engineering Guidelines.............................................................................................................................................24

5 Fault Management of the U2000...............................................................................................26


6 Troubleshooting..........................................................................................................................30
6.1 Procedures and Principles.............................................................................................................................................30
6.2 Fault Location and Troubleshooting.............................................................................................................................31

7 Schemes for Deleting Alarms and Alarm Location Parameters.........................................35


8 Parameters.....................................................................................................................................38
9 Counters........................................................................................................................................41
10 Glossary.......................................................................................................................................42
11 Reference Documents...............................................................................................................43

Issue 01 (2014-04-26) Huawei Proprietary and Confidential ii


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 1 About This Document

1 About This Document

1.1 Scope
This document describes fault management, including its implementation principles and basic
architecture.

This document covers the following features:

l MRFD-210304 Fault Management


l LBFD-004006 Fault Management
l TDLBFD-004006 Fault Management

In this document, the following naming conventions apply for LTE terms.

Includes FDD and Includes FDD Only Includes TDD Only


TDD

LTE LTE FDD LTE TDD

eNodeB LTE FDD eNodeB LTE TDD eNodeB

eRAN LTE FDD eRAN LTE TDD eRAN

In addition, the "L" and "T" in RAT acronyms refer to LTE FDD and LTE TDD, respectively.

1.2 Intended Audience


This document is intended for personnel who:

l Need to understand the features described herein


l Work with Huawei products

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 1


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 1 About This Document

1.3 Change History


This section provides information about the changes in different document versions. There are
two types of changes, which are defined as follows:

l Feature change
Changes in features of a specific product version
l Editorial change
Changes in wording or addition of information that was not described in the earlier version

SRAN9.0 01 (2014-04-26)
This is the first official release.This issue does not include any changes.

SRAN9.0 Draft B (2014-02-28)


This issue includes the following changes.

Change Type Change Description Parameter Change

Feature change None None

Editorial Added the description about the feature and None


change function difference between different site types.
For details, see section 1.4 Differences Between
Base Station Types.

SRAN9.0 Draft A (2014-01-20)


Compared with Issue 01 (2013-09-30) of SRAN8.0, Draft A (2014-01-20) of SRAN9.0 includes
the following changes.

Change Change Description Parameter Change


Type

Feature Huawei mobile network management system None


change M2000 is renamed U2000.

Added the LTE TDD eRAN mode support the


Fault Management feature.

Added dashboard function for Fast Fault Diagnosis


and Troubleshooting. For details, see section 4.3
Fast Fault Diagnosis and Troubleshooting.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 2


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 1 About This Document

Change Change Description Parameter Change


Type

Modify schemes for deleting alarms and alarm


location parameters. For details, see chapter 7
Schemes for Deleting Alarms and Alarm
Location Parameters.

Editorial Deleted the descriptions of micro base stations' None


change support for Fault Management.

1.4 Differences Between Base Station Types


Feature Support by Macro, Micro, and LampSite Base Stations
Feature ID Feature Name Suppo Suppo Suppo
rted by rted by rted by
Macro Micro Lamp
Base Base Site
Statio Statio Base
ns ns Statio
ns

MRFD-210304 Fault Management Yes Yes Yes

LBFD-004006 Fault Management Yes Yes Yes

TDLBFD-004006 Fault Management Yes No No

Function Implementation in Macro, Micro, and LampSite Base Stations


Working in either UMTS only or LTE FDD only mode, micro base stations also do not support
GSM, LTE TDD, multimode, co-MPT, or separate-MPT scenarios. As integrated entities, micro
base stations do not involve such concepts as boards, cabinets, subracks, slots, or RRUs.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 3


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 2 Fault Management Architecture

2 Fault Management Architecture

Fault management detects and records device faults and notifies users of the detected faults and
related troubleshooting methods. This helps maintenance personnel quickly locate and rectify
faults.
Fault management works in the following layers based on 3GPP specifications:
l Network element layer (NEL)
l Element management layer (EML)
l Network management layer (NML)
Figure 2-1 shows the fault management architecture.

Figure 2-1 Fault management architecture

The fault management module consists of the following components: external alarm, internal
alarm, alarm list, alarm log, alarm filter, Itf-S, and Itf-N. Table 2-1 describes the components.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 4


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 2 Fault Management Architecture

Table 2-1 Components in the fault management module

Component Description

External alarm Handles alarms reported by peripheral


devices, such as the environment monitoring
device.

Internal alarm Handles alarms reported by base station


controllers and base stations.

Alarm list Lists IDs, names, severities, and OSS


categories of all reported alarms.

Alarm log Records the detailed information about each


alarm.

Alarm filter Filters out alarms according to the preset


filtering criteria.

NEL
The NEL is where most alarms are generated. Most of these alarms are generated from main
devices of the NEs and peripherals, such as the environment monitoring device.The NEs mainly
include the base station controllers and base stations.

After detecting exceptions, an NE device first filters and judges them based on preset rules. The
exceptions that cannot be resolved are defined as faults. NE devices can directly rectify faults.
When certain faults need to be rectified with manual operations or using other automation
devices, alarms are reported.

Figure 2-2 shows implementation of fault management on the NEL, using Huawei wireless
multi-mode BSC (MBSC) as an example.

Figure 2-2 Fault management on the NEL

As shown in Figure 2-2 an MBSC includes the following devices:

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 5


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 2 Fault Management Architecture

l Operation and maintenance unit (OMU) boards


l GE switching network and control unit (SCU) boards
l Service processing boards
l Monitoring devices
The OMU collects information about faults detected on the preceding devices, configures the
correlation for alarms and events, and post-processes the faults before reporting alarms to the
U2000.

EML
A device vendor generally provides the EML to manage the NEs of its own. On certain EMLs,
devices of multiple vendors can be managed.
On the EML, alarms are received, stored, and filtered. Alarms are dispatched through the
northbound interface.
Huawei EML is iManager U2000. Figure 2-3 shows implementation of fault management on
the EML, using Huawei iManager U2000 as an example. Fault management of the U2000
involves alarm/event setting, alarm/event reporting, and alarm/event notification.

Figure 2-3 Fault management of the U2000

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 6


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 2 Fault Management Architecture

NML
In normal cases, telecom operators centrally manage their devices on the network management
layer (NML) by using an NMS. The devices are deployed on various networks, such as the radio
access network (RAN), core network, and transport network.

Fault management is an important function of the NMS. With this function, the NMS can receive,
filter, store alarms generated on devices of multiple vendors and fields and dispatch work orders
for these alarms.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 7


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 3 Basic Concepts

3 Basic Concepts

Common Concepts
l Fault
A fault is a physical or logical factor that prevents a system from running properly. The
fault is displayed as an error. A fault can trigger an alarm or event.
– Alarm
An alarm is reported to the element management system (EMS) when a device incurs
a fault or an exception that needs to be rectified manually or using automation devices.
An alarm has two states: generated and cleared. If an alarm is generated, it must be
cleared.
Fault management contains the following parameters: Alarm Name, Alarm raised
time, Location info, Cleared Time, and Cleared Type. The first two parameters are
available for generation alarms, and the remaining parameters are available for cleared
alarms.
– Event
An event refers to the information that is generated on an NE while the network is
running. The information does not indicate a fault, and therefore you do not need to
dispatch a work order for events. You can use the information as reference for
troubleshooting.
Event parameters include Event Name, Event raised time, and Location info.
l Current alarm
Current alarms indicate persistent or unacknowledged alarms on the OSS side.
Current alarms apply only to the EMS, because acknowledgment information is not saved
on the NE side.
l Active alarm
Active alarms indicate the alarms that are not cleared on the NE.
l Duplicate alarm
Duplicate alarms indicate new alarms whose alarm types, alarm sources, key location
parameters, and clearance types are the same as those of the existing alarms.
l Common alarm
Common alarms indicate the alarms reported by common devices such as power supply
and temperature control devices in multi-mode scenarios.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 8


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 3 Basic Concepts

Common alarms are identified by the Common Alarm field. The value in this field also
indicates the RATs involved.

Concepts Related to Alarm Attributes


l Alarm ID
In a product, an alarm ID uniquely identifies an alarm and represents a special fault or event.
You can determine the information about an alarm based on its alarm ID, such as the alarm
name, clearance suggestion, alarm cause, and location information. However, specific
values of location information cannot be located.
An alarm ID ranges from 1 to 65533. The alarm ID of a generated alarm is the same as that
of the corresponding cleared alarm.
l Alarm name
In a product, an alarm name uniquely identifies an alarm and clearly and accurately
represents the significance of an alarm. An alarm name maps an alarm ID. The alarm name
of a generated alarm is the same as that of the corresponding cleared alarm. For user-defined
alarms, the relationships between alarm names and alarm IDs can be configured on the
EMS.
l Alarm severity
An alarm severity identifies how severe a fault affects services. Four alarm severities are
defined in descending order: critical, major, minor, and warning.
– Critical: Faults affect system services and must be rectified immediately. If a device
malfunctions or a certain type of resource becomes unavailable, troubleshoot the faults
immediately even in off-work hours.
– Major: Faults affect quality of service (QoS) and need to be rectified immediately. If
the QoS of a device or resource deteriorates, troubleshoot the faults in work hours.
– Minor: Faults may affect QoS. Troubleshoot the faults at an appropriate time or continue
to observe the alarms. For example, if a packet loss alarm is reported, you need to check
the settings of bit error rate (BER) thresholds, view the onsite BER, and determine the
impact.
– Warning: Faults may affect system services due to potential errors. Troubleshoot the
faults based on error information. For example, if an alarm indicating insufficient
redundant power supply is reported, services may be affected later. In this case, handle
the alarm before services are affected.
l Alarm type
Alarms and events can be further categorized as follows according to their sources:
– Power system: alarms for faults in the power system (providing -48 DC power supply).
– Environmental system: alarms for faults related to the device environment, such as the
temperature, humidity, and access control.
– Signaling system: alarms for faults related to signaling, such as channel associated
signaling (No.1) and common channel signaling (No.7).
– Trunk system: alarms for faults on trunk circuits and trunk boards.
– Hardware system: alarms for faults on the hardware, such as boards, clocks, and CPUs.
– Software system: alarms for software-related faults.
– Operating system: alarms for faults that occur while a system is running.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 9


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 3 Basic Concepts

– Communication system: alarms for faults in the communication system, such as network
cable disconnection and network equipment fault.
– QoS: alarms for QoS-related faults.
– Processing error: alarms for system processing errors.
– Integrity: one type of security alarm, which indicates that information may be modified,
inserted, or deleted illegally.
– Operations: one type of security alarm, which indicates that services are unavailable or
unreachable due to incorrect operations, faults, or other unknown reasons.
– Physical system: one type of security alarm, which indicates that physical resources are
damaged by suspicious hacker attacks.
– Security: one type of security alarm, which indicates that the RAN system has suffered
hacker attacks.
– Time domain: one type of security alarm, which indicates that events occur at the time
when the events need to be avoided or are forbidden.
l Alarm log
Alarm logs record the alarms generated in the RAN system, including cleared and uncleared
alarms and all events. However, the following suppressed alarms are excluded:
– Alarms whose Shielded Flag is set to Shield
– Alarms that are suppressed during alarm oscillation processing
– Alarms that are suppressed during alarm correlation processing
l Alarm generation time
Alarm generation time marks a point when an alarm or event is generated. Alarm generation
time is the current time of the module or device where an alarm is generated. For example,
if an alarm is generated during local operation and maintenance, the local time is used; if
an alarm generated on the host, the host time is used.
l Alarm clearance time
Alarm clearance time marks a point when an alarm is cleared. For the alarms cleared
normally, the alarm clearance time is the current time of the module or device where an
alarm is located. For alarms cleared during local operation and maintenance, the alarm
clearance time is the local time.
l Alarm clearance type
This concept indicates how an alarm is cleared, including the following types:
– Normal clearance
If a cleared alarm is received, an alarm is cleared. Alternatively, the OMC automatically
clears the alarms that have been cleared on the NE but are not cleared on the NMS based
on the active alarm list synchronized from NEs.
– Reset clearance
If alarms are detected again on a device after the device restarts, old alarms are
automatically cleared.
– Manual clearance
If you manually clear an alarm, the alarm is displayed as a cleared alarm on the LMT
even though the fault persists. Therefore, you are advised to confirm that the fault has
been rectified before manually clearing an alarm.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 10


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 3 Basic Concepts

– Configuration clearance
After an object is deleted, alarms for the object are automatically cleared.
– Correlation clearance
After receiving the root alarm of an uncleared correlative alarm, the fault management
system automatically clears the correlative alarm when reporting the root alarm. The
correlative alarm is deleted from GUIs. If alarm correlation is not configured, this alarm
clearance type is unnecessary.
– Overwrite clearance
The oldest alarms are overwritten due to limited hard disk space for NEs. If active alarms
are overwritten, the NEs automatically clear them.
– State-switching clearance
During a device status switchover, the active alarms in the previous state are
automatically cleared. The alarms in the current state are reported.
l Location information
Location information refers to the alarm information about products and services, such as
the CPU ID, board type, specific error code, and other information used for fault
troubleshooting, including the temperature. The location information in the alarm clearance
report is the same as that in the corresponding alarm report.
The location information can be empty.
l Alarm serial number
An alarm serial number records the sequence of alarms generated on an NE and uniquely
identifies an alarm in the alarm log. The same serial number is used in alarm generation,
alarm clearance, and alarm change including alarm severity change and location
information change.
l Additional information
Users have configured alarm location information on NEs when creating alarms. However,
telecom operators require special information in certain cases. The special information can
be reported as additional information, which is also configured on NEs. The NMS cannot
identify additional information, so the NMS needs to parse the additional information by
character string. Each item in the additional information must be in the format of <item
name=><value> and contains a maximum of 500 bytes.

Concepts Related to Alarm Reporting and Processing Mechanism


l Alarm toggling
In a specified period, a managed object (MO) alternately generates the same alarm and
clears the alarm multiple times, or repeatedly generates the same event.
l Alarm change
Only the alarm severity of an uncleared alarm can be changed. After the alarm severity of
an alarm is changed, the alarm needs to be reported again, containing the time when the
alarm severity is changed. The alarm serial number of the reported alarm is the same as
that of the alarm when it is generated. However, the alarm synchronization number needs
to be allocated again.
l Alarm correlation

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 11


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 3 Basic Concepts

Alarms may be correlated with each other. For example, a fault for which an alarm is
generated may cause another fault that triggers a new alarm. Also, different alarms may be
generated for the same fault.
If the fault that causes alarm A occurs, alarms A, B, C, and D may be reported
simultaneously. If alarm B is generated because of the fault that causes alarm A, alarm C
is generated because of the fault that causes alarm B, and alarm D is generated because of
the fault that causes alarm C, alarm A is the root alarm for alarms B, C, and D, and alarms
B, C, and D are correlative alarms for alarm A.
Alarm correlation refers to the configuration of alarm identifiers that indicate the
relationship between alarms, such as root alarms and correlative alarms. The fault
management system can identify the relationship between alarms based on the identifiers
and discard certain alarms as required. If an alarm is a correlative alarm and its root alarm
can be identified by NEs, source of the alarm is set to correlative alarm, and a root alarm
serial number can be added to the alarm message for you to efficiently locate the root alarm
and rectify the fault. If an alarm is a root alarm, source of the alarm is set to root alarm.
If an alarm has no root or correlative alarm, for example, the environment alarm, source
of the alarm is set to independent static alarm.
If multiple root alarms or correlative alarms are involved, NEs provide a correlative alarm
group ID to identify a group of correlated alarms. For example, the preceding alarms A, B,
C, and D have the same correlative alarm group ID. If you have purchased the Efficient
Trouble Ticket license, the U2000 provides the alarm correlation view, where alarms with
the same correlative alarm group ID are displayed centrally. This improves the efficiency
of routine troubleshooting and trouble ticket dispatch on the U2000. The U2000 can also
send correlative alarm group IDs to the NMS through the northbound interface, enabling
telecom operators to dispatch trouble tickets from the NMS. This improves the efficiency
of trouble ticket dispatch and troubleshooting on the NMS.
In addition to the alarms that can be identified as root alarms correlative alarms, some
alarms may be generated for the same fault. For example, multiple optical port tributary
alarms are generated on the same optical interface board. If you have purchased the
Efficient Trouble Ticket license, the U2000 can combine multiple alarms generated for
the same fault within a certain period into a minimum of one alarm.
In the scenario where a fault (for example, the fault on transmission equipment) frequently
occurs, and the base station and base station controller report alarms indicating the same
fault separately, you can import an inter-NE alarm correlation rule into the U2000 so that
correlated alarms reported by the base station and base station controller can be added to
the same correlative alarm group. Alternatively, you can discard alarms on the base station
or the base station controller. The predefined default correlation rules are available only
after you purchase the Efficient Trouble Ticket license.
l Alarm synchronization number
An alarm synchronization number records the sequence of reporting alarm messages to the
EMS and ensures that once an alarm is generated on an NE, it is reported to the EMS
immediately. Alarm synchronization numbers in an alarm generation message and the
corresponding alarm clearance message are different. The alarm synchronization number
ranges from 1 to 0x7ffffffe in a cyclical order.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 12


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

4 NE Fault Management

4.1 Overview
Fault Management provides the following basic functions:

Fault detection
After detecting faults, a fault detection unit reports the faults to the fault management module.
Then, the fault management system reports alarms for these faults to the U2000 or local
maintenance terminal (LMT) after processing the faults on each layer. Fault detection units can
detect faults of all MOs including software and hardware, such as TRXs, ports, channels, boards,
BTSs, cells, links, and signaling messages.

Duplicate fault/alarm filtering, the fault/alarm transient rule, and the fault/alarm
toggle rule
There are two filtering stages: primary filter and secondary filter. In the primary filter, fault
detection units filter duplicate faults and other faults using the transient rule and toggle rule. In
the secondary filter, alarms to be reported are filtered.

l Transient rule
Faults or alarms of short duration can be filtered based on the alarm or fault generation
delay. Only the faults or alarms whose duration exceeds the threshold of the generation
delay comply with the transient rule and are reserved for next filtering.
As shown in Figure 4-1 the duration of fault 1 or alarm 1 is shorter than the delay threshold
T, so fault 1 or alarm 1 is discarded. The duration of fault 2 or alarm 2 is longer than T, so
alarm 2 or an alarm for fault 2 can be reported.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 13


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Figure 4-1 Principles of the transient rule

On a base station, you can run the SET ALMFILTER command to set parameters related
to alarm filtering based on the alarm transient rule.
On a base station controller, you can run the SET ALMBLKSW and SET
ALMBLKPARA commands to set switches and parameters related to alarm filtering based
on the alarm transient rule:
– SET ALMBLKSW allows you to set Alarm switch of blinking filter
(BLKFILTERSW), Switch of statistics blinking alarm (BLKSTATSW), and
Observing time window of statistical alarm (BLKSTATPRD).
– SET ALMBLKPARA allows you to set Alarm ID (AID), Intermittent alarm
generating threshold (BLKPRD), and alarm statistics thresholds. The alarm statistics
thresholds include: Upper threshold for accumulated fault occurrences
(CNTRISTHRD), Lower threshold for accumulated fault occurrences
(CNTSTLTHRD), Upper threshold for accumulated fault duration
(TMRISTHRD), Lower threshold for accumulated fault duration
(TMSTLTHRD).
NOTE

To enable the alarm statistics function, both Switch of statistics blinking alarm (BLKSTATSW) and
Alarm switch of blinking filter (BLKFILTERSW) must be set to ENABLE.Intermittent alarm
generating threshold (BLKPRD) must be set based on statistics on the intervals between the generation
and clearance of an intermittent alarm on the live network. In normal cases, Observing time window of
statistical alarm (BLKSTATPRD), Upper threshold for accumulated fault occurrences
(CNTRISTHRD), and Lower threshold for accumulated fault occurrences (CNTSTLTHRD) are set
to 3600s, 15, and 2, respectively.
l Toggle rule

Figure 4-2 Principles of the toggle rule

If the number of duplicate faults exceeds a threshold in a period T1, the duplicate faults are
filtered using the toggle rule. After that, one fault and an alarm for the fault are reserved, and

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 14


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

alarms for other duplicate faults are filtered. The fault detection units determine oscillation
termination conditions once oscillation starts. If the number of duplicate faults is within the
threshold in T2, the oscillation ends. The fault does not occur.

You can run the MML commands SET ALMOSCISW and SET ALMOSCITHRD on a base
station controller to set the switch and parameters related to alarm filtering based on the toggle
rule.
l SET ALMOSCISW allows you to set Alarm Oscillation Filtering Switch (SW).
l SET ALMOSCITHRD allows you to set Alarm ID (AID), Oscillation Entry Period
(INOSCPRD), Oscillation Entry Threshold (INOSCTHRD), Oscillation Exit Period
(OUTOSCPRD), and Oscillation Exit Threshold (OUTOSCTHRD).

Fault troubleshooting
Fault troubleshooting involves processes that include a device status switchover, fault isolation,
and automatic fault rectification. BTSs and BSCs filter faults and automatically rectify them
based on preset policies. If required, the preset policies can be modified by adjusting parameters.
When faults fail to be automatically rectified and manual interventions are required, alarms are
reported.

Alarm mapping
Alarm mapping is one of the core processes in fault management. Faults in the system are
independent from reported alarms. Only alarms are presented to you. Alarm mapping forces
faults to map reported alarms. Faults and events occur in the system and involve system details.
Alarms provide fault analysis results and are displayed in a uniform and simple format. You can
rectify faults based on alarms. Rather than obtaining system details, you only need to locate the
units where faults occur and that can be replaced or modified.

Alarm box management


Alarm box management provides functions, such as specifying the severity of alarms to be
reported to the alarm box, resetting the alarm box, and querying the alarm box version. After
you specify concerned alarms to be reported to an alarm box, the alarm box provides audible
and visual notifications for you to rectify faults in time.

Alarm correlation
Alarm correlation is one of the core processes in fault management. This function filters out
non-root faults and presents root faults to you. A root fault generally triggers multiple correlative
faults. If alarm correlation is not performed, multiple alarms are reported, which affects fault
location.

Certain critical alarms, such as service-related alarms, cannot be suppressed based on alarm
correlation even if the critical alarms are generated for correlative faults that include physical
device faults or data transmission faults. These alarms carry the serial numbers of their root
alarms. In this way, the U2000 can present alarm correlations to maintenance personnel for fast
fault location and troubleshooting.

Figure 4-3 shows the principles of alarm correlation.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 15


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Figure 4-3 Principles of alarm correlation

Faults A, B, C occurred in sequence and were detected at T2, T1, and T3, respectively. Fault A
is detected later than fault B. Based on the alarm correlation rules, fault A is the root fault, fault
B is a correlative fault of fault A, and fault C is a correlative fault of fault B. The correlation
analysis delays of fault B and fault C are Δtb and Δtc, respectively.

1. When the root fault A was detected, the corresponding alarm A was directly reported.
2. The correlation analysis of fault B was performed at T1+Δtb when the root fault A was
detected, and therefore alarm B for fault B was suppressed.
3. The correlation analysis of fault C was performed at T3+Δtc when root alarm A was
reported, and therefore alarm C, which contains the serial number of alarm A (root alarm
serial number), was reported.

Supporting common alarms in the SingleRAN solution


In a GSM/UMTS/LTE multimode base station, if three common alarms with the same
information are detected and the alarms are for GSM, UMTS, and LTE, respectively, an alarm
for only one RAT can be displayed. This prevents redundant work order dispatches. The RAT
in the alarm varies according to the multi-RAT priority settings.

Alarm synchronization between the MBSC, NodeB, eNodeB, and eGBTS and the
U2000
After an NE is created on or reconnected to the U2000, the U2000 synchronizes alarms and
events from the NE. In the scenario where the U2000 fails to receive alarms reported by an NE
in real time due to network exceptions or other unknown reasons, alarm synchronization is also
required.

Alarm synchronization between the GBTS and the U2000


Alarms in a GBTS can be managed only by the MBSC. Therefore, alarm synchronization
between a GBTS and the U2000 consists of two stages:

l Alarm synchronization between the GBTS and the MBSC: The MBSC queries active
alarms from the GBTS, issues a command to the GBTS to check for unsynchronized alarms,
and updates alarm records on the MBSC based on the check result.
l Alarm synchronization between the MBSC and the U2000.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 16


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Alarm severity change


Based on 3GPP specifications, the alarm severity of an uncleared alarm can be changed. After
the alarm severity is changed, an alarm severity change message is reported.

User-defined alarms
Base stations and MBSCs can be connected to external environment monitoring devices to
monitor the environment and device status, such as the temperature, humidity, voltage, theft,
and smoke. You can define alarms on BTSs and MBSCs for faults related to the status of the
environment and devices. You can also set parameters for these alarms, such as the alarm name,
alarm severity, and network management type. In this way, you can dynamically monitor the
environment and devices.

Before configuring a user-defined alarm, you need to run the MML command ADD EMU to
add an environment monitoring unit (EMU). Table 4-1 and Table 3-2 list the parameters for an
EMU on a base station controller and a base station, respectively.

Table 4-1 Parameters for an EMU on a base station controller

Parameter Parameter
Type

Switch l Enable Door Status Alarm Reporting (DOOR_ENGINE_MASK)


l Enable Humidity Alarm Reporting (HUM_MASK)
l Enable Infrared Sensor Alarm Reporting (INFRA_RED_MASK)
l Enable Smoke Alarm Reporting (SMOKE_MASK)
l Enable Temperature Alarm Reporting (TEMP_MASK)
l Enable Water Alarm Reporting (WATER_MASK)
l Switch for Relay 1 (POWER_RELAY1) to Switch for Relay 6
(POWER_RELAY6)
l Enable Alarm Reporting for 24V Power (VOL24_MASK)
l Enable Alarm Reporting for 48V Power (VOL48_MASK)

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 17


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Parameter Parameter
Type

Threshold l Upper Limit of Signal Output of External Analog 1


(EX_ANO1_SIG_MAX) to Upper Limit of Signal Output of External
Analog 4 (EX_ANO4_SIG_MAX) (four parameters)
l Lower Limit of Signal Output of External Analog 1
(EX_ANO1_SIG_MIN) to Lower Limit of Signal Output of External
Analog 4 (EX_ANO4_SIG_MIN) (four parameters)
l Upper Limit of Measurement Range of External Analog 1
(EX_ANO1_VAL_MAX) to Upper Limit of Measurement Range of
External Analog 4 (EX_ANO4_VAL_MAX) (four parameters)
l Lower Limit of Measurement Range of External Analog 1
(EX_ANO1_VAL_MIN) to Lower Limit of Measurement Range of
External Analog 4 (EX_ANO4_VAL_MIN) (four parameters)
l Upper Limit of Humidity Alarm (HUM_THD_HIGH) and Lower Limit
of Humidity Alarm (HUM_THD_LOW)
l Upper Limit of Temperature Alarm (TEMP_THD_HIGH) and Lower
Limit of Temperature Alarm (TEMP_THD_LOW)
l Upper Limit of Alarm for 24V Power (VOL24_THD_HIGH) and Lower
Limit of Alarm for 24V Power (VOL24_THD_LOW)
l Upper Limit of Alarm for 48V Power (VOL48_THD_HIGH) and Lower
Limit of Alarm for 48V Power (VOL48_THD_LOW)

Sensor Sensor Type of External Analog 1 (EX_ANO1_TYPE) to Sensor Type of


External Analog 4 (EX_ANO4_TYPE) (four parameters)

Table 4-2 Parameters for an EMU on a base station

Parameter Parameter
Type

Switch l Special Analog Alarm Flag (SAAF)


l Special Boolean Alarm Flag (SBAF)

Threshold l Temperature Alarm Lower Threshold (TLTHD) and Temperature Alarm


Upper Threshold (TUTHD)
l Humidity Alarm Lower Threshold (HLTHD) and Humidity Alarm Upper
Threshold (HUTHD)

The methods of configuring user-defined alarms on base stations and base station controllers
are as follows:
l On a base station, you can run the SET ALMPORT command to bind a user-defined alarm
with a physical port, and then run the SETENVALMCFG command to set the alarm name,
severity, and network management type.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 18


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

l On a base station controller, you can run the SET ALMPORT command to configure the
environmental signal input port for environmental alarms. The related parameters are
Subrack No. (SRN), Port No. (PN), Switch (SW), Alarm ID (AID), Port Type (PT),
Alarm level (AVOL), Upper Limit (UL), and Lower Limit (LL). Then, run the SET
ENVALMPARA command to set the alarm name, severity, and network management
type.

You can also configure user-defined alarms on U2000 GUIs. For details about the user-defined
alarms and the configuration method, see section Fault Management > Fault Monitoring >
Setting Fault Monitoring Rules > Defining an NE Alarm in U2000 Help.

Alarm suppression
With this function, you can suppress unnecessary alarms by alarm ID or object.

l Suppressing alarms by alarm ID


If Shielded Flag of a specified alarm ID is set to Shielded, all the active alarms of the alarm
ID are cleared. During alarm suppression, no alarm will be reported even if the fault persists.
If the fault is not rectified after alarm suppression is disabled, alarms of the specified alarm
ID are reported again.
l Suppressing alarms by object
On the LTE side, you can suppress a certain alarm or all alarms for a certain board or port,
or suppress a certain alarm for all objects. On the UMTS or GSM side, you can suppress a
certain alarm or all alarms for a certain board, port, digital signal processor (DSP), cell, or
base station, or suppress a certain alarm for all objects.
You can run the following MML commands to enable alarm suppression:
– SET ALMSHLD: determines whether to suppress a specified alarm.
– ADD OBJALMSHLD: adds an alarm suppression rule for a specified object.

Fault log
Fault logs are classified into local fault logs and central fault logs. Local fault logs record faults
on faulty boards and are stored in a nonvolatile storage device. Central fault logs record the
information about all faults, based on which you can obtain all the fault information about an
NE.

4.2 Maintenance Mode Alarm


This section describes maintenance mode alarms, which apply to all the GSM, UMTS, and LTE
networks.

4.2.1 Basic Concepts

Maintenance Mode
The maintenance mode of an NE specifies an NE in maintenance mode and which maintenance
operations are being performed on the NE. The RAN system enables the maintenance mode
handling mechanism for all NEs that are in maintenance mode. If an NE is in Normal mode, no
maintenance operation is performed on the NE. Huawei provides pre-defined maintenance

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 19


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

modes, and telecom operators provide customized maintenance modes. Pre-defined maintenance
modes are Install, Testing, Upgrade, and Expand.

Maintenance Mode Alarm


When an NE is in maintenance mode, maintenance operations may cause exceptions on the NE
in a short period. If this occurs, a large number of alarms are reported. These alarms are identified
as maintenance mode alarms.

Maintenance mode alarms are automatically cleared after maintenance operations have been
performed.

In the past, maintenance mode alarms were not distinguished from the alarms generated during
NE operation and maintenance. Monitoring engineers have to manually filter all alarms by
referring to certain documents, such as maintenance plans. This filtering process increases their
workloads. To reduce their workloads, Huawei devices provide the maintenance mode
management feature, which helps monitoring engineers to determine the alarm type based on
NE modes and to configure policies for reporting and displaying alarms based on the alarm type.
Therefore, monitoring engineers do not need to handle maintenance mode alarms.

Maintenance personnel can view maintenance mode alarms directly on the LMT or on the U2000
by customizing the display policies.

4.2.2 Technology Description


After the maintenance mode management feature is enabled, NEs identify the data generated in
different modes and report it to the U2000 based on settings. The U2000 uses the appropriate
policies for displaying and reporting alarms based on identifiers carried with the alarms. For
example, maintenance mode alarms are not displayed on the U2000 by default or reported to
the northbound interface and the centralized monitoring system of telecom operators.

The maintenance mode management feature is enabled only in topology management and alarm
management. For details, see Table 4-3.

Table 4-3 Application of the maintenance mode management feature on the U2000

Function Application

Topology management You can filter NEs in a topology view by NE


status. When an NE that is in maintenance
mode generates an alarm, the color of the NE
icon in the topology view does not change.

Performance management When an NE is in maintenance mode,


Reliability of the performance result is
displayed as Unreliable. Except for the
previous differences, the mechanism for
handling NEs in maintenance mode and
normal mode are the same.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 20


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Function Application

Alarm management By default, the U2000 does not monitor


maintenance mode alarms. Therefore, set the
filter criteria for monitoring these alarms in
the Advanced dialog box. If the U2000 does
not monitor maintenance mode alarms, it
does not display the received maintenance
mode alarms on the client, provide audible or
visual alarm notifications, or forward alarms
to the NMS.
By using this feature, alarm monitoring
engineers can concentrate on normal mode
alarms as well as maintenance mode alarms
in specific scenarios.

For details about maintenance mode alarms, see the U2000 Feature Description About the
Maintenance Mode Management.

4.3 Fast Fault Diagnosis and Troubleshooting


This section applies to the GSM and UMTS networks.

If a fault occurs on the RNC or network, you cannot quickly locate the NE or board where the
fault occurs. Therefore, you cannot apply common measures, such as resetting, powering off,
or replacing the board, to restore services. In this case, you can use the fast fault diagnosis
function.

This function classifies faults on the RNC and network based on scenarios and provides different
fault diagnosis rules based on those scenarios. The fault fast diagnosis system analyzes fault
information based on counters, alarms, and logs of faults using an appropriate fault diagnosis
rule. This system then provides a fault analysis report, which helps maintenance personnel locate
the board or subsystem where the faults occur and allows them to take the necessary measures
to rectify the faults.

Based on fast fault diagnosis, the dashboard function provides the correlation analysis of global
KPIs, alarms, and operation logs in traffic statistics and displays the analysis results on the LMT,
as shown in Figure 4-4. You can view global KPIs in traffic statistics on the LMT and analyze
the impact of key global KPIs. By clicking a time when the key global KPIs deteriorated, you
can query the alarms and operation logs generated at that time. This helps you quickly locate
fault causes and restore services.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 21


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Figure 4-4 Viewing the correlation analysis results of global KPIs, alarms, and operation logs

Table 4-4 and Table 4-5 list fault scenarios for BSC6900 and BSC6910 where the fast fault
diagnosis function can be used.

Table 4-4 Fault scenarios for BSC6900

Mode Scenario

UMTS RRC connection setup

CS service setup

CS call drop

PS service setup

PS call drop

CS Erlang

PS throughput

Paging

A large number of
unavailable cells

Equipment health check

RNC in Pool Load


Sharing

GSM GSM Ater Interface


Interruption

GSM A Interface
Interruption

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 22


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Mode Scenario

GSM GB Interface
Interruption

GSM CS Traffic

GSM BSC Board Repeat


Fault

GSM GB Interface
Traffic

Table 4-5 Fault scenarios for BSC6910

Mode Scenario

UMTS RRC connection setup

CS service setup

CS call drop

PS service setup

PS call drop

CS Erlang

PS throughput

Paging

A large number of
unavailable cells

Equipment health check

RNC in Pool Load


Sharing

GSM GSM A Interface


Interruption

GSM GB Interface
Interruption

GSM CS Traffic

GSM BSC Board Repeat


Fault

GSM GB Interface
Traffic

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 23


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

4.4 OML Identification


This section applies to the GSM network.

4.4.1 Function Description


The BSC identifies a BTS over the Abis interface based on the timeslot for the port where the
OML of the BTS is configured. If transmission connections to two BTSs with the same
configurations, including the site type, boards, cells, and TRXs, are reversed, the data
configurations that the BSC sends to the two BTSs are also reversed. However, no alarm is
reported because these two BTSs and their cells work properly. In this case, maintenance
personnel cannot quickly identify the incorrect transmission connections. In practice, the
reversed radio parameters for cells under these BTSs cause a decrease in the network KPIs and
problems such as co-channel interference, adjacent-channel interference, and handover failures.

The OML identification function allows the BSC to check the electronic label of a BTS after
the OML to the BTS is set up. If the BSC detects that the electronic label is inconsistent with
the configured one, the BSC does not allow the BTS to work and reports an alarm so that
maintenance personnel can quickly identify the incorrect transmission connections.

4.4.2 Engineering Guidelines

Application Scenarios
The OML identification function applies in the following scenarios:

l There are a large number of BTSs with identical configurations, including identical BTS
type, hardware configuration, cell configuration, and TRX configuration.
l Transmission connections on the Abis interface are incorrect and need to be detected.

Before enabling the OML identification function, collect the bar code information on the BBU
backplane of the BTS.

There are two methods for obtaining the bar code information on the BBU backplane of the BTS.

l For running BTSs, run the BSC MML command DSP BTSESNINFO to query the bar
code on the BBU backplane of the BTS. Before using this method, ensure that the BTS
OML is correctly connected. If it is connected incorrectly, the queried bar code will also
be incorrect, leading to a bar code verification failure.
l For the deployed BTSs, manually record the bar code on the BBU backplane at the site
during deployment. This method is more reliable than the above mentioned.

The OML identification function applies to 3900 series base stations except BTS3900E and
BTS3900B. In addition, the BTSs must be upgraded to the required versions before the OML
identification function can be used.

Procedure
l Activation Procedure

Step 1 Run the BSC MML command DEA BTS to deactivate the BTS.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 24


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 4 NE Fault Management

Step 2 Run the BSC MML command SET BTSOMLDETECT with Whether to Enable OML
Detection set to CONCHK(Signaling link check) and BTS Bar Code set to the ESN of the
target BTS.

Step 3 Run the BSC MML command ACT BTS to activate the BTS.

----End

l Verification Procedure

Step 1 Run the BSC MML command SET BTSOMLDETECT with Whether to Enable OML
Detection set to CONCHK(Signaling link check) and BTS Bar Code set to the ESN of the
target BTS.

Step 2 Log in to the LMT and then choose Alarm/Event > Browse Alarm/Event, check whether the
ALM-21821 Site Signaling Link Connection Mismatch is generated on the alarm console.

----End

Expected result: The alarm is generated.

l Deactivation Procedure

Step 1 Run the BSC MML command DEA BTS to deactivate the BTS.

Step 2 Run the BSC MML command SET BTSOMLDETECT with Whether to Enable OML
Detection set toOFF(Off).

Step 3 Run the BSC MML command ACT BTS to activate the BTS.

----End

l Example
/*Activating OML detection*/
//Deactivating the BTS
DEA BTS: IDTYPE=BYID, BTSID=1;
//Enabling OML detection
SET BTSOMLDETECT: IDTYPE=BYID, BTSID=1, OMLDETECTSWITCH=CONCHK,
BTSBARCODE="20";
//Activating the BTS
ACT BTS: IDTYPE=BYID, BTSID=1;
/*Verifying OML detection
SET BTSOMLDETECT: IDTYPE=BYID, BTSID=1, OMLDETECTSWITCH=CONCHK,
BTSBARCODE="21";
/*Deactivating OML detection*/
//Deactivating the BTS
DEA BTS: IDTYPE=BYID, BTSID=1;
//Disabling OML detection
SET BTSOMLDETECT: IDTYPE=BYID, BTSID=1, OMLDETECTSWITCH=OFF;
//Activating the BTS
ACT BTS: IDTYPE=BYID, BTSID=1;

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 25


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 5 Fault Management of the U2000

5 Fault Management of the U2000

This chapter provides an overview of the U2000 fault management functions, such as alarm
display and statistics, audible and visual alarm notification, alarm acknowledgment, and alarm
synchronization. For details, see chapter "Fault Management" in the U2000 Help.

Alarm Display and Statistics


The U2000 receives alarms generated on NEs in real time and displays and collects statistics on
the alarms in multiple ways.

l Alarm display
The U2000 displays alarms on alarm panels or in a bar chart and allows you to query alarms.
– Alarm panel
Alarm panels collect and display alarms of different severities and status for MOs based
on alarm list templates. Functioning as the monitoring panels, alarm panels provide fault
status on the entire network.
– Alarm bar chart
The U2000 client provides alarm bar charts to display alarms. An alarm bar chart
window contains one or more alarm bar charts. Alarms collected using an alarm template
are displayed in an alarm bar chart in graphics and numerals.
– Alarm query
You can view the alarm list and query alarm logs or event logs on the U2000. Alarms
can be displayed in a list on an U2000 GUI by alarm status or alarm severity.
The U2000 supports alarm query and display in two modes:
Single-NE: Query and display alarms generated on the target base station.
Cross-NE: Query and display alarms generated on neighboring cells of the target base
station.
Alarm logs record all the alarms received by the U2000. Each alarm is displayed as a
record.
Event logs record all the events received by the U2000. Each event is displayed as a
record.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 26


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 5 Fault Management of the U2000

The alarm list displays the alarms that you need to concentrate on and handle. One object
may generate multiple alarms with the same information. In the alarm list, however,
only the latest record is displayed.
l Alarm statistics
The U2000 can collect statistics on alarm and event logs based on preset statistical criteria.
For example, the number of alarms of a severity and that are reported by an NE per hour
can be collected.

Audible and Visual Alarm Notification


The U2000 uses the alarm box, audio adapter and sound box to notify you of alarms.

l Alarm box
Alarm boxes provided by Huawei can be used on the U2000. When an alarm that complies
with filter criteria is generated, you can receive audible and visual alarm notifications from
the alarm box.
l Audio adapter and sound box
You can configure different audio files for the alarms of different severities on the U2000
client. When an alarm is reported, the U2000 client where the audio adapter and sound box
are installed plays corresponding sounds to notify you of this alarm.

Alarm Acknowledgment
An acknowledged alarm is cleared by users and does not need further attention. If this alarm
needs to be addressed again, you can unacknowledge this alarm and take corresponding measures
to clear it.

The U2000 supports manual acknowledgment and unacknowledgment, and automatic


acknowledgment by alarm severity and by user-defined rule.

Alarm Synchronization
The U2000 provides automatic and manual data synchronization. In most cases, the U2000
automatically synchronizes alarm data from NEs. However, due to certain reasons, such as
network disconnection, the alarm data on the U2000 may be inconsistent with that on NEs. To
ensure alarm data consistency, you can manually synchronize alarm data from NEs.

Alarm Clearance
On the U2000, you can manually clear the alarms that cannot be automatically cleared or that
have been acknowledged.

Alarm Suppression
You can set rules for suppressing alarms on the U2000 or NEs.

Suppressing alarms on the U2000: After alarms are reported to the U2000, the U2000 discards
the alarms that meet alarm suppression criteria. It does not store them in the alarm database.

Suppressing alarms on NEs: NEs do not report the alarms that meet alarm suppression criteria
to the U2000.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 27


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 5 Fault Management of the U2000

Conversion from Events to ADMC Alarms


The U2000 allows you to convert an event into an alarm in auto detected manual confirmed
(ADMC) mode. This mode displays the ADMC alarm in the alarm list and draws the user's
attention to the event.

Alarms Redefinition
You can set rules for redefining alarms on the U2000 or NEs. By redefining alarms, you can
change alarm names, types, and severities displayed on the U2000 client, highlight the alarms
that you need to concentrate on, and ignore the alarms that you do not need to concentrate on.

Redefining alarms on the U2000: After alarms are reported to the U2000, the U2000 displays
the alarms based on the redefined severities.

Redefining alarms on NEs: NEs report alarms to the U2000 based on the redefined severities.

Alarm Correlation Analysis


The U2000 supports the alarm/event correlation analysis, alarm/event frequency analysis,
intermittent alarm analysis, duplicate event analysis, and analysis of the duration between the
time when an alarm is acknowledged and the time when the alarm is cleared. The U2000 discards
alarms and suppresses non-root alarms that are also called correlative alarms or redefine alarm
severities based on preset alarm correlation rules. Therefore, only root alarms and the alarms
that need attention are displayed on the U2000 client.

Alarm Combination
The U2000 provides two alarm/event combination functions:

l Function 1: combines multiple alarms into one alarm based on the values of key fields. This
function is not under license control.
l Function 2: combines multiple alarms into fewer alarms based on the values of key fields
and the alarm reporting time. The combined alarms contain key location information about
each alarm. For example, after multiple optical port tributary alarms are combined into one
alarm, tributary IDs for all the optical port tributary alarms are listed in the Tributary ID
field of the alarm message. This function is controlled by the Efficient Trouble Ticket
license.

Function 1 is different from function 2 in that function 1 is not restricted by time and does not
provide alarm key location information such as tributary IDs.

Alarm Maintenance Experience


Alarm maintenance experience is recorded in the alarm experience database. Alarm maintenance
experience can be imported into or exported from the alarm experience database. When a similar
fault occurs, you can refer to relevant information to rectify it.

Remote Alarm Notification


When a generated alarm meets filter criteria, the U2000 automatically notifies the specified
maintenance engineers of the alarm by email or SMS message, which helps them learn about

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 28


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 5 Fault Management of the U2000

alarm information and take measures to rectify faults. The notification criteria, notification time,
notification method, and message format can be set on the U2000 client.

Alarm Auto-Triggering Script


After you specify a script and script triggering criteria, the U2000 server automatically runs the
specified script if generated alarms meet the preset script triggering criteria. In this way, you
can write an alarm auto-triggering script to perform the repetitive operations for routine
maintenance, which implements automation of partial routine maintenance.

Alarm Customization
The U2000 provides the environment monitoring function. You can define alarms to monitor
the physical conditions of NEs based on site requirements.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 29


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 6 Troubleshooting

6 Troubleshooting

6.1 Procedures and Principles


Procedures
When a fault occurs, you should calmly analyze, locate, and troubleshoot the fault in the
following ways:

l Check
Check the following fault information:
– Fault symptom
– Time, place, and frequency that the fault has occurred
– Fault impacts
– Running status of devices before the fault occurs
– Whether alarms and correlative alarms are generated when the fault occurs
– Whether indicators on a board are abnormal when the fault occurs
l Inquire
Engineers need to ask personnel about the following information:
– Operations performed on devices before the fault occurs and operation results.
For example, personnel need to determine whether certain data is modified, files are
deleted, circuit boards are replaced, lightning or power failures have occurred, and
improper operations are performed.
– Measures taken after the fault occurs and corresponding effects.
– Conduct an inquiry among users or staff who report faults about fault symptoms and
the time, place, and frequency that the fault has occurred.
l Find
Maintenance engineers analyze obtained information based on their technical knowledge
and find causes of this fault based on Alarm Reference and Troubleshooting Guide.
l Locate and troubleshoot

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 30


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 6 Troubleshooting

Locate the MO where the fault has occurred based on fault location principles and
troubleshoot the fault in different ways, for example, modifying data or replacing boards.

Principles
A fault generally has multiple fault reasons. Troubleshooting measures vary based on the fault
causes. For example, you can troubleshoot faults remotely in the monitoring or equipment room,
or onsite. To improve fault troubleshooting efficiency, you are advised to adopt the following
principles:

l Analyze faults remotely in the monitoring room. Then troubleshoot the faults onsite.
l Find fault causes with high probabilities and then those with lower probabilities.
l Take low-cost troubleshooting measures including their impact on services and
troubleshooting costs. Then, take those with high costs.

6.2 Fault Location and Troubleshooting


Some alarms provide fault causes and information about FRUs or FMUs where faults have
occurred, for example, alarms for faults on boards. In this case, you can follow troubleshooting
procedures in corresponding alarm help documents to rectify the faults.

Certain alarms, such as link- and transmission-related alarms, do not provide fault causes or
information about FRUs and FMUs because the faults triggering these alarms involve multiple
devices and fault causes. You have to locate faults using fault location principles. Then, you can
clear the alarms based on your experience and the documents provided by vendors.

Table 6-1 lists the faults triggering alarms and fault location methods. This list helps you clear
alarms.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 31


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 6 Troubleshooting

Table 6-1 Classification of faults triggering alarms and corresponding fault location methods

No. Fault Description Fault Location

1 Incorrect manual Incorrect data 1. Ask personnel on


operations configuration or duty or query the
manual maintenance operation logs on
the BSC.
2. Compare the
current data with
previous data
Check whether
the data is
modified or
maintenance
operations are
performed
manually. If yes,
check the
modifications
first. In normal
cases, if the data
is configured
incorrectly,
rectifying faults
by resetting
certain hardware,
such as boards,
does not work.

2 Software-related Exceptions that You can reset


faults occur when software hardware to rectify
is running such faults.
Hardware reset is
classified into soft
reset (without
powering off the
hardware) and hard
reset (power-off
reset). In normal
cases, perform soft
reset preferentially.
If exceptions persist,
perform the hard
reset.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 32


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 6 Troubleshooting

No. Fault Description Fault Location

3 Hardware-related Natural breakdown 1. Remove and


faults of hardware or a reinstall the
malfunction due to boards or
incorrect manual disconnect and
operations reconnect the
cables.
2. Replace faulty
units, such as
boards, with
normal ones.
3. Replace faulty
units with new
ones.
4. If the fault cannot
be located, shut
down or block the
units one by one.
In normal cases,
rectifying faults
by resetting
hardware does
not work.

4 Faults on peripheral Faults on peripheral 1. Remove and


devices devices, such as the reinstall the
environment boards or
monitoring device, disconnect and
transmission device, reconnect the
and antenna system cables.
2. Replace faulty
units, such as
boards, with
normal ones.
3. If the fault cannot
be located, shut
down or block the
units one by one.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 33


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 6 Troubleshooting

No. Fault Description Fault Location

5 Adverse ambient Electromagnetic 1. Inquire/query:


environment interference, List possible fault
temperature, and causes based on
adverse weather, fault symptoms
such as wind, rain, and inquire about
snow, hail, the ambient
lightening, and environment.
earthquakes Check whether
exceptions occur
when the fault
occurs.
2. Certain faults
caused by adverse
ambient
environment can
be rectified after
external factors
are removed,
such as
interference.
Certain faults
cannot be
rectified, such as
earthquakes.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 34


Copyright © Huawei Technologies Co., Ltd.
SingleRAN 7 Schemes for Deleting Alarms and Alarm Location
Fault Management Feature Parameter Description Parameters

7 Schemes for Deleting Alarms and Alarm


Location Parameters

Adding or deleting alarms has an impact on the northbound interface. Alarms are generally added
when new features are introduced or features in earlier versions are optimized. Alarms may be
deleted for the following reasons:

l Service changes in a new version: Some alarms no longer apply to scenarios in the new
version.
l Internal design optimization: Some alarms are no longer required for their corresponding
faults, for example, NEs can now independently rectify the faults.
l Alarm optimization: Some alarms are replaced with other alarms.

In addition, alarm location parameters of an alarm may be deleted for the following reasons:

l Service optimization: Some alarm location parameters become invalid.


l Configuration model optimization: Some alarm location parameters are invalid or replaced
with other parameters.

Alarms and alarm location parameters to be deleted will be reserved in two NE versions: the
current version and the later version. Table 7-1 describes the schemes for deleting alarms and
alarm location parameters. In the table, N indicates the current version.

Table 7-1 Schemes for deleting alarms and alarm location parameters

Scenario Scheme(N-N+1) Scheme(N+2)

Alarms without application Old alarms are not reported and Old alarms are deleted.
scenarios are deleted. are only reserved in related NE
documents.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 35


Copyright © Huawei Technologies Co., Ltd.
SingleRAN 7 Schemes for Deleting Alarms and Alarm Location
Fault Management Feature Parameter Description Parameters

Scenario Scheme(N-N+1) Scheme(N+2)

Old alarms are replaced with l Old alarms are not reported Old alarms are deleted.
new alarms. and are only reserved in
related NE documents that
provide the mapping
between old alarms and new
alarms.
l New alarms are reported
properly.

Old alarm location Old alarm location parameters Old alarm location
parameters are invalid due to are reserved, and parameter parameters are deleted.
service optimization and are values reported through the
deleted. northbound interface are null
values.

Old alarm location l Old alarm location Old alarm location


parameters are invalid due to parameters are reserved and parameters are deleted.
configuration model reported through the
optimization and are deleted northbound interface. The
or replaced with other parameter value validity
parameters. depends on the validity of
the values for the
corresponding parameters
in the configuration model.
l New alarm location
parameters are reported
properly.

Table 7-2 lists the documents containing information about deleted alarms and alarm location
parameters.

Table 7-2 Documents that contain information about deleted alarms or alarm location parameters

Document Description of Deleted Alarms and


Alarm Location Parameters

Disuse Alarm List This document is released with the NE


software and describes all deleted alarms in
this version. It also provides disuse
statements that describe the reasons for
deleting these alarms or alarm location
parameters.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 36


Copyright © Huawei Technologies Co., Ltd.
SingleRAN 7 Schemes for Deleting Alarms and Alarm Location
Fault Management Feature Parameter Description Parameters

Document Description of Deleted Alarms and


Alarm Location Parameters

Disuse Event List This document is released with the NE


software and describes all deleted events in
this version. It also provides disuse
statements that describe the reasons for
deleting these events or event location
parameters.

Alarm Reference Disuse statements that describe the schemes


for deleting alarms or alarm location
parameters are provided in the Description
field.
This document is integrated into the NE
HedEx documentation package.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 37


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 8 Parameters

8 Parameters

Table 8-1 Parameter description

MO Parame MML Feature Feature Description


ter ID Comma ID Name
nd

EMU SAAF ADD LBFD-0 Environ Meaning: Indicates whether to allow the report of a
EMU 04012 / ment dedicated analog alarm. If the shield flag for an analog
MOD TDLBF Monitori alarm is enabled, the analog alarm cannot be reported.
EMU D-00401 ng The 48V_DISABLE check box under this parameter
2 must be set to off. Otherwise, ALM-26271 Inter-System
LST Monitoring Device Parameter Settings Conflict will be
EMU reported mistakenly in a separate-MPT multimode base
station that involves the GBTS.
GUI Value Range: 48V_DISABLE(-48 Voltage
Disabled), RES0(Reserved Sensor 0), RES1(Reserved
Sensor 1), RES2(Reserved Sensor 2), TS_DISABLE
(Temperature Sensor Disabled), HS_DISABLE
(Humidity Sensor Disabled)
Unit: None
Actual Value Range: 48V_DISABLE, RES0, RES1,
RES2, TS_DISABLE, HS_DISABLE
Default Value: 48V_DISABLE:OFF, RES0:ON,
RES1:ON, RES2:ON, TS_DISABLE:OFF,
HS_DISABLE:OFF

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 38


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 8 Parameters

MO Parame MML Feature Feature Description


ter ID Comma ID Name
nd

EMU SBAF ADD LBFD-0 Environ Meaning: Indicates whether to allow the report of a
EMU 04012 / ment dedicated Boolean alarm. If the shield flag for a Boolean
MOD TDLBF Monitori alarm is enabled, the Boolean alarm cannot be reported.
EMU D-00401 ng GUI Value Range: WS_DISABLE(Water -Immersed
2 Sensor Disabled), SS_DISABLE(Smog Sensor
LST
EMU Disabled), IS_DISABLE(Infrared Sensor Disabled),
GS_DISABLE(Gating Sensor Disabled)
Unit: None
Actual Value Range: WS_DISABLE, SS_DISABLE,
IS_DISABLE, GS_DISABLE
Default Value: WS_DISABLE:OFF,
SS_DISABLE:OFF, IS_DISABLE:ON,
GS_DISABLE:OFF

EMU TLTHD ADD LBFD-0 Environ Meaning: Indicates the lower limit of temperature.
EMU 04012 / ment When the temperature is below the lower limit, an
MOD TDLBF Monitori ALM-25650 Ambient Temperature Unacceptable is
EMU D-00401 ng reported.
2 GUI Value Range: -20~80(metric system);-4~176
LST
EMU (imperial system)
Unit: degree Celsius(metric system);degree Fahrenheit
(imperial system)
Actual Value Range: -20~80(metric system);-4~176
(imperial system)
Default Value: 0(metric system);32(imperial system)

EMU TUTHD ADD LBFD-0 Environ Meaning: Indicates the upper limit of temperature.
EMU 04012 / ment When the temperature exceeds the upper limit, an
MOD TDLBF Monitori ALM-25650 Ambient Temperature Unacceptable is
EMU D-00401 ng reported.
2 GUI Value Range: -20~80(metric system);-4~176
LST
EMU (imperial system)
Unit: degree Celsius(metric system);degree Fahrenheit
(imperial system)
Actual Value Range: -20~80(metric system);-4~176
(imperial system)
Default Value: 50(metric system);122(imperial system)

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 39


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 8 Parameters

MO Parame MML Feature Feature Description


ter ID Comma ID Name
nd

EMU HLTHD ADD LBFD-0 Environ Meaning: Indicates the lower limit of humidity. When
EMU 04012 / ment the humidity is below the lower limit, an ALM-25651
MOD TDLBF Monitori Humidity Abnormal alarm is reported.
EMU D-00401 ng GUI Value Range: 0~100
2
LST Unit: %
EMU Actual Value Range: 0~100
Default Value: 10

EMU HUTHD ADD LBFD-0 Environ Meaning: Indicates the upper limit of humidity. When
EMU 04012 / ment the humidity exceeds the upper limit, an ALM-25651
MOD TDLBF Monitori Humidity Abnormal alarm is reported.
EMU D-00401 ng GUI Value Range: 0~100
2
LST Unit: %
EMU Actual Value Range: 0~100
Default Value: 80

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 40


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 9 Counters

9 Counters

There are no specific counters associated with this feature.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 41


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 10 Glossary

10 Glossary

For the acronyms, abbreviations, terms, and definitions, see Glossary.

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 42


Copyright © Huawei Technologies Co., Ltd.
SingleRAN
Fault Management Feature Parameter Description 11 Reference Documents

11 Reference Documents

1. BSC6900 GU Alarm Reference


2. BSC6910 GU Alarm Reference
3. BSC6900 GU Event Reference
4. BSC6910 GU Event Reference
5. BSC6900 GSM Alarm Reference
6. BSC6910 GSM Alarm Reference
7. BSC6900 GSM Event Reference
8. BSC6910 GSM Event Reference
9. BSC6900 UMTS Alarm Reference
10. BSC6910 UMTS Alarm Reference
11. BSC6900 UMTS Event Reference
12. BSC6910 UMTS Event Reference
13. 3900 Series Base Station Alarm Reference
14. GBSS Troubleshooting Guide
15. RAN Troubleshooting Guide
16. eRAN Troubleshooting Guide
17. Fault Management on the U2000 side
18. U2000 Alarm Reference on the U2000 side
19. U2000 Feature Description About Maintenance Mode Management on the U2000 side

Issue 01 (2014-04-26) Huawei Proprietary and Confidential 43


Copyright © Huawei Technologies Co., Ltd.

You might also like