You are on page 1of 42

NetNumen™ U31 R18

Unified Element Management System


Alarm Handling Reference

Version: 12.10.040

ZTE CORPORATION
NO. 55, Hi-tech Road South, ShenZhen, P.R.China
Postcode: 518057
Tel: +86-755-26771900
Fax: +86-755-26770801
URL: http://ensupport.zte.com.cn
E-mail: support@zte.com.cn
LEGAL INFORMATION
Copyright © 2011 ZTE CORPORATION.
The contents of this document are protected by copyright laws and international treaties. Any reproduction or
distribution of this document or any portion of this document, in any form by any means, without the prior written
consent of ZTE CORPORATION is prohibited. Additionally, the contents of this document are protected by
contractual confidentiality obligations.
All company, brand and product names are trade or service marks, or registered trade or service marks, of ZTE
CORPORATION or of their respective owners.
This document is provided “as is”, and all express, implied, or statutory warranties, representations or conditions
are disclaimed, including without limitation any implied warranty of merchantability, fitness for a particular purpose,
title or non-infringement. ZTE CORPORATION and its licensors shall not be liable for damages resulting from the
use of or reliance on the information contained herein.
ZTE CORPORATION or its licensors may have current or pending intellectual property rights or applications
covering the subject matter of this document. Except as expressly provided in any written license between ZTE
CORPORATION and its licensee, the user of this document shall not acquire any license to the subject matter
herein.
ZTE CORPORATION reserves the right to upgrade or make technical change to this product without further notice.
Users may visit ZTE technical support website http://ensupport.zte.com.cn to inquire related information.
The ultimate right to interpret this product resides in ZTE CORPORATION.

Revision History

Revision No. Revision Date Revision Reason

R1.1 2011-12–05 The following sections are modified:


l “4.1 15010001 Performance Data Delayed”
l “5.3 User Account Locked”

R1.0 2011-10–11 First Edition

Serial Number: SJ-20110823134613-015

Publishing Date: 2011-12-05(R1.1)


Contents
About This Manual ......................................................................................... I
Chapter 1 Overview .................................................................................... 1-1
1.1 Fault Management ............................................................................................. 1-1
1.2 Fault Indication................................................................................................... 1-1
1.3 Alarm ................................................................................................................ 1-2
1.3.1 Alarm Code.............................................................................................. 1-2
1.3.2 Alarm Severity ......................................................................................... 1-2
1.3.3 Alarm Type .............................................................................................. 1-3
1.3.4 Probable Cause ....................................................................................... 1-3
1.3.5 Impact on System .................................................................................... 1-4
1.3.6 Handling Suggestion ................................................................................ 1-4

Chapter 2 Communication Alarms ........................................................... 2-1


2.1 198099803 Link Broken Between OMM and NE ................................................... 2-1
2.2 198099804 Link Broken Between Server and Alarm Box ...................................... 2-1
2.3 198099805 Broken Link Between EMS and NMS ................................................. 2-3

Chapter 3 QoS Alarm ................................................................................ 3-1


3.1 1513 Performance Index Threshold Crossing....................................................... 3-1

Chapter 4 Equipment Alarm ...................................................................... 4-1


4.1 15010001 Performance Data Delayed ................................................................. 4-1

Chapter 5 OMC Alarms .............................................................................. 5-1


5.1 15010002 NAF Performance Data File Delayed ................................................... 5-1
5.2 15010003 License Alarm .................................................................................... 5-2
5.3 1000 User Account Locked ................................................................................. 5-2
5.4 1001 Database Overload .................................................................................... 5-3
5.5 1002 CPU Overload of Application Server............................................................ 5-4
5.6 1003 RAM Overload of Application Server ........................................................... 5-4
5.7 1004 Hard Disk Overload of Application Server .................................................... 5-5
5.8 1006 Directory Size Threshold Crossing .............................................................. 5-5
5.9 1008 Database Space Threshold Crossing .......................................................... 5-6
5.10 1009 Synchronization Failure of Server Time ..................................................... 5-6
5.11 1010 Broken Link Between Server and Alarm Box .............................................. 5-7
5.12 1011 Running Failure of the Whole Database Backup Task ................................. 5-9
5.13 1012 License Has Expired .............................................................................. 5-10

I
5.14 1013 License Will Expire................................................................................. 5-10
5.15 1014 Broken Link Between Server and NE........................................................5-11
5.16 1015 Broken Link Between Server and NE Agent ............................................. 5-12
5.17 1016 Alarm Frequency Threshold Crossing...................................................... 5-13
5.18 1017 Alarm Duration Threshold Crossing ......................................................... 5-13
5.19 1018 Duration Threshold Crossing of Unacknowledged Alarm........................... 5-14
5.20 1019 TRAP Messages Discarded .................................................................... 5-15
5.21 1021 Running Failure of the Basic Database Backup Task ................................ 5-16
5.22 1022 New Alarm Raised Based on the Alarm Merging Rule .............................. 5-17

Figures............................................................................................................. I
Tables ............................................................................................................ III
Glossary .........................................................................................................V

II
About This Manual
The NetNumenTM U31 R18 Unified Element Management System (NetNumen U31 or
EMS) is a special network element management system that manages network elements
in radio access systems. By using NetNumen U31, users can configure and maintain
individual network elements, and manage radio access networks in a unified manner.
NetNumen U31 provides the following management functions:
l Configuration management
l Fault management
l Performance management
l Topology management
l Security management
As an object-oriented system designed on the JAVA 2 platform Enterprise Edition (J2EE),
NetNumen U31 provides unified standard interfaces to external devices.

Purpose
This manual provides a reference for operation and maintenance personnel who perform
fault management operations via NetNumen U31. It describes the alarms and notifications
related to the network element management system, analyzes their causes and provides
corresponding handling suggestions.
For alarm and notification information of a specific network element (device), please refer
to corresponding manual of this device.

Intended Audience
l Maintenance engineers
l Debugging engineers

What Is in This Manual


This manual contains the following chapters:

Summary
Chapter

Chapter 1, Overview Gives a brief introduction to the fault management functions


in the NetNumen U31 system, describes two fault indication
modes in the system, and describes the explanation model of
alarms and notifications in this manual.

Chapter 2, Communication Alarms Provides the information of all communication alarms related
to the NetNumen U31 system, analyzes their causes, and
gives the handling suggestions.

I
Summary
Chapter

Chapter 3, QoS Alarm Provides the information of the Quality of Service alarm
related to the NetNumen U31 system, analyzes their causes,
and gives the handling suggestions.

Chapter 4, Equipment Alarm Provides the information of the equipment alarm related to
the NetNumen U31 system, analyzes their causes, and gives
the handling suggestions.

Chapter 5, OMC Alarms Provides the information of all Operation and Maintenance
Center alarms related to the NetNumen U31 system,
analyzes their causes, and gives the handling suggestions.

II
Chapter 1
Overview
Table of Contents
Fault Management .....................................................................................................1-1
Fault Indication...........................................................................................................1-1
Alarm .........................................................................................................................1-2

1.1 Fault Management


The fault management module provided in NetNumen U31 can monitor failures in the
managed network in near-real time. It collects the information of faults and events that
occur during system operation or service processing. When a failure occurs, an indication
is made and stored in the database as an alarm or notification record, which can be
displayed in real time on a user-oriented alarm monitoring platform.
By using the fault management functions of NetNumen U31, you can get the current and
history running status and service processing condition of the system by monitoring current
alarms and querying history alarms on the Graphic User Interface (GUI) of the NetNumen
U31 client. According to those fault indications, you can do troubleshooting to restore the
system and services or take preventive measures to remove potential risks in the system.

1.2 Fault Indication


An alarm or notification is reported when a fault occurs in the system.
l Alarm

A fault is indicated in the form of an alarm when it persists and affects the reliability
and services in the system. The alarm only disappears after the fault is removed.
Immediate troubleshooting is required when such faults occur because of their impact
on the proper running of the system.
l Notification

A notification indicates a non-repeatable or instantaneous fault or event in the


system, for example, board reset and signaling overload. Such fault or event is
generally caused by sudden environment change or other accidental factors. No
special handling is required because the fault or event the notification indicates can
be automatically removed by the system. However, a frequently-reported notification
needs troubleshooting.

1-1

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

1.3 Alarm
In NetNumen U31, you can look up the general information, probable cause, impact on
system operations, and handling suggestion of alarms by their alarm codes.
This reference manual describes the information (alarm code, severity, and type), probable
cause, handling suggestion of all alarms related to the network Element Management
System (EMS).

1.3.1 Alarm Code


Each alarm has a code consisting of a code number and the fault information (code name).
l The code number is a unique 32–bit number.
l The code name gives a brief of the fault causing the alarm, such as fault cause or
symptom.

1.3.2 Alarm Severity


Alarms are divided into the following four classes depending on the severity of
corresponding faults.
l Critical Alarm
A critical alarm indicates a fault that causes the failure of system operation or service
offering. Immediate troubleshooting is required when a critical alarm is reported.
l Major Alarm
A major alarm indicates a fault that seriously impacts the proper running of the system
or reduces the capability of service offering. Removing corresponding fault to restore
the system as soon as possible is required when a major alarm occurs.

l Minor Alarm
A minor alarm indicates a fault that slightly influences the proper running of the system
or reduces the capability of service offering. You are required to take measures to
remove the corresponding fault in time and prevent the occurrence of severer alarms
when a major alarm occurs.

l Warning
A warning indicates a fault that has a potential or gradual impact on the proper running
of the system or the service offering capability. You are required to analyse the warning
message and then take proper measures timely to remove the corresponding fault and
thus avoid severer alarms.
In NetNumen U31, most alarms have been set with default severity levels. The severity of
a small number of alarms has not been set. You can define their severity when you want
to use them.

1-2

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 1 Overview

Note:
Although NetNumen U31 supports the modification of alarm severity level, use caution
when you want to modify the severity level of an alarm because the default severity level
is generally an appropriate one.

The definition of alarm severity depends on the influence range of the corresponding fault.
If a fault impacts an index, such as reliability or security, the severity of the alarm caused
by this fault can be determined according to the degree of influence on the index. An alarm
caused by a fault that impacts multiple indexes has a higher severity level than those just
impacting one index.

1.3.3 Alarm Type


This manual introduces some of the five basic categories of alarms as specified in ITU-T
X.733 and the OMC alarm category.

l Communication Alarm
An alarm of this type is principally associated with the procedures and/or processes
required to convey information from one point to another.

l Processing Error Alarm


An alarm of this type is principally associated with a software or processing fault.
l Quality of Service Alarm
An alarm of this type is principally associated with a degradation in the quality of
service.
l Equipment Alarm
An alarm of this type is principally associated with an equipment fault.
l Environmental Alarm

An alarm of this type is principally associated with a condition relating to an enclosure


in which the equipment resides.
l Operation & Maintenance Center (OMC) Alarm

Besides the five basic alarm types as specified in ITU-T X.733, NetNumen U31
also groups the alarms principally associated with faults in the network Element
Management System (EMS) into a specific OMC alarm type.

1.3.4 Probable Cause


NetNumen U31 lists all probable causes of each alarm for your reference. You can
get corresponding alarm-raising reasons and find a proper troubleshooting method by

1-3

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

analysing probable causes to restore the system as soon as possible when an alarm
occurs.

1.3.5 Impact on System


NetNumen U31 provides a brief description of the impacts on system operations or
services that may be caused by the corresponding fault indicated by each alarm.

1.3.6 Handling Suggestion


You can refer to the handling suggestions provided by NetNumen U31 to troubleshoot the
orresponding fault when an alarm occurs.

Tip:
Follow the instructions below when handling an alarm:
l Record the problem and fault symptom, and then handle the alarm step by step
according to the corresponding handling method described in this manual. At any
one step, end the alarm handling process if the fault is removed (that is, the alarm
disappears); if the fault still exists, move to the next step.
l In the case of failure to remove faults and restore the system, contact your local ZTE
office for support.

1-4

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 2
Communication Alarms
Table of Contents
198099803 Link Broken Between OMM and NE.........................................................2-1
198099804 Link Broken Between Server and Alarm Box............................................2-1
198099805 Broken Link Between EMS and NMS.......................................................2-3

2.1 198099803 Link Broken Between OMM and NE


Alarm Information
l Code Number: 198099803
l Code Name: Link Broken Between OMM and NE
l Severity: Critical
l Alarm Type: Communication alarm

Probable Cause
The link between an Network Element (NE) and the Operation & Maintenance Module
(OMM) server that manages this NE is broken.

Impact on System
The OMM fails to obtain performance and alarm data from the NE.

Handling Suggestion
Check the connection between the NE and the OMM server as follows:
1. In the Configuration Management window of the NetNumen U31 client GUI, get the
IP address of the NE.
2. On the OMM server, ping the IP address of the NE. If the ping fails, check whether the
communication failure is caused by a fault in the NE itself or the network between the
OMM and the NE.

2.2 198099804 Link Broken Between Server and Alarm


Box
Alarm Information
l Code Number: 198099804

2-1

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

l Code Name: Link Broken Between Server and Alarmbox


l Severity: Critical
l Alarm Type: Communication alarm

Probable Cause
The probable causes of broken link between the NetNumen U31 server and the alarm box
are:
l The network connection between the server and the alarm box is abnormal.
l The configured IP address of the alarm box on the EMS server is different from the
actual IP address of the alarm box.
l The server IP address configured in the alarm box is different from the actual IP
address of the server.
l The port No. configured in the alarm box is different from that configured on the server.

Impact on System
The NetNumen U31 server fails to send alarm information to the alarm box.

Handling Suggestion
Do the following to check for network connection problem and inconsistent IP address
configuration:
1. In the Fault Management window of the NetNumen U31 client GUI, get the configured
IP address of the alarm box. Next, open a terminal window on the NetNumen U31
server and ping this IP address. If the ping fails, check the power connection and
network connection of the alarm box as follows:
a. Check that the power cable is firmly connected to the alarm box and that the input
voltage meets the power supply requirements of the alarm box. Then switch on
the alarm box and check whether the menu is normally displayed on the screen
of the alarm box. If no, reset or replace the alarm box.
b. Check the network cable connected to the alarm box and make sure that the
network cable is intact and in good contact with the network port (Lan1) of the
alarm box.
2. On the screen of the alarm box, select the appropriate menu to view the IP address of
the alarm box, and then check whether the alarm box’s IP address configured on the
NetNumen U31 server is the same as this IP address. If no, modify the IP address of
the alarm box on the server to the actual one.
3. On the alarm box, press the left arrow key to view the version of the alarm box, and
select one of the following methods to check the server IP address configured in the
alarm box depending on the version of the alarm box.
l If the version of the alarm box is V3 or a previous version, select the appropriate
menu to view the IP address of the NetNumen U31 server on the screen of the
alarm box, and then check whether it is the same as the actual IP address of the
server. If no, modify the server IP on the alarm box to the actual one.

2-2

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 2 Communication Alarms

l If the version of the alarm box is V5, open a terminal window on the server, type
the telnet [alarm box IP address] [601] command, and then enter
the password alarmpro to get access to the alarm box. Next, execute the
tcpCfgShow command to view the server IP address configured on the alarm
box.
If the configured IP address is different from the actual IP address of the server,
modify the server IP on the alarm box to the actual one by executing the following
command:
cfgTcpComm [Serial No.] [Server IP] [Port No.] [Group
No.]
4. On the screen of the alarm box, select the appropriate menu to view the port
configuration of the alarm box, and then check whether the configured port number is
the same as the port number set in the EMS fault management module. If no, modify
the port number on the alarm box and make sure that it is same as the port number
set on the server.

2.3 198099805 Broken Link Between EMS and NMS


Alarm Information
l Code Number: 198099805
l Code Name: Link to NMS Broken
l Severity: Critical
l Alarm Type: Communication alarm

Probable Cause
This alarm indicates a broken link between the NetNumen U31 system, that is, the Element
Management System (EMS), and a Network Management System (NMS) that is connected
to the EMS via northbound interfaces. The probable causes are:
l The NMS process exits, so that the NMS stops responding.
l The NMS is disconnected from the network due to network problems.
l NetNumen U31 fails to communicate with the NMS because of a firewall between
them.

Impact on System
NetNumen U31 fails to communicate with the NMS.

Handling Suggestion
Check the link between NetNumen U31 and the NMS as follows:

1. In the Fault Management window of the NetNumen U31 client GUI, view the details
of the alarm to find the port number and IP address of the NMS.

2-3

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

2. On the NetNumen U31 server, ping the port at the IP address of the NMS. If the ping
fails, check whether the network between the EMS and the NMS has any problem.
3. On the NetNumen U31 server, telnet to the port at the IP address of the NMS. If the
telnet fails, check whether the NMS process encounters an exception.

2-4

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 3
QoS Alarm
Table of Contents
1513 Performance Index Threshold Crossing .............................................................3-1

3.1 1513 Performance Index Threshold Crossing


NetNumen U31 supports the customization of threshold crossing alarms based on different
key performance indexes (KPIs) for Performance Management (PM). You can predefine
the severity of a threshold-crossing alarm for an index and modify the default handling
suggestion for the alarm.
The following describes the performance index threshold crossing alarm with the default
code number for example.

Alarm Information
l Code Number: 1513
l Code Name: PM threshold cross-border
l Severity: Minor
l Alarm Type: QoS alarm

Probable Cause
The value of a performance index exceeds the preset threshold range.

Impact on System
The impact of threshold-crossing alarms varies with different performance indexes.

Handling Suggestion
The handling suggestion is different depending on each KPI. You can add the handling
suggestion of threshold-crossing alarms for a KPI when creating or setting a threshold
task in the Performance Management window of the NetNumen U31 client GUI.
When the value of the KPI is out of a preset threshold range, a threshold-crossing alarm
is reported. You can view the information and the preset handling suggestion of this alarm
by viewing the alarm details.

3-1

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

Note:
For the meaning and purpose of specific KPIs of a network element (NE), please refer to
the counter and performance index reference manuals of the corresponding equipment.

3-2

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 4
Equipment Alarm
Table of Contents
15010001 Performance Data Delayed........................................................................4-1

4.1 15010001 Performance Data Delayed


Alarm Information
l Code Number: 15010001
l Code Name: Performance data delayed
l Severity: Warning
l Alarm Type: Equipment alarm

Probable Cause
The probable causes of delayed performance data storage are:
l The link between NetNumen U31 and an Operation & Maintenance Module (OMM) is
broken.
l The link between the NetNumen U31 server and the database is broken.
l The performance data tablespace of the database is full.

Impact on System
NetNumen U31 fails to store the collected performance data to the database. And
therefore, the system will report failure when you query the performance data collected
during the delay period or the request for a performance report involving related data.

Handling Suggestion
Do the following to handle this alarm:
1. In the Fault Management window of the NetNumen U31 client GUI, check whether any
link broken alarms are reported, such as “198099803 Link Broken Between OMM and
NE”.
If such alarms exist, handle the link fault according to the handling suggestion of that
alarm.
2. In the Fault Management window of the NetNumen U31 client GUI, query history
alarms raised during the period when the “Performance Data Delayed” alarm persists.
If the query result contains the “198099803 Link Broken Between OMM and NE” alarm,
after the link between the OMM and the NE recovers, start a measurement task to

4-1

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

re-collect the delayed performance data or wait for automatic data re-collection initiated
by the system.
Manual data re-collecting method: In the Performance Management window, perform
the data integrity query. Next, select a specific time period to initiate recollection.
3. If the database server and the NetNumen U31 server are separately deployed on two
different hosts, ping the IP address of the database server on the NetNumen U31
server. If the ping fails, check the network connection between the database server
and the NetNumen U31 server for poor contact of network cable connectors or network
problems.
4. Check and make sure that all necessary database services have been started.
For detailed instructions on how to check all necessary database services, see
NetNumen U31 Mobile Network Element Management System Software Installation
(for UNIX).
5. In the System Monitor window of the NetNumen U31 client GUI, select the database
server, and then click the View button to open the dialogue box that shows the
information of database resources. Then check the free space (percentage) of the
tablespaces related to performance management, that is, the tablespaces whose
names contain “PM”(stores performance management data). If the percentage of
free space of a performance management tablespace is smaller than 5%, enlarge
the tablespace.

4-2

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5
OMC Alarms
Table of Contents
15010002 NAF Performance Data File Delayed .........................................................5-1
15010003 License Alarm............................................................................................5-2
1000 User Account Locked ........................................................................................5-2
1001 Database Overload............................................................................................5-3
1002 CPU Overload of Application Server ..................................................................5-4
1003 RAM Overload of Application Server..................................................................5-4
1004 Hard Disk Overload of Application Server ..........................................................5-5
1006 Directory Size Threshold Crossing.....................................................................5-5
1008 Database Space Threshold Crossing.................................................................5-6
1009 Synchronization Failure of Server Time .............................................................5-6
1010 Broken Link Between Server and Alarm Box .....................................................5-7
1011 Running Failure of the Whole Database Backup Task ........................................5-9
1012 License Has Expired........................................................................................5-10
1013 License Will Expire ..........................................................................................5-10
1014 Broken Link Between Server and NE...............................................................5-11
1015 Broken Link Between Server and NE Agent.....................................................5-12
1016 Alarm Frequency Threshold Crossing..............................................................5-13
1017 Alarm Duration Threshold Crossing .................................................................5-13
1018 Duration Threshold Crossing of Unacknowledged Alarm .................................5-14
1019 TRAP Messages Discarded.............................................................................5-15
1021 Running Failure of the Basic Database Backup Task .......................................5-16
1022 New Alarm Raised Based on the Alarm Merging Rule .....................................5-17

5.1 15010002 NAF Performance Data File Delayed


Alarm Information
l Code Number: 15010002
l Code Name: Naf Performance Data File Delayed
l Severity: Warning
l Alarm Type: OMC alarm

Probable Cause
NetNumen U31 delays storing the collected performance data into the database during a
period.

5-1

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

Impact on System
NetNumen U31 cannot generate the Northbound Adapter Function (NAF) performance
data file, and therefore fails to transfer the performance data to the Network Management
System (NMS) that is connected to NetNumen U31 via northbound interfaces.

Handling Suggestion
In the Fault Management window of the NetNumen U31 client GUI, check whether the
alarm “15010001 Performance data delayed” was reported during the failure period. If
yes, follow the instructions provided in the previous section “1501001 Performance Data
Delayed” to handle the performance data delayed alarm.

5.2 15010003 License Alarm


Alarm Information
l Code Number: 15010003
l Code Name: License Alarm
l Severity: Warning
l Alarm Type: OMC alarm

Probable Cause
The value of some system settings exceeds the maximum number specified by the license.

Impact on System
You cannot configure more network elements, cells, and carrier frequencies than the
number limited by the license.

Handling Suggestion
Do the following actions to handle this alarm:
1. In the Fault Management window of the NetNumen U31 client GUI, view the details
of the alarm to find the items whose numbers exceed the one permitted by the license.
2. Check whether the functions and capacities provided by NetNumen U31 meet the
requirements for managing the existing network. If no, apply for changing the existing
license file for a higher limit, or apply for a new license file.

5.3 1000 User Account Locked


Alarm Information
l Code Number: 1000
l Code Name: User locked
l Severity: Warning

5-2

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

l Alarm Type: OMC alarm

Probable Cause
The probable causes of this alarm are:
l The user types wrong passwords for continuous times while attempting to log in to
NetNumen U31.
l An unauthorized user tries to log in to NetNumen U31 by typing guessed passwords.

Impact on System
The user account is locked and cannot be used for login.

Handling Suggestion
Check and analyze the login log to find whether the problem is caused by a password
guessing attack. If no, the system administrator can unlock the user account at the user’s
request.

5.4 1001 Database Overload


Alarm Information
l Code Number: 1001
l Code Name: Hard disk usage of database server overload
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
During the operation of NetNumen U31, the disk space allocated to the database becomes
insufficient with the continuous data collection and storage into the database if you do
not clear the data in the database periodically, although the database can enlarge the
space automatically. An alarm will be raised once the disk space occupied by the database
exceeds the preset threshold.

Impact on System
The database occupies too much hard disk space, which may influence the proper
operation of the NetNumen U31 server.

Handling Suggestion
Back up and clear history data in the database periodically.

5-3

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

5.5 1002 CPU Overload of Application Server


Alarm Information
l Code Number: 1002
l Code Name: CPU usage of application server overload
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The Central Processing Unit (CPU) usage of the application server exceeds the preset
threshold.

Impact on System
Long-term CPU overload reduces the response speed of the application server and
influences the proper operation of the NetNumen U31 system.

Handling Suggestion
It is recommended that the system administrator handle this alarm as follows:
l Check that the load of the NetNumen U31 system is within the allowable range.
l Check whether any unnecessary applications are running on the NetNumen U31
server. If yes, exit those unnecessary applications.

5.6 1003 RAM Overload of Application Server


Alarm Information
l Code Number: 1003
l Code Name: Ram usage of application server overload
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The RAM usage of the application server exceeds the preset threshold.

Impact on System
Long-term RAM overload reduces the response speed of the application server and
influences the proper running of the EMS.

Handling Suggestion
It is recommended that the system administrator handle this alarm as follows:
l Check that the load of the NetNumen U31 system is within the allowable range.

5-4

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

l Check whether any unnecessary applications are running on the NetNumen U31
server. If yes, exit those applications to release some RAM.
l Increase the RAM of the application server.

5.7 1004 Hard Disk Overload of Application Server


Alarm Information
l Code Number: 1004
l Code Name: Hard disk usage of application server overload
l Severity: Minor
l Alarm Type: OMC alarm

Probable Cause
The hard disk usage of the application server exceeds the preset threshold.

Impact on System
The hard disk overload of the application server will influence the proper operation of the
NetNumen U31 system.

Handling Suggestion
It is recommended that the system administrator handle this alarm as follows:
l On the Task Management view of the NetNumen U31 client GUI, execute all the
directory monitoring tasks under the File Clear node on the Task Management tree
to clear the data from all backup directories.
For instructions on how to run a directory monitoring task, refer to NetNumen U31
Unified Element Management System Maintenance Management Operation Guide.
l Check that the space of the hard disk in the application server has been properly
allocated.
l Increase the hard disk capacity.

5.8 1006 Directory Size Threshold Crossing


Alarm Information
l Code Number: 1006
l Code Name: Directory size exceed the threshold
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The size of a directory on the NetNumen U31 server exceeds the preset threshold.

5-5

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

Impact on System
The oversize directory will influence the proper operation of the NetNumen U31 system.

Handling Suggestion
Please ask the system administrator to clear the data under the directory.

5.9 1008 Database Space Threshold Crossing


Alarm Information
l Code No.: 1008
l Code Name: Database space usage too large
l Severity: Minor
l Alarm Type: OMC alarm

Probable Cause
The space occupied by a database instance exceeds the preset threshold.

System Impact
The available space of the database is insufficient to store the collected data, which can
result in the loss of some data and the failure of data storage.

Handling Suggestion
Do the following to remove the probable problems causing this alarm:
l Back up and delete history data periodically.
l Clean the database periodically.
l Allocate more space to the database instance.

5.10 1009 Synchronization Failure of Server Time


Alarm Information
l Code Number: 1009
l Code Name: Server timer synchronize fail
l Severity: Warning
l Alarm Type: OMC alarm

Probable Cause
The probable causes of the time synchronization failure are:
l The server acting as the clock source is not properly configured on the client.
l The clock synchronization port of the server is disabled.

5-6

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

l The network connection between the server and the client has problem.

Impact on System
The time synchronization failure can result in inconsistent system time between the client
and the server, which influences the proper operation of the server.

Handling Suggestion
Do the following to troubleshoot the time synchronization failure:
1. On the client, check the ums-client\works\global\deploy\deploy-usf.pro
perties file to find whether the following two parameters are correctly set:
l usf.components.clocksync.source01.ip
l usf.components.clocksync.source01.port
Make sure that these two parameters are set to the actual IP address and port number
of the server.
2. On the server, check the settings of the parameter usf.components.clocksync.syn
c.port in the corresponding profile. If the parameter value is zero, set the parameter
to a non-zero value to enable the SNTP service on the clock synchronization port.
The profile containing this parameter varies with the version of Unified Element
management system Platform (UEP) and the actual network scale.
l For UEP of version 12, check this parameter in the ums-server\works\main
\deploy\deploy-usf-firewall.properties file.
l For UEP of version 13 or later versions, check this parameter in the ums-server
\works\uep\deploy\deploy-uep-cluster-cluster.properties file in
the case of Scale 1, or check this parameter in the ums-server\works\clust
er\deploy\deploy-cluster.properties file in the case of other scales.
3. On the client, ping the IP address of the server to check whether the network
connection between the client and the server is normal.

5.11 1010 Broken Link Between Server and Alarm Box


Alarm Information
l Code Number: 1010
l Code Name: The link between the Server and the alarm box is broken
l Severity: Critical
l Alarm Type: OMC alarm

Probable Cause
The probable causes of broken link between the NetNumen U31 server and the alarm box
are:
l The network connection between the server and the alarm box is abnormal.

5-7

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

l The configured IP address of the alarm box on the server is different from the actual
IP address of the alarm box.
l The server IP address configured in the alarm box is different from the actual IP ad-
dress of the server.
l The port number configured in the alarm box is different from that configured on the
server.

Impact on System
The NetNumen U31 server fails to send alarm information to the alarm box.

Handling Suggestion
Do the following to check for network connection problem and inconsistent IP address
configuration:
1. In the Fault Management window of the NetNumen U31 client GUI, get the configured
IP address of the alarm box. Then open a terminal window on the NetNumen U31
server and ping this IP address. If the ping fails, check the power connection and
network connection of the alarm box as follows:
a. Check that the power cable is firmly connected to the alarm box and the input
voltage meets the power supply requirements of the alarm box. Next, switch on
the alarm box and check whether the menu is normally displayed on the screen
of the alarm box. If no, reset or replace the alarm box.
b. Check the network cable connected to the alarm box and make sure that the
network cable is intact and in good contact with the network port (Lan1) of the
alarm box.

2. On the screen of the alarm box, choose the appropriate menu to view the IP address
of the alarm box, and then check whether the alarm box’s IP address configured on
the NetNumen U31 server is the same as this IP address. If no, modify the IP address
of the alarm box on the server to the actual one.
3. On the alarm box, press the left arrow key to view the version of the alarm box, and
select one of the following methods to check the server IP address configured in the
alarm box depending on the version of the alarm box.
l If the version of the alarm box is V3 or a previous version, choose the appropriate
menu to view the IP address of the NetNumen U31 server on the screen of the
alarm box, and then check whether it is the same as the actual IP address of the
server. If no, modify the server IP on the alarm box to the actual one.
l If the version of the alarm box is V5, open a terminal window on the server, type
the telnet [alarm box IP address] [601] command, and then enter the
password alarmpro to get access to the alarm box. Then run the tcpCfgShow
command to view the server IP configured on the alarm box.
If the configured IP address is different from the actual IP address of the server,
modify the server IP on the alarm box to the actual one via the following command:

5-8

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

cfgTcpComm [Serial No.] [Server IP] [Port No.] [Group


No.]
4. On the screen of the alarm box, choose the appropriate menu to view the port
configuration of the alarm box, and then check whether the configured port number
is the same as the port number set on the server. If no, modify the port No. on the
alarm box and make sure that it is same as the port number set on the server.

5.12 1011 Running Failure of the Whole Database


Backup Task
Alarm Information
l Code Number: 1011
l Code Name: Failed to execute the whole DB structure backup task
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The system fails to back up the whole database by executing the corresponding database
backup task.

Impact on System
The restoration of the whole database will fail because the backup data is unavailable.

Handling Suggestion
Find the cause of the backup failure and handle the fault as follows:
1. Check whether the network service name of the Oracle database is correctly set in the
format of SID_IP.
2. Check whether all necessary database services have been started and are properly
running.
3. View the details of the alarm and do appropriate checks according to the error
information as follows:
l If the alarm details prompt the failure of obtaining database password, check the
JCA data source of each database and make sure that the JCA data source of
each database has been correctly set, and each data source has been set with a
password.
l If the alarm details prompt insufficient disk space, check the disk space on the
server and make sure that enough space is available for the storage of the backup
file.
l If the alarm details indicate lots of failures, such as failure of querying data
from the database, failure of reading basic table definition, failure of obtaining
tablespace information, failure of obtaining data file information, failure of

5-9

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

obtaining the version information of SQL server, and failure of obtaining the
installation path of SQL server, check whether the connection to the database
is normal.
l If the alarm details prompt that a database to be backed up is unavailable in the
instance, check whether the database exists in the current database instance.

5.13 1012 License Has Expired


Alarm Information
l Code Number: 1012
l Code Name: License is expired
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The license of NetNumen U31 has expired.

Impact on System
You cannot use the NetNumen U31 system any longer.

Handling Suggestion
Contact the system administrator for a new license.

5.14 1013 License Will Expire


Alarm Information
l Code Number: 1013
l Code Name: License is about to expire
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The licence of the NetNumen U31 system will expire in a short time.

Impact on System
You cannot use the NetNumen U31 system after the license expires.

Handling Suggestion
Contact the system administrator for a new licence.

5-10

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

5.15 1014 Broken Link Between Server and NE


Alarm Information
l Code Number: 1014
l Code Name: The link between the Server and the NE is broken
l Severity: Critical
l Alarm Type: OMC alarm

Probable Cause
The probable causes of broken link between the NetNumen U31 server and an NE are:
l The link between the NE and its Operation & Maintenance Module (OMM) is broken.
l The link between the NetNumen U31 server and the OMM of the NE is broken.

Impact on System
l If the link between the NE and its OMM is broken, the OMM fails to obtain performance
and alarm data from the NE.
l If the link between NetNumen U31 and the OMM is broken, NetNumen U31 fails to
get performance and alarm data of the NE via the OMM.

Handling Suggestion
Do the following to troubleshoot the broken link between the server and the NE:
1. Check the connection between the OMM and the NE as follows:
a. In the Configuration Management window of the NetNumen U31 client GUI, get
the IP address of the NE.
b. On the OMM server, ping the IP address of the NE. If the ping fails, check whether
the network between the OMM and the NE has problem and whether the NE itself
is faulty.
2. Check the connection between NetNumen U31 and the OMM as follows:
a. In the Fault Management window of the NetNumen U31 client GUI, view the
details of the alarm to find the name of the OMM to which the link is broken.
b. In the Topology Management window of the NetNumen U31 client GUI, find the
OMM, and then view the properties of the OMM to get the Envrionment Monitor
Board (EMB) port number and the File Transfer Protocol (FTP) port number.
c. On the NetNumen U31 server, try to telnet the EMB port and FTP port of the OMM.
If the telnet to the EMB port or the FTP port fails, check whether the EMB port or
the FTP port is enabled on the OMM.

d. If these two ports are enabled, execute the netstat -ano command on the
OMM server to check whether these two ports are occupied by other processes.

5-11

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

e. If these two ports are not occupied by other processes, check the operation log
on the OMM server for exceptions during the startup process. If any exception
is found in the log, restart the OMM and make sure that the OMM is successfully
started without exception.

5.16 1015 Broken Link Between Server and NE Agent


Alarm Information
l Code Number: 1015
l Code Name: The link between the Server and the NE Agent is broken
l Severity: Critical
l Alarm Type: OMC alarm

Probable Cause
The link between NetNumen U31 and an Operation & Maintenance Module (OMM) is
broken, which results in communication interruption.

Impact on System
NetNumen U31 fails to manage the OMM and related NEs. The OMM can not receive
any synchronization information and Man-Machine Language (MML) commands from
NetNumen U31.

Handling Suggestion
Check the connection between NetNumen U31 and the OMM as follows:

1. In the Fault Management window of the NetNumen U31 client GUI, view the details
of the alarm to find the name of the OMM to which the link is broken.
2. In the Topology Management window of the NetNumen U31 client GUI, find the OMM,
and then view its properties to get the EMB port number and the FTP port number of
the OMM.
3. On the NetNumen U31 server, try to telnet to the EMB port and FTP port of the OMM.
If the telnet to the EMB port or the FTP port fails, check whether the EMB port or the
FTP port is enabled on the OMM.
4. If these two ports are enabled, execute the netstat -ano command on the OMM
server to check whether these two ports are occupied by other processes.
5. If these two ports are not occupied by other processes, check the operation log on
the OMM server for exception during the startup process. If any exception is found in
the log, restart the OMM and make sure that the OMM is successfully started without
exception.

5-12

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

5.17 1016 Alarm Frequency Threshold Crossing


Alarm Information
l Code Number: 1016
l Code Name: Frequency of warning overload
l Severity: Minor
l Alarm Type: OMC alarm

Probable Cause
The occurrence frequency of an alarm during a specified period exceeds the preset
threshold. According to the alarm counting rule, the system raises a new alarm, indicating
the frequent occurrence of the same alarm.

Note:
This alarm only occurs when you have properly set the alarm counting rule and specified
the alarm conditions.

Impact on System
This alarm indicates a fault repeatedly occurring in the system, prompting a potential risk.
Timely troubleshooting is required when this alarm is raised.

Handling Suggestion
Do the following to handle this alarm:
1. In the Fault Management window of the NetNumen U31 client GUI, view the details of
the alarm to find the original alarm that occurred repeatedly, which causes this alarm.
2. Find the handling suggestion of the original alarm by its alarm code, and then handle
the original alarm according to the suggestion.

5.18 1017 Alarm Duration Threshold Crossing


Alarm Information
l Code Number: 1017
l Code Name: The time in which the designated alarm remains active has expired
l Severity: Minor
l Alarm Type: OMC alarm

5-13

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

Probable Cause
When an alarm persists in the NetNumen U31 system for so long time that the duration
exceeds the preset threshold specified in the alarm duration rule, the system raises a new
alarm, indicating the long persistence of this alarm.

Note:
This alarm only occurs when you have properly set the alarm duration rule and specified
the alarm-raising conditions.

Impact on System
This alarm indicates a persistent alarm that has not been cleared during the specified
period. Timely troubleshooting is required when this alarm is raised.

Handling Suggestion
Do the following to handle this alarm:
1. In the Fault Management window of the NetNumen U31 client GUI, view the details
of the alarm to find the original alarm that persists for a long time, which causes this
alarm.
2. Find the handling suggestion of the original alarm by its alarm code, and then handle
the original alarm according to the suggestion.

5.19 1018 Duration Threshold Crossing of


Unacknowledged Alarm
Alarm Information
l Code Number: 1018
l Code Name: The time in which the designated alarm remains unacknowledged has
expired
l Severity: Minor
l Alarm Type: OMC alarm

Probable Cause
When an alarm has not been acknowledged during the specified period in the alarm
duration rule, the system raises a new alarm, prompting the unacknowledgement of the
alarm.

5-14

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

Note:
This alarm only occurs when you have properly set the alarm duration rule and specified
the alarm-raising conditions.

Impact on System
This alarm indicates that an alarm has not been acknowledged during the specified period.
Timely troubleshooting is required when this alarm is raised.

Handling Suggestion
Do the following to handle this alarm:
1. In the Fault Management window of the NetNumen U31 client GUI, view the details
of the alarm to find the original alarm that causes this alarm.
2. Find the handling suggestion of the original alarm by its alarm code, and then handle
the original alarm according to the suggestion.

5.20 1019 TRAP Messages Discarded


Alarm Information
l Code Number: 1019
l Code Name: Speed of trap receive too fast, discard some trap
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The NetNumen U31 server receives too many TRAP messages from managed NEs during
a short time, which may be caused by an alarm storm.

Impact on System
NetNumen U31 discards some TRAP messages received from OMMs that report too many
TRAP messages.

Handling Suggestion
Do the following actions to handle this alarm:

1. Check the NE(s) that reports too many TRAP messages and find whether an alarm
storm occurs in it.
2. Reset the alarm threshold by modifying the usf.usf.trap.queue parameter in the
corresponding profile, whose default value is 2000.

5-15

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

The profile containing this parameter varies with the UEP version:
l For UEP with the version 12, modify this parameter in the \ums-server\work
s\ftp\data\config\deploy-usf.properties file.
l For UEP with the version 13 and 20, modify this parameter in the \ums-server
\works\cluster\ftpdata\config\deploy-usf.properties file
l For UEP with the version 30 or later versions, modify this parameter in the \ums
-server\works\global\deploy\deploy-usf.properties file.

5.21 1021 Running Failure of the Basic Database


Backup Task
Alarm Information
l Code Number: 1021
l Code Name: Fail to execute the basic database backup task
l Severity: Major
l Alarm Type: OMC alarm

Probable Cause
The system fails to complete the task for backing up the basic data of the database.

Impact on System
The restoration of the database will fail because the backup data is unavailable.

Handling Suggestion
Find the cause of the backup failure and handle the fault as follows:
1. Check whether the network service name of the Oracle database is correctly set in the
format of SID_IP.
2. Check whether all necessary database services have been started and are properly
running.
3. View the details of the alarm and do appropriate checks according to the error
information:
l If the alarm details prompt the failure of obtaining database password, check the
JCA data source of each database and make sure that the JCA data source of
each database has been correctly set, and each data source has been set with a
password.
l If the alarm details prompt insufficient disk space, check the disk space on the
server and make sure that enough space is available for the storage of the backup
file.
l If the alarm details prompt lots of failures, such as failure of querying data from the
database, failure of reading basic table definition, failure of obtaining tablespace
information, failure of obtaining data file information, failure of obtaining the

5-16

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Chapter 5 OMC Alarms

version information of SQL server, and failure of obtaining the installation path of
SQL server, check whether the connection to the database is normal.
l If the alarm details prompt that a database to be backed up is unavailable in the
instance, check whether the database exists in the current instance.

5.22 1022 New Alarm Raised Based on the Alarm


Merging Rule
Alarm Information
l Code Number: 1022
l Code Name: Relative alarms arise a new alarm
l Severity: Minor
l Alarm Type: OMC alarm

Probable Cause
When a fault causes multiple alarms, the system can merge these alarms according to a
preset alarm merging rule and raise a new alarm that indicates the existence of alarms
caused by the same source.

Impact on System
By merging the alarms caused by the same source into one alarm, the system reduces
the count of alarms displayed on the GUI so that you can find specific alarms more easily.

Handling Suggestion
Do the following to handle this alarm:

1. Click the “+” sign before this alarm in the current alarm table to show all merged alarms.
2. Find the handling suggestion of each alarm by its alarm code, and then handle each
according to the corresponding suggestion.

5-17

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


NetNumen™ U31 R18 Alarm Handling Reference

This page intentionally left blank.

5-18

SJ-20110823134613-015|2011-12-05(R1.1) ZTE Proprietary and Confidential


Figures

I
Figures

This page intentionally left blank.


Tables

III
Tables

This page intentionally left blank.


Glossary
CPU
- Central Processing Unit
EMB
- Environment Monitor Board
EMS
- Network Element Management System
FTP
- File Transfer Protocol
GUI
- Graphical User Interface
J2EE
- JAVA 2 platform Enterprise Edition
KPI
- Key Performance Index
MML
- Man Machine Language

NAF
- Northbound Adapter Function

NE
- Network Element
NMS
- Network Management System
OMC
- Operation & Maintenance Center
OMM
- Operation & Maintenance Module
PM
- Performance Management
QoS
- Quality of Service

RAM
- Random Access Memory
SNTP
- Simple Network Time Protocol

V
NetNumen™ U31 R18 Alarm Handling Reference

SQL
- Structured Query Language
TRAP
- Trap
UEP
- Unified Element management system Platform

VI

You might also like