Professional Documents
Culture Documents
1
Capacity Monitoring Guide
(BSC6910-based)
Issue Draft A
Date 2015-01-15
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer.
All or part of the products, services and features described in this document may not be within the purchase scope or
the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this
document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or
implied.
The information in this document is subject to change without notice. Every effort has been made in the preparation
of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this
document do not constitute a warranty of any kind, express or implied.
Email: support@huawei.com
1.1 Overview
Growing traffic in mobile networks, especially in newly deployed networks, requires more
and more network resources, such as radio and transmission resources. Lack of network
resources will affect user experience. Therefore, monitoring network resources, locating
bottlenecks, and performing capacity expansion in real time are critical to the provision of
high quality services.
This document describes how to monitor the usage of various network resources and locate
network resource bottlenecks.
This document applies to BSC6910s and 3900 series base stations.
Radio Network Controllers (RNCs) mentioned in this document refer to Huawei BSC6910s.
For details about the MML commands, parameters, alarms, and performance counters, see section
"Operation and Maintenance" in the BSC6910 UMTS Product Documentation or 3900 Series
WCDMA NodeB Product Documentation.
For details about flow control, see Flow Control Feature Parameter Description in the RAN
Feature Documentation.
1.2 Organization
This document is organized as follows:
Chapter Description
1 Overview Introduces network resources and monitoring methods.
2 Network Resource Provides capacity monitoring principles, monitoring
Monitoring methods, and optimization suggestions.
3 Network Resource Provides methods for analyzing and locating network
Troubleshooting congestion and overload problems.
4 Metrics Definitions Provides a table that includes all metrics currently in use.
5 Reference Documents Lists the documents referenced within the text and provides
the document name, document package, and document
package download path at http://support.huawei.com.
Draft A (2015-01-15)
This is a Draft A release of RAN17.1.
Compared with Issue 01 (2015-01-15) of RAN17.0, Draft A of RAN17.1 includes the
following changes.
Contents
1 Overview .......................................................................................................................................... 5
1.1 Network Resources .......................................................................................................................................... 5
1.2 Monitoring Methods ......................................................................................................................................... 7
1 Overview
SCU provides the function of inter-subrack information exchange in the RNC. When the
traffic volume of inter-subrack communication approaches the overload threshold, voice
service quality, data service quality, and network KPIs deteriorate, causing the system to
become unstable.
RNC license
The RNC license (BSC6910) can be controlled based on the following license resource
items:
− PS user-plane traffic volume
− Voice traffic volume (Erlangs)
− HSDPA traffic volume
− HSUPA traffic volume
− RNC hardware capacity
− Cell-level HSDPA/HSUPA user number
For details about other resources, counters, and monitoring thresholds, see the corresponding
sections.
This procedure applies to most resource monitoring scenarios, but sometimes the system may
be overloaded because of other abnormalities instead of traffic increases. Some of these
abnormalities can be discovered by resource cross-checking. If more extensive resource
cross-checking is unsuccessful, more advanced troubleshooting is required. Chapter 3
"Network Resource Troubleshooting" addresses the more advanced troubleshooting
procedures.
For the sake of simplicity, it is assumed that there are no other abnormalities on the network
during the proactive monitoring process described in chapter 2 "Network Resource
Monitoring."
load exceeds 60% during peak hours for n days (n is specified according to the actual
situation; 3 days by default) in a week, contact Huawei for technical support.
NOTE
In effective mode, run the MML command LST BRD to query information about a specific board,
for example, whether the board is an interface board.
In ineffective mode, find the MML command ADD BRD in the RNC configuration file. If the value
of the BRDCLASS parameter is INT, the board is an interface board. You can obtain information
about this interface board using this command.
unstable. Therefore, the SCU CPU load and inter-subrack bandwidth need to be monitored for
SCU boards.
The licenses related to the traffic volume are listed in the following table.
The licenses related to the BSC6910 hardware capacity are listed in the following table.
The BSC6910 Hardware Capacity License Utilization Rate is calculated using the following
formula:
RNC Hardware Capacity License Utilization Rate = VS.HWLicense.Thruput.RNC/(RNC
Throughput Hardware Capacity (per 50Mbps)License allocated x 50 x 1000) x 100%
The license related to the BSC6910 ENIUa traffic is described in the following table.
The BSC6910 ENIUa License Utilization Rate is calculated using the following formula:
RNC ENIUa License Utilization Rate = VS.NIUPSLoad.Thruput.RNC/(Evolved Network
Intelligence Processing Throughput License allocated x 50 x 1000) x 100%
T
h
e
License Item Abbreviation Sales Unit
B
S Active User Hardware Capacity (per
RNC LQW1HWCL04 1000 Active Users
C Active Users)
1000
6
9
10 active user number capacity license is described in the following table.
The Active User License Utilization Rate is calculated using the following formula:
Active User License Utilization Rate = (VS.CellDCHUEs.RNC +
VS.CellFACHUEs.RNC)/(RNC Active User Hardware Capacity (per 1000 Active
Users)License allocated x 1000) x 100%
The licenses related to the cell-level HSDPA/HSUPA user number capacity are listed in
the following table.
The Cell HSDPA Users License Utilization Rate is calculated using the following
formula:
Cell HSDPA Users License Utilization Rate = VS.HSDPA.UE.Mean.Cell/Max HSDPA
users of the HSDPA users license per cell x 100%
NOTE
In the preceding formula:
The value of VS.HSDPA.UE.Mean.Cell is equal to the value of VS.HSDPA.UE.Mean.Cell for the
cell with the largest number of HSDPA users in the RNC.
The value of Max HSDPA users of the HSDPA users license per cell is equal to the maximum
licensed value of the number of HSDPA users in a cell configured.
The Cell HSUPA Users License Utilization Rate is calculated using the following
formula:
Cell HSUPA Users License Utilization Rate = VS.HSUPA.UE.Mean.Cell/Max HSUPA
users of the HSUPA users license per cell x 100%
NOTE
In the preceding formula:
The value of VS.HSUPA.UE.Mean.Cell is equal to the value of VS.HSUPA.UE.Mean.Cell for the
cell with the largest number of HSUPA users in the RNC.
The value of Max HSUPA users of the HSUPA users license per cell is equal to the maximum
licensed value of the number of HSUPA users in a cell configured.
When the traffic volume license utilization rate during peak hours exceeds 80%, perform
capacity expansion immediately.
When the hardware capacity license utilization rate or cell-level HSDPA/HSUPA user number
license utilization rate during peak hours reaches 60% for n days (n is specified according to
the actual situation; 3 days by default), a prewarning is required. When the hardware capacity
license utilization rate or cell-level HSDPA/HSUPA user number license utilization rate
during peak hours exceeds 70%, perform capacity expansion immediately.
Each cell has only one RACH. When signaling and user data coexist on the RACH, the
RACH usage is calculated using the following formula:
RACH Usage = ((VS.CRNCIubBytesRACH.Rx - VS.TRBNum.RACH x 360/8) x
8/168)/(<SP> x 60 x 4/0.02) + VS.TRBNum.RACH/(<SP> x 60 x 4/0.02)
The downlink cell load is indicated by the mean utility ratio of transmitted carrier power in a
cell.
The mean utility ratio of the transmitted carrier power for non-HSPA users in a cell
(including non-HSPA users on CCHs) is calculated using the following formula:
MeanTCP (NonHS) Usage = MeanNonHSTCP/MAXTXPOWER x 100%
The mean utility ratio of the transmitted carrier power for all users in a cell is calculated
using the following formula:
MeanTCP Usage = MeanTCP/MAXTXPOWER x 100%
NOTE
To obtain MAXTXPOWER, run the LST UCELL command to query the value of the Max
Transmit Power of Cell parameter, and convert the parameter value from the unit "0.1 dBm" to
"watt."
The NodeB measures the RTWP on each receive channel in each cell. The cell RTWP
obtained by the RNC is the linear average of the RTWPs measured on all receive channels in
a cell under the NodeB. The RTWP indicates the interference to a NodeB and the signal
strength on the RX port on the RF module.
The uplink cell capacity is restricted by the rise over thermal (RoT), which equals the RTWP
minus the cell's background noise. The formula is as follows:
The relationship between the RoT and the uplink load factor
UL is as follows:
1
RoT 10 log( )
1 UL
For example, a 3 dB noise increase corresponds to 50% of the uplink load and a 6 dB noise
increase corresponds to 75% of the uplink load.
Figure 2-3 Relationship between RTWP, noise increase, and uplink load
A large RTWP value in a cell is caused by traffic overflow, hardware faults (for example, poor
quality of antennas or feeder connectors), or external interference. If the RTWP value is too
large, the cell coverage shrinks, the quality of admitted services declines, or new service
requests are rejected.
If the value of the VS.MeanRTWP counter is greater than –90 dBm during peak hours
for n days (n is specified according to the actual situation; 3 days by default) in a week,
there may be hardware faults or external interference. Locate and rectify the faults and
attempt to eliminate the interference. If the value of the VS.MeanRTWP counter is still
greater than –90 dBm after hardware faults are rectified and external interference is
eliminated, enable the following features as required:
− WRFD-140215 Dynamic Configuration of HSDPA CQI Feedback Period
The system reserves code resources for HSDPA services, and these code resources can be
shared among HSDPA services. Therefore, HSDPA services do not require admission control
based on cell code resources.
Figure 2-7 shows NodeB-controlled dynamic code allocation.
Enable the WRFD-010631 Dynamic Code Allocation Based on NodeB feature if this
feature has not been enabled. Preferentially allocate idle codes to HSDPA UEs to
improve the HSDPA UE throughput.
Add a carrier or split the sector.
Enable the WRFD-010653 96 HSDPA Users per Cell feature if it is not enabled.
NOTE
For details about how to enable the WRFD-010631 Dynamic Code Allocation Based on NodeB feature
and the WRFD-010653 96 HSDPA Users per Cell feature, see HSDPA Feature Parameter Description in
the RAN Feature Documentation.
2.11 CE Usage
2.11.1 Monitoring Principles
CE resources are baseband processing resources in a NodeB. The more CEs a NodeB supports,
the stronger the NodeB's service processing capability. If a new call arrives but there are not
enough CEs (not enough baseband processing resources), the call will be blocked.
Uplink CE resources can be shared in an uplink resource group, but not between uplink
resource groups. Downlink CE resources are associated with the baseband processing
boards where a cell is set up. CE resources allocated by licenses are shared among services on
the NodeB. (CE resources are shared on a per operator basis in MOCN scenarios.)
The NodeB sends the RNC a response message that carries its CE capability. The NodeB's CE
capability is limited by both the installed hardware and the configured software licenses.
The methods of calculating the credit resource usage of admitted UEs are different before and
after the CE Overbooking feature is enabled. Table 2-3 describes the details.
Table 2-3 Credit resources consumed by admitted UEs before and after CE Overbooking is
enabled
Before or After CE Credit Resource Consumed by Admitted UEs
Overbooking is Enabled
Before CE Overbooking The RNC calculates the usage of CEs for admitted UEs by
is enabled adding up credit resources reserved for each UE.
R99 UE: The RNC calculates the usage of credit resources
for an R99 UE based on the mobility binding record
(MBR).
HSUPA UE: The RNC calculates the usage of credit
resources for an HSUPA UE based on MAX (GBR,
Rateone RLC PDU).
NOTE
For details about CE Overbooking, see CE Overbooking Feature Parameter Description.
CCHs do not require extra CE resources because the RNC reserves CE resources for services
on these channels. Signaling carried on an associated channel of the DCH does not consume
extra CE resources, either. One CE can be consumed by a 12.2 kbit/s voice call.
For details about CE number consumed by different services, see CE Resource Management
Feature Parameter Description.
In IP transmission, the BSC6910 only support transmission resource pool mode and does not
support resource groups.
Iub bandwidth congestion will cause RRC connection setup and RAB setup failures. Table 2-4
describes Iub bandwidth congestion scenarios.
Scenario Description
The physical Admission control is performed based on the sufficiency of
transmission transmission resources. If transmission resources are
resources are sufficient, the admission succeeds. Otherwise, the
sufficient, but the transmission fails. Admission control prevents excessive
admission fails. service admission and ensures the quality of admitted
services.
The physical The physical transmission bandwidth between the RNC
transmission and NodeB is insufficient.
bandwidth is
insufficient.
IP Transmission
On an IP transmission network, the RNC and NodeB can monitor the average uplink and
downlink loads on their interface boards. The Iub bandwidth usage is represented by the ratio
of the average uplink or downlink load to the configured Iub bandwidth. In addition, the RNC
and NodeB can dynamically adjust the bandwidth of a service based on the QoS requirements
of the service and user priority and improve the Iub bandwidth usage using the reserve
pressure algorithm on their interface boards.
1. Bandwidth-based admission success rate
The counters for monitoring the bandwidth-based admission success rate are as follows:
VS.ANI.IP.Conn.Estab.Succ: Number of Successful IP Connections for IP Transport
Adjacent Node
VS.ANI.IP.Conn.Estab.Att: Number of IP Connection Setup Requests Received by IP
Transport Adjacent Node
VS.ANI.IP.FailResAllocForBwLimit: Number of Failed Resource Allocations Due to
Insufficient Bandwidth on the IP Transport Adjacent Node
The IP connection setup success rate is calculated using the following formula:
IP connection setup success rate = VS.ANI.IP.Conn.Estab.Succ/VS.ANI.IP.Conn.Estab.Att
If the IP connection setup success rate is less than 99%, bandwidth congestion has probably
occurred on the user plane.
If the value of the VS.ANI.IP.FailResAllocForBwLimit counter is not zero, bandwidth
congestion has occurred on the user plane.
2. Physical bandwidth usage
(1) Control plane
The counters for monitoring the number of bytes transmitted or received on an SCTP link
(carrying NCP, CCP, and ALCAP) are as follows:
VS.SCTP.TX.BYTES: Number of IP bytes transmitted on an SCTP link
VS.SCTP.RX.BYTES: Number of IP bytes received on an SCTP link
The uplink and downlink traffic volumes of the SCTP links on the control plane over the Iub
interface are calculated using the following formulas:
SCTP UL KBPS = VS.SCTP.RX.BYTES x 8/<SP>/1000
SCTP DL KBPS = VS.SCTP.TX.BYTES x 8/<SP>/1000
If ALM-21542 SCTP Link Congestion is reported on the Iub interface, bandwidth congestion
has occurred on the control plane.
(2) User plane
In transmission resource pool networking, the number of IP bytes transmitted or received on
the user plane by each adjacent node is measured by the following counters:
VS.IPPOOL.ANI.IPLAYER.TXBYTES: Number of IP bytes transmitted on the user
plane of an adjacent node
VS.IPPOOL.ANI.IPLAYER.RXBYTES: Number of IP bytes received on the user plane
of an adjacent node
The Iub bandwidth usage of a transmission resource pool is calculated using the following
formulas:
UL Iub Usage on User Plane = VS.IPPOOL.ANI.IPLAYER.RXBYTES x
8/<SP>/1000/RX BW_CFG
DL Iub Usage on User Plane = VS.IPPOOL.ANI.IPLAYER.TXBYTES x
8/<SP>/1000/TX BW_CFG
If the uplink or downlink bandwidth usage of the user plane on the Iub interface reaches 70%,
bandwidth congestion has occurred on the user plane. Then, capacity expansion is required.
Scenario 4: Admission Bandwidth Congestion on the ATM and IP User Planes, Not
Accompanied by Physical Bandwidth Congestion
Step 2 Decrease the activity factor for PS services to enable the system to admit more UEs. Query
the activity factor used by the adjacent node by checking the configuration data of the
following command:
ADD ADJMAP: ANI=XXX, ITFT=IUB, TRANST=XXX, CNMNGMODE=SHARE,
FTI=OldIndex;
Step 3 Run the ADD TRMFACTOR command to add an activity factor table. The recommended
value for PSINTERDL, PSINTERUL, PSBKGDL, PSBKGUL, HDINTERDL, and
HDBKGDL is 10. The following is an example:
ADD TRMFACTOR:FTI=NewIndex, REMARK="IUB_USER", PSINTERDL=10,
PSINTERUL=10, PSBKGDL=10, PSBKGUL=10, HDINTERDL=10, HDBKGDL=10;
Step 4 Run the MOD ADJMAP command to use the new activity factor on the adjacent node. The
following is an example:
MOD ADJMAP: ANI=XXX, ITFT=IUB, FTI=NewIndex;
NOTE
The activity factor equals the ratio of actual bandwidth occupied by a UE to the bandwidth allocated to
the UE during its initial access. It is used to predict the bandwidth required by admission. Each NodeB
can be configured with its own activity factor. The default activity factor for voice services is 70%, and
the default activity factor for PS BE services is 40%.
----End
resource pool. In this case, this counter measures the average ratio of HSDPA users
activated in the resource pool to HSDPA users supported by the resource pool.
VS.BOARD.UsedHsupaUserRatio.Mean: measures the average ratio of HSUPA users
activated on the board scheduler to HSUPA users supported by a board. If the HSUPA
Scheduler Pool feature is activated, all boards that support this feature form a shared
resource pool. In this case, this counter measures the average ratio of HSUPA users
activated in the resource pool to HSUPA users supported by the resource pool.
3.1 Possible Block and Failure Points in the Basic Call Flow
When network resources are insufficient, accessibility-related KPIs are likely to be affected
first.
Figure 3-1 uses a mobile terminated call (MTC) as an example to illustrate the basic call
flow, with possible block/failure points indicated. For details about the basic call process, see
3GPP TS 25.931.
Figure 3-1 Basic call flow chart and possible block/failure points
Step 5 If the RNC is congested when receiving the RRC connection setup request, the RNC may
drop the request message. See block point #3.
Step 6 If the RNC receives the RRC connection setup request and does not discard it, the RNC
determines whether to accept or reject the request. The request may be rejected due to
insufficient resources, as shown in block point #4.
Step 7 If the RNC accepts the request, the RNC instructs the UE to set up an RRC connection. The
UE may not receive RNC's response message or may find that the configuration does not
support RRC connection setup. See failure points #5 and #6.
Step 8 After the RRC connection is set up, the UE sends NAS messages to negotiate with the CN
about service setup. If the CN determines to set up a service, the CN sends an RAB
assignment request to the RNC.
Step 9 The RNC accepts or rejects the RAB assignment request based on the resource usage of RAN.
See block point #7.
Step 10 If the RNC accepts the RAB assignment request, it initiates an RB setup process. During the
process, the RNC sets up radio links (RLs) to the NodeB over the Iub interface and sends an
RB setup message to the UE over the Uu interface. The RL or RB setup process may incur
failures. See failure points #8 and #9.
----End
In most cases, the troubleshooting procedure includes detecting abnormal KPIs, selecting top
N cells, and analyzing abnormal KPIs.
In the case of system bottlenecks, accessibility-related KPIs are usually checked first because
most of the access congestion issues are caused by insufficient system resources.
Figure 3-3 shows key points for bottleneck analysis. The following sections then describe
these key points.
As shown in Figure 3-4, if the CP CPU load exceeds 50%, analyze the cause of the problem
and prevent the CPU load from increasing further. In addition, perform an in-depth analysis of
the high CPU load, for example, check whether the parameter configurations are appropriate.
If the high CPU load is caused by an increase in traffic volume, it is recommended that you
expand hardware capacity.
CP/UP flexible deployment can adjust the hardware resources in the CP and UP pools so that
these pools have basically the same average load. Because the non-pooled load in the CP
cannot be shared, dynamic cell allocation is required to balance this load in the CP pool.
If the traffic model changes, the load can be balanced between CP and UP pools to a certain
degree but cannot be completely balanced. For example, if the CP CPU load is lower than the
UP CPU load but the remaining CP resources are insufficient to bear new NodeBs and cells, it
is recommended that you reduce the number of cells in the RNC or add new boards.
If the UP CPU load exceeds 60%, analyze the cause of the problem and prevent the CPU load
from increasing further. If the high CPU load is caused by an increase in traffic volume, it is
recommended that you expand hardware capacity.
CE resources can be shared within a resource group. Therefore, CE usage on the NodeB must
be calculated to determine whether CE congestion occurs in a resource group or the NodeB. If
CE congestion occurs in a resource group, reallocate CE resources among resource groups. If
CE congestion occurs in the NodeB, perform capacity expansion.
Generally, adding a carrier is the most effective means of relieving uplink power congestion.
If no additional carrier is available, add a NodeB or reduce the downtilt of the antenna.
4 Metrics Definitions
5 Reference Documents
This chapter lists the documents referenced within the text and provides the document name,
document package, and document package download path at http://support.huawei.com.