Professional Documents
Culture Documents
5.capacity Monitoring Guide
5.capacity Monitoring Guide
Issue Date
03 2012-11-07
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied. The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.
Issue 03 (2012-11-07)
Audience
This document is intended for network maintenance personnel.
Organization
This document consists of the following chapters. Chapter 1 Network Resource Monitoring Methods 2 Network Resource Counters 3 HSPA Related Resources 4 Diagnosis of Problems Related to Network Resources Description Describes basic concepts associated with network resources, including definitions and monitoring activities. Describes various network resources. Describes how to monitor network resources when HSPA is enabled. Provides fault analysis and locating methods that experienced WCDMA network maintenance personnel can use to handle network congestion or overload events efficiently.
Issue 03 (2012-11-07)
ii
Description Lists all performance counters mentioned in the other chapters. These counters help in monitoring network resources and designing resource analyzing instruments.
Change History
Changes between document issues are cumulative. Therefore, the latest document issue contains all changes made in previous issues.
03 (2012-11-07)
This is the third commercial release of RAN 14.0. Compared with issue 02 (2012-06-30), this issue incorporates the following changes:
Updated section 2.12 CNBAP Load of the NodeB Main Processing and Transmission Unit (WMPT/UMPT).
02 (2012-06-30)
This is the second commercial release of RAN 14.0. Compared with issue 01 (2012-04-30), this issue incorporates the following changes:
Update the formula for calculating CE usage, replace the NodeB counter with RNC Counter. Add MPU part. Adjust SPU,DPU,Interface board threshold. Adjust the document structure.
01 (2012-04-30)
This is the first commercial release of RAN 14.0. Compared with issue Draft A (2012-02-15), this issue optimizes the description.
Draft A (2012-02-15)
This is the draft for RAN14.0.
Issue 03 (2012-11-07)
iii
Contents
Contents
About This Document .................................................................................................................... ii 1 Network Resource Monitoring Methods ................................................................................. 1
1.1 Network Resource Introduction ....................................................................................................................... 1 1.2 Resource Monitoring Procedure ....................................................................................................................... 2
Issue 03 (2012-11-07)
iv
Contents
4.4 Resource Analysis .......................................................................................................................................... 23 4.4.1 CE Resource Consumption Analysis .................................................................................................... 25 4.4.2 Code Resource Usage Analysis ............................................................................................................. 28 4.4.3 Iub Resource Analysis ........................................................................................................................... 28 4.4.4 Power Resource Analysis ...................................................................................................................... 29 4.4.5 SPU CPU Usage Analysis ..................................................................................................................... 30 4.4.6 DPU DSP and Interface Board CPU Usage Analysis ........................................................................... 32 4.4.7 PCH Usage Analysis ............................................................................................................................. 32 4.4.8 FACH Usage Analysis .......................................................................................................................... 33
Issue 03 (2012-11-07)
For details on network resources, see chapter 2 "Network Resource Counters." For details on HSPA-associated resources, see chapter 3 "HSPA Related Resources."
Problem-driven analysis: When a network performance counter deteriorates (for example, calls are dropped), a thorough analysis is performed. This method is applicable to analysis upon network congestion. This method requires more analysis instruments and skills than the prediction-based monitoring method, but can use the current system and eliminates the need for an immediate network expansion. For details on this method, see chapter 4 "Diagnosis of Problems Related to Network Resources."
NOTE
In addition to the preceding two methods, other methods may also be used by network maintenance engineers for system problem analysis.
SPU: indicates the signaling processing unit on an RNC. An RNC supports various types of SPUs. SPUs process air interface signaling and manage transport resources. They are the most likely network resource bottleneck. MPU: indicates the main control processing unit on an RNC. It manages control-plane resources, user-plane resources, and transport resources. If provided on an SPUb board, the MPU subsystem may be overloaded.
Issue 03 (2012-11-07)
DPU: indicates the user-plane processing unit on an RNC. It distributes user-plane service data. With rapid development of mobile broadband (MBB), more and more DPU resources are consumed. There is a high possibility that the preset DPU resource capability cannot meet the requirements for the rapid development. Received total wideband power (RTWP): indicates the total wideband power received by a base station within a bandwidth (namely, the uplink load generated due to the receiver noise, external radio interference, and uplink traffic). This is a counter for measuring uplink load, similar to the received signal strength counter (RSSI) in the CDMA system. RSSI is a downlink load measurement, indicating the total channel power received by a UE. Transmitting carrier power (TCP): indicates the full-carrier power transmitted by a cell and is a counter for monitoring downlink load. This counter value is limited by the maximum transmission capability of the power amplifier at a NodeB. Channel element (CE): indicates the baseband processing resource. CEs are managed and shared at the NodeB level. For a new network, this counter has a small start value to lower capital expenditure (CAPEX). Generally, CEs are the most likely resource bottleneck that results in network congestion. Orthogonal variable spreading factor (OVSF): indicates the downlink OVSF code resource. For a cell, only one OVSF code tree is available in the downlink direction. Iub interface resource: On an IP transport network, uplink and downlink Iub interface bandwidth can be dynamically adjusted for both NodeBs and RNCs. Generally, transport resource bottlenecks do not result from insufficient capacities of interface boards but from low bandwidth available on the IP transport network. Paging channel (PCH): The PCH usage is directly related to the LAC area plan and PCH state transition. PCH overload will cause a decrease in the paging success ratio. Random access channel (RACH) and forward access channel (FACH): The RACH and FACH carry signaling and some user-plane data. RACH/FACH overload will cause a decrease in access success ratio and affect user experience. Main processing and transmission unit(WMPT/UMPT): The main processing and transmission unit performs site transmission, signaling, and system management. CPU overload of the WMPT will cause a decrease in system processing capabilities, therefore affecting NodeB-related KPIs.
Issue 03 (2012-11-07)
For a newly constructed network, you can monitor only one resource. Once detecting that this resource exceeds its upper threshold, check whether other resources exceed their upper thresholds.
If yes, the cell or NodeB is overloaded. Perform network expansion. If no, the cell or NodeB is not necessarily overloaded. In this case, network expansion is not mandatory and the problem can be solved by other adjustments or optimizations.
For example, the CE usage is more than 70% but the usages of other resources such as RTWP, TCP, and OVSF codes are within their allowed ranges. In this case, CE resources are insufficient but the cell is not overloaded. To solve the problem in this example, configure licenses allowing more CEs or add baseband processing boards, instead of performing network expansion immediately. Figure 1-2 Resource monitoring flowchart
As shown in Figure 1-2, an SPU is overloaded if its CPU usage is 50% to 60%, regardless of other resource usages. This flowchart is applicable to most resource monitoring scenarios, except when the system overload is due to an unexpected event, but not a service increase. Unexpected events are not considered in this flowchart. Causes for unexpected events can be located based on their association with various resource bottlenecks. For details on how to locate a resource-related problem, see chapter 4 "Diagnosis of Problems Related to Network Resources."
Issue 03 (2012-11-07) Huawei Proprietary and Confidential Copyright Huawei Technologies Co., Ltd. 3
2
Resources Type SPU CPU MPU CPU DPU DSP Load Interface Board CPU Load Interface Board Forwarding Load
Various counters are defined to represent the resource usage or load of a UTRAN system. In addition, upper thresholds for these counters are predefined. Identifying the busy hour is a key to accurate counter analysis. There are various methods of identifying the busy hour. The simplest one is to take the hour when the most resources are consumed as the busy hour. Table 2-1 RNC resources and threshold Counter Monitoring Threshold VS.XPU.CPULOAD.MEAN VS.XPU.CPULOAD.MEAN VS.DSP.UsageAvg VS.INT.CPULOAD.MEAN VS.INT.TRANSLOAD.RATIO.MEAN 50% 50% 60% 50% 70%
Issue 03 (2012-11-07)
The mean SPU resource usage (SPU CPU load) is indicated by the counter VS.XPU.CPULOAD.MEAN expressed in percentage. It is recommended:
If the SPU CPU usage is over 50% in the busy hour for three consecutive days in one week, add SPUs as required. If the SPU CPU usage is over 60% in the busy hour for three consecutive days in one week, take emergency expansion measures.
Issue 03 (2012-11-07)
VS.INT.CPULOAD.MEAN: mean CPU usage of an interface board, which is expressed in percentage. VS.INT.TRANSLOAD.RATIO.MEAN: mean forwarding load of an interface board, which is expressed in percentage. Session load = VS.INT.CFG.INTERWORKING.NUM/Number of session setup or release times x 60 x SP
where
VS.INT.CFG.INTERWORKING.NUM: indicates the number of call setup attempts on an interface board. SP: indicates the measurement period, expressed in minutes. Number of session setup or release times (per second): 500 for a single-core interface board (1000 for the GOUa and FG2a) and 5000 for a multi-core interface board.
It is recommended that you expand the interface board capacity if the mean CPU usage or the session load is higher than 50% or the forwarding load is higher than 70% for three consecutive days in one week.
Figure 2-3 Relationship between RTWP, noise increase, and uplink load
Generally, the uplink load threshold is 75% and the corresponding RTWP is smaller than 100 dBm. The corresponding equivalent number of users (ENU) ratio should be smaller than 75% if the power-based admission decision is based on algorithm 2 (the algorithm for the ENU). If the RTWP value is larger than 100 dBm, the cell is overloaded in the uplink direction. Generally, if a cell is overloaded or the RTWP value is too large, the cell coverage decreases, live service quality declines, or new service requests are rejected. Huawei RNCs support the following RTWP and ENU counters:
VS.MeanRTWP: mean RTWP in a cell (unit: dBm) VS.MinRTWP: minimum RTWP in a cell (unit: dBm) VS.RAC.UL.EqvUserN: uplink mean ENU on all dedicated channels in a cell UlTotalEqUserNum: maximum ENU that is configured by the ADD UCELLCAC command. UL ENU Ratio = VS.RAC.UL.EqvUserNum/UlTotalEqUserNum
In some areas, the background noise increases to more than 106 dBm due to other interference or hardware faults (for example, poor quality of antennas or feeder connectors). In this case, the VS.MinRTWP counter value (RTWP when the cell carries no traffic) is considered the background noise. If the VS.MinRTWP value is larger than 100 dBm or smaller than 110 dBm in the idle hour for three consecutive days in one week, there are hardware faults or external interference. Locate and rectify the faults. Normally, VS.MeanRTWP is used as the cell capacity indicator. If the VS.MeanRTWP value is higher than 100 dBm (corresponding to a 6 dB noise increase or 75% load) or the uplink ENU ratio is higher than 75% in the busy hour for two or three days in one week, the cell is regarded as heavily loaded. When the cell is heavily loaded, perform capacity expansion operations such as adding a carrier or increasing the UlTotalEqUserNum values.
Issue 03 (2012-11-07)
The cell coverage decreases. The data throughput decreases. The service quality declines. New call requests are rejected.
The amount of consumed downlink power in a cell is not only related to cell traffic (or load), but also related to the user's location and the cell coverage. The larger the cell coverage and the farther the user is located from the cell, the more power is consumed. The heavier the traffic in a cell, the more power is consumed. In a WCDMA system, TCP is defined to measure the downlink total transmit power. For Huawei RNCs, four TCP-associated counters are defined:
VS.MeanTCP: mean carrier transmit power in a cell VS.MaxTCP: maximum carrier transmit power in a cell VS.MinTCP: minimum carrier transmit power in a cell VS.MeanTCP.NonHS: mean downlink carrier transmit power for non-HSDPA in a cell
VS.MeanTCP is used as the downlink load indicator. If VS.MeanTCP is constantly higher than 85% VS.MaxTCP, the cell is overloaded in the downlink direction. Some live UTRAN networks use hierarchical cell structures with multiple frequency layers. The downlink power settings and the corresponding downlink TCP thresholds vary by carrier. For example,
If the maximum TCP value is 20 W (43 dBm), the downlink TCP threshold is 17 W (42.3 dBm). If the maximum TCP value is 40 W (46 dBm), the downlink TCP threshold is 34 W (45.3 dBm).
If VS.MeanTCP or VS.MaxTCP exceeds 85% of its threshold in the busy hour for three consecutive days in one week, the cell is regarded as heavily loaded in the uplink direction. Perform capacity expansion operations such as adding a carrier.
2.7 CE Usage
CE resources are baseband resources in a NodeB. One CE is the resources consumed by a 12.2 kbit/s voice call. If a new call arrives but there are not enough CEs (not enough baseband processing resources), the call will be blocked. CE resources are managed and shared at the NodeB level (note that 850 MHz and 1900 MHz cells cannot share CEs with each other, because the cells belong to different license groups). The total available CE resources are limited by both the installed hardware and the configured software licenses. If the hardware resources in the current installation are sufficient and the CEs are only limited by licenses, then the corrective action is to modify the license file to expand the cell capacity.
Issue 03 (2012-11-07)
The usage metric can also be used to monitor CE resources. Once the CE usage is consistently higher than the threshold 70%, the NodeB is overloaded, with respect to CE usage. CE expansion is required. Since separate baseband processing units are used in the uplink and downlink, CE management is also separate for the uplink and downlink. CE usage for the uplink and downlink is defined as: NodeB_UL_CE_MEAN_RATIO = UL Mean CE Used Number / UL NodeB CE Cfg Number If VS.NodeB.ULCreditUsed.Mean>0, it indicates that CE OVERBOOKING feature is available, then UL Mean CE Used Number= VS.NodeB.ULCreditUsed.Mean/2 ,otherwise UL Mean CE Used Number = Sum_AllCells_of_NodeB(VS.LC.ULCreditUsed.Mean/2), VS.LC.ULCreditUsed.Mean counts usage of UL Credit for cell, /2 is for the uplink credit number is twice the number of uplink CEs, and the downlink credit number is equal to the number of downlink CEs. UL NodeB CE Cfg Number = MIN(NodeB License UL CE Number, NodeB Physical UL CE Capacity) NodeB_DL_CE_MEAN_RATIO = DL Mean CE Used Number / DL CE Cfg Number Where, DL Mean CE Used Number = Sum_AllCells_of_NodeB(VS.LC.DLCreditUsed.Mean), VS.LC.DLCreditUsed.Mean counts usage of DL Credit for cell. DL CE Cfg Number = MIN(NodeB License DL CE Number, NodeB Physical DL CE Capacity)
The counter is from RNC. The License CE Number is distributed by M2000, the NodeB Physical CE Capacity is calculation by NodeB board configuration and board specification(MML query).
Issue 03 (2012-11-07)
A maximum spreading factor (SF) of 256 is supported. For a cell, only an OVSF code tree is available, with sibling codes orthogonal to each other but not with their parent or child codes. As a result, once a code is allocated to a user, neither its parent nor child code can be allocated to any other user. The total OVSF resources are limited. If available OVSF codes are insufficient to implement the desired QoS, a new call request may be rejected. An OVSF code tree can be divided to four codes (SF = 4), 8 codes (SF = 8), 16 codes (SF = 16), or 256 codes (SF = 256). This means that code resources with various SFs can be considered N x equivalent SF = 256 codes. For example, one SF = 8 code is equivalent to thirty-two SF = 256 codes. Based on this equivalence mapping, the OVSF code usage for a user or a cell can be calculated. A Huawei RNC monitors the average code usage of an OVSF code tree based on the number of occupied equivalent SF = 256 codes. The average code usage of an OVSF code tree is indicated by the VS.RAB.SFOccupy counter. OVSF code usages are defined as follows:
where DCH_OVSF_CODE = (<VS.SingleRAB.SF4> + <VS.MultRAB.SF4>) x 64 + (<VS.MultRAB.SF8> + <VS.SingleRAB.SF8>) x 32 + (<VS.MultRAB.SF16> + <VS.SingleRAB.SF16>) x 16 + (<VS.SingleRAB.SF32> + <VS.MultRAB.SF32>) x 8 + (<VS.MultRAB.SF64> + <VS.SingleRAB.SF64>) x 4 + (<VS.SingleRAB.SF128> + <VS.MultRAB.SF128>) x 2 + (<VS.SingleRAB.SF256> + <VS.MultRAB.SF256>) A threshold (such as 70%) can be defined for DCH_OVSF_Utilization to judge whether a cell runs out of OVSF codes. If OVSF code resources are insufficient in the busy hour for three consecutive days in one week, perform capacity expansion operations such as adding a carrier or splitting the cell.
Issue 03 (2012-11-07)
10
PCH usage = VS.UTRAN.AttPaging1/(<SP> *60*5/0.01) Usage of an FACH carried on a non-standalone SCCPCH = VS.CRNCIubBytesFACH.Tx *8/[(60 *<SP> *168 *1/0.01)*VS.PCH.Bandwidth.UsageRate *6/7 + [60 *<SP> *360 *1/0.01)*(1VS.PCH.Bandwidth.UsageRate *6/7)] where, VS.PCH.Bandwidth.UsageRate = <VS.CRNCIubBytesPCH.Tx> /( <VS.CRNC.IUB.PCH.Bandwidth> *SP *60.0)
Usage of an FACH carried on a stand-alone SCCPCH FACH Utility Ratio = ((VS.SRBNum.FACH-VS.OneSRBTTINum.FACH)/2 +VS.OneSRBTTINum.FACH + VS.IndepTRBNum.FACH) /(3600/0.01)
RACH usage
Issue 03 (2012-11-07)
11
There is one RACH channel in a cell. When both signaling and traffic existRACH Utility Ratio could be calculated as follow RACH Utility Ratio= ( VS.CRNCIubBytesRACH.Rx - VS.TRBNum.RACH*360/8 *8/168) / ({sp} * 60 * 4 / 0.02))+ VS.TRBNum.RACH/ ({sp} * 60 * 4 / 0.02) The basic principles for evaluating PCHs are as follows:
If paging messages are not re-transported, 5% of them will be lost when the PCH usage reaches 60%. It is recommended that you troubleshoot this message loss or replan the LAC. If paging messages are re-transported once or twice, 1% of them will be lost when the PCH usage reaches 70%. It is recommended that you troubleshoot this message loss or replan the LAC.
The basic principle for evaluating FACHs is as follows: If the FACH usage reaches 70%, it is recommended that you optimize specific policies or parameters, or add FACHs as required. The basic principle for evaluating RACHs is as follows: If the RACH usage reaches 70%adding carriers will be recommended.
2.12 CNBAP Load of the NodeB Main Processing and Transmission Unit (WMPT/UMPT)
The NodeB main processing and transmission unit (WMPT/UMPT) processes signaling messages and manages resources for other boards. If the main processing and transmission unit is overloaded, a radio link will fail to be set up or a radio link setup request will not receive a response. This significantly decreases KPIs, such as the RRC connection setup success rate and RAB setup success rate. To address this issue, Huawei uses the control NodeB application part (CNBAP) load ratio to assess the processing capacity of the main processing and transmission unit. If soft handover factor < 0.4
Issue 03 (2012-11-07)
12
where
VS.RadioLink.Recv.Mean: indicates the average number of wireless connection receptions per second. It is a NodeB counter. VS.DedicMeaRpt.MEAN: indicates the average number of dedicated measurement reports per second. It is also a NodeB counter. SP: indicates the measurement period. It is expressed in minutes. CNBAP Capacity of NodeB: depends on the configurations of main processing and transmission units, baseband processing boards, and extended transmission boards.
Note: Generally, VS.DedicMeaRpt.MEAN can be ignored. In the second formula, VS.DedicMeaRpt.MEAN/12 is used for equivalent conversion. The soft handover factor is a cell-level counter. Soft handover factor = ((<VS.SHO.AS.1RL> + <VS.SHO.AS.2RL> + <VS.SHO.AS.3RL> + <VS.SHO.AS.4RL> + <VS.SHO.AS.5RL> + <VS.SHO.AS.6RL>)/(<VS.SHO.AS.1RL> + <VS.SHO.AS.2RL> /2 + <VS.SHO.AS.3RL> /3 + <VS.SHO.AS.4RL> /4 + <VS.SHO.AS.5RL> /5 + <VS.SHO.AS.6RL> /6)) 1
VS.SHO.AS.1RL: Mean Number of UEs with One Radio Link for Cell; RNC counter VS.SHO.AS.2RL: Mean Number of UEs with Two Radio Links for Cell; RNC counter VS.SHO.AS.3RL: Mean Number of UEs with Three Radio Links for Cell; RNC counter VS.SHO.AS.4RL: Mean Number of UEs with four Radio Links for Cell; RNC counter VS.SHO.AS.5RL: Mean Number of UEs with five Radio Links for Cell; RNC counter VS.SHO.AS.6RL: Mean Number of UEs with six Radio Links for Cell; RNC counter
The NodeB soft handover factor equals the average soft handover factor of all cells under the NodeB. If the CNBAP Load Ratio is higher than 60% in the busy hour for three consecutive days in one week, the main processing and transmission unit is becoming overloaded. If this happens, add a baseband processing board or an extended transmission board, or split the NodeB.
Issue 03 (2012-11-07)
13
3
3.1 HSDPA
3.1.1 Power Resources
High Speed Packet Access (HSPA) includes High Speed Downlink Packet Access (HSDPA) and High Speed Uplink Packet Access (HSUPA). HSDPA and HSUPA functionalities are part of the WCDMA standard. HSPA uses technologies such as fast scheduling, adaptive modulation, and hybrid automatic repeat request (HARQ) to transport data at high speed. HSPA carries PS data. As conversational services are prioritized over PS data, HSPA uses system resources only after conversational services are served. This chapter looks into how to efficiently use the system resources by means of HSPA without changing the existing pattern for resource allocation.
Figure 3-1 illustrates how the downlink transmit power of a cell is allocated. The dashed line indicates the total downlink transmit power of a cell. Figure 3-1 Dynamic power resource allocation
Issue 03 (2012-11-07)
14
Power for CCH: This portion of power is allocated to common transport channels (CCHs) of the cell such as the broadcast channel, pilot channel, and paging channel. Power margin: This portion of power is not allocated. The power margin is reserved to ensure that the system can remain stable even if the UE position or environment changes. Power for DPCH: This portion of power is allocated to real-time services (voice and video calls) and PS R99 services, and varies with the number and locations of users. RNCs and UEs can adjust power for DPCH based on the power control algorithm. Power for HSPA: This portion of power is allocated to HSDPA and is calculated as follows: HSDPA user power = Maximum cell transmit power (Power for CCH + Power margin + Power for DPCH) HSPA power schedulers are designed primarily to make the most of available power. In an HSDPA-enabled cell, TCP is still monitored to see if the system is overloaded in the downlink. TCP thresholds for this cell are the same as those for a cell without HSDPA. With HSDPA, downlink power overload affects HSDPA performance before it affects conversational services.
Issue 03 (2012-11-07)
15
3.2 HSUPA
3.2.1 CE Resources
HSUPA channels are dedicated channels, and resource consumption of HUSPA services is measured by CE. UL CEs are shared between R99 services and HSUPA services. HSUPA improves user experience and uplink throughput, but also consumes more uplink CE overhead for hybrid automatic repeat requests (HARQ) and soft handovers. This means that uplink CE resources may become a system bottleneck. Therefore, uplink CE usage needs to be monitored when HSUPA is enabled. Huawei NodeBs support dynamic HSUPA CE management.
3.2.2 RTWP
Similar to HSDPA, which is designed to make the most of the downlink power, HSUPA is designed to make the most of uplink capacity margin. HSUPA is always authorized to send data until the RTWP rises to 6 dBm. HSUPA provision increases uplink data throughput but also consumes a large amount of uplink RTWP, which is monitored in the same way regardless of whether HSUPA is provisioned. If RTWP overload occurs, rates of HSUPA services must be lowered to ensure QoS of conversational services.
Issue 03 (2012-11-07)
16
The preceding chapters describe the basic methods of monitoring network resources. These methods can be used to resolve most problems caused by high resource usage. In certain scenarios, further analysis is required to determine whether high resource usage is caused by a traffic increase or other exceptions. This chapter describes how to diagnose problems related to network resources. This chapter is intended for experts who have a deep understanding of WCDMA networks.
Issue 03 (2012-11-07)
17
Figure 4-1 Call flowchart where possible block and failure points are marked
The call flow, which uses a mobile-terminated call as an example, is described as follows: Step 1 The CN sends a paging message to the RNC. Step 2 Upon receipt of the paging message, the RNC broadcasts the message on a PCH. If the PCH is congested, the RNC may drop the message. See block point #1. Step 3 The UE cannot receive the paging message or fails to connect to the network. See failure point # 2. Step 4 After receiving the paging message, the UE sends an RRC connection request to the RNC. Step 5 If the RNC is congested when receiving the RRC connection request, the RNC may drop the request. See block point #3.
Issue 03 (2012-11-07)
18
Step 6 If the RNC receives the RRC connection request and does not drop it, the RNC determines whether to accept or reject the request. The request may be rejected due to insufficient resources. See block point #4. Step 7 If the RNC accepts the request, the RNC instructs the UE to set up an RRC connection. The RRC connection setup may fail, the UE does not receive the instruction, or the UE receives the message but finds the configuration information to be incorrect. See failure points #5 and #6. Step 8 After the RRC connection is set up, the UE sends NAS messages to negotiate with the CN about service setup. If the CN determines to set up a service, the CN sends an RAB assignment request to the RNC. Step 9 The RNC accepts or rejects the RAB assignment request based on the resource usage on the RAN side. See block point #7. Step 10 If the RNC accepts the RAB assignment request, the RNC initiates an RB setup process. During the process, the RNC sets up transmission resources over the Iub interface by setting up a radio link (RL) to the NodeB, and sets up channel resources over the Uu interface by sending an RB setup message to the UE. A failure may occur in the RL or RB setup process. See failure points #8 and #9.
Paging loss (RNC) Counters indicating that RNC-level paging loss ratio are caused by Iu-interface flow control, CPU overload, or RNC-level PCH congestion: VS.RANAP.CsPaging.Loss and VS.RANAP.PsPaging.Loss Iu-interface paging loss ratio (RNC) = [(VS.RANAP.CsPaging.Loss + VS.RANAP.PsPaging.Loss)/(VS.RANAP.CsPaging.Att + VS.RANAP.PsPaging.Att)] x 100%
Paging loss (Cell) Counter indicating that paging requests are discarded due to cell-level PCH congestion: VS.RRC.Paging1.Loss.PCHCong.Cell Iu-interface paging loss ratio (cell) = (VS.RRC.Paging1.Loss.PCHCong.Cell/VS.UTRAN.AttPaging1) x 100%
Issue 03 (2012-11-07)
19
Insufficient uplink power resources: VS.RRC.Rej.ULPower.Cong Insufficient downlink power resources: VS.RRC.Rej.DLPower.Cong Insufficient uplink CE resources: VS.RRC.Rej.UL.CE.Cong Insufficient downlink CE resources: VS.RRC.Rej.DL.CE.Cong Insufficient uplink Iub bandwidth resources: VS.RRC.Rej.ULIUBBand.Cong Insufficient downlink Iub bandwidth resources: VS.RRC.Rej.DLIUBBand.Cong Insufficient downlink code resources: VS.RRC.Rej.Code.Cong Number of RRC requests: VS.RRC.AttConnEstab.Sum
The following is the formula for calculating the paging loss ratio: Vs.RRC.Block.Rate = Total RRC Rej/VS.RRC.AttConnEstab.Sum x 100% Where Total RRC Rej = < VS.RRC.Rej.ULPower.Cong > + < VS.RRC.Rej.DLPower.Cong > + < VS.RRC.Rej.UL.CE.Cong > + < VS.RRC.Rej.DL.CE.Cong > + < VS.RRC.Rej.ULIUBBand.Cong > + < VS.RRC.Rej.DLIUBBand.Cong > + < VS.RRC.Rej.Code.Cong >
VS.RAB.FailEstabCS.ULPower.Cong VS.RAB.FailEstabCS.DLPower.Cong VS.RAB.FailEstabPS.ULPower.Cong VS.RAB.FailEstabPS.DLPower.Cong VS.RAB.FailEstabCS.ULCE.Cong VS.RAB.FailEstabPS.ULCE.Cong VS.RAB.FailEstabCs.DLCE.Cong VS.RAB.FailEstabPs.DLCE.Cong VS.RAB.FailEstabCs.Code.Cong VS.RAB.FailEstabPs.Code.Cong VS.RAB.FailEstabCS.DLIUBBand.Cong VS.RAB.FailEstabCS.ULIUBBand.Cong VS.RAB.FailEstabPS.DLIUBBand.Cong VS.RAB.FailEstabPS.ULIUBBand.Cong
Issue 03 (2012-11-07)
20
The following is the formula for calculating the call congestion ratio: VS.RAB.Block.Rate = Total number of congestions due to the preceding causes/VS.RAB.AttEstab.Cell
Issue 03 (2012-11-07)
21
Table 4-1 provides solutions to signaling storms. These solutions attempt to reduce signaling loads so that the network capacity does not need to be expanded immediately.
Issue 03 (2012-11-07)
22
Table 4-1 Signaling storm causes and solutions UE Behavior No signaling connection release indication (SCRI) UE Type Nokia, Samsung, or Motorola feature phones Solution Enable the Cell_PCH function to decrease signaling services for these terminals.
SCRI without values indicating causes R8 terminals with SCRI carrying values indicating causes
iPhone (R6)
Enable the enhanced fast dormancy (EFD) function for RNCs and add international mobile equipment identities (IMEIs) of terminals to the whitelist. Enable the R8 FD function for RNCs and add terminal IMEIs to the whitelist.
Issue 03 (2012-11-07)
23
Generally, an abnormal KPI initiates a troubleshooting process. Determining the top N cells that may have problems facilitates follow-up troubleshooting. It is recommended to analyze accessibility KPIs to identify the system bottleneck that causes access congestion.
Issue 03 (2012-11-07)
24
Issue 03 (2012-11-07)
25
NOTE
CE usage in Table 4-2 assumes that the signaling radio bearer (SRB) over HSUPA feature is enabled. If the SRB is carried on an R99 DCH independently, an extra CE is consumed by the SRB. Therefore, add one CE to the number listed in Table 4-2.
HSDPA services do not consume downlink R99 CEs. HSUPA services and R99 services share uplink CEs. CE congestion or routine CE usage monitoring may trigger CE resource analysis. If the CE resource usage is higher than a preset threshold for a period of time or CE congestion occurs, the CE resources are insufficient and must be increased to ensure system stability.
Issue 03 (2012-11-07)
26
Cells belonging to the same NodeB share CEs and CE resources consumed by a NodeB must be manually calculated. Check whether CE resource congestion occurs in a resource group or an entire site. If CE resource congestion occurs in a resource group, reallocate CEs between resource groups. If CE resource congestion occurs in an entire site, perform site capacity expansion and reconfigure CEs as required.
Issue 03 (2012-11-07)
27
Decrease the maximum number of PS RABs. Enable code-based load reshuffling (LDR). Decrease the minimum number of codes reserved for HSDPA services. Activate the license for dynamic code allocation on the NodeB.
Thresholds for the preceding code congestion-related operations must be set based on operators' requirements for services quality.
After IP RAN is introduced, Iub resources no longer need to be monitored. This section is retained to provide a complete solution so that operators can compare solutions provided by different vendors.
If insufficient Iub bandwidth causes congestion, check the Iub bandwidth usage. If the Iub bandwidth usage remains higher than 80% for a certain period, it can be determined that the Iub bandwidth is insufficient. If no more Iub resources are available or the issue is not urgent, decrease PS activity factors so the system admits more users. The activity factor, which is the ratio of actual bandwidth occupied by a user to the allocated bandwidth, is used to estimate the real bandwidth needed in admission. The activity factor can be set on a per-NodeB basis. The default activity factor is 70% for voice services and 40% for PS BE services.
Issue 03 (2012-11-07)
28
Workaround: Enable the LDR and OLC functions. Solution: Add carriers or split cells.
Issue 03 (2012-11-07)
29
Adding carriers is the most efficient solution to insufficient uplink power. If no more carriers are available, add more sites or tilt down antennas to spit cells.
Issue 03 (2012-11-07)
30
If the SPU CPU usage is higher than 50%, advise customers to add SPU boards. If SPU CPU usage is higher than 60%, add SPU boards immediately. Check whether SPU subsystem loads are balanced. If they are unbalanced, adjust load sharing thresholds so that subsystems share loads evenly. In addition, identify root causes for the high CPU usage. If signaling storms occur, check whether system configurations are correct or the transmission link is interrupted. If high traffic causes the high CPU usage, add SPU boards to expand capacity.
Issue 03 (2012-11-07)
31
If the DPU DSP or interface board CPU usage is higher than 60%, add DPU boards or interface boards. Add hardware for capacity expansion if traffic increase or unbalanced transmission causes the high loads.
Issue 03 (2012-11-07)
32
Decrease values of PS inactive timers to transfer PS services to the CELL_PCH or IDLE state and set up RRC connections on DCHs instead of FACH if DCH resources are sufficient. Add an SCCPCH to carry FACHs
Issue 03 (2012-11-07)
33
Issue 03 (2012-11-07)
34
5 Counter Definitions
5
Counter Name Congestion Counter Call drop ratio Vs.Call.Block.Rate (custom) Counter RRC congestion ratio Vs.RRC.Block.Rate (custom)
Counter Definitions
Definition
Vs.RRC.Block.Rate + (<RRC.SuccConnEstab.sum>/(<VS.RRC.AttCon nEstab.CellDCH> + <VS.RRC.AttConnEstab.CellFACH>)) x Vs.Rab.Block.Rate (<VS.RRC.Rej.ULPower.Cong> + <VS.RRC.Rej.DLPower.Cong> + <VS.RRC.Rej.ULIUBBand.Cong> + <VS.RRC.Rej.DLIUBBand.Cong> + <VS.RRC.Rej.ULCE.Cong> + <VS.RRC.Rej.DLCE.Cong> + <VS.RRC.Rej.Code.Cong>)/<VS.RRC.AttConn Estab.Sum> (<VS.RAB.FailEstabCS.ULPower.Cong> + <VS.RAB.FailEstabCS.DLPower.Cong> +<VS.RAB.FailEstabPS.ULPower.Cong> + <VS.RAB.FailEstabPS.DLPower.Cong> + <VS.RAB.FailEstabCS.ULCE.Cong> + <VS.RAB.FailEstabPS.ULCE.Cong> + <VS.RAB.FailEstabCs.DLCE.Cong> + <VS.RAB.FailEstabPs.DLCE.Cong> + <VS.RAB.FailEstabCs.Code.Cong> + <VS.RAB.FailEstabPs.Code.Cong> + <VS.RAB.FailEstabCS.DLIUBBand.Cong> + <VS.RAB.FailEstabCS.ULIUBBand.Cong> + <VS.RAB.FailEstabPS.DLIUBBand.Cong> + <VS.RAB.FailEstabPS.ULIUBBand.Cong>)/VS. RAB.AttEstab.Cell
Vs.RAB.Block.Rate (custom)
Issue 03 (2012-11-07)
35
5 Counter Definitions
Usage Counter Power Usage Counter R99_TCP_Utiliz ation_Ratio Total_TCP_Utili zation_Ratio Max UL RTWP Mean UL RTWP Min UL RTWP UL ENU ratio IUB Usage Counters IUB BW usage NODEB_Throughput (custom) NODEB_Trans_Cap (custom) NODEB_Trans_ Cap VS.IPDLTotal.1 VS.IPDLTotal.2 VS.IPDLTotal.3 VS.IPDLTotal.4 NODEB_Throug hput NODEB_Throug hput_DL NODEB_Throughput_DL (custom) NODEB_Throughput_UL (custom) VS.IPDLAvgUsed.1 VS.IPDLAvgUsed.2 VS.IPDLAvgUsed.3 VS.IPDLAvgUsed.4 NODEB_Throug hput_UL VS.IPULAvgUsed.1 VS.IPULAvgUsed.2 VS.IPULAvgUsed.3 VS.IPULAvgUsed.4 (VS.IPULAvgUsed.1 + VS.IPULAvgUsed.2 + VS.IPULAvgUsed.3 + VS.IPULAvgUsed.4) MAX(NODEB_Throughput_DL, NODEB_Throughput_UL) (VS.IPDLAvgUsed.1 + VS.IPDLAvgUsed.2 + VS.IPDLAvgUsed.3 + VS.IPDLAvgUsed.4) (VS.IPDLTotal.1 + VS.IPDLTotal.2 + VS.IPDLTotal.3 + VS.IPDLTotal.4) NODEB_Throughput/NODEB_Trans_Cap VS.MeanTCP.NonHS VS.MeanTCP VS.MaxRTWP VS.MeanRTWP VS.MinRTWP VS.RAC.UL.EqvUserNum VS.MeanTCP.NonHS/Configured_Total_Cell_T CP (43 dBm or 46 dBm) VS.MeanTCP/Configured_Total_Cell_TCP VS.MaxRTWP VS.MeanRTWP VS.MinRTWP VS.RAC.UL.EqvUserNum/UlTotalEqUserNum
Issue 03 (2012-11-07)
36
5 Counter Definitions
Counter
Definition
VS.UTRAN.AttPaging1/(60 x 60 x 5/0.01) (1) Utilization of FACH carried on non-standalone SCCPCH FACH Utility Ratio = VS.CRNCIubBytesFACH.Tx *8/((60 *<SP> *168*1/0.01) *VS.PCH.Bandwidth.UsageRate *6/7 + (60 *<SP> *360 *1/0.01)*(1VS.PCH.Bandwidth.UsageRate *6/7)) where, VS.PCH.Bandwidth.UsageRate = <VS.CRNCIubBytesPCH.Tx> /( <VS.CRNC.IUB.PCH.Bandwidth> * SP *60.0) (2) Utilization of FACH carried on standalone SCCPCH FACH Utility Ratio = ((VS.SRBNum.FACH-VS.OneSRBTTINum.FA CH)/2 +VS.OneSRBTTINum.FACH + VS.IndepTRBNum.FACH) /(3600/0.01)
OVSF Usage Counter OVSF usage OVSF usability ratio DCH OVSF ratio VS.RAB.SFOccupy VS.RAB.SFOccupy.Ratio DCH_OVSF_Utilization VS.RAB.SFOccupy VS.RAB.SFOccupy/256 [(<VS.SingleRAB.SF4> + <VS.MultRAB.SF4>) x 64 + (<VS.MultRAB.SF8> + <VS.SingleRAB.SF8>) x 32 + (<VS.MultRAB.SF16> + <VS.SingleRAB.SF16>) x 16 + (<VS.SingleRAB.SF32> + <VS.MultRAB.SF32>) x 8 + (<VS.MultRAB.SF64> + <VS.SingleRAB.SF64>) x 4 + (<VS.SingleRAB.SF128> + <VS.MultRAB.SF128>) x 2 + (<VS.SingleRAB.SF256> + <VS.MultRAB.SF256>)]/256
CPU Usage Counter SPU usage MPU usage VS.XPU.CPULOAD.MEAN VS.XPU.CPULOAD.MEAN VS.XPU.CPULOAD.MEAN VS.XPU.CPULOAD.MEAN
Issue 03 (2012-11-07)
37
5 Counter Definitions
VS.BRD.CPULOAD.MEAN
if VS.NodeB.ULCreditUsed.Mean>0 Sum_AllCells_of_NodeB(VS.NodeB.ULCreditU sed.Mean /2) / MIN(NodeB License UL CE Number, NodeB Physical UL CE Capacity) else Sum_AllCells_of_NodeB(VS.LC.ULCreditUsed. Mean/2) / MIN(NodeB License UL CE Number, NodeB Physical UL CE Capacity) Sum_AllCells_of_NodeB(VS.LC.DLCreditUsed. Mean) / MIN(NodeB License DL CE Number, NodeB Physical DL CE Capacity)
Issue 03 (2012-11-07)
38