You are on page 1of 36

INITIAL Iub

TROUBLESHOOTIN
G
OVERVIEW
 Objective
Agenda
 CPP O&m Concepts
 Protocols
 O&m Client Services
 Counters Overview
 Performance Management
 Iub over ATM
 Initial Counters
 Iub Analysis
 Fail After Admission
 IP Iub Throughput
 Questions?
OBJECTIVE
Main idea is introduce to the transport engineer the basic concepts of
troubleshooting on Iub interface, by presenting initial counters and
KPIs, that could help to define which area needs further
investigations.

Based on these conclusions, network optimization services can be


performed.
CPP O&M CONCEPTS

Moshell is a suite of tools for O&M of CPP-based nodes.

CPP is the Connectivity Packet Platform on which are based the following
nodes: RNC, RBS, MGW, RXI.

Information collected by CPP counters every 15 minutes in stored in xml


files (ROP files).

Information are read and stored into a SQL database on a daily basis.
PROTOCOLS
Protocols used for accessing these services:
› http
› unsecure protocols (unencrypted): telnet, ftp, iiop
› secure protocols (encrypted): ssh, sftp, ssliop
NODE
Hyper RS232
Terminal

OSE shell (COLI)

TELNET (23) / SSH (22)

HTTP (80)
MoShell Ethernet TCP/IP FTP (21) / SFTP (22) File system
or
IPoverATM
CM (Configuration Mgmt)
IIOP (56834) MIB
/ FM (Fault Mgmt)
SSL IOP (56836)

PM (Performance Mgmt)
Scanners

Figure 1 - Protocols
The O&M client services

› Configuration Service (CS): Read and change configuration data;


configuration data is stored in the MO attributes
› Alarm Service (AS): Retrieve the list of alarms currently active on
each MO
› Notification Service (NS): Subscribe and receive notifications from
the node, informing about parameter/alarm changes in the MOs
› Inventory Service (IS): Get a list of all HW and SW defined in the
node
› Log Service (LS): Save a log of certain events such as changes in
the configuration, alarms raising and ceasing, node and board restarts
› Performance Measurement (PM): Setup that are stored in MO pm-
attributes and output to an XML file every 15 minutes.
COUNTER TYPES:
COUNTERS overview
• Peg: a counter that is increased by 1 at each occurrence of a specific
activity.

• Gauge: a counter that can be increased or decreased depending on the


activity in the system.

• Accumulator: a counter that is increased by the value of a sample. It


indicates the total sum of all sample values taken during a certain time.
The name of an accumulator counter begins either with pmSum or
pmSumOfSamp.

• Scan: a counter that is increased by 1 each time the corresponding


accumulator counter is increased. It indicates how many samples have
been read.

• Probability Density Function (PDF): is a list of range values. If the value


falls within a certain range, the range counter for that range is increased.
Counter Reset Behavior COUNTERS OVERVIEW
Counter values can be either reset at the end of ROP Period or can
be accumulated up to the counter limit.
In a counter that is not reset after ROP period, the incremented
value during a ROP period is the difference between two consecutive
ROPs.
Counter Classification
Counters can be grouped by NE Type:
RNC
RXI
RBS
Or by area of interest:
Radio Network – RNC specific counters
Radio Network – RBS specific counters
Transport Network counters
iUb over atm

Figure 3 - Iub configuration example


iUb over atm
AAL2 CAC and resources usage:

 AAL2 connection admission control (CAC) is executed before a new


AAL2 connection is set up in the system.

 AAL2 connections in UTRAN are always initiated by RNC.

 RNC reserves a CID and the relevant bandwidth, and forwards the
establish request message through the AP. It will contain, the
allocated CID, the traffic descriptors and QoS
CID
iUb over atm
Because of standardization constrains, no more than 248 AAL2
connections can be simultaneously established on a single AAL2
path: more than 248 connections can be established between two
adjacent nodes if more than one AAL2 path is configured.
When an AAL2 connection is allocated on an AAL2 path, a Channel
Identifier (CID) is reserved and assigned by the node that is
originating or forwarding the AAL2 connection request.

Figure 4 – AAL2 Connections table


iUb over atm
In particular:
The AAL2 path capacity assumed by CAC is equal to:
 the configured PCR, for CBR AAL2 paths

 the configured MCR, for UBR+ AAL2 paths

 zero, for UBR AAL2 paths

Flow Control:
The Flow Control function has been conceived to dynamically adapt
transmission rate of Best Effort services to Iub available bandwidth by
reducing transmission rate during Iub congestion situations
Initial counter check
Recommended to check in an initial investigation as they will give clues
on whether the source of the problem is transport network based.

Checking if the number of Unsuccessful local or remote AAL2


connections is increasing will indicate where potential problems exist,
at the NodeB, RXI or RNC. The ‘OutConns’, viewed at AAL2 Access
points in RNC looking towards the RXI/NodeB, and AAL2 Access
Points in the RXI looking towards the NodeBs are the best counters
to observe.

 Aal2Ap pmUnSuccOutConnsLocalQosClassA/B/C/D
 Aal2Ap pmUnSuccInConnsLocalQosClassA/B/C/D
 Aal2Ap pmUnSuccOutConnsRemoteQosClassA/B/C/D
 Aal2Ap pmUnSuccInConnsRemoteQosClassA/B/C/D
Initial counter check
The following counters show the BW utilization.

› VclTp, VplTp, Atmport pmBwUtilizationRx;


pmBwUtilizationTx

To check ATM links utilization


› VclTp, VplTp, Atmport pmTransmittedAtmCells
pmReceivedAtmCells

To show number of RRC/RAB Establishment failures after admission


› Utrancell pmNoFailedAfterAdm
Initial counter check
To check for congestion in the control plane

Iub interface
 UniSaalTp pmNoOfLocalCongestions
 NbapCommon pmNoOfDiscardedNbapMessages
 Iublink pmTotalTimeIublinkCongestedDl

Iu/Iur interface
 NniSaalTp pmNoOfLocalCongestions

To check for interface availability

Iub interface
 UniSaalTp pmLinkInServiceTime

 Iu/Iur interface
 NniSaalTp pmLinkInServiceTime
Initial counter check
The following counter shows if Iub Bandwidth is limiting HS services, measured
in %.
OBS. if > 75% cause could be Iub capacity or Radio limitations.

 IubDataStreams pmCapAllocIubHsLimitingRatioSpi<xx>

To see HS frame loss


 IubDataStreams pmHsDataFramesLostSpi<XX>
 IubDataStreams pmHsDataFramesReceivedSpi<XX>

To check ATM link quality


 Aal2PathVccTp, pmBwLostCells
 Aal5TpVccTp,VpcTp pmFwLostCells
› ImaLink Initial counter check
Check the physical layer quality of the transmission link
pmSesIma
pmSesImaFe
pmUasIma
pmUasImaFe

› ImaGroup pmGrUasIma

› E1PhyspathTerm,
E1Ttp,E3PhysPathterm pmEs
pmSes
pmUas

› Os155SpiTtp pmMsEs
pmMsSes
pmMsUas
pmMsBbe

› Vc12Ttp,Vc4Ttp pmVcEs
pmVcSes
pmVcUas
Iub analysis
The followingAAL2 flowchart
Setup Failure summarises an Iub link analysis
procedure based on AAL2 Setup failure rate
OK
Strict Admission Traffic examination.
No AAL2 Setup Failure

Lack of CID Create More


Class A VCs
Local
Lack of Bw
AAL2
Setup Failure
Remote Bad TN quality
Check Physical
Layer Quality

Best Effort Traffic Check Flow


No AAL2 Setup Failure Control Counters

Local Create More


Lack of CID
Class B&C VCs
AAL2
Setup Failure Check Physical
Remote Bad TN quality Layer Quality
AAL2 Setup Failure Rate
The following KPIs and AAL2Ap counters are suggested to monitor the AAL2
Setup Failure rate on an Iub link.

Counters
 Aal2Ap::pmUnSuccOutConnsLocalQoSClass<x> (A/B/C/D)
Number of unsuccessful attempts to allocate AAL2 resources during
establishment of outgoing connections on this Access Point (AP). Caused by
Rejects in Connections Admission Control (CAC).

 Aal2Ap::pmUnSuccOutConnsRemoteQoSClass<x> (A/B/C/D)
Number of unsuccessful establishments of outgoing connections on this AAL2
Access Point (AP).

 Aal2Ap::pmSuccOutConnsRemoteQosClass<x> (A/B/C/D)
Number of successful establishments of outgoing connections on this AAL2
Access Point (AP).
AAL2 Setup Failure Rate
[ AAL2 _ Fail _ Rate _ Local _ ClassA]% 
pmUnSuccOu
KPIs tConnsLocalQoSClassA *100%
pmSuccOutConns Re moteQoSClassA  pmUnSuccOu tConnsLocalQoSClassA  pmUnSuccOu tConns Re moteQoSClassA

pmUnSuccOutConns Re moteQoSClassA *100%


[ AAL2 _ Fail _ Rate _ Re mote _ ClassA]% 
pmSuccOutConns Re moteQoSClassA  pmUnSuccOutConns Re moteQoSClassA

Similar formulae can be used for Class B & Class C.

The AAL2_Fail_Rate_Local_ClassA KPI signals possible problems in the Iub section


between the RNC and the next connected node (NodeB or RXI).

The AAL2_Fail_Rate_Remote_ClassA KPI signals possible problems in the Iub section


between any intermediate RXI.
CID Utilization Estimate
This is a crude method of calculating the number of CIDs as it does not distinguish
between traffic types.

There is a second method using Erlang Counters, that won’t be demonstrated on this
presentation.

Counters
 Aal2Ap:: pmExisTransConns
The number of existing connections for the Access Point (AP) existing in the node.. Gauge
Counter

 Aal2Ap:: pmExisOrigConns
Number of existing connections for the Access Point (AP) originating in this node.
Gauge Counter.

 Aal2Ap:: pmExisTermConns
Number of existing connections for the Access Point (AP) terminating in this node.
Gauge Counter.
KPI
CID Utilization Estimate
[ pmExisOrigConns  pmExisTerm Conns  pmExisTran sConns ]
 Average _ No _ Connections 
n

where n is the number of paths per AAL2 Access Point.

Note: if the RXI is a pure AAL2 switching node, then the pmExisOrigConns and
pmExisTermConns counters can be discounted as there can be no originated
or terminated connections in the node, only transiting connections.

 This method of CID calculation gives a basic estimate of CID utilization.


In a typical Iub link with one VC (normally vc39) defined for Strict Admission
traffic and one VC (normally vc50) defined for Best Effort traffic, the division
by 2 in the formula will average the total number of used CIDs over both
traffic types. For example, if the counter returns a value of 360, it is not known
if this is 180 CIDs in both ClassA and ClassB&C, or maybe 240 in ClassA and
120 in ClassB&C. If it is the latter, then VC expansion is needed, as the
maximum number of CIDs allowed per path (248) is being reached.
BW Utilization Estimate
Bandwidth utilization can be measured per VP and also per VC using
counters.
To monitor Best Effort VC utilization is better use ‘Flow Control’ methodology.

Counters

 VplTp:: pmTransmittedAtmCells = Number of transmitted ATM cells. This


counter is incremented for each transmitted ATM cell. Peg counter.
 VplTp:: pmReceivedAtmCells = Number of received ATM cells. This counter
is incremented for each received ATM cell. Peg counter.

KPIs VplTp :: pmTransmit tedAtmCells


AAL2 _ VP _ Utilisation _ Tx  * 100%
Meas _ Length( s ) * egressAtmPCR

VplTp :: pm Re ceivedAtmCells
AAL2 _ VP _ Utilisation _ Rx  * 100%
Meas _ Length( s ) * ingressAtm PCR
TN quality Physical Layer Quality
Several counters are available to monitor the availability and the quality of
physical and IMA terminations in CPP nodes.

 Errored Seconds (ES): seconds with block errors during the PM interval.
These counters are incremented for each second where one or more blocks
with one or more errors are received.

 Severely Errored Seconds (SES): seconds during available time having a


severe bit error rate.

 Unavailable Seconds: the accumulated unavailable time in seconds during


the interval. Unavailable time starts when 10 consecutive SES are detected,
and ends when 10 consecutive non-SES are detected. These counters are
incremented for each second of unavailable time
Flow Control
 pmHsDataFr
HSDPA amesLostSp
Congestion i  xx  *100%
KPIs:
HSFrameLossRatio 
pmHsDataFrames Re ceivedSpi  xx   pmHsDataFramesLostSpi  xx 

High frame loss indicates potential congestion problems.


<xx> = the supported SPI (Scheduling Priority Indicator)

HSFrameDelayDistribution  pmHsDataFrameDelayIubSpi  xx 

This counter indicates the percentage of times where Iub congestion has occurred per SPI
(Scheduling Priority Indicator).

Experience has shown that in high loaded Iub cases, this counter could reach values of
about 65–75%.
Flow Control
Low HS Throughput Site Analysis Study Case
Counters were extracted and graphs plotted to
illustrate the HS Frame Loss Ratio and HSLimitIub
KPIs over time
Flow Control
Examining the KPIs resulting graphs below, it was evident that the channel
normally reserved for ClassA traffic (vc39), was experiencing abnormally high
bandwidth utilization.
The ClassB&C traffic channels (vc50 & vc51) were experiencing abnormally low
utilization (next slide).
Flow Control
Enhanced Uplink Congestion KPIs
Flow Control
pmEdchDataFramesLost
Eul _ Frame _ Loss _ Ratio  * 100%
pmEdchDataFrames Re ceived  pmEdchDataFramesLost

High frame loss indicates potential congestion problems.


Eul _ Frame _ Delay _ Distribution  pmEdchDataFrameDelayIub

This counter is difficult to post process, so is only recommended to be used


with troubleshooting rather than performance monitoring
Failure After Admission
What is ‘Failure After Admission’?

 refers to an RRC/RAB setup failure that occurs after the user has been
admitted to the network.

 Admission to the network occurs when the user successfully completes an


initial RRC Connection Setup request.

 An RRC failure that occurs after the initial admission could be if the user
wanted to upswitch to a higher rate while on an existing call and the upswitch
could not be achieved, due to lack of resources (Radio or Transport). This
would be perceived by the user as a slow connection.

 On the other hand, a RAB setup failure would be perceived by the user as a
failure to setup a call.
Failure After Admission
In general, high ‘Failure After Admission’ occurrences are mainly due to:

 Transport Network: lack of BW/CIDs, or,


 Radio Network: lack of Channel Element Availability.

Failure After Admission’ Study Case


To perform this study case the following procedure is performed:

 Identification of a problem site, by extraction of


pmNoFailedAfterAdm counter.
 AAL2 Setup Failure Rate, counter retrieval and KPI calculation.
 Graphical Analysis to establish correlation between both
Graphical Analysis
IP Iub Throughput
The client should define a user throughput threshold, in
order to identify the bandwidth target to be delivered (in
average) for user.
After that, this threshold should be compared with
actual customer average throughput, as defined below:
THROUGHPUTPER pmDlTraffi
USER: cVolumePsI ntHs
1
AvUserThrH s  kbit / s   This formula calculates
 Cells the average Bit-rate per user
Meas _ Length( s )
on Iub interface. Cells
 AvNrHsUser sPerCell

pmSumBestPsHsAdchRabEstablish
AvNrHsUser sPerCell  
pmSamplesBestPsHsAdchRabEstablish

pmSumBestPsEulRabEstablish pmSumBestPsStreamHsR abEst


 
pmSamplesBestPsEulRabEstablish pmSamplesBestPsStrea mHsRabEst
IP Iub Throughput
If the throughput per user is below defined threshold, should be identified if it
has been limited by ‘Flow Control’. This can be done using Iub congestion
counter:
HSLimitIub  pmCapAllocIubHs lim itingratiospi  xx 

Other indication that the transport network is overloaded, could be measured


by frame loss counter, that should present values below 2%.
pmHsDataFramesLostSpi  xx  *100%
HSFrameLossRatio 
pmHsDataFrames Re ceivedSpi  xx   pmHsDataFramesLostSpi  xx 

If frame loss counter returns low values, and Iub presents no


limitation
IP Iub Throughput
RNC Iub throughput monitoring KPIs:
Average Iub throughput:
pmSumCapacity
IUB_THR  kbit / s  
pmSamplesCapacity

 Average Iub throughput regulated within ROP:


pmSumCapacity Re gulation
IUB_THR_ REG kbit / s  
pmSamplesCapacity Re gulation

 Periods of Iub Throughput limitation:


IUB _ THR _REG_ DURATION  sec   pmTotalTim eCapacityRegulated

This KPIs observation alows to understand when low performance is due


internal RNC limitation, and not by transport network.
IP IUB EVALUATION
FLOWCHART

You might also like