This action might not be possible to undo. Are you sure you want to continue?
Issue Date: September 2004
Abstract—This paper provides an insight into network performance management and quality of service (QoS) of matured second generation (2G) cellular systems (after the pre-/post-launch testing and optimization phase). It identifies the components of QoS and the available mechanisms to analyze and evaluate them. The paper also identifies important key performance indicators (KPIs) that need to be monitored and optimized and provides a way to collect and classify data for analysis. Finally, the most common QoS shortfalls and possible solutions are discussed.
y 1992, many European countries had operational second generation (2G) global system for mobile communication (GSM) systems, and GSM started to attract interest worldwide. GSM proved to be a major commercial success for system manufacturers and network operators, many of which enjoyed exponential growth until the end of the decade. The most valuable and limited resource of GSM is the available frequency spectrum, which limits the system capacity. The successful take-up of GSM services led to continuous development of sophisticated algorithms to maximize the system capacity. This caused a substantial technological evolution of GSM, with annual (and often biannual) releases of new functionality, which have increased the complexity of the system. The evolution of the Ericsson™ “locating” algorithm, which is very capacity-efficient but quite complex, with several operators unable to fully exploit the benefits of this functionality, is an example. Underutilization of available functionality, coupled with an exponential increase in subscriber numbers, resulted in many operators overdimensioning their base station subsystems (BSSs) with continuous aggressive deployment of new base stations. Thus, constant change and evolution of GSM networks have necessitated the continuous optimization of the offered quality of service (QoS).
performance, and the industry has developed GSM optimization expertise (mainly through trial and error), but this expertise is not fully documented. There is usually more than one solution to a problem, which (unlike for design or site acquisition) makes it difficult to proceduralize optimization techniques and problem solutions. Engineers need to be open-minded, with good analytical skills and good understanding of the overall system and its individual components. Performance management and QoS optimization are subjects that cannot be fully taught. Expertise must be gained through trial and error, in an attempt to maintain optimum and constant QoS offered by dynamic and ever-changing GSM networks. This paper focuses on 2G QoS, as well as the advantages and disadvantages of each mechanism available to monitor, analyze, and improve it. The paper also describes the most common QoS shortfalls and provides improvement recommendations, which serve as a useful reference in performance analysis and optimization for specific projects.
WHAT IS QUALITY OF SERVICE?
verall QoS for 2G, 2.5G, and 3G systems comprises three important components, all of which need to be constantly monitored and optimized as networks change in response to increasing coverage and capacity demands: • Accessibility – getting on the system • Retainability – staying on the system • Connection quality – having a good service experience while using the system
Many publications on GSM describe the system, its architecture, and its evolution. However, limited sources document QoS, network performance management, and optimization. Many European operators currently enjoy very good network
© 2004 Bechtel Corporation. All rights reserved.
by specific geographical areas of interest or for the entire network • Trends are stable Disadvantages • Indicate problems but not their causes or solutions • Do not differentiate customer value T he three mechanisms available to monitor. analyze. and evaluate QoS and take corrective actions are customer complaints. calling patterns only • Labor-intensive analysis Network Statistics Advantages • All calls can be monitored • Trends can be measured.QUALITY OF SERVICE EVALUATION ABBREVIATIONS. Each mechanism has certain advantages and disadvantages. ACRONYMS. and network statistics. drive tests. usually with conflicting priorities for limited optimization resources. SDCCHSR SDCCH success rate SMS TA TBF TCH TMA 2 Bechtel Telecommunications Technical Journal . all three of which are described below. AND TERMS 2G 3G BCCH BH BL BLR BSC BSIC BSS BTS CCSR CFR CI CRH CRO CS CTR DCR DL DTCHR DXC EIR GPRS GSM HSN HSR KPI LAC LAPD MHT MSC OFTEL PS QoS RF RTT SDCCH second-generation third-generation broadcast control channel busy hour both links block error rate base station controller base station identity code base station subsystem base transceiver station call completion success rate congestion failure rate cell identification cell reselect hysteresis cell reselect offset circuit switched cell traffic recording dropped call rate downlink dropped traffic channel rate digital cross connect equipment identity register general packet radio service global system for mobile communication hopping sequence number handover success rate key performance indicator location area code link access protocol on the D-channel mean holding time mobile switching center Office of Telecommunications packet switched quality of service radio frequency roundtrip time standalone dedicated control channel short message service timing advance temporary block flow traffic channel tower-mounted amplifier Customer Complaints Advantages • Real problems experienced by customers using the service • Decision-forming/influential Disadvantages • Subjective • Often vague with little supporting data • Often received too late to react to the situation • Require filtering by customer service before being handled by the engineering department Drive Tests Advantages • Real calls • Cause of failure can be identified • Good for benchmarking • Good for network pre-launch tuning (startups and new deployment projects) Disadvantages • Low volumes/statistically insignificant • One terminal type • Only ground level and in-car service • Predetermined routes.
the Office of Telecommunications (OFTEL). Optimizing such cells improves CCSR. However. It takes into account the fact that all failures are either drops or unsuccessful call set-ups. This makes statistics the most useful mechanism for identifying QoS shortfalls. • HSR: Handover success rate (HSR) indicates the success of handovers. while drive tests are used to verify them and/or the solution(s). At the end of the cycle. optimization projects are initiated. analytical skills. drive tests alone cannot be relied on to provide insight into the offered service. Using all available methods. to identify cells with high drops. In such optimization projects. In the United Kingdom.000 calls while driving 305 pre-defined routes with clearly defined call patterns. using statistical analysis properly and to the fullest extent possible can significantly improve QoS. and experience in network performance management and optimization. because a call cannot be statistically related to just one cell. GSM operators have developed sophisticated CFR formulas to account for the effects of features such as directed retry and cell load-sharing when measuring customer-experienced congestion. all network operators make approximately 22. This. Network performance management and optimization activities ensure that QoS targets are met. It is a good way to evaluate the effectiveness of optimization activities because it takes into account the carried traffic and is more sensitive to changes than DCR. which indicates the static nature of traffic. planning. A summary of the most important KPIs that can have an impact on the offered QoS follows. Every 6 months. Circuit Switched (CS) – Voice • DCR: The dropped call rate (DCR) provides the customer-perceived dropout performance. these projects fully analyze the performance of the area to understand the problems and take corrective actions. a governing body. to optimize cells with high failure rate. The total number of failures is divided by the total number of call attempts. due to handovers. on average. in turn. and network statistics is used. It is calculated over an area of the entire network or a geographical area and not on a per-cell basis. the operators submit a summary of the results and all drive-test files to OFTEL . Nevertheless. • Minute-Erlang/Drop: This KPI indicates the average time between dropped calls. For underperforming areas (sections of the network failing the KPI thresholds). • CFR: The congestion failure rate (CFR) indicates the failure rate of assignments due to congestion and can be used on a cell basis for engineering. uses CCSR from drive tests to declare the best network for QoS. In several mature European networks there is. The KPI thresholds are usually revised once a year. It is used for engineering purposes only (and not for reporting). statistical analysis and customer complaints are used to identify problems. and new goals are set as the business priorities change. • SDCCHSR: The standalone dedicated control channel success rate (SDCCHSR) indicates the rate of successful air interface signaling channel assignments and is used for engineering purposes only. Expertise must be gained through trial and error. A large proportion of traffic offered via mature networks is static and often originates at higher-thanground levels. Usually. September 2004 • Volume 2. Drive tests can only provide an indicator of QoS for traffic that is highly mobile and at ground level. WHAT NEEDS TO BE MONITORED AND OPTIMIZED? T he trends of several KPIs must be closely monitored. experience is required in recognizing problem trends. However. only one handover per call. requires good knowledge of the system. It is a good method to use to evaluate the network accessibility and retainability as perceived by the customers. Number 2 3 . Performance management and QoS optimization cannot be fully taught. identifying the causes. • CCSR: The call completion success rate (CCSR) can be derived either from network statistics or from drive test statistics.Established GSM operators use clearly defined network QoS key performance indicators (KPIs) with target thresholds to be achieved. drive tests. and taking corrective actions. • DTCHR: The dropped traffic channel rate (DTCHR) indicates the drops at the cell level. a combination of customer complaints. Optimizing these cells improves DCR and CCSR. and troubleshooting purposes and on an area basis to provide a measure of the customer-perceived traffic congestion. Minimizing handover failures improves DCR. It is a division of traffic expressed in minuteErlangs divided by the total drops and is inversely proportional to DCR.
if this is necessary. Trends with daily values are also used for reporting and benchmarking. Both user-defined formulas and “raw” counters are grouped into one of the following categories: • Random access channel measurements • Standalone dedicated control channel (SDCCH) measurements • TCH measurements • Idle channel measurements • Handover measurements • Subscriber disconnection measurements • Link access protocol on the D-channel (LAPD) signaling measurements ORGANIZING STATISTICAL DATA PRIOR TO ANALYSIS A s shown in Figure 1. The statistics database is divided into “object types. they provide the “worstcase” scenario. • TBF Multiplexing: Temporary block flow (TBF) multiplexing indicates the number of users per time slot usage of general packet radio service (GPRS) resources.” which correspond to different equipment or system function blocks. In a way. • Peak Hour: Peak hour statistics are of great significance.Packet Switched (PS) – Data (GPRS) • Cell Throughput: Cell throughput is an end-to-end KPI used at the cell and network levels to indicate data throughput..25 Network SMS MSC MSC BSC MSC BSC BSC EIR Figure 1. • Day: Daily statistics are introduced to provide a way of averaging temporary fluctuations of hourly data. Performance Evaluation Performance Configuration Database Database Voicemail DXC Crossconnect X. The following observation time intervals are suggested for statistical evaluation: • Hour: Hourly statistics give a detailed picture of network performance. Problems can be identified and corrective actions triggered with more confidence. Statistics can be obtained directly from the switching node. Proposed methods for organizing and classifying the available data follow. Collection of Performance and Configuration Data • BSC measurements 4 Bechtel Telecommunications Technical Journal . Each object type contains several event counters.e. the base station controller (BSC) uploads the entire object counter data to the statistics database every 15 minutes. Classification by Network Level As shown in Table 1. the monitoring process and statistical analysis take place at different levels: • Network-wide: The entire network (to provide a “global” overview) • Geographical Area or Region: All cells belonging to specified geographical regions (to obtain and compare results for performance in different areas) • City: All cells belonging to specified major cities (to obtain and compare results for performance in different cities) • BSC: All cells belonging to certain switching nodes (to obtain switching node-related statistics and compare performance of different nodes) • Cell: Individual cells as well as neighboring cell relationships Classification by Resource Type or Event Statistics can be classified by resource type or the events they refer to. performance and configuration data are collected in the switching nodes and usually aggregated into a statistics database and a configuration database. because they correspond to the time of heavy utilization of network resources. They are useful to help spot temporary problems and identify trends. where outputs are available every 15 minutes. • RTT: Decreasing roundtrip time (RTT) delay increases throughput. respectively. A high number of users per time slot decreases the data throughput. • Online: Online statistics provide almost real-time monitoring of the network. it is important to define appropriate time frames within which the data will be gathered and processed. i. Observation Time Intervals When manipulating statistical data. The basic time unit for data collection is 15 minutes.
engineers must determine optimized parameter values for a specific area of a network.5–1. If this occurs over a long period of time and especially during the busy hour (BH).5% 0. Increasing the cell reselect hysteresis (CRH) will delay GPRS reselection. 98–99.5% 100–250 min.5% 0. Because coverage. missing assignments in neighbor list. Number 2 5 . September 2004 • Volume 2.9% 0. missing neighbors.8% 100–250 min. 0. high number of call set-up bids • Action – Check historical statistics of SDCCH availability.Table 1. Real-time data can show if certain time slots are constantly idle. It is advisable to aim for no SDCCH congestion at all times. If this occurs over a long period of time and especially during the BH. Accessibility Optimization SDCCH Congestion • Causes SDCCH availability.5% 0–20% 0–1% 100–250 min. which can be converted to eight SDCCHs. This can be done at the expense of one TCH. high number of location updates. if possible. TCH time slots may go into sleep mode.5–99. call set-ups.5% 0–1% 98. It might be wise to expand SDCCH resources. high level of short message service (SMS) traffic. and traffic load differ from one area to another and from one network to another.5–1. time slots may go into sleep mode.5–2% 0–30% 0–1% 98–99. a BTS restart and retest validation may be required. a base transceiver station (BTS) restart and retest validation may be required. – Check for high number of location updates. 1–4% 98–99% 97–99.9% 5–10% 5–15% 2–10% 1–5% QoS Attributes Accessibility/Retainability Accessibility/Retainability Retainability Retainability Speech Quality Speech Quality Accessibility Accessibility/Retainability Retainability Retainability Accessibility Accessibility Accessibility/Retainability Speech Quality Speech Quality Retainability Accessibility Retainability Accessibility Retainability Accessibility/Retainability Speech Quality Accessibility Retainability Accessibility Accessibility/Retainability Accessibility Entire Network DCR Half-Rate Traffic Silence/One-Way Transmission SDCCHSR CCSR from Drive Tests DCR Area or Region Minute-Erlang/Drop BH CFR SDCCHSR CCSR from Drive Tests Half-Rate Traffic Silence/One-Way Transmission Major City Minute-Erlang/Drop SDCCHSR DCR BH CFR DCR BH CFR Silence/One-Way Transmission SDCCHSR % of Cells with Dropped TCH >2% % of Cells with BH CFR >10% % of Cells with HSR <95% % of Cells with SDCCHSR <95% All BSCs All Cells CAUSES OF CERTAIN QoS SHORTFALLS AND POSSIBLE SOLUTIONS A lthough the most common QoS shortfalls and suggested possible higher level solutions are discussed. traffic distribution • Action – Check TCH availability. Historical data can show if certain time slots are constantly idle.5% 98–99. TCH Congestion • Causes TCH availability.5% 0. and SMS traffic.9% 97–99.5–1. Examples of QoS KPIs and Target Thresholds Level KPI CCSR from Drive Tests CCSR Calculated Minute-Erlang/Drop Performance Target Range 97–99. In some systems. spectrum utilization. a detailed description of the functionality to be fine-tuned and parameter settings is beyond the scope of this paper.5–2% 0–1.
– Redistribute traffic among cells within the same layer. This is more common in hierarchical cell structures where traffic is forced down to lower layers using aggressive layer thresholds of –90 dBm or lower. the serving cell may suffer TCH congestion and show increased MHT. – Use traffic management (load shedding) techniques that force traffic originating near the cell border to the surrounding cells. – In a hierarchical cell structure. Retainability and Quality Optimization Deterioration of Performance with Sudden Increase in the Number of TCH Drops • Causes Hardware problem. If some external neighbor cells (belonging to a different BSC or mobile switching center MSC) show no successful hand-overs. use hysteresis and hysteresis offset to initiate early handover and modify the imperative handover parameters to also initiate earlier handover due to bad quality. TCH Drops due to Downlink Signal Quality • Causes Downlink interference. there will be high value for block error rate (BLR) and poor throughput. the drops may be due to poor coverage. Check how and where the serving cell frequencies are reused to identify the interfering frequencies and plan a frequency change. – Check whether any neighbor cells have been deleted or whether any are not on the air. This is valid for base-band frequency hopping systems. – When statistics show that drops are due to downlink quality. There will be an increase of immediate PS assignment rejections. This can be achieved with optimum use of capacity-efficient features such as directed retry. Change the layer threshold to initiate earlier handovers to higher layers. using cell reselect offset (CRO). This can be accomplished by adjusting handover hysteresis and handover offset. They will reduce TBF multiplexing and the number of PS immediate assignment rejections and will also increase GPRS throughput. 6 Bechtel Telecommunications Technical Journal . coverage • Action – Identify cell pairs that have a high number of handover attempts with reasonable downlink (DL) quality. Check the radio plan for missing neighbor cell assignments. and reduction of GPRS throughput. using layer threshold and layer threshold hysteresis. For synthesizer hopping systems. If the GPRS user is in a high interference area. Note: The traffic distribution actions mentioned above will improve GPRS performance. Statistical analysis and customer complaints are used to identify problems. handover problem • Action – Check historical statistics of TCH availability. while drive tests are used to verify them and/or the solution(s). This will help to identify the approximate area where mobiles experience DL interference. – Check historical handover performance for the cell. – If the cell serves with a high TA value. For cells on the same layer. TCH Drops due to Uplink Signal Quality • Causes Uplink interference. make the cell less attractive in idle mode. change the hopping sequence number (HSN). Check if there are any alarms on the cell or the transceiver or any of the TCH time slots. antenna feeder system. missing or incorrect handover definitions on the parent BSC or MSC could be the reason. Greater MHT may be due to missing or incorrect neighbor cell definitions. but only attempts. TBF multiplexing. distribute traffic to lower or higher cell levels as required. coverage • Action – Use cell traffic recording (CTR) and check the uplink quality for certain timing advance (TA) values. If any neighbor cells are not on the air. using early handover from a congested cell to another cell.– Check for cell mean holding time (MHT) and compare it with that of the surrounding cells in the area. Also modify the imperative (urgent) handover parameters to initiate earlier urgent handovers to higher layers due to bad quality. cell load-sharing (traffic reason handover or changing the handover hysteresis parameters). and handover offset between two neighbor cells. Check the frequency plan to see what frequencies are used in these areas and schedule a frequency retune.
– When statistics show that drops are due to downlink quality. This can be done if there is coverage overlap so that a coverage hole is not created. Check the CTR file for both uplink and downlink signal strength. the drops may be due to poor coverage. – There could be a problem in the antenna or feeder systems. coverage. Investigate for any alarms on the site. If any cell is a better server than this cell. a problem could exist in the antenna or feeder systems. make early handover to the higher layer using layer threshold. If the cells are on the same layer. missing neighbor definition on the BSC and/or MSC • Action – Co-BSIC/BCCH planning errors occur when a cell has two neighbors with the same BSIC and the same BCCH. change the value of hysteresis and hysteresis offset to initiate earlier handover. To confirm this.– There could be a problem in the antenna or feeder systems. – There could be a problem in the antenna or feeder systems. – In a duplexed transmit/receive situation. – In hierarchical cell structures. restrict the coverage by making the cell less attractive in dedicated mode with CRO and in idle mode by initiating early handover with hysteresis and hysteresis offset. Check the antenna feeder system. TCH Drops due to Uplink Signal Strength • Causes Coverage. run CTR for this cell. If TA values are high. – Run CTR for the affected cell and check TA values. the BSC uses this combination to identify the cell Although drive tests can only provide an indicator of QoS for traffic that is highly mobile and at ground level. Check the feeder system. Mobiles report measurements of the surrounding cells with their BSICs and BCCHs. restrict the coverage by making the cell less attractive in dedicated mode with CRO and in idle mode by initiating early handover with hysteresis and hysteresis offset. they are good for benchmarking and ideal for verifying applied optimization solutions. – Consider installing a tower-mounted amplifier (TMA) to boost the uplink and see if there is room for a TMA installation in the tower. hardware faults • Action – This type of problem occurs in areas where a cell serves a tube or tunnel. For synthesizer hopping systems. Handover Attempts but no Successful Handover Assignments • Causes Co-base station identity code/broadcast control channel (co-BSIC/BCCH) planning error. Changing layer threshold will help when the cells are on different hierarchical layers. In such cases. Handover Performance Optimization Handover due to Degraded Signal Quality • Causes Downlink interference. Initiate damage assessment on coaxial and antenna systems. check the layer and layer threshold for the cell. TCH Drops due to Both Links (BL) Signal Strength and due to Sudden Loss • Causes Coverage. – Consider increasing antenna downtilt to reduce the service area of the cell. September 2004 • Volume 2. – Check downtilt and calculate if the existing downtilt is correct for the intended coverage area. Increase downtilt if necessary. uplink interference. Check to see how and where the serving cell frequencies are reused to identify the interfering frequencies and plan a frequency change. then initiate early handover using hysteresis and hysteresis offset. Check the feeder and antenna systems for proper operation. – Run CTR for the affected cell and check TA values. if the affected cell is in a lower layer and if a cell from a higher layer is stronger in CTR. Mobiles traveling in certain directions will run out of coverage and drop out. Number 2 7 . If TA values are high. This is valid for baseband frequency hopping systems. hardware faults • Action – Check for any missing neighbor cell relations or to see if any defined neighbors are out of service. change the HSN. Investigate for any alarms on the site. Investigate for any alarms on the site. Investigate for any alarms on the site. antenna feeder system • Action – Identify cell pairs that have a high number of handover attempts due to degraded signal quality.
at the BSC level. REFERENCES  Office of Telecommunications (OFTEL). the monitoring process and statistical analysis must take place at different levels: network-wide. and management of the end-to-end performance of cellular networks. – Check handover performance if there are attempts but no successful assignments for some external neighbor definitions (neighbors on a different BSC and/or MSC). drive test analysis. O perator competency in managing performance and optimizing QoS is not easily taught. and at the cell level.org. optimization. The plethora of statistics generated in the network switches data must be organized before analysis. Africa. For effective network performance and evaluation.e. Greece. From 1999 to 2003. Michael is a mobile networks specialist with 17 years of experience in the telecommunications industry. 8 Bechtel Telecommunications Technical Journal . and optimization. Drive tests are good for benchmarking and more ideal for verifying applied optimization solutions. England. Optimization solutions vary in different areas and networks but. “Counters in the Measurement Database for Traffic and Event Measurements in Radio Network” – Ericsson Function Specification. ADDITIONAL READING • • • • “Radio Network Parameters and Cell Design Data”– Ericsson CME20 Documentation. TRADEMARK Ericsson is a trademark or registered trademark of Telefonaktiebolaget LM Ericsson. rather. planning. it can be a powerful tool for an experienced engineer with good analytical skills to use to identify problems and apply optimization solutions. writes guidelines and procedures for mobile network design. BIOGRAPHY Michael Pipikakis is a network planning and wireless technology manager for Bechtel’s Europe. These mechanisms have advantages and disadvantages and can be utilized in parallel in large optimization projects. worked for Cellnet UK and GEC Marconi UK. the external neighbor cell has been incorrectly defined as a neighbor to the serving cell’s BSC with either wrong location area code (LAC) or BSIC or BCCH. and Southwest Asia Region. Nokia BSS S9. “BSC STS User Formulas” – Ericsson CME201 R9. mainly through trial and error. Before joining Bechtel. and statistical analysis.identification (CI) of these cells and might direct the handover to the wrong cell. Statistical analysis can identify trends but does not provide solutions. and was a telecommunications operator in the Greek Navy. as discussed in this paper. a generic approach can be developed to monitor and optimize the QoS as networks continuously change in response to changes in offered traffic and business priorities. he was a member of the Vodafone Global Forum for UMTS design harmonization. However. He is a member of the Institution of Electrical Engineers.uk/static/archive/ oftel/publications/research/2003/call_survey/). including more than 11 years in RF planning. it is developed.. This can result in many dropped calls in the area. and an HND in Radio Communications Systems Design from the Polytechnic School of Athens. CONCLUSIONS A generic approach can be developed to monitor and optimize the QoS as networks continuously change. “Mobile Network Operators’ Call Success Rate Surveys – May 2003” (http://www. design. This can be identified from many handover attempts with no successful assignments. This is due to incorrectly defined external cells. i. and participates in technology forums. by geographical area or region. There are three main mechanisms for evaluating and optimizing QoS—customer complaints. Michael has a BEng Honors in Electronics Engineering with Computing and Business from Kingston University in Surrey. He supports ongoing and new projects and new business development. and this mechanism is reactive. Michael held various management positions in the Vodafone Group’s radio system design and optimization department and development department over a 10-year period. by city.ofcom. Middle East. Customer complaints can be objective but are also misleading. Change the BSIC of one of the neighbor cells.
This action might not be possible to undo. Are you sure you want to continue?