You are on page 1of 19

1/19

STC CU/ECU alarms handling Procedure V1.0 NSN CM Team NSN MS Use Khalil Al Ngashy

STC CU/ECU alarms handling procedure

STC CU/ECU alarms handling procedure

4) Clearance Actions 5) Recommendations .1) Main CU/ECU alarms description .0 NSN CM Team NSN MS Use Khalil Al Ngashy Table of contents: 1) Introduction 2) Issue Description 3) Technical Description 3.2/19 STC CU/ECU alarms handling Procedure V1.

2) Issue description It has been found that the number of alarms received from the CU/ECU modules is higher than other vendors for equivalent modules within STC network. special trouble shooting procedures for interfaces are provided. It is necessary to review the procedure being followed to clear these alarms and to study the current behavior and alarm clearance process. The more common service affecting alarms in STC network are listed. but from interface problems in general (for example interrupted cables). This maintenance concept guarantees a simple and fast fault clearance and leads to high operational efficiency.1) The RF part of the CU/ECU has problems (34835) Error description: The power stage of the CU/ECU detected one of the following errors: Over temperature of the power stage Loop1 of Power stage does not close Loop2 of Power stage does not close FlexCU only: Over current alarm Other words: The other word information provides additional information about every single alarm generate by ex-Siemens BSS. 3) Technical description 3. the results of these routine tests are sufficient to localize the fault and clear it immediately at the BTSE. In this case. The high system functionality of the base station system is achieved by means of systemintegrated routine tests. For ex-Siemens BR equipment document number A50016-G5100-A001-0776K5 is available and all personal working with ex-Siemens GSM radio equipment should be familiar with it. 3. it is based in the standard NSN operation documentation and should not be used as a replacement of such documentation but as a quick reference and support for the field engineers . The following items provide the error description which has to be taken into account when troubleshooting any CU/ECU alarm. For the case of alarm numbered 34835 “RF part of the CU/ECU has problems” the other words information is explained as followed: 1718(D): .1) Main CU/ECU alarms description .0 NSN CM Team NSN MS Use Khalil Al Ngashy 1) Introduction The current document is intended to provide information about the CU/ECU alarm handling for STC network in order to improve the current behavior of alarms and HW issues in the CU/ECU module. In most cases. however.3/19 STC CU/ECU alarms handling Procedure V1.1. Sometimes. it is mandatory to analyze this information to define the appropriate action to be taken (please refer to annex C for details). These routine tests continually check the correct functioning of the base station subsystems including the BTSEs. it may happen that faults do not result from defective modules. The modular design of the BTSE allows STC to clear a large percentage of faults in the system by replacing a defective module. please refer to this documentation for further technical details.

A defective CU/ECU or FlexCU As a consequence. • If TX power reduction > 4 dB. The error can be caused by: 1. If the alarm persists after 20 sec. The device driver can be identified by the address of its initialization function.2) Loss of Board (22356) Error description: Board lost. i. Cable problems in the TX/RX path. especially a defective TX cable from the CU/ECU to the combiner or a defective antenna cable. • If TX power reduction = 2 dB. the initialization failed. the power is not reduced. Other words: In this case the other words information does not provide any hint to the field engineer. if BCOM is connected and fails no alarm would be sent. • If TX power reduction = 4 dB.3) RF power reflected warning (34910) Error description: The power amplifier detects reflected RF power. the output power is reduced by 6 dB. 2. the error condition persists and the alarm 34892 is raised.4/19 STC CU/ECU alarms handling Procedure V1. the output power is reduced by 4 dB.0 NSN CM Team NSN MS Use Khalil Al Ngashy • 17=Bit field showing the errors of the power stage. Other words: Octet 17 = A value unequal zero indicates a VSWR problem 3. the output power is reduced by 2 dB. .1. Communication supervision detects a loss of board.1.e. 4 (reduced output power) or 8 (excessive output power).4) Device driver initialization failed (94) Error description: The initialization function of a device driver returned a non-zero error code to the OS. 3.1. 3.g. the nominal output power is reduced for safety reasons: • If TX power reduction = 0 dB. the alarm is cleared and the output power is set back to its original value. its name can be found in the linker map file. If bit <> 0 -> error bit0: over temperature bit1: unused bit2: loop1 alarm (reduced output power) bit3: loop2 alarm (excessive output power) • 18=if <> 0: over current alarm (on FlexCU only) As per above information for this alarm octet 17 can take the hexadecimal values 1 (overtemperature). Further consequence: If after 20 seconds the error condition is removed. In case of the board loss alarm caused by ACT all associated alarms will remain in the same status e.

0 this alarm was enhanced with more additional information about failures on HIT DSPs. CU/ECU Possible causes: 1) A hardware test procedure has detected a chip error 2) BBSIG/BBSIG44: Timout at loading BB1(/BB2) with Boot-SW or Load-SW TRXD: Timeout at loading DR1/DR2 with Boot-SW or Load-SW TRXD2: Timeout at loading LEA with Load-SW Because of HW tolerances this error can occur during startup of TPU (TPU1 HW version) with the following additional infos: HW test number = 0b. NOTE2: For EdgeCU/FlexCU in BR8.1. Other words: In this case the other words information does not provide any hint to the field engineer. TRXD/TRXD2(TPU). it can be ignored.1. LI. CCLK. If this FER from object CU/ECU occurs sporadically. Other words: Not applicable 3. 3. HW test result = 01.5) RF power reflected into power stage (34892) Error description: The power amplifier detected reflected RF power.1. So either the connection between the power amplifier and the combining equipment is not correct or the power amplifier itself does not work correctly. TRXA.0. BBSIG/BBSIG44. ALCO.6) No startup-request after reset (24603) Error description: Reset-supervision failed: The core did not receive a startup-request message from the peripheral board after it was brought into boot-phase by reset. This can occur when Booter falsely starts test SW instead of system SW because of a wrong transmission of test mode bits via CC Link. CORE. The TPU is afterwards recovering successfully without any problem! NOTE: This error replaces the former error CBM_AEID_HW_ERROR beginning with BR3. 3.7) Hardware error (4110) Error description: A hardware error was detected on one of the following BTSE boards: CCTRL. Other words: Not applicable .5/19 STC CU/ECU alarms handling Procedure V1.0 NSN CM Team NSN MS Use Khalil Al Ngashy Other words: In this case the other words information does not provide any hint to the field engineer. It provides more detailed error information.

1. Other words: Not applicable 3. see corresponding Test Report. or the ramping FLASH could not be read completely.9) Any of the PLLs shows lockproblems (34836) Error description: Any of PLLs on the SIPRO has lockproblems.1.FAILED in order to indicate on LMT and RC that there is a problem with this board (note that the original alarm is no longer available after a BTSE restart).1.1.10) Test Result failed (28675) Error description: The initiated PerformTest or AutomaticTest don't pass. Other words: Not applicable 3.1. Other words: Not applicable 3. Recovery after BTSE restart (30784) Error description: No physical recovery is performed for a faulty processor board after a BTSE restart (the board remains in state DISABLED.11) No phys. It fails.6/19 STC CU/ECU alarms handling Procedure V1.12) Error in Flash or mismatch with database (34832) Error description: There are two possibilities for this error: Either the cell allocation number in the database does not match to the actual hardware.0 NSN CM Team NSN MS Use Khalil Al Ngashy 3.NOT_INSTALLED). . With this alarm the board is set to DISABLED.8) Hardware Error on CU/ECU detected by BIC (20603) Error description: Power Supply Unit (PSU) does not work correctly Other words: Not applicable 3.

Other words: Not applicable 3. • The downlink burst type is illegal. • The time slot number is wrong.1.0 NSN CM Team NSN MS Use Khalil Al Ngashy Other words: Not applicable 3. • The packet size is wrong.13) Inter board communication timeout (24581) Error description: A task located on the core has not received the expected message from a peripheral board. Other words: Not applicable 3. • The training sequence code is illegal. • The number of received data packet(s) is wrong.14) Local AE-queue full (20543) Error description: The local AE-Queueing buffer is full.1. DSP on ECU /FlexCU ) in which either: • The checksum is not correct. • There is a collision of downlink data on FlexCU. The error can be caused by: • Cable problems on CC-Link • A defective COBA • A defective CU (GCU/ECU/FlexCU) Other words: Not applicable .15) Error in downlink message from COP/HIT (34841) Error description: U1-BIC (baseband information controller) received a message from COP (coding/decoding proc essor on CU/ECU/ GCU ) / HIT (highly integrated transceiver.1.7/19 STC CU/ECU alarms handling Procedure V1.

: . i. The alarm changes the availability status of the CU/GCU/ECU/FCU to "degraded".1.the supervision is switched off reception with diversity is switched off .both receivers have a good receive quality .at the antenna .e. The problem does not allow any further processing of the SELIC-ASIC.inside the CU/GCU/ECU/FCU A CU/GCU/ECU/FCU test can give a hint whether the error is situated in the CU/GCU/ECU or outside of it.0 NSN CM Team NSN MS Use Khalil Al Ngashy 3. If the CU/GCU/ECU/FCU test fails the error is located inside the CU/GCU/ECU/FCU. The error can be located in the whole receive path. The CU/GCU/ECU/FCU leaves the availability status "degraded" if: . otherwise the error is located outside.17) Diversity Receive Branch Failed (34837) Error description: A diversity receive branch failed because of a bad signal receive level or because of a bad signal to noise ratio. ECU/FCU (Octet 1 = 0x62): Processor ID of alarm originator 1920(D): • 19=(G)CU (Octet 1 = 0x6a): .8/19 STC CU/ECU alarms handling Procedure V1. In this case the CU/GCU/ECU/FCU test should pass. Other words: Not applicable 3. The lower four bits show the number of timeslots of the main receiver which do not receive properly due to bad signal strength.16) Critical SELIC problem (8260) Error description: A SELIC-ASIC indicates a critical problem. Cause for this alarm may also be a strong interfering signal.in the combining equipment . ECU/FCU (Octet 1 = 0x62): Alarm type discriminator: operating mode 0: Invalid 1: 2Rx 2: 4Rx 3: Switch Beam (not supported in BR8) 4: 4RxTxDiv 5: 0Rx • 18=(G)CU (Octet 1 = 0x6a): The upper four bits show the number of timeslots of the main receiver which do not receive properly due to a bad signal to noise ratio. It is recommended to perform a CU/GCU/ECU/FCU test in order to locate the error. The following critical problems can occur: * Illegal CC Link configuration * RAM BIST failure. Other words: 1718(D): • 17=(G)CU (Octet 1 = 0x6a): Shows the number of timeslots of the main receiver which do not receive properly. which disturbs the receiver.e. the CU/GCU/ECU/FCU is locked or a fault occured on an object which is necessary for this TRX.1.at the RF cabling .Call Processing is blocked i.

2728(D): • 27=ECU/FCU (Octet 1 = 0x62): The upper four bits show the number of timeslots of the 4th receive branch which do not receive properly due to a bad signal to noise ratio. 2122(D): • 21=ECU/FCU (Octet 1 = 0x62): The upper four bits show the number of timeslots of the 1st receive branch which do not receive properly due to a bad signal to noise ratio. • 26=ECU/FCU (Octet 1 = 0x62): Shows the number of timeslots of the 4th receive branch which do not receive properly. The lower four bits show the number of timeslots of the 2nd receive branch which do not receive properly due to bad signal strength. . The lower four bits show the number of timeslots of the 3rd receive branch which do not receive properly due to bad signal strength.1. The lower four bits show the number of timeslots of the 4th receive 3.0 NSN CM Team NSN MS Use Khalil Al Ngashy This octet shows the number of timeslots of the diversity receiver which do not receive properly. The lower four bits show the number of timeslots of the diversity receiver which do not receive properly due to bad signal strength. The lower four bits show the number of timeslots of the 1st receive branch which do not receive properly due to bad signal strength. ECU/FCU (Octet 1 = 0x62): Equalizer diversity configuration • 20=(G)CU (Octet 1 = 0x6a): The upper four bits show the number of timeslots of the diversity receiver which do not receive properly due to a bad signal to noise ratio. This indicates possible hardware degradation of the BTS RF path. The path loss difference is represented as a signed integer number with 4 bytes length. Positive values indicate higher UL path loss than DL path loss -> Degradation at the receiver equipment. • 22=ECU/FCU (Octet 1 = 0x62): Shows the number of timeslots of the 2nd receive branch which do not receive properly.18) Increased path loss difference (38956) Error description: The absolute mean value of the path loss difference of the corresponding TRX (see additional info) is above the specified alarm threshold (RFL Alarm Threshold). ECU/FCU (Octet 1 = 0x62): Shows the number of timeslots of the 1st receive branch which do not receive properly. • 24=ECU/FCU (Octet 1 = 0x62): Shows the number of timeslots of the 3rd receive branch which do not receive properly. 2324(D): • 23=ECU/FCU (Octet 1 = 0x62): The upper four bits show the number of timeslots of the 2nd receive branch which do not receive properly due to a bad signal to noise ratio. 2526(D): • 25=ECU/FCU (Octet 1 = 0x62): The upper four bits show the number of timeslots of the 3rd receive branch which do not receive properly due to a bad signal to noise ratio.9/19 STC CU/ECU alarms handling Procedure V1.

octet 4 (MSB) 2122(D): • 21=Measurement count. since the initial test of a module not working properly can pass successfully during low traffic conditions and this can lead to the reuse of a faulty module in another site. It is also recommended to switch the BCCH to the CU/ECU that was generating the alarm after this is cleared. can be located by using the wiring data of the relevant cell. The RF power reflected in the CU/ECU will depend on the traffic carried by this CU/ECU in case that particular CU/ECU is not carrying the BCCH. octet 2 • 26=call count. over temperature.g. octet 1 (LSB) 2526(D): • 25=call count. combiners. octet 2 1920(D): • 19=TRX ID. octet 2 2930(D): • 29=mean value of path loss difference. octet 2 2324(D): • 23=Measurement count. octet 3 (MSB) 2728(D): • 27=mean value of path loss difference. octet 1 (LSB) • 28=mean value of path loss difference. octet 4 (MSB) 4) Clearance actions Most of the alarms can appear in the CU/ECU only when the traffic over that particular CU/ECU increases e. . Locking a BTS RFLoopBack scanner with active alarm conditions will cease these alarms. The alarm contains some information about the responsible HW object. etc. octet 1 (LSB) • 22=Measurement Count. Some of the alarms could also be caused by environmental issues in the site itself e. octet 1 (LSB) • 18=TRX ID. boosters. octet 3 • 30=mean value of path loss difference. Other words: 1718(D): • 17=TRX ID. Due to this behavior is mandatory to define within O&M team the responsibility of detecting and tracking the actual faulty modules.g. With this information all affected RF cabling. The reported TRX ID keeps it's validity up to the next configuration change. octet 3 • 20=TRX ID. octet 3 (MSB) • 24=call count.0 NSN CM Team NSN MS Use Khalil Al Ngashy Negative values indicate higher DL path loss than UL path loss -> Degradation at the transmitter equipment. power amplifiers.10/19 STC CU/ECU alarms handling Procedure V1. even if the error condition still exists. so it is also important to keep record and to track each site behavior when it comes to faulty modules and alarms cleared in that particular site to easily recognize any pattern that may show up.

If the test is failed the module can be deleted and recreated to force a restart.1.2) Loss of board alarm. If the same CU/ECU is disabled after swapping.11/19 STC CU/ECU alarms handling Procedure V1.1.g. this will rule out a possible failure in the CU/ECU position of the rack and provide a HW rest of the module. in case of a new failure the module should be replaced and sent for repair (please refer to annex A and B for further details) 4. If the same CU/ECU is disabled after swapping. this will rule out a possible failure in the CU/ECU position of the rack and provide a HW rest of the module. in case the alarm is generated again and the other words information indicate an over temperature problem it is recommended to visit the site to make sure there is no physical issue with the rack e. this can be done remotely from the RC or the BSC. it has to be replaced and sent to repair (please refer to annex A and B for further details). 2) The TX FlexiCable connecting the CU/ECU with the combiner. in case and the alarm appears again the O&M engineer has to visit the site to pinpoint the failure. this can be done remotely from the RC or the BSC. initially the module has to be tested remotely but the test it’s very likely to be successful (the CU/ECU has to be sent to repair if the test fails).0 4. The failure can be located in: 1) The physical TX output port of the CU/ECU.1) The RF part of the CU/ECU has problems The initial step to clear this alarm is to perform a test on the module. missing cover plate. if the module keeps disabled the O&M engineer should visit the site and swap the CU/ECU another with CU/ECU of the same BTSE.1. rack door open etc… If the alarm reappears the module has to be sent for repair (please refer to annex A and B for further details). If the test passes or the module is recovered after re-creation or swap it is necessary to keep track of the behavior of this CU/ECU to make sure it does not fail again. 4. If the test is failed the module can be deleted and recreated to force a restart. Hardware error Alarm or Test result failed alarm The initial step to clear this alarm is to perform a test on the module. the CU backplane need to be checked If the test passes or the module is recovered after re-creation or swap it is necessary to keep track of the behavior of this CU/ECU to make sure it does not fail again. 3) The TX input port of the antenna combiner. it has to be replaced and sent to repair (please refer to annex A and B for further details). if the module keeps disabled the O&M engineer should visit the site and swap the CU/ECU with another CU/ECU of the same BTSE. 4) A high VSWR in the feeder (indicated in the other words of the RF power reflected warning) If the test is successful from the RC or BSC and the alarm appears again the O&M engineer should visit the site and sequentially perform the following steps: . If another CU/ECU does not work in the faulty CU/ECU position in the rack.3) RF power reflected warning and RF power reflected into powerstage These two alarms are closely related and normally generated due to a problem outside the module.

this will rule out a possible failure in the CU/ECU position of the rack and provide a HW rest of the module. swap the combiner where the CU/ECU is connected to make sure the fault is located in the combiner.1.6) Critical SELIC problem (8260) Proceed as per 4. 4. Replace the related module/flexicable in case any physical damage is found.Swap the TX flexicable with another one from the same BTSE for example CU/ECU:Y. in case it is still not clear where the problem is located the O&M engineer should continue as follows: C) 1. once this is determine the combiner should be replaced and sent for repair (please refer to annex A and B for further details).4. this flexicable has to be replaced E) If the alarm stayed in the same CU/ECU position.5) Error in downlink message from COP/HIT (34841) Initally perform a visual inspection including the CC-link cables related to the CU/ECU with the alarm (replace if necessary). . If the test is failed the module can be deleted and recreated to force a restart. in case of a new failure the module should be replaced and sent for repair (please refer to annex A and B for further details).4) Device driver initialization failed The initial step to clear this alarm is to perform a test on the module. this can be done remotely from the RC or the BSC.6 to 3.Swap the CU/ECU with the alarm with another CU/ECU of the same BTSE for example CU/ECU:X. if the alarm persists replace the COBA of the site.1.1. If the same CU/ECU is disabled after swapping.4 4.1.In case the alarm moves to the position Y where the flexicable was moved to. Alarms in items 3.0 NSN CM Team NSN MS Use Khalil Al Ngashy A) Perform a visual inspection of all flexicables and input/output ports of the CU/ECU and corresponding combiner. it has to be replaced and sent to repair (please refer to annex A and B for further details). If necessary measure the insertion loss of the feeder to make sure it is within the range define by the feeder manufacturer. D) 1-If the alarm moved to position X the CU/ECU has to be changed.1. if the module keeps disabled the O&M engineer should visit the site and swap the CU/ECU another with CU/ECU of the same BTSE. 2. If the test passes or the module is recovered after re-creation or swap it is necessary to keep track of the behavior of this CU/ECU to make sure it does not fail again.1. 4. Then proceed to troubleshoot as per 4.14 should be troubleshooted as per 4.5. Items A and B should pinpoint most of the possible reasons for this alarm. B) Perform a VSWR test of the corresponding feeder.12/19 STC CU/ECU alarms handling Procedure V1.1.1. 2.

1.8) Increased path loss difference (38956) Based on the other words information it is possible to determine if a problem is present in the RX or the TX path.If the alarm is present in more than one CU/ECU in more than one sector in the site or.The test failed the CU is faulty and need to be changed. one of CU X. if the other words information indicates a bad signal to noise ratio in more than one RX path this may indicate an interference problem in the site. 2) The RX FlexiCable connecting the CU with the combiner. 4) High insertion loss in the antenna feeder. Flexicable or Combiner). 3) The RX input port of the antenna combiner. B. After that keep monitoring the site and check the alarm where will be appeared either In CU:X means the CU is faulty.3. 4. If the problem is present in the RX path O&M engineer should proceed as per 4.1.To find out the CU is healthy or not perform test either locally or from remote so if A. FLM has to do swap for each mentioned items . This alarm does not leave the related TRX in disabled state so the CU keeps carrying traffic.7) Diversity Receive Branch Failed (34837) for CU or ECU As mentioned in the error description this alarm indicated a bad signal received level or a bad signal to noise ratio. 1.Y or Z so check the Combiner or in the CUs M&N so the feeder need to be checked. the CU with other one the site for example CU:X.1. 2. contact OPT for support.C.7 otherwise perform the procedure described in 4.13/19 STC CU/ECU alarms handling Procedure V1.If the test pass this means the CU is healthy and the problem outside the CU Either 1) The physical RX output port of the CU.B.&D) Swap the related Feeder with other one in the same sector from BTSE side for example with feeder of CU:M&N.0 NSN CM Team NSN MS Use Khalil Al Ngashy 4. Normally the problem is located outside the CU in the RX path (Antenna. this requires for the field engineer to have all the necessary precautions to avoid any call drops (please refer to Annex B). .1. Swap The RXFlexi cables with other ones in the site for example with the two RX cables of CU:Y&Z. CU:Y or Z so changed the related RX cable. Swap the related Combiner with other combiners of the mentioned CUs (X/Y/Z) for example with combiner Of CUs ( A.

O&M engineer on the field should keep a record of the replaced modules with its serial number to avoid the usage of a replaced module in a different site.All the replacements procedures have to be performed as indicated in the official NSN operating documentation. this includes the usage of all the ESD precautions among all the others recommendations as indicated in the “Maintain Hardware” document (A50016-G5100-B326-04-7620). Since the rotation of the field engineers is very frequent in STC a proper track of the replacement is mandatory to avoid this kind of situations. . .STC could designed a checklist database where the parties involved can register the steps performed in one particular module and can also check and advise based on such information.14/19 STC CU/ECU alarms handling Procedure V1. . This could happen since a faulty module can be enabled right after installation if there is no traffic in it.The uninstalled modules (spare parts or replaced faulty modules) should be packed and transported all the time as received from NSN to avoid any further damage in a faulty module or a fault in a spare due to mishandling.It is recommended to keep track of the tested modules and to define if the replacement has to be done even if the test of module is successful. . .0 NSN CM Team NSN MS Use Khalil Al Ngashy 5) Recommendations .

15/19 STC CU/ECU alarms handling Procedure V1. Module replacement and ESD precautions .0 NSN CM Team NSN MS Use Khalil Al Ngashy ANNEX A.

0 NSN CM Team NSN MS Use Khalil Al Ngashy .16/19 STC CU/ECU alarms handling Procedure V1.

17/19 STC CU/ECU alarms handling Procedure V1.0 NSN CM Team NSN MS Use Khalil Al Ngashy Annex B. Avoiding the lost of calls .

18/19 STC CU/ECU alarms handling Procedure V1.0 NSN CM Team NSN MS Use Khalil Al Ngashy Annex C. Other words .

for this particular example octet 4 is H61.19/19 STC CU/ECU alarms handling Procedure V1. octet 15 H07 and octet 15 HFF.0 NSN CM Team NSN MS Use Khalil Al Ngashy When connected to the LMT the information of the alarm will be displayed as below: The additional words are always presented in octes as in above example. .