You are on page 1of 4

EE CSR: 2040328

T-Mobile: Philadelphia-PHRNC009
Node: PHRNC009.
System Impact: 50% signal degradation.
Cause: Unknown. RCA CSR is under investigation.
Event Date: August 03rd 2012

Areas of Improvement
1.

RCA CSR: 2040387.


RCA CSR status: Investigation is ongoing (evedgar).
Recovery Team:
RM: Fabio Testte.
ERT CNS: Ping Wong, Daniel Niedzielski, Gloria Montoya Solarte, Ruslan
Uvashev.
ERT RAN: Kevin Wang, Alexis Sestini.
SDM: Rich Corso.
Recovery Actions:

According to Customer and PAS alerts, the KPIs related to Iu Signaling


were around 45% success. ERT quickly connected to the node and confirmed
KPIs to be CSIuSigSuc=45.4 and PSIuSigSuc=46.7. Customer commented this
happened already in the recent past and could be related to ETMFG in
MS-07 (000700). ERT and Customer proposed to complete DCG collection and
lock ETMFG 002600. Since there was no change, 002600 was unlocked and
000700 was blocked, but no improvement. ERT and Customer verified that
issue started right after a planned maxconn change. ERT decided to lock
Ranap board 001700 and push active Ranap to 001800, but there was no
improvement either. ERT proposed to lock 000800 and push SCCP into
000900, which in fact lowered the level of exception code 142 and
improved the KPIs. Two ROPs were monitored, confirming to have both
CSIuSigSuc and PSIuSigSuc at 100%. Customer and ERT agreed to close the
Emergency. Actions were be planned on the boards which were
still locked (001700 and 000800).

Timeline breakdown:
E2E Total: 78 min
ERT Total: 68 min
CDBE = ~10 min
CDAE = 0 min
EDT = 2 min

1.

To reduce recovery time below 40 min.

Process/Procedure Improvements:
ERT followed the normal investigation path. ERT worked
first on the ETMFG, due to past issues. Then tried to swap
Ranap process and finally swap SCCP process, which
solved the issue. Proposed Action Item: create an
Emergency Learning post to share knowledge within ERT.

Product Improvements:

Competence Improvements:

To achieve early detection

Process/Procedure Improvements:

Product Improvements:

Competence Improvements:

To avoid the event entirely

Process/Procedure Improvements:

Product Improvements:
Technical RCA is still ongoing. Questions to be answered
are why the maxconn change triggered this issue.
Proposed Action Item: include an SCCP restart in the MOP
(already done). Follow up on RCA and verify if a GOLS
Bulletin would be needed.

Competence Improvements:

ETTE = 2 min
PTTE = n/a

SMS EE 2040328

PRT = 5 min
ETBP= n/a

PLM
Escalation
Required:
No. | Ericsson Internal | EMC-10:005221 Uen, Rev B | 2010-08-18 | Page 1
RNAM
Soft RCA
template
PLM Escalation Reason: n/a.

Emergency Event Time Definitions

E2E End to End


Total time of outage End-to-End
Time begins at start of outage and ends when outage is resolved

ERT Ericsson Recovery Time


Time begins when Ericsson is advised of the outage and ends when outage is resolved
Excludes Customer Delay Before Ericsson (CDBE) and Customer Delays After Ericsson (CDAE)
ERT=E2E-CDBE-CDAE
Time includes Ericsson Delay Time (EDT) and Product Recovery Time (PRT)

CDBE: Customer Delay Before Ericsson


Time from when the customer detects an outage condition by alarm or customer complaints until Ericsson is contacted for assistance.
Delay time between when outage occurred and when customer contacted Ericsson for support.

CDAE: Customer Delay After Ericsson


Time during the recovery process, when a customers actions or inactions directly contributed to a delay in recovering the system.

EDT: Ericsson Delay Time


Time during the recovery process when Ericssons actions or inactions directly contributed to a delay in recovering the system (includes delays to get logged in
from all levels of support)
Includes:
ETTE: ER Time To Engage - Time delay for ER to get engaged
PTTE: PLM Time To Engage - Time delay for PLM to get engaged

PRT: Product Recovery Time


Time from when effective recovery strategy is invoked until when the product is recovered as expected.
Time required to verify a successful recovery action due to the product s lack of notification.

ETBP: ER Time Before PLM


Time from when Emergency Recovery team is contacted until when they escalate to the PLM for recovery assistance
PSP: Primus Search Performed
Indicates whether a primus search was performed by Global Support

EDBP: Ericsson Delay Before Primus


Time from when Emergency Team is connected to the node and performs a primus search

PSU: Primus Search Useful


Indicates whether the primus search resulted in a useful solution that was implemented in the action plan.
PER: PLM Escalation Reason

PEN: PLM Escalation Needed: Yes/No

Reason for which the Emergency Event was escalated to PLM


RNAM Soft RCA template | Ericsson Internal | EMC-10:005221 Uen, Rev B | 2010-08-18 | Page 2

Emergency Event Time Graph


END to END

EDT

EDT

Product Recovered

Solution implemented

Awaiting customer to implement


E/// proposes solution

PLM engaged

ER escalates to PLM

ER Engaged
Customer calls E///

RNAM Soft RCA template | Ericsson Internal | EMC-10:005221 Uen, Rev B | 2010-08-18 | Page 3

PRT
CDAE
CDBE

ERT
ERT

RNAM Soft RCA template | Ericsson Internal | EMC-10:005221 Uen, Rev B | 2010-08-18 | Page 4

You might also like