Professional Documents
Culture Documents
APG40?
Now what!!
I. Introduction.........................................................................................2
II. Scope 2
I. Introduction
We have the lazy moose, the lazy dog, and lazy guides for almost everything,
why not a Lazy APG?
Special because this is a new product to all of us, and some of us did not have
yet the chance to work with it enough or at all, although we all have to eventually
start to handle emergency situations.
II. Scope
The purpose with this document is to have one document as a base of the
APG40 handling. This document is to be used as an help to our local engineers
when handling emergency situations. Although you must be careful when
fowling some of the suggested recovery actions, and if you are not sure if it is
applicable, or how to proceed, please contact the next level of support (I always
love this sentence, specially in OPI?s)
Note that you are supposed to know a little bit of APG40 when following this
document.
Keep this document as internal use only, do not send it to the customer at any
circumstances, since it is an non official document.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 3 / 75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
This document is based in Alex Library as well some other useful APG docs and
Primus Cases. Please let me know if you find out something that is not clear
enough, or that might need an update.
The very first thing we should do when receiving an Emergency APG40 call is to
check the AP status and find out if they’re where any works in progress:
Take note of the APG status, this may be performed by phone, later on a cmd file
or a list of commands may be asked to the customer together with the log files.
If SW is R9.1
fchstart –V
If SW R10
fchstate
Now at this stage, together with the customer explanation you should be able to
identify the severity of the case as well an idea of might be need to recover the
problem.
So before you start to do anything you should always request the customer to
make an mktr log on both nodes .
Just enter the command : mktr YYMMDD-HHMM
IV Recovery Actions
Collect data
If you will escalate this problem the best will be if you collect data before.
Send the following CMD file to the customer and ask him to open a new log file
and run it on both nodes.
collectdata.cmd
!APG_collectdata_CMD
hostname
date /t
time /t
set
prcstate
swrprint
swrsid -h
pstat
alist
ipconfig /all
type d:\burinfo.txt
netdom query
netdom bdc
net start
cluster node
cluster group
cluster res
!if R9.1
fchstart –V
! if R10
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 5 / 75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
fchstate
! only in the active node
mml allip:alcat=apz
mml apamp
dsdls -a
mml octdp
cpdlist -l
cpfls
afpls
cdhls
cdhdsls
"C:\Program Files\Dptmgr\Raidutil" -L all
dir /s k:\images
net share
!CLose the logfile
FCH handling
As a first action in these situations, you must figure out which kind of fallback you
need, one node restore or both node restore? or just a fallback with command
fchfb.
Note : The command fchstart -V (or fchstate if R10) will print the current status
of the FCH session and information on what steps must be performed.
The printout will guide the user how to proceed. The printout has several
sections. The first line is the status of the node and the FCH session.
After the first line may follow one or two lines describing for how long time this
state has been in effect and how long time it may take before the operation is
completed.
After these lines follows several lines describing what actions are going on.
Finally, one or several lines describing the status of the cluster server, the
ACS_FCH_Server and the connection to the other node.
1)
Note 1 if this is an MEC which as two FCH, maybe a first one to remove a
module and no backup was performed meanwhile; go to the next option also.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 6 / 75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Since even if you fallback you will end up with different SW between the two
nodes.
burrestore
Data about the image file from the backup partition will be shown (image name,
date, node and status). Confirm this as a valid backup. y
As the system restore forces a swap of the disk partitions C and D the
system running on C will be stored as an image on D.
Enter an image name at dialog:
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 7 / 75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
The node will reboot, wait 5 minutes then logon to the restored node.
After executing the burrestore command, the cluster configuration may still be
corrupt. But that will be corrected when executing fchend. If the cluster
configuration is not properly restored by fchend, see FCH Appendix Information
Section 3.2.
The restored node comes back online and joins the cluster. Use command
cluster node /stat to verify the node comes back online and joins the cluster.
Yes Go to step 6 .
No, the node did not come back after the reboot. No contact.
Note in this case, try to reboot the restored node.
Send the command fcc_reset other from the active node.
No, the node did come back but did not join the cluster. Undefined state.
Go to step 5 then to the next one step 6.
7) fchfb
No Lbb SW was loaded, initiate a Function Change (FCH) Fallback.
It can be executed in all FCH states up to and including Supervision state (that is,
after a successful execution of the fchstart command and the following reboot,
but before executing the fchcommit command).
fchfb
Note: This command will return the system to the configuration that existed
before the function change was initiated.
Note: At the end of this command the system will be rebooted. The operator
must wait for the node to come back on-line and connect to it again before
continuing.
Note: Use the command fchstart -V. Possible format of printouts from fchstart -V
is described in FCH apendix. The state End indicates a successful fallback.
Note: Please note that it may take some minutes for the fchfb to complete after
the system has come back on-line after the reboot.
Yes
Go to Step 6). End FCH
No
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 9 / 75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Note:
The AP will reboot and start up again.
During the reboot time, the APG will
be unavailable and will be unable
to offer any service.
Note:
If the cluster service has not
started automatically start it
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 11 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
18 Go to chapter
"Backup of the AP system"
to backup the running system
to the D partition.
Note:
The currently running system
should be on CM124_R1A level. So
do not use exact the system names
from chapter
"Backup of the AP system".
The soft function change has failed due to an error and sfcexec has
initiated fallback. Due to a second error the sfcexec has failed to restore the old
parameters (failed fallback).
As a result the active node has got the new CXC parameters (sfcexec
has failed to restore the old values), while the passive node has got the old
parameters.
The fault is that the OPIs indicate that 'The job is completed', and the system is
left in an inconsistent state (different CXC parameters in the two nodes).
So as first measure this OPI's will be updated.
What you should do in this situation is restore the node which got the new
CXC parameters, for this go to one node restore step 4) of FCH handling.
Note if this node is active one, you should reboot it, and force a fail over.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 12 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
2.- Open a channel with WIOL 6.0 towards the cluster IP-addr
of the APG, using administrator rights.
C:\>L:
L:>\cd FMS
L:\cd data
L:\ cd tmp
L:\>dir
5.- Send the CP backup subfiles from LCT to APG40 system partition
L:\FMS\data\tmp\CPbackup, using ftp.
For the ftp transfer, you can use WS_ftp or Microsoft Internet
Explorer . Use binary transfer option. Close ftp session after the transfer.
2.- Check you have enough free space in the L:\ partition
of your APG for exporting temporarily the subfiles R0..R5:
C:\>L:
L:\FMS\data\tmp >dir
Connect to CP EX-side:
C:\>mml
Separate IPN-1:
<OCISI:IPN=1;
Separate CP SB-side:
<FCSEI;
Connect to CP SB-side:
<exit;
C:\>mml -s
Connect to CP EX-side:
<exit;
C:\>mml
Connect to CP EX-side:
C:\>mml
Separate IPN-1:
<OCISI:IPN=1;
Separate CP SB-side:
<DPSES;
Connect to CPT:
<PTCOI;
End CPT:
cpt< PTCOE;
Connect to CP SB-side:
<exit;
C:\>mml -s
Connect to CPT:
<PTCOI;
End CPT:
cpt< PTCOE;
Connect to CP EX-side:
<exit;
C:\>mml
Connect to CP SB-side:
C:\>mml -s
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 17 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Connect to CPT:
<PTCOI;
Separate CP SB-side:
cpt< PTSES;
End CPT:
cpt< PTCOE;
<exit;
Separate IPN-1:
C:\>ipnaadm -state sep -ipnano 1
Connect to CP SB-side:
C:\>mml -s
Connect to CPT:
<PTCOI;
End CPT:
cpt< PTCOE;
Connect to CP SB-side:
<exit;
C:\>mml -s
Connect to CPT:
<PTCOI;
End CPT:
cpt< PTCOE;
Connect to CP EX-side:
<exit;
C:\>mml
!charging scheme!
!---------------!
CP(FOAM) APG OS
/----!----\ /---\ /--\
CDR's-->CHOP-->MTAP ----> RDT ------>BGW
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 19 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
1) A call is finished and some CDR's are generated according to the charging
case data (i.e. CHASP:CC=7).
3) The CDR's are not sent to APG until the buffer is full or the time limit for
releasing the buffer has expired.
In the example below, if you make a single call that generates 2 CDR's,for sure
these CDR's don't fill the 4Kb buffer, so they will have to wait OUTP=00100 = 1
minute until these 2 CDR's are sent to the APG(RDT).
<CHOPP;
COMMON CHARGING OUTPUT ADJUNCT PROCESSOR INTERFACE DATA
If the buffer becomes full before 1 minute, the system sends the contents of the
buffer to the APG40 inmediately.
4) When the CDR's arrives to APG, they are stored in a Call Record Block,
(CRB, is like a group of CDR's). This CRB has a size and a time limit, and the
same behaviour as the buffer in CHOP.
The size and time limit can be found printing the charging parameters in APG:
C:\rdtview -f
In the example below CRB length is 16384 bytes (see BlockingLength) and the
time limit is 300sg (see BlockingHoldTime). If the CRB is full before 300sg, the
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 20 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
CRB will be released, if the CRB is incomplete, the CDR's will have to wait in the
CRB 300sg (5minutes) before the block will be released.
========================================
RDT Output Stream Table
========================================
RDT_OutputStream BlockingLength: 16384 (range 512--32768 bytes)
RDT_OutputStream BlockingMaxCallRecordsPerBlock: 0 (max. num of CDR's
per CRB)
RDT_OutputStream BlockingLengthType: VARIABLE
RDT_OutputStream BlockingHoldTime: 300 (range 1--7200 seconds)
RDT_OutputStream BlockingPaddingCharacter: 32
RDT_OutputStream FilingFileSize: 16384 (range 10-16384 Kilobytes)
RDT_OutputStream FilingFileHoldTime: 300 (range 1--86400 seconds)
RDT_OutputStream FilingFileGroupID: OHSLOCALUSRG
5) The CRB's released are put together in a file, once more, this file will have a
size and a time limit, in the above parameters we can see a file size of 16384
Kb, (see FilingFileSize) and a time limit of 300sg (see FilingFileHoldTime). This
time, is going to be complicated to fill a file with this size limit, so, after 5 minutes
(time limit) a file will appear with the CDR's received until that moment.
6) When the charging files are released, they are stored in the disk partition
Y:\RDT\data\FileBasedDest1\Ready, for File Based initiating method, and in
Y:\RDT\data\RespFile1\Ready, for File Based responding method.
7) In case you are in a test plant, and there's no BGW, if you need to get the
charging files for decoding, you can do the following:
a) Connect to active node in order to copy the charging files to the ftp area
Open a telnet session with the active node, with administrator rights.
C:\>prcstate
active
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 21 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
C:\>dir y:\rdt\data\filebaseddest1\ready\initfile.20021016*
Volume in drive Y is Disk Y:
Volume Serial Number is 9884-7F2B
Directory of y:\rdt\data\filebaseddest1\ready
C:\>dir y:\rdt\data\respfile1\ready\respfile.20021016*
Volume in drive Y is Disk Y:
Volume Serial Number is 9884-7F2B
Directory of y:\rdt\data\respfile1\ready
y:\rdt\data\respfile1\ready\Respfile.200210161641330
y:\rdt\data\respfile1\ready\Respfile.200210161806300
6 file(s) copied.
b) Now, open a ftp session, as we saw in chapter 2, from your computer in order
to get the files.
8)How the CDR's travel from AP to BGW is another story, out of the scope of
this document.
COMMANDS:
LIMIT
BANS
END
CHOIP;
COMMON CHARGING OUTPUT INTERFACE DATA
INTERF
IOG
END
CHODP;
COMMON CHARGING OUTPUT CHARGEABLE DURATION DATA
FN CDN RSN
AP 0000000000 00000000
TT 0000000322 00000057
ICI 0000000000 00000000
END
CHOPP;
COMMON CHARGING OUTPUT ADJUNCT PROCESSOR INTERFACE DATA
SYPAC:ACCESS=ENABLED,PSW=PSW2PAR;
DBTRI;
DBTSC:TAB=AXEPARS,SETNAME=AMCFOAMC,NAME=DMHSUPPORTED,
VALUE=0;
DBTRE:COM;
SYPAC:ACCESS=DISABLED;
<DBTSP:TAB=AXEPARBLOCKRELS,NAME=DMHSUPPORTED;
DATABASE TABLE
END
<LASLP:BN=28;
STORAGE LAYOUT
SAAEP:SAE=500,BLOCK=CHOP;
SAAII:SAE=500,BLOCK=CHOP,NI=2000;
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 25 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
CHOEC:OUTP=00100;
EXECUTED
CHOPP;
COMMON CHARGING OUTPUT ADJUNCT PROCESSOR INTERFACE DATA
CHOPI:BSIZE=4;
EXECUTED
CHOPP;
COMMON CHARGING OUTPUT ADJUNCT PROCESSOR INTERFACE DATA
CHOIP;
COMMON CHARGING OUTPUT INTERFACE DATA
INTERF
AP
END
<CHOUP:fn=tt;
COMMON CHARGING OUTPUT USER FUNCTION DATA
Y:\RDT\data\FileBasedDest1\Ready>rdtview -f
========================================
OHS Initiating Rpc Output Handler Table
========================================
OHS_InitiatingRpcOutputHandler BlockBasedDest1 DestinationID: BlockBasedDest1
OHS_InitiatingRpcOutputHandler BlockBasedDest1 DestinationHostName: 159.107.9.120
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RPCProgramNumber: 300625
OHS_InitiatingRpcOutputHandler BlockBasedDest1 DataVersion: 2
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RetryInterval: 1
OHS_InitiatingRpcOutputHandler BlockBasedDest1 LinkDownTimeOut: 300
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RPCBackupEnabled: NO
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RPCBackupDestinationHostName: -
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RPCBackupRPCProgramNumber:
591751040
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RPCBackupDataVersion: 2
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RPCBackupRetryInterval: 10
OHS_InitiatingRpcOutputHandler BlockBasedDest1 RPCBackupLinkDownTimeOut: 300
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupEnabled: NO
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupDirectoryName:
Y:\RDT\BlockBasedDest1
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupFillingSize: 1024
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupFillingFileHoldTime: 300
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupFillingFileGroupID:
OHSLOCALUSRG
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupAllocationLimit: 2097152
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupWarningLimit: 80
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupNameTemplate:
BackupRespFile.<yyyy><mm><dd><HH><MM><SS><n>
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupKeepTime: 24
OHS_InitiatingRpcOutputHandler BlockBasedDest1 FileBackupStartupSequenceNumber: 0
========================================
OHS Initiating Ftp Output Handler Table
========================================
OHS_InitiatingFtpOutputHandler FileBasedDest1 DestinationID: FileBasedDest1
OHS_InitiatingFtpOutputHandler FileBasedDest1 DestinationHostName: 159.107.9.120
OHS_InitiatingFtpOutputHandler FileBasedDest1 UserName: reeguest
OHS_InitiatingFtpOutputHandler FileBasedDest1 Password: reeguest
OHS_InitiatingFtpOutputHandler FileBasedDest1 RetryInterval: 5
OHS_InitiatingFtpOutputHandler FileBasedDest1 DestinationHostPort: 21
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 27 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
========================================
RDT File Based Destination Table
========================================
RDT_FileBasedDestination RespFile1 DestinationID: RespFile1
RDT_FileBasedDestination RespFile1 AllocationLimit: 2097152
RDT_FileBasedDestination RespFile1 WarningLimit: 80
RDT_FileBasedDestination RespFile1 NameTemplate:
Respfile.<yyyy><mm><dd><HH><MM><SS><n>
RDT_FileBasedDestination RespFile1 KeepTime: 24
RDT_FileBasedDestination RespFile1 StartupSequenceNumber: 0
========================================
OHS File Based Output Handler Table
========================================
OHS_FileBasedOutputHandler RespFile1 DestinationID: RespFile1
OHS_FileBasedOutputHandler RespFile1 DirectoryName:
========================================
RDT Block Based Destination Table
========================================
RDT_BlockBasedDestination BlockBasedDest1 DestinationID: BlockBasedDest1
RDT_BlockBasedDestination BlockBasedDest1 AllocationLimit: 2097152
RDT_BlockBasedDestination BlockBasedDest1 WarningLimit: 80
========================================
RDT General Table
========================================
RDT_General DataLoggerWarningLimit: 80
RDT_General DataLoggerAllocationLimit: 2048
RDT_General VolumeManagerWarningLimit: 80
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 28 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
========================================
RDT Output Stream Table
========================================
RDT_OutputStream BlockingLength: 16384
RDT_OutputStream BlockingMaxCallRecordsPerBlock: 0
RDT_OutputStream BlockingLengthType: VARIABLE
RDT_OutputStream BlockingHoldTime: 300
RDT_OutputStream BlockingPaddingCharacter: 32
RDT_OutputStream FilingFileSize: 16384
RDT_OutputStream FilingFileHoldTime: 300
RDT_OutputStream FilingFileGroupID: OHSLOCALUSRG
========================================
RDT Statistics Table
========================================
RDT_Statistics GenerationTime: 00:00
RDT_Statistics ResetFlag: NO
Useful info
-----------
The RDT application, when is started, creates authomatically the following
directories:
Ready: This directory contains CRB files that are ready to be sent to the OHS.
When such a file will be sent, the file is moved to the Send directory.
When such a file will be archived, the file is moved to the Archive directory.
Send: This directory contains the CRB file that is being sent to the destination.
When this file is successfully transferred, the file is moved to the Delete directory.
When this file is going to be archived, the file is moved to the Archive directory.
Delete: This directory contains CRB files that were sent or archived and can
therefore be removed from the AP (as far as this destination is concerned).
For each file, the file will be removed after a certain amount of minutes.
This value is configurable using the parameter KeepTime in the
RDT_FileBasedDestinationTable (Check the Application Information of RDT).
Archive: This directory contains CRB files that are being archived.
When such a file is archived, the file is moved to the Delete directory,
unless it was stated that the file must be archived twice.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 29 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Keep: This directory contains CRB files coming from the Delete directory.
Each file will be physically removed after a configurable amount of hours using
the parameter KeepTime in the RDT_FileBasedDestinationTable (Check the
Application Information of RDT).
C:\>stmotls
RECORDING AREA
ON OPER
!Retaining time
stmdbrt -p
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 30 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
One know fault : If the stmotls list is full and state is cancelled!
The sts process needs to be restarted.In some cases when the STSCBIN is in a
faulty revision.
stmotls
stmfols
stmmp -L -l
stmrp -L -l
cdhls –l dest………
afpls –a transq… destset…. ! –1 on retries is equal to infintive!
cdhdsls -l
afpls –ls dest………
vdls -n "Default FTP Site"
Directory Structure of STS collection and transferto Virtual directory.
The STS is collected in the S drive and transferred using AES to the virtual
directory in the K drive.
cd /d S:\STS\data\DeliveryDir && DIR
cd /d S:\STS\data\DeliveryBuffDir && DIR
C:\>stmotls -c
Error: No correction area
C:\>stmotls -l AAL2PATH
RECORDING AREA
ON OPER
END
C:\>stmotd -i
C:\>stmotls -c
RECORDING AREA
ON CORR
C:\>stmotls -l AAL2PATH
RECORDING AREA
ON OPER
END
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 33 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
9.- Define a report for the Object Type AAL2PATH, the result printout
of this command is the Report Identity assigned:
11.- Define a measurement program for the previous report id. 2002,
the result printout of this command is the MP Identity assigned.
12.- Print all the MP defined in the system, check that our reports
identities 1002 and 1003 are present:
S:\Sts\data\Deliverydir>stmmp -L
MPID STATUS MP-NAME
1000 running DTIMP
1001 running MULTIMEDIAMP
1002 running AAL2MP
1003 running OTROAAL2MP
C:\>stmmp -L -l 1003
MPID UTC BEGINTIME ENDTIME REPEATS INTERVAL REPORTID
1003 no 06/11/2002 09:30:00 06/11/2002 09:45:00 15 2002
OUTPUTFORMAT STATUS MP-NAME
LF running OTROAAL2MP
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 34 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
C:\>stmmp -L -l 1002
MPID UTC BEGINTIME ENDTIME REPEATS INTERVAL REPORTID
1002 no 06/11/2002 08:45:00 infinite 60 2002
OUTPUTFORMAT STATUS MP-NAME
ASN.1 running AAL2MP
S:\Sts\data\Deliverydir>dir
Volume in drive S is Disk S
Volume Serial Number is 444A-8936
Directory of S:\Sts\data\Deliverydir
06/11/02 11:46 <DIR> .
06/11/02 11:46 <DIR> ..
06/11/02 09:46 <DIR> AAL2MP_200206110645_1
06/11/02 10:46 <DIR> AAL2MP_200206110745_2
06/11/02 11:46 <DIR> AAL2MP_200206110845_3
06/11/02 09:46 <DIR> OTROAAL2MP_200206110745_1
76 File(s) 0 bytes
4,605,517,824 bytes free
Note. The time in the reports name is always referred to GMT (UTC Universal
Time Coordinated).
In Spain the local time is GMT + 1hour, and due to daylight savings
changes:
16.- Check how long statistics reports are kept in the system,
this is the retaining time, giving in days :
C:\>stmdbrt -p
Database retainment time: 1 (one day)
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 35 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
!check status!
S:\Sts\data\Deliverydir>stmmp -L -l 1003
MPID UTC BEGINTIME ENDTIME REPEATS INTERVAL REPORTID
1003 no 06/11/2002 09:30:00 06/11/2002 09:45:00 15 2002
OUTPUTFORMAT STATUS MP-NAME
LF expired OTROAAL2MP
S:\Sts\data\Deliverydir>stmmp -L -l 1002
MPID UTC BEGINTIME ENDTIME REPEATS INTERVAL REPORTID
1002 no 06/11/2002 08:45:00 infinite 60 2002
OUTPUTFORMAT STATUS MP-NAME
ASN.1 running AAL2MP
!delete MP's
S:\Sts\data\Deliverydir>stmmp -D 1002
Program deleted
S:\Sts\data\Deliverydir>stmmp -D 1003
Program deleted
S:\Sts\data\Deliverydir>stmmp -L
MPID STATUS MP-NAME
1000 running DTIMP
1001 running MULTIMEDIAMP
!delete reports
C:\>stmrp -L
REPORTID IN-USE NAME
2000 yes DTIREP
2001 yes MULTIMEDIAREP
2002 no AAL2REP
C:\>stmrp -D 2002
Report deleted
C:\>stmrp -L
REPORTID IN-USE NAME
2000 yes DTIREP
2001 yes MULTIMEDIAREP
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 36 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Backup of APG40
1.- login to active node (in our case node B), check this node is the "active" one!
C:\>prcstate
active
C:\>hostname
APG40-1B -->this is the B-node, and is active.
C:\>cd bur
3.-Store partition info in C:\bur, this action must be done from active node!
C:\bur>dir
Volume in drive C is ntserv
Volume Serial Number is 18D2-B76A
Directory of C:\bur
C:\bur>exit
C:\>prcstate
passive
C:\>hostname
APG40-1A -->this is the A node.
C:\>dir \\APG40-1A\c$\bur
Volume in drive \\APG40-1A\c$ is ntserv
Volume Serial Number is 9893-1A66
Directory of \\APG40-1A\c$\bur
C:\>dir \\APG40-1B\c$\bur
Volume in drive \\APG40-1B\c$ is ntserv
Volume Serial Number is 18D2-B76A
Directory of \\APG40-1B\c$\bur
8.- Copy the .ddi info from active node (node-B) to passive node A,
in the following way:
C:\>cd bur
C:\bur>dir
Volume in drive C is ntserv
Volume Serial Number is 9893-1A66
Directory of C:\bur
C:\bur>burbackup
Image name:apawqofg079r_A_ingo2_msc10
Execute burBackup with these parameters:
-src C:\
-dest D:\
-ImageName "apawqofg079r_A_ingo2_msc10"
[y=yes, n=no]?y
burBackup execution completed
687 directories and 11304 files copied or equal and 0 files locked
C:\bur>d:
D:\>cd bur
D:\bur>dir
Volume in drive D is ntbackup
Volume Serial Number is D4A5-52E3
Directory of D:\bur
01/16/02 11:37a <DIR> .
01/16/02 11:37a <DIR> ..
01/16/02 11:16a 22,090 020116_backup.ddi
11/13/01 02:17p 21,582 clone_test_all_appl_cm13.ddi
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 39 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
C:\>prcstate
passive
!APG40-1C is the name of the cluster (see with cluster /VER command).
C:\>dir \\APG40-1C\k$\images
Volume in drive \\APG40-1C\k$ is Disk K
Volume Serial Number is 906C-7BC2
Directory of \\APG40-1C\k$\images
C:\>prcstate
active
C:\>hostname
APG40-1B
C:\>prcboot
REBOOT INITIATED!
C:\>exit
C:\>prcstate
Error: failed to find pipe. Error: 2
C:\>prcstate
passive
C:\>hostname
APG40-1B -->now, node-B is passive
C:\>burbackup
Image name:apawqofg079_B_ingo2_msc10
Execute burBackup with these parameters:
-src C:\
-dest D:\
-ImageName "apawqofg079_B_ingo2_msc10"
[y=yes, n=no]?y
burBackup execution completed
680 directories and 11283 files copied or equal and 0 files locked
C:\>d:
D:\>dir
Volume in drive D is ntbackup
Volume Serial Number is 5084-EB2B
Directory of D:\
C:\>exit
Note. In this document we only consider the case of restoring one node
at a time. Recover two nodes simultaneously is out of the scope of
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 42 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
this document.
One node restore (see step 22 "Select appropiate action" in the OPI)
----------------
In this example, we want to restore A node, that is working badly:
Note. APG40-1C is our cluster name, check yours with command cluster /ver.
3.- Restore system from backup partition D:, sending the command:
C:\>burrestore
4.- The system ask you something like “Restore from this backup [y=yes, n=no]”,
confirm with yes (y) the question:
Imagename: AP02090222A_CM110_LBB_updated
Date: 9/23/02 4:26:11 PM
Node: APG40-1A
Status: OK (7150 files and 719 directories copied, 0 locked files)
5.- The system will ask you a name for the faulty SW running in node A,
in our case we chose "corrupt_system_node_A"
7.- After typing the name, press Enter and the system will answer:
8.- Check that all cluster resources are on-line after reboot:
C:\>cluster res
Note: it could take several minutes for the resources to be on-line, repeat the
command and take it easy.
CONDITIONS:
PROCEDURE:
3. Verify the status of the RAID disks. All entries must be reported as Optimal.
If the command is issued on the PASSIVE node, Disk Drive entries will be
reported as "Optimal" and RAID 1 entries as "Drive Failed".
C:\>"C:\Program Files\Dptmgr\dptutil" -L raid
FOS:
C:\>fosview -s
FOS:
C:\>fosview -b
Make sure CP file system can communicate with FMS in the APG.
C:\>mml SYBFP:FILE
12. Verify there have been no recent errors or warning messages in the event
logs.
C:\>dumplog system
C:\>type c:\temp\log.txt
C:\>dumplog application
C:\>type c:\temp\log.txt
C:\>dumplog security
C:\>type c:\temp\log.txt
C:\>del c:\temp\log.txt
This instruction may be follow when the alarm CP AP time diference exceed
600ms.
Active Node
1. Check time
time /t
2. Check CP time
mml CACLP;
3. Set time
time
4. Check time
time /t
5. Passive Node
Check time
time /t
5. set time
6. Check time
time /t
This is a partial list of the parameters for the mml command. Check the man
page to find all the parameters and their functions.
-a Used to indicate that this terminal will be an alarm receiver during this mml
session. This means that when the operator releases the terminal using “ctrl
d” or the equivalent, all spontaneous printouts, including alarms, will be sent
to the terminal.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 49 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
-d Must include the device ID. This parameter specifies the device ID to be
seized for this session. If the device is already in use, a fault message
“Device is occupied” will be returned.
-i Must include the device ID. Sends all spontaneous printouts to the specified
device ID. When used in conjunction with the –a parameter, alarms are
received at both devices.
-I Must include the device ID. Redirects spontaneous printouts to the specified
device ID.
-Q Invokes a dialog that asks which CP side you want to connect to: EX or SB.
-r Time period for continuous retries when Function Busy is returned by the
system.
-w Specifies the waiting time for result printouts before sending the next
command. A value of zero means no wait time.
The most convenient method for system access is to log on to the AP, then
immediately enter the mml command with appropriate parameters (-a being the
most common for maintenance personnel). The AP system is then reachable via
the APLOC command without releasing the AD device that was originally seized
at CP logon.
**********************************CAUTION*******************************
If you use APLOC; to access the AP, then you should always return to the CP
using exit. If you return to the CP with mml, you now have two CP sessions
running and you run the risk of "losing" a session.
Cluster Appendix
cluster group
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 50 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
If SW R10
fchstate
This command is used to report the Function Change status. It may be used at
any time during a Function Change session to report the current status of the
session. It may also be used after a Function Change session to report the status
of the latest sessions.
Node status reported by fchstate -s
The command fchstate -s prints one line:
<status> <state>
The <status> indicates the node status:
active
The node is active.
passive
The node is passive.
ClusterDown
The Cluster is down.
Limbo
The Cluster Group is not online.
<state> indicates the FCH state.
CommitDone
THE SYSTEM IS COMMITTED.
Run the command 'fchend' to end the session.
If you cannot run the command 'fchend',
then run the command 'fchrst' to restore the non-upgraded node.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 55 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Committing
The command 'fchcommit' commits the system after supervision.
Wait for the 'CommitDone' state.
FCH is in the middle of preparing the non-upgraded node for upgrade.
Wait for the 'CommitDone' state.
Config1
ACS_FCH_Server is executing deletion of resources
in the cluster database.
If the 'Config1' state persists for several minutes,
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is updating the cluster database.
Wait for the 'Supervision' state before continuing. If the state 'Config1' persists for
several minutes, go to fallback the FCH session.
Config1B
ACS_FCH_Server is executing update of resources
in the cluster database.
If the 'Config1B' state persists for several minutes,
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is updating the cluster database.
Wait for the 'Supervision' state before continuing. If 'Config1B' state persists for
several minutes, go to fallback the FCH session.
Config2
The command 'fchend' is executing deletion of old resources
in the cluster database non-upgraded node.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 56 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Config2B
The command 'fchend' is executing update to new resources
in the cluster database, non-upgraded node.
If the 'Config2B' state persists for several minutes,
then try to reboot the non-upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the non-upgraded node.
The command fchend is updating the cluster database. The node will reboot
when the update is finished.
Wait for the rebooting node to come back online before continuing. If the
'Config2B' state persists for several minutes, try rebooting the node you are
executing fchend(1m) on using the command prcboot(1m). If this does not work,
or if you are certain the command has been interrupted, go to fallback two node
restore.
Config3
ACS_FCH_Server or the command 'fchfb' is executing deletion
of resources in the cluster database.
If the 'Config3' state persists for several minutes,
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is falling back to the system existing prior to the start of the FCH session.
Wait for the upgraded node to reboot and come back online before continuing. If
the fallback has failed, (that is if you receive an error message from the
command fchfb(1m) or the 'Config3' state persists for several minutes), reboot
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 57 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
the upgraded node to force a fallback. If this too fails, or you are sure that the
fallback has been interrupted, please go to restore the upgraded node.
Config4
ACS_FCH_Server or the command 'fchfb' is executing addition
of old resources in the cluster database.
If the 'Config4' state persists for several minutes,
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is falling back to the system existing prior to the start of the FCH session.
Wait for the upgraded node to reboot and come back online before continuing. If
the fallback has failed, (that is if you receive an error message from the
command fchfb(1m) or the 'Config4' state persists for several minutes), reboot
the upgraded node to force a fallback. If this too should fail, or you are sure that
the fallback has been interrupted, please go to restore the upgraded node.
Config5
ACS_FCH_Server or the command 'fchfb' is executing update
to old resources in the cluster database, upgraded node.
If the 'Config5' state persists for several minutes,
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is in the middle of falling back to the configuration existing prior to the FCH
session.
Wait for the 'End' state . If options -L,-c or -x were used in the command fchstart,
fchfb cannot complete the fallback since LBB software upgrade. To complete the
fallback you must use the command fchrst.
If the fchfb fails or if the 'Config5' state persists for several minutes, try rebooting
the upgraded node using the command prcboot. If this also fails to fallback the
node, go to restore the upgraded node.
Config6
ACS_FCH_Server is executing deletion of old resources in
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 58 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Config6B
ACS_FCH_Server is executing update to old resources
in the cluster database, non-upgraded node.
Due to unavailability of the upgraded node,
the non-upgraded node is being made active.
Wait for the upgraded node to come back online.
If the 'Config6B' state persists for several minutes after the
upgrade node is back online and the ACS_FCH_Server is started,
then try to reboot the non-upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the non-upgraded node.
During execution of fchend , the active node failed or rebooted. The non-
upgraded node has temporarily taken over as active node.
When the upgraded node comes back online, go to complete fchend. If fchend
cannot be executed go to fallback, restore needed.
End
A fallback has occurred. Please run the command 'fchend'
on the passive node to end the session.
FCH has finished a fallback of the session.
Go to end the FCH session.
EndInstallDone
The command 'fchend' has finished installation in the non-upgraded node.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 59 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
The command fchend has finished installation of the non-upgraded node. The
node will reboot to activate the new software.
Wait for the rebooting node to come back online.
EndInstalling
The command 'fchend' is installing the non-upgraded node after commit.
The command fchend is installing software on the second node. If the command
fchend detects any problems during the installation process, error messages are
printed and the command initiates a local fallback.
If you are sure the command has failed, go to restore the non-upgraded node.
EndOrigNode
FCH is about to, or executing end of a failed FCH session.
Run the command 'fchend' if not already done so.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
During fchend the other (active) node has failed or rebooted. The fchend
command has been aborted. The non-upgraded node has returned to the state
before fchend was started, and is now operating as the active node until the
upgraded node comes back online.
Wait for the other node to come back online and become active. Then go to
execute fchend once more. If the active node does not come back online you
may need to restore it.
EndReboot
The FCH session is in a boot state (prior to end of the FCH session).
Wait for the rebooting node to come back online.
FCH is about to finish the session after final reboot during command fchend.
Execute the command fchstate with regular intervals until it reports that "No
function change session is ongoing". If this has not happened within 20 minutes,
the fchend may has failed. Go to restore the non-upgraded node and re-commit
the system.
EndRebootDone
ACS_FCH_Server or the command 'fchend' has initiated a reboot successfully
and the FCH-session is about to end.
FCH is in the process of finishing the FCH session.
If you are sure that the session has failed, (that is, the 'EndRebootDone' state
persists for several minutes) go to restore the non-upgraded node and re-
commit the system.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 60 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
EndWrongNode
Due to unavailability of the upgraded node, the non-upgraded node is active.
Wait for the upgraded node to come back online,
then run the command 'fchend' again.
If the upgraded node is still down after 20 minutes, try to start it
using the command 'fcc_reset other'.
During execution of fchend , the active node failed or rebooted. The non-
upgraded node has temporarily taken over as active node.
When the upgraded node comes back online, go to complete fchend. If fchend
cannot be executed go to fallback.
Failed Supervision
Failure during Supervision detected. FCH is about to fallback.
[Waiting for {A|B}-node removal of resources from the Cluster DB.]
[Waiting for {A|B}-node addition of resources to the Cluster DB.]
[If the 'Failed Supervision' state persists for more than 20 minutes after reboot,
then try to reboot the {A|B}-node using the command 'prcboot'.]
A failure is detected during the Supervision period and FCH has started a
fallback. If the other node is down, FCH will wait until the other node is back
online and then continue the fallback.
Wait for the FCH session to complete the fallback.
Failover1
ACS_FCH_Server is about to failover.
If the failover has failed (the 'Failover1' state persists for several minutes),
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
The switchover to the new configuration is in progress.
Wait for the 'Supervision' state before continuing. This may take a few minutes. If
the 'Supervision' state has not appeared within 10 minutes, or you are certain
that the FCH session has failed, go to initiate a fallback. If this should fail, you
need to restore the upgraded node.
Failover2
ACS_FCH_Server is about to start-up the nodes.
First, the new active node, then the other.
If the failover has failed (the 'Failover2' state persists for several minutes),
try executing the command 'fchfb' to initiate a fallback.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 61 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
FbEndReboot
A fallback has occurred and the FCH session is in a boot state
(prior to reactivation of old software on the passive node).
Wait for the 'CommitDone' state after the rebooting node is back online.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the non-upgraded node.
FCH is in the process of falling back the passive node after a failed or interrupted
fchend command.
If the 'FbEndReboot' state persists for more than 20 minutes, or you are sure that
FCH has failed, go to restore the non-upgraded node and re-commit the system.
FbEndRebootDone
ACS_FCH_Server or the command 'fchend' has initiated a reboot successfully
after a local fallback and the node is about to become committed again.
FCH is finishing up a fallback of the passive node after a failed or interrupted
fchend command.
If the 'FbEndRebootDone' state persists for several minutes, or if you are sure
that the FCH has failed, go to restore the non-upgraded node and re-commit the
system.
FbFailover1
ACS_FCH_Server or the command 'fchfb' is about to initiate a fallback of
the upgraded node and bringing all resources offline.
If the fallback has failed (the 'FbFailover1' state persists for several minutes),
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the comamnd 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is falling back to the system existing prior to the start of the FCH session.
Wait for the upgraded node to reboot and come back online before continuing. If
the fallback has failed, (that is if you receive an error message from the
command fchfb or the 'FbFailover1' state persists for several minutes), reboot the
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 62 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
upgraded node to force a fallback. If this too should fail, or you are sure that the
fallback has been interrupted, please go to restore the upgraded node.
FbFailover2
ACS_FCH_Server or the command 'fchfb' is about to, or has executed start-up of
the old non-upgraded node.
If the printout from the command 'fchstate' indicates that an LBB upgrade
is going on, the command 'fchfb' cannot continue the fallback.
If the fallback fails (the 'FbFailover2' state persists for several minutes),
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is in the middle of falling back to the configuration existing prior to the FCH
session.
Wait for the 'End' state. If the options -L, -c or -x were used in the command
fchstart, fchfb cannot complete the fallback since LBB software upgrade was
used. To complete the fallback you must use the fchrst command.
If the fchfb should fail or if the 'FbFailover2' state persists for several minutes, try
rebooting the upgraded node using the prcboot command. If this also fails to fall
back the node, go to restore the upgraded node.
FbFailover3
A fallback has occurred and the FCH session is in a boot state
(before boot to activate old software).
If the fallback fails (the 'FbFailover3' state persists for several minutes),
then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is in the middle of falling back to the configuration existing prior to the FCH
session.
Wait for the 'End' state. If the options -L,-c or -x were used in the fchstart
command, fchfb cannot complete the fallback since LBB software upgrade was
used. To complete the fallback you must use the fchrst command.
If the fchfb should fail or if the 'FbFailover3' state persists for several minutes, try
rebooting the upgraded node using the prcboot command. If this also fails to
fallback the node, go to restore the upgraded node.
FbReboot
A fallback has occurred and the FCH session is in a boot state
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 63 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
FbReboot2
A fallback has occurred and the FCH session is in a boot state
(prior to activation of old software).
Wait for the rebooting node to come back online.
If the fallback fails, then try to reboot the upgraded node
using the command 'prcboot'.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the upgraded node.
FCH is in the middle of falling back to the configuration existing prior to the FCH
session.
Wait for the 'End' state.
If the option -L, -c or -x were used in the fchstart command, fchfb cannot
complete the fallback since LBB software upgrade was used. To complete the
fallback you must use the fchrst command.
If the fchfb should fail or if the 'FbReboot2' state persists for several minutes, try
rebooting the upgraded node using the prcboot command. If this also fails to
fallback the node, go to restore the upgraded node.
InitWrongNode
FCH is initiating activation of the non-upgraded node.
Due to unavailability of the upgraded node,
the non-upgraded node is being made active.
When the upgraded node is back online,
run the command 'fchend' again.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the non-upgraded node.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 64 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
During execution of fchend , the active node failed or rebooted. The non-
upgraded node has temporarily taken over as active node.
When the upgraded node comes back online, go to complete fchend.
If fchend cannot be executed go to fallback, restore is needed.
InitWrongNodeDone
FCH is about to start-up the non-upgraded node.
Due to unavailability of the upgraded node,
the non-upgraded node is being made active.
When the upgraded node is back online,
run the command 'fchend' again.
Else, if you are sure that restore is needed,
run the command 'fchrst' to restore the non-upgraded node.
During execution of fchend , the active node failed or rebooted. The non-
upgraded node has temporarily taken over as active node.
When the upgraded node comes back online, go to complete fchend.
If fchend cannot be executed go to fallback restore is needed
Installing
The command 'fchstart' is installing software.
The command fchstart is installing software on the passive node.
If the command fchstart detects any problems during the installation process,
error messages are printed and the command initiates a fallback.
If this was an LBB software upgrade session (that is , fchstart -L, -c or -x were
used), go to node restore.
LbbReboot1
The FCH session is in a boot state.
When the node is back online, you should run the command
'fchstart -L [other args]...' again.
The FCH session is in a boot state during LBB upgrade.
Wait for the node to come back on line, then see how to continue. If you are sure
that the session has failed, and a restore is needed, go to fallback.
LbbReboot2
The FCH session is in a boot state.
When the node is back online, you should run the command 'fchend' again.
The FCH session is in a boot state during LBB upgrade.
Wait for the node to come back on line, and then see how to continue. If you are
sure that the session has failed, and a restore is needed, go to fallback.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 65 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Move1
The FCH session is in a state of failover.
If the failover has failed (the 'Move1' state persists for several minutes), try
executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node using the command 'prcboot'.
Else, if you are sure that restore is needed, run the command 'fchrst' to restore
the upgraded node.
FCH is in the middle of switching to the new configuration.
Wait for the 'Supervision' state before continuing. This may take a few minutes. If
this event has not appeared within 10 minutes, or you are certain that the FCH
session has failed, initiate a fallback. If this should fail, you need to restore the
upgraded node.
Move2
The FCH session is in a state of fallback.
If the fallback has failed (the 'Move2' state persists for several minutes),
try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node using the command 'prcboot'.
Else, if you are sure that restore is needed, run the command 'fchrst' to restore
the upgraded node.
FCH is falling back to the system existing prior to the start of the FCH session.
Wait for the upgraded node to reboot and come back online before continuing. If
the fallback has failed, (that is if you receive an error message from fchfb
command or the 'Move2' state persists for several minutes), reboot the upgraded
node to force a fallback. If this too should fail, or you are sure that the fallback
has been interrupted, please go to restore the upgraded node.
noFCH
No Function Change session is ongoing.
FCH-session started at <date-time> <result>
Last FCH-session started at <date-time> <result>
No function change is currently in progress.
The second and third lines are optional. <date-time> is the date and time when
the FCH-session started. <result> is "was successful", "failed" or nothing.
Reboot
The FCH session is in a boot state (prior to failover).
Wait for the rebooting node to come back online.
If the rebooting node is back online and the 'Reboot' state persists for more than
30 minutes, try executing the command 'fchfb' to initiate a fallback.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 66 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
If 'fchfb' fails, then try to reboot the upgraded node using the command 'prcboot'.
Else, if you are sure that restore is needed, run the command 'fchrst' to restore
the upgraded node.
FCH is about to start switching to the new configuration.
Wait for the 'Supervision' state before continuing. This may take several minutes.
If this event has not appeared within 30 minutes, or you are certain that the FCH
session has failed, go to initiate a fallback. If this should fail, you need to restore
the upgraded node.
Reboot2
A fallback has occurred and the FCH session is in a boot state
(prior to or after activation of old software).
Wait for the rebooting node to come back online.
If the fallback fails, try executing the command 'fchfb' to initiate a fallback.
If 'fchfb' fails, then try to reboot the upgraded node using the command 'prcboot'.
Else, if you are sure that restore is needed, run the command 'fchrst' to restore
the upgraded node.
FCH is in the middle of falling back to the configuration existing prior to the FCH
session.
Wait for 'End' state and then go to fallback. If the options -L, -c or -x were used in
the fchstart command, fchfb cannot complete the fallback since LBB software
upgrade was used. To complete the fallback you must use the fchrst command.
If the fchfb should fail or if the 'Reboot2' state persists for more than 20 minutes,
try rebooting the upgraded node using the prcboot command. If this also fails to
fallback the node, go to restore the upgraded node.
Restore
Please follow the OPI: AP System Restore, Initiate.
When finished, run the command 'fchend' to end the session.
An FCH restore operation is in progress.
Wait for the command burrestore to complete before continuing.
Restore2
Please follow the OPI: AP System Restore, Initiate.
When finished, run the command 'fchcommit' again.
An FCH restore operation to restore the non-upgraded node after a failed fchend
is in progress.
Wait for the command burrestore to complete before continuing.
StartOrigNode
The FCH session is falling back the original node and making it active.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 67 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
Supervision
SUPERVISION HAS BEGUN.
Be sure that all resources are online, then observe the system for a while.
Then run the command 'fchcommit' or 'fchfb' to continue the session.
FCH has successfully switched to the new system configuration.
Unknown state
Unknown state <state>. If this state persists for a long time you might have to
restore both nodes.
A serious error has occurred and FCH is in an unknown state! The system must
be restored.
Make a Trouble Report, see Operational Instruction AP, Trouble Report, Initiate.
Then restore the system, please see Operational Instruction AP, System
Restore, Initiate.
This table is intend to give a brief overview of the relation between the IOG and
APG40, with IPN, commands. For some commands the nearest alternative has
been chosen. The syntax for all AP commands can be found in the ALEX
documentation.
AP command IOG Command
Common CP commands
OCINP:IPN=ALL; EXSLP:SPG=0;
APAMP;
OCISI:IPN=1; FCSLI:SPG=0;
OCIBE:IPN=1; BLSLE:SPG=0,LINK=1;
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 68 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
alddef ALALI
alddeblk ALBLE
aldls ALALP
aldrm ALALE
aldpdef ALDIC
aldprm Alarm Display Property, Delete (No mapping)
aldpls ALDIP
aldquiet Bell Switch
aldtest ALLTI
Cpdinsert ILSLI, MCDVI
Cpdremove ILSLR, MCDVR
Cpdchange MCDSC, MCDVC, ILSLC
Cpdlist ILNPP, MCDVP
Cpdtest ILLTI
Exaldef ALRDL, ALEXL
Exalrm ALEXE
Exaldeblk BLEAE
Exalblk BLEAI
exalch External Alarm Receiver, Change (No mapping)
exalclear ALEXR
exalls ALEXP
Central Processor, CPS
bupprint SYGPP
bupset SYGPS
Maintenance, MAS
PTCOI, PTCOE, PTCPL etc. Processor test commands remains the same.
ACS
acease (Alarms Cease) Manually ceases an alarm. Caution: The alarm can
be manually ceased without being repaired.
alist (Alarms List) Displays information about alarms in an AP.
alogact (Audit Log Activate) Activates the Audit Log function of an Adjunct
Processor (AP).
alogdeact (Audit Log Deactivate) Deactivates the Audit Log function of an
Adjunct Processor (AP).
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 71 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
alogexcl (Audit Log Exclude) Inserts or ends the exclusion status of Man-
Machine Language (MML) commands, MML command parameters,
MML printouts or UNIX commands in the Audit Log function of the
Adjunct Processor (AP).
alogexls (Audit Log Exclude List) Lists the excluded items that are contained
in the Audit Log of an Adjunct Processor (AP).
alogfind (Audit Log Find) Searches the logged data for items that
correspond to the specified search parameters.
aloglist (Audit Log Attributes List) Lists the attributes of the Audit Log
function of an Adjunct Processor (AP).
alogset (Audit Log Set) Specifies the attributes of the Audit Log function of
an Adjunct Processor (AP).
burbup (Backup) Initiates a backup to DAT tape.
burres (Backup Restore) Reads a backup tape and restores its contents to
the system.
burver (Backup Verify) Reads the archives on a backup tape and checks
that the tape is still readable.
dsdls (Directory Service List) Lists and displays the contents of the
registration file that is a part of the Directory Service function.
dslist (Directory Service List) Lists and displays the contents of the
registration file that is part of the Directory Service function;
information common for the directory service is printed.
fchend (Function Change End) Terminates a Function Change
session.
fchfb (Function Change Fallback) Reverts the system to its state before a
Function Change.
fchstart (Function Change Start) Initiates a Function Change session
fchsudo (Function Change Set User) Enables execution of commands as
another user in the restricted Function Change environment.
ispprint (In Service Performance Print) Prints the process control In Service
Performance (ISP) log to standard output or makes some simple
statistics and prints the result to standard output.
mktr (Make Trouble Report) Reports and collects additional information
to be used when trouble reports are received.
msdls (Message Store List) Lists and displays the contents of a
message store.
phacreate (Parameter Create) Creates the parameter tables for a CXC
specified in a parameter file (used at installation).
phaprint (Parameter Print) Reads and prints one parameter, a group of
parameters, or all parameters to standard output.
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 72 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
CPS
bupprint (Backup Parameter Print) Initiates a printout of parameters that are
used for backup generation handling and command log handling
(effective in connection with reload).
bupset (Backup Parameter Set) Sets up parameters for system backup
generation handling and command log handling.
FMS
afpchd (AP File Processing Change Definition) Changes the definition of a
file; inserts or removes a destination, and/or changes the remove
delay of a file in the Adjunct Processor File Processing (AFP).
afpdef (AP File Processing Define) Defines composite main files, simple
files and UNIX files to AFP.
afpls (AP File Processing List) Lists all files that are defined or reported
to AFP.
afprep (AP File Processing Report) File reports a subfile to Adjunct
Processor File Processing (AFP).
afprm (AP File Processing Remove) Removes UNIX files, composite main
files, subfiles, or simple files from Adjunct Processor File
Processing (AFP).
cpfchange (Central Processor File Change) Changes attributes of an infinite
file.
cpfcp (Central Processor File Copy) Copies a CP file to another CP file. A
simple file or a subfile may be copied to a simple file or a subfile.
cpfdf (Central Processor File Display File) Lists all volumes used by the
Central Processor (CP) file system.
cpfife (Central Processor File Infinite File End) Changes the active subfile
of an infinite file.
cpfls (Central Processor File List) Prints the names and attributes of one
or all files.
cpfmkfile (Central Processor File Make File) Creates a main file or subfile in
the CP file system.
cpfmv (Central Processor File Move File) Moves a CP file to another
volume. The file can be simple or composite.
cpfport (Central Processor File Port) Copies a CP file to a physical file path
(export), or copies a physical file or directory to a CP file (import).
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 74 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc
FOS
foscopy (Format Output Subsystem Copy) Copies messages currently
being read by the Flexible Formatter from the Adjunct Computer
Subsystem Message Store to a file.
foslview (Format Output Subsystem Log View) Retrieves and decodes the
contents of an unprocessed data log-file.
fosview (Format Output Subsystem View) Shows information about the
Format and Output Subsystem (FOS) application on a user's
terminal.
MCS
cpdchange (Central Processor Device Change) Changes the attributes for the
specified parameters of a defined I/O device.
cpdinsert (Central Processor Device Insert) Initiates a PDS program
associated with the I/O (used when an operator inserts a new I/O
device into the CP).
cpdlist (Central Processor Device List) Lists the attributes of defined or
active I/O devices.
cpdremove (Central Processor Device Remove) Removes an I/O device from
the Central Processor (CP) and ends the Printout Destination
Services (PDS) program.
mml (Man-Machine Language) Executes MML commands.
pds (Printout Destination Services) Initiates the Printout Destination
Services (PDS) when used with command cpdinsert(1m).
PKG
pkgadd (Package Add) Adds operating system packages to a
specified directory.
pkgchk (Package Check) Checks the accuracy of installed files or, by use
of the –1 option, displays information about package files.
pkginfo (Package Information) Determines the version of the specified
package in the host operating system.
pkgrm (Package Remove) Removes operating system packages from a
specified directory.
STS
ERICSSON LAZY GUIDE FOR EMERGENCY HANDLING OF APG40 75 /
75
Prepared Subject Resp. No
Daniel Figueiredo (SEPDAFI) SEP\SD\CSS
Doc Resp/Approved Checked Date Rev File/Referenced
2004-12-14 A Lazy_APG40.doc