You are on page 1of 4

SP Emergency Shutdown and Startup Procedures

1. Introduction
This document describes the steps that need to be taken to shut down and
startup the SP in case of emergency.

2. Shutdown and Power Off

2.1 Shutdown all nodes and Control Workstations


Please refer to the Rshutdown documentation located at
G:\TCS\TCS3\TCS32\PROJECTS\RSHUT\RSHUT.DOC for a description of this step.

2.2 Power Off


The Control Workstations will switch themselves off when they have completed their
shutdowns. The SP nodes will also switch off, but only logically. If power is cut to the frames
and then comes back again the nodes will power on immediately with no control over the
sequence.
Therefore the nodes need to be powered off manually. This is done by opening up the
fronts of the frames and putting the node power switches in the down position. Thin and wide
nodes have one switch. High nodes have two and both need to be off. Then turn off power to
the frames. The power switch is located on the front, down on the left side (the small red box
with the clear plastic sliding cover.
Then cut the power to the computers by pulling the power cord to the Control Workstations
and tripping the main circuit breakers under the floor for the frames. The frame circuit
breakers are located under the floor to the left of the 3494 Tape Library. Each frame has its
own breaker and the breakers are numbered one to three.
Finally, turn off and cut the power to all peripheral equipment, such as the 3494 Tape
Library, 9015 Racks etc.
3. Power On and Startup

3.1 Power on frames and peripheral equipment


The order of powering on the equipment is reverse to powering off. First power on all
peripheral equipment. Then bring back power to the frames by switching on the main circuit
breakers. Switch on the frames but leave the nodes powered off for now.

3.2 Control Workstations


Power on and startup the Control Workstations. When the AIX is complete on the primary
Control Workstation, login as root and start HACMP. Do this with smitty clstart. Make sure
that the panel shows the following:

Start Cluster Services

Type or select values in entry fields.


Press Enter AFTER making all desired changes.

[Entry Fields]
* Start now, on system restart or both now +

BROADCAST message at startup? false +


Startup Cluster Lock Services? false +
Startup Cluster Information Daemon? true +

F1=Help F2=Refresh F3=Cancel F4=List


F5=Reset F6=Command F7=Edit F8=Image
F9=Shell F10=Exit Enter=Do

Monitor the HACMP startup progress by tailing the cluster log file /var/adm/cluster.log. When
the HACMP cluster is up HACWS will be brought up automatically. This takes two or three
minutes. Monitor the progress with the following command:

# lssrc -a | grep spcw201


sdr.spcw201 sdr 22762 active
hb.spcw201 hb 21740 active
hags.spcw201 hags 40710 active
hagsglsm.spcw201 hags 8574 active
haem.spcw201 haem 40700 active
hr.spcw201 hr 21450 active
pman.spcw201 pman 19546 active
pmanrm.spcw201 pman 17174 active
hats.spcw201 hats 37316 active
Emonitor.spcw201 emon inoperative

All daemons except Emonitor should be active when HACWS is ready.

Then start HACMP on the secondary Control Workstation. Since this a backup for the primary
no applications will be started so the startup is much faster. Monitor the progress in the
cluster log file.
3.2 SP Nodes

3.2.1 Power On the Nodes


Verify that the frames and the switches are all powered on and functioning before proceeding
here. Do this with the command spmon –G –diag:

.
.
.
4. Checking frames

Controller Slot 17 Switch Switch Power supplies


Frame Responds Switch Power Clocking A B C D
----------------------------------------------------------------
1 yes yes on 0 on on on N/A
2 yes yes on 0 on on on N/A
3 yes yes on 0 on on on N/A

.
.
.

Power on node one and three in frame one (the left-most). Monitor the progress with
spmon -diag. Watch for Host Responds to come on before proceeding.

3.2.2 Clock Topology


Now its time to setup the switch clock topology. This needs to be done if the power has been
completely cut from the frames. Use the following command:

# Eclock –f /etc/SP/Eclock.top.3nsb.0isb.0

Make sure to give the right filename in the above command. Verify the correct setup with
spmon –G –diag:

# spmon -G -diag
.
.
.
4. Checking frames

Controller Slot 17 Switch Switch Power supplies


Frame Responds Switch Power Clocking A B C D
----------------------------------------------------------------
1 yes yes on 0 on on on N/A
2 yes yes on 3 on on on N/A
3 yes yes on 3 on on on N/A

5. Checking nodes
.
.
.

3.2.3 Switch Startup


Now the switch can be started with the command Estart. Verify that node one and three
has Switch Responds with spmon –diag.
3.2.4 Fileserver Cluster
Log in to node one and three (they can both be done at the same) and start HACMP with
smitty clstart:

Start Cluster Services

Type or select values in entry fields.


Press Enter AFTER making all desired changes.

[Entry Fields]
* Start now, on system restart or both now +

BROADCAST message at startup? false +


Startup Cluster Lock Services? false +
Startup Cluster Information Daemon? true +

F1=Help F2=Refresh F3=Cancel F4=List


F5=Reset F6=Command F7=Edit F8=Image
F9=Shell F10=Exit Enter=Do

Monitor the progress in the cluster log file.

3.2.5 ADSM Cluster


Switch on node one in frame two (ADSM server) and node one in frame three (EDMS
server). Monitor the AIX startup with spmon -diag. When Host Responds comes on the
nodes are ready.
Make sure that the switch activates on both nodes. If the nodes were brought down cleanly
they should come on the switch automatically, otherwise the nodes are fenced. If this is the
case then unfence them with the Eunfence command from the Control Workstation.
Start HACMP on the ADMS node with smitty clstart, and make sure that the settings are
the same as for the Fileserver Cluster. Monitor the cluster log to see when the startup is
complete. Then do the same on the EDMS node.

3.2.6 DataWarehouse
The subsurface nodes use the DataWarehouse node as a fileserver for the Oracle software
so it needs to be up first. This is a node five in frame three. Switch on the node and monitor
the AIX startup. When Host Responds comes on the node is ready. Make sure that it comes
on to the switch, unfencing if necessary.

3.2.7 Subsurface and the Rest


The subsurface nodes have no inter-dependancy so they can all be started at the same time.
Switch on the nodes and monitor the AIX startup (as with the DataWarehouse node), and
make sure that they come on to the switch.
Any other remaining nodes, such as the Web Server, can also be brought up at this point.

You might also like