Professional Documents
Culture Documents
Setting Up A Fail Over Solution For Centra Site
Setting Up A Fail Over Solution For Centra Site
The description applies to CentraSite and UNIX. For a basic introduction into CentraSite HA failover
please refer to "CentraSite Failover Basics".
Audience
The intended audience are experienced CentraSite administrators with a in-depth knowledge of
CentraSite internals who want to integrate CentraSite into a cluster environment.
Conventions
Symbol Usage
$ Bourne/Korn/Bash shell prompt
# Superuser prompt
Note
File paths in this description are the file paths that are used in a default installation. A non-standard
installation directory may have been specified in your installation, so the file paths may be different.
General Description
Architectural Considerations
The CRR database is comprised of by files in the directory /opt/softwareag/CentraSite/data (in a
default installation). They have extensions like .1D0, .1I0, .1J0) and are called the (database)
containers.
Most of the CRR configuration data are contained in the file /opt/softwareag/CentraSite/cfg/regfile
(called the "Internal Configuration Store" - ICS). For the CRR the ICS contains information about the
structure of the database, such as the sizes and locations of the containers as well as the setting for
server parameters. The database is controlled by a daemon process named inosrv (located in
/opt/softwareag/CentraSite/bin). Processes like inosrv access the ICS via an environment variable
REGFILE, which contains the path of the ICS file.
Client requests for CRR are directed to the CentraSite registry/repository by the CentraSite Application
Server Tier (CAST). Multiple CAST installations are possible. CAST is not required to be under cluster
control since a CAST can be installed on each cluster node.
Prerequisite for accessing a shared Registry/Repository is using a symbolic link for the CentraSite
data directory with the same name as the data directories used for a local Registry/Repository. This
can be achieved by using the same directory name as root directory when installing CentraSite or by
using the same directory name for database locations and moving the database spaces to this
directory on both nodes.
cen 1 cen 2
REGFILE REGFILE
CentraSite CentraSite
RR RR
Figure 1: The environment variable REGFILE points to a local file containing the internal configuration
where a local directory is used for database spaces.
cen 1 cen 2
REGFILE
CentraSite
RR
Shared disk
Figure 2: The environment variable REGFILE points to a symbolic link to a shared file containing the
internal configuration where a symbolic link to a shared directory is used for database spaces.
CentraSite Registry/Repository Failover Basics
All the objects in the CRR are organized on a lower database layer in XML structures (basically
collections containing XML documents). In a failover configuration the check on correctness of the
CRR is verified by monitoring the CRR database regularly. The monitoring check is based on an XML
document being updated regularly. (In this context, “regularly” is normally something like 30 seconds
but this is dependent on the failover requirements. This typically means that after 30 seconds at the
latest an error in the CRR can be detected.) This XML document is part of a collection. It is
recommended to keep this XML document in the "ino:etc" collection.
Cluster Topology
A failover scenario for CRR is based on a cluster with two or more nodes and a disk shared by all the
participating nodes. CRR is only active on one node at a time, i.e. this is what is called an "active-
passive clustering".
Basically each of the cluster nodes has at least one IP address which identifies the node. Clustered
applications have a different IP address, which is dynamically assigned to the cluster node on which
the application is currently active. This allows a client always to use a fixed IP address when
connecting to a clustered application. The actual node on which the clustered application is running at
a given point in time is transparent to the client. The CRR database server as a clustered application
therefore has an IP address associated with it. This IP address is assigned by the cluster software to
the node on which the CRR database server is running. This IP address is called the "CRR virtual IP
address" throughout the rest of this description.
The CRR can also be clustered in virtualization environments (for example Oracle/Solaris zones,
IBM/AIX logical partitions or VMware). In this case the CRR runs within a virtual host (e.g. a zone) and
the complete virtual host is switched from one physical host to another. Such settings do not require all
the steps listed below. For example if the disk is kept within the machine, it might not be necessary to
perform all the steps to bring the CRR database and ICS to the shared disk. A combination of virtual
hosts and shared disks might also be used (e.g. the CentraSite installation are in a virtual host, the
shared files on a separate shared disk).
The ICS and the CRR database containers have to be shared by all the nodes in a cluster. The shared
disk contains:
CRR must be installed on each of the nodes in a location with an identical path. In this description a
default installation on "/opt/softwareag/CentraSite" is assumed.
This step is only required if more than two nodes are participating in the cluster.
Step 1.3: Ensure that an ActiveSOA license is installed on each participating node.
This step inserts the XML document which is being updated to monitor the CRR database server
status the collection "ino:etc":
$ . centrasite_setenv.sh
$ inmham -createtimestamp CentraSite ino:etc AdminUser AdminUserPassword
The AdminUser and AdminUserPassword are just examples for a real CentraSite administration user.
At the point of installation the default administration user is "Administrator". The command should
return with the exit value "0".
In a default installation the CRR is started by default when the node on which it is installed is restarted.
Within a standard installation the CRR database server process (inosrv) is started when the machine
is booted. In this case a start/stop script can be found at /etd/init.d . For CRR under cluster control an
automatic start (i.e. a start independent from the cluster software) must be prevented. The CRR server
should be started by cluster server means and the daemon for running the registry/repository should
be removed.
Assuming the start/stop script on /ets/init.d is "sag2inm97", this can be achieved by the following
command (on the /opt/softwareag/CentraSite/bin directory):
Step 3.1: Ensure that the CRR database server on cen1 is stopped
The shared disk must be imported and the shared file system must be mounted on cen1 for this
action.
On node cen1 copy the CRR database contents to the filesystem on the shared disk.
Throughout this document we use "/FS/fsha" as the file system on the shared disk.
$ mkdir -p /FS/fsha/opt/softwareag/CentraSite/data
$ cp -pR /opt/softwareag/CentraSite/data/* /FS/fsha/opt/softwareag/CentraSite/data
Step 4.1: On each of the participating nodes: Stop the CRR database server
Step 4.2: Recommended: On each of the participating nodes: Backup the CRR database
It is recommended to backup the "data" directory on each node before creating the link to the shared
location in the next steps.
The backup can be created by executing on each of the nodes the command:
mv /opt/softwareag/CentraSite/data /opt/softwareag/CentraSite/data.orig
Step 4.3: On each of the participating nodes: Create a symbolic link from the original
"/opt/softwareag/CentraSite/data" directory to the mount point on the shared disk.
ln -s /FS/fsha/opt/softwareag/CentraSite/data /opt/softwareag/CentraSite/data
Step 5: Share the Internal Configuration Store (ICS)
The internal configuration store is contained in the file "/opt/softwareag/CentraSite/cfg/regfile". Change
the CentraSite installations to refer to a common, shared ICS (which is active on at most one node at
any given point in time).
$ mkdir -p /FS/fsha/opt/softwareag/CentraSite/cfg/regfile
$ cp /opt/softwareag/CentraSite/cfg/regfile /FS/fsha/opt/softwareag/CentraSite/cfg/regfile
$ cp /opt/softwareag/CentraSite/cfg/regfilebck
/FS/fsha/opt/softwareag/CentraSite/cfg/regfilebck
$ mv /opt/softwareag/CentraSite/cfg/regfile /opt/softwareag/CentraSite/cfg/regfile.orig
$ ln -s /FS/fsha/opt/softwareag/CentraSite/cfg/regfile /opt/softwareag/CentraSite/cfg/regfile
$ mv /opt/softwareag/CentraSite/cfg/regfilebck
/opt/softwareag/CentraSite/cfg/regfilebck.orig
$ ln -s /FS/fsha/opt/softwareag/CentraSite/cfg/regfilebck
/opt/softwareag/CentraSite/cfg/regfilebck
In case the internal user repository should be used, this file should be shared, too. This can be done
by using step 5.1 to 5.3 for the internal user repository file /opt/softwareag/common/conf/users.txt
For CentraSite, the commands shown below provide the required features. These commands are part
of a CRR installation and are located in the directory "/opt/softwareag/CentraSite/bin". They can be
used in scripts for onlining, monitoring, offlining, and cleaning. There is an example script showing how
to use these commands for the different entry points that most of the cluster servers provide. Since the
scripts are different for certain cluster servers, please have a look at the sample script and adapt it to
your cluster server specific requirements. For details of the script please refer to the document
"CentraSite HA Sample Script".
Starting/Onlining:
Monitoring:
Stopping/Offlining:
The "shutdownrollback" version shuts down the CRR database server by rolling back all open
transactions.
The "shutdownemergency" version enforces a hard shutdown of the CRR database server writing a
CentraSite dump; it's something like a hard kill.
Cleaning:
This command gets the process ID of the CRR database server (in stdout). The PID can be used as a
parameter of the kill command, to be sure that the CRR database process is really stopped.
inoosr -k
This command cleans up possible leftovers (like shared memory segments) after a kill or an
unexpected termination of the CRR database server.
The CRR database server must be started after the file system on the shared disk is mounted and
after the CRR virtual IP address is assigned.
The CRR database server must be stopped before the CAST virtual IP address is unassigned.
CentraSite High-Availability Monitor (inmham)
The CentraSite installation provides a tool that supports integration of the CRR in a failover cluster. It
is "inmham", and its location in a default installation is "/opt/softwareag/CentraSite/bin". This section
describes the features of the tool.
Just calling
$ . centrasite_setenv.sh
$ inmham
shows the help output including all the options and parameters. inmham requires the CentraSite
environment to be set, therefore before executing inmham the file "centrasite_setenv.sh" should
always be sourced. This sets environment variables like REGFILE which are needed when accessing
the ICS.
This command is used for monitoring that the CRR database server is up and working correctly. It
updates the XML document "inmham" in the collection <collection>. A successful response must be
returned before the timeout specified by <timeout in sec> expires. If there is a wrong result or the the
timeout expires before a response i received, a non-zero exit value is returned.
parameter explanation
dbname only "CentraSite" allowed
collection the collection name of a collection for XML documents, "ino:etc" is
recommended
dbuser the user name of a CentraSite administration user
dbuserpassword the password of the CentraSite administration (see above)
timeout in seconds the time frame in which a successful monitor request must be completed
Stops the CRR database server and rolls back all open transactions.
A non-zero exit value indicates that the shutdown failed.
parameter explanation
dbname only "CentraSite" allowed
Stops the CRR server immediately. This is a "hard" shutdown. If the CRR database server is not
successfully shut down within the timeframe <timeout in sec>, a non-zero exit value is returned.
parameter explanation
dbname only "CentraSite" allowed
timeout in seconds the time frame in which the CRR database server is expected to be stopped
Writes the process ID of the CRR database server (inosrv) to stdout. If the PID cannot be determined,
"0" is written to stdout and the exit value is non-zero.
parameter explanation
dbname only "CentraSite" allowed
inmham -createtimestamp <dbname> <collection> <dbuser> <dbuserpassword>
parameter explanation
dbname only "CentraSite" allowed
collection a collection name of a collection for XML documents, "ino:etc" is recommended
dbuser the user name of a CentraSite administration user
dbuserpassword the password of the CentraSite administration user (see above)
exit explanation
value
0 Success
4 The inmham XML document could not be inserted.
5 The inmham XML document cannot be updated. The CRR database server is down, hung
up, or not available.
6 Parameter error: invalid syntax of the inmham command (e.g. too few parameters
specified).
7 The shutdown of the CRR database server failed. One possible reason is that the CRR
database server is hung up.
8 The process ID of the CRR database server cannot be determined.
9 The shutdown was not completed in time (i.e. before the timeout occurred).
10 License error, the license is not ActiveSOA.
11 Access to the CRR database server failed.
12 Unauthorized access, invalid user/password combination.