You are on page 1of 6

Creating a Red Hat Cluster: Part 1

This is the first of a series of articles that will demonstrate how to create a Linux cluster using Red Hat/CentOS 5.5 distribution. When we created our first cluster at the office, we were searching for some Red Hat cluster setup information on the internet. To my surprise, we could not find a lot of article or user experience on that topic. So I hope, this series of articles will benefit to the community of users trying or wanting to create their own cluster.

The cluster hardware
Our cluster will have 3 HP servers, they will all have 4GB of memory, 36GB of mirrored internal disks, one Qlogic fiber card connected to our SAN system, 2 network cards and for our fencing device we will use the on-board HP ILO (Integrated Light Out). This is my setup, yours does not need to be the same. You do not need to use HP server to build a cluster, you do not need to have mirrored disks (although recommended) and you do not need to have to use a SAN infrastructure either (NFS share can also be used). One thing I would recommend is a fencing device, on the HP server there is a network port on the back of each server called “ILO”. This will allow the cluster software to power on, power off or restart a server remotely. Red Hat cluster package allow you to use a lot of similar fencing devices. This part of the cluster is important because this will prevent at one point some nodes in the cluster to write to a non-shareable filesystem and create data corruption. If you do not have a fencing device, you can always use the manual fencing method it works, but it not supported.

The Setup
Our cluster will be contains 3 servers, we will have 2 actives servers and one passive. For our example, we will have one server running an HTTP service, the second server will be an FTP server. The third server will be use as a fail-over server, if the first or second server have network, SAN, or hardware problem, the service it is running will move the third server. Although not required, having a passive server offer some advantage, first it make sure that if one server have problem the passive server will be able to handle the load of any of your server. If we did not have that passive server we would need to make sure that either of the server would be capable of handling the load of the two servers on one server. Clustering environment offer some other advantage when time come to do hardware update on a server. Let say we need to add memory to the first server, we could move the HTTP service from one server to the passive node, add memory to the first server and then move back to the service on the original node when ready. Another advantage of having a passive server is that you can update the OS of your node one by one without affecting the service (if reboot is necessary). So the name of our node will be “bilbo” the will host the http service, “gollum” the will host the FTP service and “gandalf” will be our passive node. As you can see below image, each server use 3 network cards. The first NIC (eth0), is use to offer the service HTTP and FTP to the users and it is the host main network card. The second network card (eth1) is used by cluster software for his heartbeat. In a corporate environment, this network should be isolated from the rest of your network. There will be a lot of broadcast on it and the response time is important. The ILO network will be used by the cluster software to remotely poweron/poweroff server in our cluster, when a problem arise.

ca hbgandalf hbgollum hbbilbo ftp.3 192.168.211 # # HeartBeat IP Address 10.1. but in case your DNS does not¨ through out these localhost We should define all the ip address that our cluster need to function properly in your /etc/ ftp www gandalf.168.10.104 # # Service Virtual IP 192. .1.1 # # Host Real IP gandalf gollum bilbo # # HP ILO IP Address gollum. We will use the domain name “ Of course they should also be define in your www.204 192. the /etc/hosts will assure you that the cluster will continue to work properly.103 10.1.So out /etc/hosts file look like this : # ilo_gandalf ilo_gollum ilo_bilbo # ilo_gandalf.103 hbbilbo.168.

# grep -i locking_type /etc/lvm/lvm. # yum install kmod-gnbd-PAE kmod-gfs-PAE kernel-PAE Finally. so let install it. From the gnome desktop run the following the following command .conf file. to inform LVM that we will be dealing with GFS volume group and GFS filesystem. let‟s make sure that you have that the servers have the latest OS update. # yum –y update Setting the locking type for GFS filesystem To use the GFS (Global File System) with the cluster you need to activate the GFS locking in the /etc/lvm/lvm. This will ensure that you are using all the memory available on the server.Creating a Red Hat Cluster: Part 2 In this article.conf locking_type = 0 # lvmconf --enable-cluster # grep -i locking_type /etc/lvm/lvm. Let‟s install the clustering software. # yum groupinstall “Cluster Storage” I also found out that this package is also needed by the cluster software. by typing this command. We need to change the “locking_type” variable from 0 to 3. you need to register your server in order to download new software or update. You also need to subscribe to the “Clustering” and “Cluster Storage” channel in order to install these groups of software.conf locking_type = 3 Making sure SELinux and firewall are disable We do not want to deal with SELinux and the firewall in our cluster. we continue our journey on how-to build our cluster Installing the cluster software If you are using Red Hat. you need to install the PAE kernel and GFS modules. This command needs to be run on all the servers. this is not needed since we can download these groups of software without registration. we will need the “Cluster Storage” software group. . # yum groupinstall Clustering Since we will be using GFS filesystem. so we will disable them. # yum install perl-Crypt-SSLeay If you are using a 32 bits kernel and you have more than 4 GB of memory on it. With CentOS.

The first one is a web interface called “Conga” that require the installation of a agent on each node (ricci) and a centralize configuration center named (lucy). Activate cluster services. This tool can be started from any node within the cluster and does not require installing any additional software.# system-config-securitylevel Disable the SELinux need reboot and Disable the firewall Activating cluster service Let‟s make sure that the cluster services are started each time the server is started. If you want to use the web interface (Seem to be working a lot better now). Start Cluster Services. To start the cluster configuration GUI. type the following command using the “root” account. We now we can start the cluster services manually or reboot the server. Start cluster GUI from command line: . Starting cluster configuration There are two tools you can use to configure and maintain your cluster. This was fairly new when I first created my first cluster and we decided to used the second interface name “Cluster Configuration”. it is advisable to install the configuration center on a Linux server outside of the cluster.

But basically. The Post-Join Delay parameter is the number of seconds the fence daemon (fenced) waits before fencing a node after the node joins the fence domain. we have choose to name it “our_cluster”. If a node hasn‟t updated its status for a period of time.1. General cluster setting:  Next we need to enter the name of our cluster. the GULM is depreciated. Click on the “Create New Configuration” button and let‟s move on. Cluster Properties Now let‟s check some of the default setting given to our cluster.1”.   The multicast address we used for the heartbeat will be “239.  The usage of the “Quorum Disk” is outside the scope of this article. The only way to change the name of the cluster is to create a new one. but can vary according to cluster and network performance.  We use the recommend Lock Manager (Dynamic Lock Manager). A typical setting for Post-Join Delay is between 20 and 30 seconds.conf” was not found.1. The Post-Fail Delay parameter is the number of seconds the fence daemon (fenced) waits . Click on “Cluster” on the left hand side of the screen and then click on the “Edit Cluster Properties”. a warning message may display. The Post-Join Delay default value is 3. It is just informing us that the cluster configuration file “/etc/cluster/cluster. you define a small disk on the SAN that is shared among the nodes in the cluster and node status is regularly written to that disk.New cluster config warning The first time you run the cluster configuration GUI. so choose it wisely. it will be considered down and the cluster will then fence that node. The Configuration Version value is by default to 1 and automatically incremented each time you modify your cluster configuration. The name of a cluster cannot be change. If you are interested to use the “Quorum Disk” there is an article here that describe how to set it up.

maison. quorum is only reached if seven or more nodes are communicating. You will be then presented the “Node Properties” screen. CHANGE THE “POST FAIL DELAY” to 30 Seconds. The avoid problem. I always used the first interface for the network cluster heartbeat. in our case it is eth0. Remember that this name MUST be defined in our hosts file and in your DNS (if you have one). Enter the node name. So in a thirteen-node cluster. communicating. Adding our nodes to cluster To add node into the cluster let‟s select “Cluster Nodes” on the upper left side of the screen and click on the “Add a Cluster Node” Its value may be varied to suit cluster and network performance. the cluster loses quorum and can no longer function. and agree on the active cluster members. There is a special setting “two_node” in the /etc/cluster.  Enter the name of the host name used for the heartbeat.conf file that looks like this: <cman expected_votes=”1″ two_node=”1″/> Repeat operation for every node you want to include in the cluster Insert “gandalf” and “gollum” node in our cluster.before fencing a node (a member of the fence domain) after the node has failed. We say a cluster has „quorum‟ if a majority of nodes are alive. For our cluster we will leave the Quorum Votes to the default value of 1. If the seventh node dies. for our first node it will be “hbbilbo. If we would have a two-node clusters.   Quorum is a voting algorithm used by the cluster manager. we would need to make a special exception to the quorum rules. the “Quorum Votes” and the name of the interface used for the “Multicast” (Heartbeat). The Post-Fail Delay default value is 0. .