Red Hat Cluster Suite

In today’s world, every organisation in is thriving to provide a nonstop 24X7 service to their clients and so deploying high available solutions to their products. This is where a cluster concept kicks in. Along with some commercial high-availability solutions like IBM’s HACMP and Veritas’ Cluster Server and HP’s Serviceguard, Red Hat provides a high-availability solution on Red Hat Enterprise Linux, called the Red Hat Cluster Suite.

Red Hat Cluster Suite has two major features, the Cluster Manager ( Cman)  that provides high availability, and the IP load balancing (also called Piranha). The Cluster Manager and Piranha are complementary high-availability technologies and can be used in combination or separately, as per your requirements.

Cluster Overview

First we will take a look onto how RHCS manages the Shared Storage and Data Integrity.Lock management is a service that provides a mechanism for other cluster components to synchronize their access to shared resources. In a RHCS, Distributed Lock Manager (DLM) or Grand Unified Lock Manager(GULM) are used as lock manager. GULM is a server-based lock manager for GFS, GNBD(Global Network Block Device)  and CLVM (Clustered LVM ). A single GULM  server can  run as a standalone server but introduces a single point of failure for GFS. Three or five GULM servers also can be used together, in which case the failure of one or two servers can be tolerated. DLM  runs in each cluster node. DLM is good choice because it removes quorum requirements as imposed by the GULM mechanism.

Based on the choice of DLM or GULM locking to be used, two basic techniques that can be used by the RHEL cluster for ensuring data integrity. Traditionally CLVM is used, which works well  with LVM-based logical volumes. Else GFS can also be used. It is a cluster filesystem that allows a the nodes to access simultaneously a block device that is shared in the cluster. It employs distributed metadata and multiple journals for optimal operation in the cluster. GFS uses a lock manager to coordinate I/O and maintains the filesystem integrity,. Changes made by one node in a GFS filesystem, makes it visible immediately to the other cluster nodes using that filesystem. In most Red Hat cluster implementations, GFS is used with a direct access configuration to SAN from all cluster nodes.

The next important component about cluster infrastructure is the Fencing  Infrastructure, whose purpose is to ensure data integrity. To ensure data integrity, only one node can run a cluster service and access cluster service data at a time. To prevent any two nodes from simultaneously accessing the same data and eventually corrupting it, power switches  are used which enables a node to power-cycle another node before restarting  that node’s cluster services during the failover process . Generally in RHEL cluster implementations, hardware fence devices are used, such as APC power switches  and  IBM BladeCenter. Fencing is a mandatory requirement  for RHEL cluster solutions using shared storage.

Creating a Red Hat Cluster

To properly design a cluster, you should avoid any single point of failure and so, you can place your servers physically in two separate racks using redundant power suppliesand  at least two network adapters and two network switches on each cluster node.

Install the latest stable version of RHEL and its updates and the install the latest version of RHCS available. The following rpms are absolute necessary to build a RHEL cluster with DLM.

  • Magma and magma-plugins
  • Perl-net-telnet
  • Rgmanager
  • System-config-cluster
  • DLM and dlm-kernel
  • DLM-kernel-hugemem and SMP support for DLM
  • Iddev and ipvsadm
  • Cman, cman-smp, cman-hugemem and cman-kernelheaders
  • Ccs

Restart the nodes when you are finished installing them.

For network configuration, Ethernet channel bonding is highly recommendedas it allows a fault-tolerant network connection by combining two Ethernet devices into one virtual device. The bonded interface ensures that if one Ethernet device fails, the other device will become active automatically. To configure two network bonding, do the following on the first node

1) Create a bonding devices in /etc/modules.conf file.

alias bond0 bonding

options bond0 mode=balance-alb miimon=100

2) Edit the /etc/sysconfig/network-scripts/ifcfg-eth0   file for eth0  and the /etc/sysconfig/network-scripts/ifcfg-eth1 file for the eth1 interface, so that these files show identical contents, as shown below:

DEVICE=ethx

USERCTL= no

ONBOOT=yes

MASTER=bond0

SLAVE=yes

BOOTPROTO=none

**This enslaves ethX (replace X with the assigned number of the Ethernet devices) to the bond0 master device.

3) Create a file , /etc/sysconfig/network-scripts/ifcfg-bond0 for the bonding device  which looks like:

DEVICE=bond0

IPADDR=192.168.1.1

NETMASK=255.255.255.0

NETWORK=192.168.1.0

BROADCAST=192.168.1.255

GATEWAY=192.168.1.1

ONBOOT=yes

BOOTPROTO=none

USERCTL=no

4) Reboot the system for the changes to take effect.

Similarly, on node 2, repeat the same steps with the only difference being that the file /etc/sysconfig/network-scripts/ifcfg-bond0 should contain an IPADDR entry with the value of 192.168.1.2.  Now you have two RHEL cluster nodes with IP addresses of 192.168.1.1  and 192.168.1.2 . You can now set the other network configuration details, such as hostname and primary/secondary DNS server configuration .

**The cluster node name needs to match the output of uname -n or the value of HOSTNAME in /etc/sysconfig/network.

You can also use additional Ethernet interface in each cluster node with  a separate IP network as an additional network for heartbeats between cluster nodes.  For this, you simply assign IP addresses—e.g., 192.168.10.1 and 192.168.10.2 on eth2, and get it resolved from the /etc/hosts file.

Now you need to setup the Fencing Device.

 

Setup of the Shared Storage Drive and Quorum Partitions

 

Cluster Configuration

The easiest way to configure a RHEL cluster is to use the RHEL GUI for creating, editing, saving, and propagating the cluster configuration file, /etc/cluster/cluster.conf. Once you are done editing the cluster configuration file. the last step is to synchronize  the cluster configuration across the cluster nodes. RHCS provides a “save configuration to cluster” option, which will appear once you start the cluster services. So, for the first time, it is better to send the cluster configuration file manually to all cluster nodes. Just copy the /etc/cluster/cluster.conf file using  the scp command to across all the cluster nodes:

Once it is done, you should start and stop RHEL-related cluster services, in sequence.

To start:

service ccsd start
service cman start
service fenced start
service rgmanager start

To stop:

service rgmanager stop
service fenced stop
service cman stop
service ccsd stop

If you use GFS, startup/shutdown of the gfs and clvmd services have to be included in this sequence.

**By default, all cluster messages go into the RHEL log messages file (/var/log/messages), which makes cluster troubleshooting somewhat difficult in some scenarios. For this purpose, you can edit   the /etc/syslog.conf file to enable the cluster to log events to a file that is different from the default log file.

daemon.* /var/log/cluster

 

 

A Quick note about Cluster Components

Cman  à manages Cluster quorum and cluster membership. It is distributed cluster manager and runs in all nodes

è  Keeps track of cluster Quorum by monitoring the count of cluster nodes.

è  Prevents split-brain condition

è  Quorum is determined by communication of messages among cluster nodes via Ethernet “and” through a quorum disk.

 

CCS   à Cluster Configuration system manages Cluster Configuration and provides configuration information to other cluster components. Runs in each node and makes sure the CCS file is updated in all nodes.

Split Brain occurs when all of the private links breaks simultaneously, but the cluster nodes are still running. If that happens, each node in the cluster may mistakenly decide that every other node has gone down and attempt to start services that other nodes are still running. Having duplicate instances of services may cause data corrpution on the shared storage.

A few commands that you want to remember for administrating the cluster


# clustat                  à Checking status of the cluster

# clusvcadm -r <servicename> -m <membername>        à Moving a service/package over to another node

# clusvcadm -e <servicename> -m <membername>      à Starting a service/package

# clusvcadm -d <servicename>          à Stopping/disabling a service/package

Removing a node from the cluster

 # chkconfig ccsd off
 # chkconfig clvmd off
 # chkconfig cman off
 # chkconfig cmirror off
 # chkconfig fenced off
 # chkconfig gfs off
 # chkconfig rgmanager off
 # shutdown -r now

Gracefully halting the cluster

 # clusvcadm -d <servicename>
 **Do the following on each node:**
 # umount <GFS filesystems>
 # service rgmanager stop
 # service gfs stop
 # service clvmd stop
 # service fenced stop
 # cman_tool status
 # cman_tool leave
 # service ccsd stop

Gracefully starting the cluster (Done on each node)

 # service ccsd start
 # service cman start
 # service fenced start
 # service clvmd start
 # service gfs start
 # service rgmanager start
 # cman_tool nodes (shows status of nodes)

 

 

You can leave a response, or trackback from your own site.

Leave a Reply

You must be logged in to post a comment.