High availability (HA) is a critical aspect of modern IT infrastructure, ensuring that services remain online and accessible even in the event of hardware failures or maintenance events. One of the most popular solutions for implementing HA on Linux is Pacemaker, a cluster resource manager that enables you to manage services and resources across multiple nodes in a way that minimizes downtime. In this article, we will walk through the steps for configuring a High Availability cluster using Pacemaker on Linux.

Prerequisites

Before we begin, please ensure that you meet the following prerequisites:

  1. Linux Distributions: We recommend using CentOS, RHEL, or Ubuntu, but the steps can be adapted for other distributions.
  2. Multiple Nodes: At least two nodes (servers) that will participate in the HA cluster.
  3. Root Access: Administrative privileges on all nodes.
  4. Network Connectivity: Ensure that all nodes can communicate with each other over a private network.
  5. Package Installation: The necessary packages should be installed on all nodes. Pacemaker, Corosync, and related tools are required.

You can install the necessary packages using the package manager of your distribution. For instance:

# For CentOS/RHEL
sudo yum install pacemaker pcs corosync

# For Ubuntu
sudo apt-get install pacemaker corosync

Step 1: Enable and Start the Required Services

On each node, you’ll need to enable and start the following services:

# For CentOS/RHEL
sudo systemctl enable pcsd
sudo systemctl start pcsd

# For Ubuntu
sudo systemctl enable pacemaker
sudo systemctl start pacemaker

Make sure you do the same for corosync.

Step 2: Authenticate the Nodes

To form a cluster, the nodes must be able to communicate with each other. On one of the nodes, set a password for the cluster:

sudo passwd hacluster

Make sure to provide the same password on all participating nodes. Now, authenticate each node to the cluster:

# On all nodes
sudo pcs cluster auth <node1> <node2> -u hacluster -p <password>

Replace <node1> and <node2> with the actual hostnames of your servers, and <password> with the password you set earlier.

Step 3: Create the Cluster

Now that the nodes are authenticated, you can create the cluster. This command should be executed on one of the nodes:

sudo pcs cluster setup --name my_cluster <node1> <node2>

Replace my_cluster with a name of your choosing for the cluster. You can also add additional nodes by appending them to the command.

After setting up the cluster, start it:

sudo pcs cluster start --all

Check the status:

sudo pcs cluster status

Step 4: Configure Cluster Resources

With your Pacemaker cluster up and running, it’s time to configure resources that will failover between nodes. For example, if you want to manage an Apache web server, you would first create a resource for the Apache service:

sudo pcs resource create WebServer apache configfile=/etc/httpd/conf/httpd.conf op start --timeout=60 --interval=0 op stop --timeout=60 --interval=0

You can also define constraints such as ensuring that the service is running on only one node at a time:

sudo pcs resource master ms_WebServer WebServer 100

Step 5: Define Failover Policies and Constraints

In a high-availability setup, defining failover policies is crucial. You can configure options such as resource location, colocation, and ordering:

  1. Colocation: Define where related resources must run together.
  2. Ordering: Define the order in which resources must start or stop.

For example, to ensure your database starts before your web server:

sudo pcs constraint order start ms_Database then start ms_WebServer

Step 6: Managing Cluster Resources

Once resources are configured, you can manage them using the pcs commands:

To start a resource:

sudo pcs resource start WebServer

To stop a resource:

sudo pcs resource stop WebServer

To check the status of resources:

sudo pcs status resources

Step 7: Testing Your Configuration

To ensure that your HA setup is functioning correctly, test failover scenarios. You can simulate a node failure and observe if the resources migrate to the remaining node:

  1. Stop the Pacemaker service on one of the nodes:

    sudo systemctl stop pacemaker

  2. Check the status on the remaining node to see if the resources have migrated.

  3. Restart the Pacemaker service on the failed node:

    sudo systemctl start pacemaker

  4. Verify that the node rejoins the cluster and resources are redistributed appropriately.

Conclusion

Configuring a high-availability cluster with Pacemaker on Linux is a straightforward process that significantly improves the reliability of your services. By following the steps outlined in this article, you’re well on your way to ensuring that your applications remain available even in the face of unforeseen failures.

For further reading, explore the official Pacemaker documentation or consider additional resources on clustering technologies to enhance your understanding and capabilities.

Additional Resources

Happy clustering!