High availability (HA) is a critical aspect of modern IT infrastructure, ensuring that services remain online and accessible even in the event of hardware failures or maintenance events. One of the most popular solutions for implementing HA on Linux is Pacemaker, a cluster resource manager that enables you to manage services and resources across multiple nodes in a way that minimizes downtime. In this article, we will walk through the steps for configuring a High Availability cluster using Pacemaker on Linux.
Prerequisites
Before we begin, please ensure that you meet the following prerequisites:
- Linux Distributions: We recommend using CentOS, RHEL, or Ubuntu, but the steps can be adapted for other distributions.
- Multiple Nodes: At least two nodes (servers) that will participate in the HA cluster.
- Root Access: Administrative privileges on all nodes.
- Network Connectivity: Ensure that all nodes can communicate with each other over a private network.
- Package Installation: The necessary packages should be installed on all nodes. Pacemaker, Corosync, and related tools are required.
You can install the necessary packages using the package manager of your distribution. For instance:
# For CentOS/RHEL
sudo yum install pacemaker pcs corosync
# For Ubuntu
sudo apt-get install pacemaker corosync
Step 1: Enable and Start the Required Services
On each node, you’ll need to enable and start the following services:
# For CentOS/RHEL
sudo systemctl enable pcsd
sudo systemctl start pcsd
# For Ubuntu
sudo systemctl enable pacemaker
sudo systemctl start pacemaker
Make sure you do the same for corosync
.
Step 2: Authenticate the Nodes
To form a cluster, the nodes must be able to communicate with each other. On one of the nodes, set a password for the cluster:
sudo passwd hacluster
Make sure to provide the same password on all participating nodes. Now, authenticate each node to the cluster:
# On all nodes
sudo pcs cluster auth <node1> <node2> -u hacluster -p <password>
Replace <node1>
and <node2>
with the actual hostnames of your servers, and <password>
with the password you set earlier.
Step 3: Create the Cluster
Now that the nodes are authenticated, you can create the cluster. This command should be executed on one of the nodes:
sudo pcs cluster setup --name my_cluster <node1> <node2>
Replace my_cluster
with a name of your choosing for the cluster. You can also add additional nodes by appending them to the command.
After setting up the cluster, start it:
sudo pcs cluster start --all
Check the status:
sudo pcs cluster status
Step 4: Configure Cluster Resources
With your Pacemaker cluster up and running, it’s time to configure resources that will failover between nodes. For example, if you want to manage an Apache web server, you would first create a resource for the Apache service:
sudo pcs resource create WebServer apache configfile=/etc/httpd/conf/httpd.conf op start --timeout=60 --interval=0 op stop --timeout=60 --interval=0
You can also define constraints such as ensuring that the service is running on only one node at a time:
sudo pcs resource master ms_WebServer WebServer 100
Step 5: Define Failover Policies and Constraints
In a high-availability setup, defining failover policies is crucial. You can configure options such as resource location, colocation, and ordering:
- Colocation: Define where related resources must run together.
- Ordering: Define the order in which resources must start or stop.
For example, to ensure your database starts before your web server:
sudo pcs constraint order start ms_Database then start ms_WebServer
Step 6: Managing Cluster Resources
Once resources are configured, you can manage them using the pcs
commands:
To start a resource:
sudo pcs resource start WebServer
To stop a resource:
sudo pcs resource stop WebServer
To check the status of resources:
sudo pcs status resources
Step 7: Testing Your Configuration
To ensure that your HA setup is functioning correctly, test failover scenarios. You can simulate a node failure and observe if the resources migrate to the remaining node:
-
Stop the Pacemaker service on one of the nodes:
sudo systemctl stop pacemaker
-
Check the status on the remaining node to see if the resources have migrated.
-
Restart the Pacemaker service on the failed node:
sudo systemctl start pacemaker
- Verify that the node rejoins the cluster and resources are redistributed appropriately.
Conclusion
Configuring a high-availability cluster with Pacemaker on Linux is a straightforward process that significantly improves the reliability of your services. By following the steps outlined in this article, you’re well on your way to ensuring that your applications remain available even in the face of unforeseen failures.
For further reading, explore the official Pacemaker documentation or consider additional resources on clustering technologies to enhance your understanding and capabilities.
Additional Resources
- Pacemaker Documentation
- Corosync Documentation
- Guide to Linux High Availability
Happy clustering!