Contents
Manual configuration of a Heartbeat cluster is often the most effective way of creating a reliable cluster that meets specific needs. Because of the extensive configurability of Heartbeat and the range of needs it can meet, it is not possible to document every possible scenario. To introduce several concepts of the Heartbeat configuration and demonstrate basic procedures, consider a real world example of an NFS file server. The goal is to create an NFS server that can be built with very low-cost parts and is as redundant as possible. For this, set up the following cluster:
Two machines that have redundant hardware
Data is mirrored on the disks of those machines with drbd
Only one machine at a time accesses and exports the data
Assign a special IP address to the computer for exporting the file system
Before starting with the cluster configuration, set up two nodes as described in Chapter 2, Installation and Setup. In addition to the system installation, both should have a data partition of the same size to setup drbd.
The configuration splits into two main parts. First, all the
resources must be configured. After this, create a set of
constraints that define the starting rules for
those resources.
All the configuration data is written in XML. For convenience, the example relies on snippets that may be loaded into the cluster configuration individually.
The cluster is divided into two main sections, configuration and status. The status section contains the history of each resource on each node and based on this data, the cluster can construct the complete current state of the cluster. The authoritative source for the status section is the local resource manager (lrmd) process on each cluster node. The cluster will occasionally repopulate the entire section. For this reason it is never written to disk and administrators are advised against modifying it in any way.
The configuration section contains the more traditional information like cluster options, lists of resources and indications of where they should be placed. It is the primary focus of this document and is divided into four parts:
Configuration options (called crm_config)
Nodes
Resources
Resource relationships (called constraints)
Example 5.1. Structure of an Empty Configuration
<cib generated="true" admin_epoch="0" epoch="0" num_updates="0" have_quorum="false">
<configuration>
<crm_config/>
<nodes/>
<resources/>
<constraints/>
</configuration>
<status/>
</cib>Before you start to configure a cluster, it is worth explaining how to view the finished product. For this purpose use the crm_mon utility that will display the current state of an active cluster. It can show the cluster status by node or by resource and can be used in either single-shot or dynamically-updating mode. Using this tool, you can examine the state of the cluster for irregularities and see how it responds when you cause or simulate failures.
Details on all the available options can be obtained using the
crm_mon --help command.
There is a basic warning for updating the cluster configuration:
![]() | Rules For Updating the Configuration |
|---|---|
Never edit the | |
To modify your cluster configuration, use the cibadmin command which talks to a running cluster. With cibadmin, you can query, add, remove, update or replace any part of the configuration. All changes take effect immediately and there is no need to perform a reload-like operation.
The simplest way of using cibadmin is a three-step procedure:
Save the current configuration to a temporary file:
cibadmin --cib_query > /tmp/tmp.xml
Edit the temporary file with your favorite text or XML editor.
Some of the better XML editors are able to use the
DTD (document type definition) to make sure that any changes
you make are valid. The DTD describing the configuration can
be found in /usr/lib/heartbeat/crm.dtd
on your systems.
Upload the revised configuration:
cibadmin --cib_replace --xml-file /tmp/tmp.xml
If you only want to modify the
resources section, do the following to avoid
modifying any other part of the configuration:
cibadmin --cib_query --obj_type resources > /tmp/tmp.xml vi /tmp/tmp.xml cibadmin --cib_replace --obj_type resources --xml-file /tmp/tmp.xml
Sometimes it is necessary to delete an object quickly. This can be done in three easy steps:
Identify the object you wish to delete, for example:
cibadmin -Q | grep stonith <nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="reboot"/> <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="1"/> <primitive id="child_DoFencing" class="stonith" type="external/vmware"> <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith"> <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith"> <lrm_resource id="child_DoFencing:1" type="external/vmware" class="stonith"> <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith"> <lrm_resource id="child_DoFencing:2" type="external/vmware" class="stonith"> <lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith"> <lrm_resource id="child_DoFencing:3" type="external/vmware" class="stonith">
Identify the resource’s tag name and id (in this case
primitive and
child_DoFencing.
Execute cibadmin:
cibadmin --cib_delete --crm_xml ‘<primitive id=”child_DoFencing”/>’
Some common tasks can also be performed with one of the
higher level tools that avoid the need to read or edit
XML. Run the following command to enable STONITH, for example:
crm_attribute --attr-name stonith-enabled --attr-value true
Or to see if somenode is allowed to run
resources, there is:
crm_standby --get-value --node-uname somenode
Or to find the current location of my-test-rsc one can use:
crm_resource --locate --resource my-test-rsc
It is not necessary to modify a real cluster in order to test the effect of the configuration changes. Do the following to test your modifications:
Save the current configuration to a temporary file:
cibadmin --cib_query > /tmp/tmp.xml
Edit the temporary file with your favorite text or XML editor.
Simulate the effect of the changes with
ptest:
ptest -VVVVV --xml-file /tmp/tmp.xml --save-graph tmp.graph --save-dotfile tmp.dot
The tool uses the same library as the live cluster to show
the impact it would have done. Its output,
in addition to a significant amount of logging, is stored in two
files, tmp.graph and
tmp.dot. Both files are representations of the
same thing—the cluster’s response to your changes. In the
graph file the complete transition is stored, containing a list
of all actions, their parameters and their prerequisites.
The transition graph is not very easy to read. Therefore,
the tool also generates a Graphviz dot-file representing the same
information.
There are three types of RAs (Resource Agents) available with Heartbeat. First, there are legacy Heartbeat 1 scripts. Heartbeat can make use of LSB initialization scripts. Finally, Heartbeat has its own set of OCF (Open Cluster Framework) agents. This documentation concentrates on LSB scripts and OCF agents.
All LSB scripts are commonly found in the directory
/etc/init.d. They must have several
actions implemented, which are at least
start, stop,
restart, reload,
force-reload, and status
as explained in http://www.linux-foundation.org/spec/refspecs/LSB_1.3.0/gLSB/gLSB/iniscrptact.html.
The configuration of those services is not standardized. If
you intend to use an LSB script with Heartbeat, make sure that
you understand how the respective script is configured. Often
you can find some documentation to this in the documentation of
the respective package in
/usr/share/doc/packages/<package_name>.
When used by Heartbeat, the service should not be touched by other means. This means that it should not be started or stopped on boot, reboot, or manually. However, if you want to check if the service is configured properly, start it manually, but make sure that it is stopped again before Heartbeat takes over.
Before using an LSB resource, make sure that the configuration of this resource is present and identical on all cluster nodes. The configuration is not managed by Heartbeat. You must take care of that yourself.
All OCF agents are located in
/usr/lib/ocf/resource.d/heartbeat/. These
are small programs that have a functionality similar to that of
LSB scripts. However, the configuration is always done with
environment variables. All OCF Resource Agents are required to
have at least the actions start,
stop, status,
monitor, and meta-data.
The meta-data action retrieves information
about how to configure the agent. For example, if you want to
know more about the IPaddr agent, use the
command:
/usr/lib/ocf/resource.d/heartbeat/IPaddr meta-data
The output is lengthy information in a simple XML format.
You can validate the output with
the ra-api-1.dtd DTD. Basically this XML
format has three sections—first several common
descriptions, second all the available parameters, and last the
available actions for this agent.
A typical parameter of a OCF RA as shown with the
meta-data command looks like this:
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> <resource-agent name="apache"><!-- Some elements omitted --> <parameter name="ip" unique="1" required="1">
<longdesc lang="en">
The IPv4 address to be configured in dotted quad notation, for example "192.168.1.1". </longdesc> <shortdesc lang="en">IPv4 address</shortdesc> <content type="string" default="" />
</parameter> </resource-agent>
This is part of the IPaddr RA. The
information about how to configure the parameter of this RA can
be read as follows:
Root element for each output. | |
The name of the | |
The description of the parameter is available in a long and a short description tag. | |
The content of the value of this parameter is a string. There is no default value available for this resource. |
Find a configuration example for this RA at Chapter 3, Setting Up a Simple Resource.
To set up the NFS server, three resources are needed: a
file system resource, a drbd resource, and a group of an NFS
server and an IP address. You can write each of the resource
configurations to a separate file then load them to the cluster
with cibadmin -C -o resources -x
resource_configuration_file.
The filesystem resource is configured
as an OCF primitive resource. It has the task to mount and
unmount a device to a directory on start and stop requests. In
this case, the device is /dev/drbd0 and
the directory to use as mount point is
/srv/failover. The file system used is
reiserfs.
The configuration for this resource looks like the following:
<primitive id="filesystem_resource" class="ocf" provider="heartbeat" type="Filesystem">
<instance_attributes id="ia-filesystem_1">
<attributes>
<nvpair id="filesystem-nv-1" name="device" value="/dev/drbd0"/>
<nvpair id="filesystem-nv-2" name="directory" value="/srv/failover"/>
<nvpair id="filesystem-nv-3" name="fstype" value="reiserfs"/>
</attributes>
</instance_attributes>
</primitive>
Before starting with the drbd Heartbeat configuration, set
up a drbd device manually. Basically this
is configuring drbd in /etc/drbd.conf and
letting it synchronize. The exact procedure for configuring
drbd is described in the Storage Administration Guide. For now, assume
that you configured a resource r0 that may
be accessed at the device /dev/drbd0 on
both of your cluster nodes.
The drbd resource is an OCF master
slave resource. This can be found in the description of the
metadata of the drbd RA. However, more important is that there
are the actions promote and
demote in the actions
section of the metadata. These are mandatory for master slave
resources and commonly not available to other resources.
For Heartbeat, master slave resources may have multiple
masters on different nodes. It is even possible to have a
master and slave on the same node. Therefore, configure this
resource in a way that there is exactly one master and one
slave, each running on different nodes. Do this with the meta
attributes of the master_slave resource.
Master slave resources are a special kind of clone resources
in Heartbeat. Every master and every slave counts as a clone.
<master_slave id="drbd_resource" ordered="false"><meta_attributes> <attributes> <nvpair id="drbd-nv-1" name="clone_max" value="2"/>
<nvpair id="drbd-nv-2" name="clone_node_max" value="1"/>
<nvpair id="drbd-nv-3" name="master_max" value="1"/>
<nvpair id="drbd-nv-4" name="master_node_max" value="1"/>
<nvpair id="drbd-nv-5" name="notify" value="yes"/>
</attributes> </meta_attributes> <primitive id="drbd_r0" class="ocf" provider="heartbeat" type="drbd">
<instance_attributes id="ia-drbd_1"> <attributes> <nvpair id="drbd-nv-5" name="drbd_resource" value="r0"/>
</attributes> </instance_attributes> </primitive> </master_slave>
The master element of this resource is
| |
| |
| |
| |
| |
| |
The actually working RA inside this | |
The most important parameter this resource needs to know about is the name of the drbd resource to handle. |
To make the NFS server always available at the same IP address, use an additional IP address as well as the ones the machines use for their normal operation. This IP address is then assigned to the active NFS server in addition to the system's IP address.
The NFS server and the IP address of the NFS server should always be active on the same machine. In this case, the start sequence is not very important. They may even be started at the same time. These are the typical requirements for a group resource.
Before starting the Heartbeat RA configuration, configure
the NFS server with YaST. Do not let the system start
the NFS server. Just set up the configuration file. If you
want to do that manually, see the manual page exports(5)
(man 5 exports. The configuration file is
/etc/exports. The NFS server is
configured as an LSB resource.
Configure the IP address completely with the Heartbeat RA configuration. No additional modification is necessary in the system. The IP address RA is an OCF RA.
<group id="nfs_group"><primitive id="nfs_resource" class="lsb" type="nfsserver"/>
<primitive id="ip_resource" class="ocf" provider="heartbeat" type="IPaddr">
<instance_attributes id="ia-ipaddr_1"> <attributes> <nvpair id="ipaddr-nv-1" name="ip" value="10.10.0.1"/>
</attributes> </instance_attributes> </primitive> </group>
In a group resource, there may be several other resources. It must have an ID set. | |
The | |
The | |
There is only one mandatory instance attribute in the
|
Having all the resources configured is only part of the job. Even if the clusters knows all needed resources, it might still not be able to handle them correctly. For example, it would be quite useless to try to mount the file system on the slave node of drbd (in fact, this would fail with drbd). To inform the cluster about these things, define constraints.
In Heartbeat, there are three different kinds of constraints available:
Locational constraints that define on which nodes a
resource may be run (rsc_location).
Colocational constraints that tell the cluster which
resources may or may not run together on a node
(rsc_colocation).
Ordering constraints to define the sequence of actions
(rsc_order).
This type of constraint may be added multiple times for
each resource. All rsc_location constraints
are evaluated for a given resource. A simple example that
increases the probability to run a resource with the ID
filesystem_1 on the node with the name
earth to 100 would be the following:
<rsc_location id="filesystem_1_location" rsc="filesystem_1"><rule id="pref_filesystem_1" score="100">
<expression attribute="#uname" operation="eq" value="earth"/>
</rule> </rsc_location>
To take effect, the | |
The | |
Whether a It is also possible to use another rule or a
|
The rsc_colocation constraint is used to
define what resources should run on the same or on different
hosts. It is not possible to give a score other than
INFINITY or -INFINITY,
defining resources to run together always or never to run
together. For example, to run the two resources with the IDs
filesystem_resource and
nfs_group always on the same host, use the
following constraint:
<rsc_colocation id="nfs_on_filesystem" to="filesystem_resource" from="nfs_group" score="INFINITY"/>
For a master slave configuration, it is necessary to know
if the current node is a master in addition to running the
resource locally. This can be checked with an additional
to_role or from_role
attribute.
Sometimes it is necessary to provide an order in which services must start. For example, you cannot mount a file system before the device is available to a system. Ordering constraints can be used to start or stop a service right before or after a different resource meets a special condition, such as being started, stopped, or promoted to master. An ordering constraint looks like the following:
<rsc_order id="nfs_after_filesystem" from="group_nfs" action="start"
to="filesystem_resource" to_action="start" type="after"/> With type="after", the
action of the from
resource is done after the action of the to
resource.
The example used for this chapter is quite useless without additional constraints. It is essential that all resources run on the same machine as the master of the drbd resource. Another thing that is critical is that the drbd resource must be master before any other resource starts. Trying to mount the drbd device when drbd is not master simply fails. The constraints that must be fulfilled look like the following:
The file system must always be on the same node as the master of the drbd resource.
<rsc_colocation id="filesystem_on_master" to="drbd_resource"
to_role="master" from="filesystem_resource" score="INFINITY"/>
The file system must be mounted on a node after the drbd resource is promoted to master on this node.
<rsc_order id="drbd_first" from="filesystem_resource" action="start"
to="drbd_resource" to_action="promote" type="after"/>The NFS server as well as the IP address start after the file system is mounted.
<rsc_order id="nfs_second" from="nfs_group" action="start"
to="filesystem_resource" to_action="start" type="after"/>The NFS server as well as the IP address must be on the same node as the file system.
<rsc_colocation id="nfs_on_drbd" to="filesystem_resource"
from="nfs_group" score="INFINITY"/>In addition to that, issue constraint that prevents the NFS server from running on a node where drbd is running in slave mode.
<rsc_colocation id="nfs_on_slave" to="drbd_resource"
to_role="slave" from="nfs_group" score="-INFINITY"/> The CRM options define the global behavior of a cluster. In
principle, the default values should be acceptable for many
environments, but if you want to use special services, like
STONITH devices, you must inform the cluster about this. All
options of crm_config are made with
nvpair and are added to
cib.xml. For example, to change the
cluster-delay from its default value of
60s to 120s, use the
following configuration:
<cluster_property_set>
<attributes>
<nvpair id="1" name="cluster-delay" value="120s"/>
</attributes>
</cluster_property_set> Write this information to a file and load it to the cluster
with the command cibadmin -C -o crm_config -x
filename. The following
is an overview of all available configuration options:
cluster-delay (interval, default=60s) This option used to be known as
transition_idle_timeout. If no activity
is recorded in this time, the transition is deemed failed as
are all sent actions that have not yet been confirmed
complete. If any operation initiated has an explicit higher
time-out, the higher value applies.
symmetric_cluster (boolean,
default=TRUE)If true, resources are permitted to run anywhere by default. Otherwise, explicit constraints must be created to specify where they can run.
stonith_enabled (boolean,
default=FALSE)If true, failed nodes are fenced.
no_quorum_policy (enum, default=stop)ignore
Pretend to have quorum.
freeze
Do not start any resources not currently in the partition. Resources in the partition may be moved to another node within the partition. Fencing is disabled.
stop
Stop all running resources in the partition. Fencing is disabled.
default_resource_stickiness (integer,
default=0)Select whether resources should prefer to run on the existing node or be moved to a “better” one?
0
Resources are placed optimally in the system.
This may mean they are moved when a
“better” or less-loaded node becomes
available. This option is almost equivalent to
auto_failback on except that the
resource may be moved to nodes other than the one on
which it was previously active.
value > 0
Resources prefer to remain in their current location but may be moved if a more suitable node is available. Higher values indicate a stronger preference for resources to stay where they are.
value < 0
Resources prefer to move away from their current location. Higher absolute values indicate a stronger preference for resources to be moved.
INFINITY
Resources always remain in their current
locations until forced off because the node is no
longer eligible to run the resource (node shutdown,
node standby, or configuration change). This option is
almost equivalent to auto_failback
off except that the resource may be moved to other
nodes than the one on which it was previously active.
-INFINITY
Resources always move away from their current location.
is_managed_default (boolean,
default=TRUE)Unless the resource's definition says otherwise:
TRUE
Resources are started, stopped, monitored, and moved as necessary.
FALSE
Resources are not started if stopped, stopped if started, or have any recurring actions scheduled.
stop_orphan_resources (boolean,
default=TRUE)If a resource is found for which there is no definition:
TRUE
Stop the resource.
FALSE
Ignore the resource.
This mostly affects the CRM's behavior when a resource is deleted by an administrator without it first being stopped.
stop_orphan_actions (boolean,
default=TRUE)If a recurring action is found for which there is no definition:
TRUE
Stop the action.
FALSE
Ignore the action.
All available options to the crm_config
are summarized in Policy Engine(7).
Homepage of High Availability Linux