Cluster LVM

Contents

14.1. Conceptual Overview
14.2. Configuration of cLVM
14.3. Configuring Eligible LVM2 Devices Explicitly
14.4. For More Information

Abstract

When managing shared storage on a cluster, every node must be informed about changes that are done to the storage subsystem. The Linux Volume Manager 2 (LVM2), which is widely used to manage local storage, has been extended to support transparent management of volume groups across the whole cluster. Clustered volume groups can be managed using the same commands as local storage.

Conceptual Overview

Clustered LVM is coordinated with different tools:

Distributed Lock Manager (DLM)

Coordinates disk access for cLVM.

Logical Volume Manager2 (LVM2)

Enables flexible distribution of one file system over several disks. LVM provides a virtual pool of disk space.

Clustered Logical Volume Manager (cLVM)

Coordinates access to the LVM2 metadata so every node knows about changes. cLVM does not coordinate access to the shared data itself; to enable cLVM to do so, you must configure OCFS2 or other cluster-aware applications on top of the cLVM-managed storage.

Configuration of cLVM

Depending on your scenario it is possible to create a RAID 1 device with cLVM with the following layers:

  • LVM.  This is a very flexible solution if you want to increase or decrease your file system size, add more physical storage, or create snapshots of your file systems. This method is described in Section 14.2.1, “Scenario: cLVM With iSCSI on SANs”.

  • DRBD.  This solution only provides RAID 0 (striping) and RAID 1 (mirroring). The last method is described in Section 14.2.2, “Scenario: cLVM With DRBD”.

  • MD Devices (Linux Software RAID or mdadm).  Although this solution provides all RAID levels, it does not support clusters yet.

Make sure you have fulfilled the following prerequisites:

  • A shared storage device is available, such as provided by a Fibre Channel, FCoE, SCSI, iSCSI SAN, or DRBD.

  • In case of DRBD, both nodes must be primary (as described in the following procedure).

  • Check if the locking type of LVM2 is cluster-aware. The keyword locking_type in /etc/lvm/lvm.conf must contain the value 3 (should be the default.) Copy the configuration to all nodes, if necessary.

[Note]Create Cluster Resources First

First create your cluster resources, and then your LVM volumes. Otherwise it is impossible to remove the volumes later.

Scenario: cLVM With iSCSI on SANs

The following scenario uses two SAN boxes which export their iSCSI targets to several clients. The general idea is displayed in Figure 14.1, “Setup of iSCSI with cLVM”.

Figure 14.1. Setup of iSCSI with cLVM

Setup of iSCSI with cLVM

[Warning]Data Loss

The following procedures will destroy any data on your disks!

Configure only one SAN box first. Each SAN box has to export its own iSCSI target. Proceed as follows:

Procedure 14.1. Configuring iSCSI Targets (SAN)

  1. Run YaST and click Network Services+iSCSI Target to start the iSCSI Server module.

  2. If you want to start the iSCSI target whenever your computer is booted, choose When Booting, otherwise choose Manually.

  3. If you have a firewall running, enable Open Port in Firewall.

  4. Switch to the Global tab. If you need authentication enable incoming or outgoing authentication or both. In this example, we select No Authentication.

  5. Add a new iSCSI target:

    1. Switch to the Targets tab.

    2. Click Add.

    3. Enter a target name. The name has to be formatted like this:

      iqn.DATE.DOMAIN
    4. If you want a more descriptive name, you can change it as long as your identifier is unique between your different targets.

    5. Click Add.

    6. Enter the device name in Path and use a Scsiid.

    7. Click Next two times.

  6. Confirm the warning box with Yes.

  7. Open the configuration file /etc/iscsi/iscsi.conf and change the parameter node.startup to automatic.

Now set up your iSCSI initiators as follows:

Procedure 14.2. Configuring iSCSI Initiators

  1. Run YaST and click Network Services+iSCSI Initiator.

  2. If you want to start the iSCSI initiator whenever your computer is booted, choose When Booting, otherwise set Manually.

  3. Change to the Discovery tab and click the Discovery button.

  4. Add your IP address and your port of your iSCSI target (see Procedure 14.1, “Configuring iSCSI Targets (SAN)”). Normally, you can leave the port as it is and use the default value.

  5. If you use authentication, insert the incoming and outgoing username and password, otherwise activate No Authentication.

  6. Select Next. The found connections are displayed in the list.

  7. Proceed with Finish.

  8. Open a shell, log in as root.

  9. Test if the iSCSI initiator has been started successfully:

    iscsiadm -m discovery -t st -p 192.168.3.100
    192.168.3.100:3260,1 iqn.2010-03.de.jupiter:san1
  10. Establish a session:

    iscsiadm -m node -l
    Logging in to [iface: default, target: iqn.2010-03.de.jupiter:san2, portal: 192.168.3.100,3260]
    Logging in to [iface: default, target: iqn.2010-03.de.venus:san1, portal: 192.168.3.101,3260]
    Login to [iface: default, target: iqn.2010-03.de.jupiter:san2, portal: 192.168.3.100,3260]: successful
    Login to [iface: default, target: iqn.2010-03.de.venus:san1, portal: 192.168.3.101,3260]: successful

    See the device names with lsscsi:

    ...
    [4:0:0:2]    disk    IET      ...     0     /dev/sdd
    [5:0:0:1]    disk    IET      ...     0     /dev/sde

    Look for entries with IET in their third column. In this case, the devices are /dev/sdd and /dev/sde.

Procedure 14.3. Creating a DLM Resource

  1. Start a shell and log in as root.

  2. Run crm configure.

  3. Enter the following commands:

    primitive dlm ocf:pacemaker:controld
    primitive clvm ocf:lvm2:clvmd \
            params daemon_timeout="30"
    group dlm-clvm dlm clvm
    clone dlm-clvm-clone dlm-clvm \
            meta interleave="true" ordered="true"
  4. Review your changes with show.

  5. If everything is correct, enter commit and leave crm with exit.

Procedure 14.4. Creating the LVM Volume Groups

  1. Open a root shell on one of the nodes you have run the iSCSI initiator from Procedure 14.2, “Configuring iSCSI Initiators”.

  2. Prepare the physical volume for LVM with the command pvcreate on the disks /dev/sdd and /dev/sde:

    pvcreate /dev/sdd
    pvcreate /dev/sde
  3. Check if everything is correct with pvdisplay:

      --- Physical volume ---
      PV Name               /dev/sdd
      VG Name               clustervg
      PV Size               509,88 MB / not usable 1,88 MB
      Allocatable           yes
      PE Size (KByte)       4096
      Total PE              127
      Free PE               127
      Allocated PE          0
      PV UUID               52okH4-nv3z-2AUL-GhAN-8DAZ-GMtU-Xrn9Kh
    
      --- Physical volume ---
      PV Name               /dev/sde
      VG Name               clustervg
      PV Size               509,84 MB / not usable 1,84 MB
      Allocatable           yes
      PE Size (KByte)       4096
      Total PE              127
      Free PE               127
      Allocated PE          0
      PV UUID               Ouj3Xm-AI58-lxB1-mWm2-xn51-agM2-0UuHFC
  4. Create the cluster-aware volume group on both disks:

    vgcreate --clustered y clustervg /dev/sdd /dev/sde
  5. Check if everything is correct with vgdisplay:

      --- Volume group ---
      VG Name               clustervg
      System ID
      Format                lvm2
      Metadata Areas        2
      Metadata Sequence No  1
      VG Access             read/write
      VG Status             resizable
      Clustered             yes
      Shared                no
      MAX LV                0
      Cur LV                0
      Open LV               0
      Max PV                0
      Cur PV                2
      Act PV                2
      VG Size               1016,00 MB
      PE Size               4,00 MB
      Total PE              254
      Alloc PE / Size       0 / 0
      Free  PE / Size       254 / 1016,00 MB
      VG UUID               UCyWw8-2jqV-enuT-KH4d-NXQI-JhH3-J24anD
  6. Create logical volumes as needed:

    lvcreate --name clusterlv --size 500M clustervg

After you have created the volumes and started your resources you should have a new device named /dev/dm-0 . It is recommended to use a clustered file system on top of your LVM resource, for example OCFS. For more information, see Chapter 12, Oracle Cluster File System 2

Scenario: cLVM With DRBD

The following scenarios can be used if you have data centers located in different parts of your city, country, or continent.

Procedure 14.5. Creating a Cluster-Aware Volume Group With DRBD

  1. Create a primary/primary DRBD resource:

    1. First, set up a DRBD device as primary/secondary as described in Procedure 13.1, “Manually Configure DRBD”. Make sure the disk state is up-to-date on both nodes. Check this with cat /proc/drbd or with rcdrbd status.

    2. Add the following options to your configuration file (usually something like /etc/drbd.d/r0.res):

      resource r0 {
        startup {
          become-primary-on both;
        }
      
        net {
           allow-two-primaries;
        }
        ...
      }
    3. Copy the changed configuration file to the other node, for example:

      scp /etc/drbd.d/r0.res venus:/etc/drbd.d/
    4. Run the following commands on both nodes:

      drbdadm disconnect r0
      drbdadm connect r0
      drbdadm primary r0
    5. Check the status of your nodes:

      cat /proc/drbd
      ...
       0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----
  2. Include the clvmd resource as a clone in the pacemaker configuration, and make it depend on the DLM clone resource. See Procedure 14.3, “Creating a DLM Resource” for detailed instructions. Before proceeding, confirm that these resources have started successfully on your cluster. You may use crm_mon or the GUI to check the running services.

  3. Prepare the physical volume for LVM with the command pvcreate. For example, on the device /dev/drbd_r0 the command would look like this:

    pvcreate /dev/drbd_r0
  4. Create a cluster-aware volume group:

    vgcreate --clustered y myclusterfs /dev/drbd_r0
  5. Create logical volumes as needed. You may probably want to change the size of the logical volume. For example, create a 4 Gigabyte logcial volume with the following command:

    lvcreate --name testlv -L 4G myclusterfs
  6. To ensure that the volume group is activated cluster-wide, configure a LVM resource as follows:

    primitive vg1 ocf:heartbeat:LVM \
            params volgrpname="myclusterfs"
    clone vg1-clone vg1 \
            meta interleave="true" ordered="true"
    colocation colo-vg1 inf: vg1-clone dlm-clvm-clone
    order order-vg1 inf: dlm-clvm-clone vg1-clone
  7. If you want the volume group to only be activated exclusively on one node, use the following example; in this case, cLVM will protect all logical volumes within the VG from being activated on multiple nodes, as an additional measure of protection for non-clustered applications:

    primitive vg1 ocf:heartbeat:LVM \
            params volgrpname="myclusterfs" exclusive="yes"
    colocation colo-vg1 inf: vg1 dlm-clvm-clone
    order order-vg1 inf: dlm-clvm-clone vg1
  8. The logical volumes within the VG are now available as file system mounts or raw usage. Ensure that services using them must have proper dependencies to collocate them with and order them after the VG has been activated.

After finishing these configuration steps, the LVM2 configuration can be done just like on any standalone workstation.

Configuring Eligible LVM2 Devices Explicitly

When several devices seemingly share the same physical volume signature (as can be the case for multipath devices or DRBD), it is recommended to explicitly configure the devices which LVM2 scans for PVs.

For example, if the command vgcreate uses the physical device instead of using the mirrored block device, DRBD will be confused which may result in a split brain condition for DRBD.

To deactivate a single device for LVM2, do the following:

  1. Edit the file /etc/lvm/lvm.conf and search for the line starting with filter.

  2. The patterns there are handled as regular expressions. A leading a means to accept a device pattern to the scan, a leading r rejects the devices that follow the device pattern.

  3. To remove a device named /dev/sdb1, add the following expression to the filter rule:

    "r|^/dev/sdb1$|"

    The complete filter line will look like the following:

    filter = [ "r|^/dev/sdb1$|", "r|/dev/.*/by-path/.*|", "r|/dev/.*/by-id/.*|", "a/.*/" ]

    A filter line, that accepts DRBD and MPIO devices but rejects all other devices would look like this:

    filter = [ "a|/dev/drbd.*|", "a|/dev/.*/by-id/dm-uuid-mpath-.*|", "r/.*/" ]
  4. Write the configuration file and copy it to all cluster nodes.

For More Information

Thorough information is available from the pacemaker mailing list, available at http://www.clusterlabs.org/wiki/Help:Contents.

The official cLVM FAQ can be found at http://sources.redhat.com/cluster/wiki/FAQ/CLVM.