Chapter 18. Samba Clustering

Contents

18.1. Conceptual Overview
18.2. Basic Configuration
18.3. Debugging and Testing Clustered Samba
18.4. Joining Active Directory Domains
18.5. For More Information

Abstract

A clustered Samba server provides a High Availability solution in your heterogeneous networks. This chapter explains some background information and how to set up a clustered Samba server.

18.1. Conceptual Overview

Trivial Database (TDB) has been used by Samba for many years. It allows multiple applications to write simultaneously. To make sure all write operations are successfully performed and do not collide with each other, TDB uses an internal locking mechanism.

Cluster Trivial Database (CTDB) is a small extension of the existing TDB. CTDB is described by the project as a cluster implementation of the TDB database used by Samba and other projects to store temporary data.

Each cluster node runs a local CTDB daemon. Samba communicates with its local CTDB daemon instead of writing directly to its TDB. The daemons exchange metadata over the network, but actual write and read operations are done on a local copy with fast storage. The concept of CTDB is displayed in Figure 18.1, “Structure of a CTDB Cluster”.

[Note]CTDB For Samba Only

The current implementation of the CTDB Resource Agent configures CTDB to only manage Samba. Everything else, including IP failover, should be configured with Pacemaker.

CTDB is only supported for completely homogeneous clusters. For example, all nodes in the cluster need to have the same architecture. You cannot mix i586 with x86_64.

Figure 18.1. Structure of a CTDB Cluster

Structure of a CTDB Cluster

A clustered Samba server must share certain data:

  • Mapping table that associates Unix user and group IDs to Windows users and groups.

  • The user database must be synchronized between all nodes.

  • Join information for a member server in a Windows domain must be available on all nodes.

  • Metadata has to be available on all nodes, like active SMB sessions, share connections, and various locks.

The goal is that a clustered Samba server with N+1 nodes is faster than with only N nodes. One node is not slower than an unclustered Samba server.

18.2. Basic Configuration

[Note]Changed Configuration Files

The CTDB Resource Agent automatically changes /etc/sysconfig/ctdb and /etc/samba/smb.conf. Use crm ra info CTDB to list all parameters that can be specified for the CTDB resource.

To set up a clustered Samba server, proceed as follows:

  1. Prepare your cluster:

    1. Configure your cluster (OpenAIS, Pacemaker, OCFS2) as described in this guide in Part II, “Configuration and Administration”.

    2. Configure a shared file system, like OCFS2, and mount it, for example, on /shared.

    3. If you want to turn on POSIX ACLs, enable it:

      • For a new OCFS2 file system use:

        mkfs.ocfs2 --fs-features=xattr ...
      • For an existing OCFS2 file system use:

        tunefs.ocfs2 --fs-feature=xattrDEVICE

        Make sure the acl option is specified in the file system resource. Use the crm shell as follows:

        crm(live)configure# primary ocfs2-3 ocf:heartbeat:Filesystem options="acl" ...
    4. Make sure the services ctdb, smb, nmb, and winbind are disabled:

      chkconfig ctdb off
      chkconfig smb off
      chkconfig nmb off
      chkconfig winbind off
  2. Create a directory for the CTDB lock on the shared file system:

    mkdir -p /shared/samba/
  3. In /etc/ctdb/nodes insert all nodes which contain all private IP addresses of each node in the cluster:

    192.168.1.10
    192.168.1.11
  4. Add a CTDB resource to the cluster:

    crm configure
    crm(live)configure# primitive ctdb ocf:heartbeat:CTDB params \
        ctdb_recovery_lock="/shared/samba/ctdb.lock" \
        op monitor timeout=20 interval=10
    crm(live)configure# clone ctdb-clone ctdb \
        meta globally-unique="false" interleave="true"
    crm(live)configure# colocation ctdb-with-fs inf: ctdb-clone fs-clone
    crm(live)configure# order start-ctdb-after-fs inf: fs-clone ctdb-clone
    crm(live)configure# commit
  5. Add a clustered IP address:

    crm(live)configure# primitive ip ocf:heartbeat:IPaddr2 params ip=192.168.2.222 \
      clusterip_hash="sourceip-sourceport" op monitor interval=60s
    crm(live)configure# clone ip-clone ip meta globally-unique="true"
    crm(live)configure# colocation ip-with-ctdb inf: ip-clone ctdb-clone
    crm(live)configure# order start-ip-after-ctdb inf: ctdb-clone ip-clone
    crm(live)configure# commit
  6. Check the result:

    crm status
    Clone Set: dlm-clone
         Started: [ hex-14 hex-13 ]
     Clone Set: o2cb-clone
         Started: [ hex-14 hex-13 ]
     Clone Set: c-ocfs2-3
         Started: [ hex-14 hex-13 ]
     Clone Set: ctdb-clone
         Started: [ hex-14 hex-13 ]
     Clone Set: ip-clone (unique)
         ip:0       (ocf::heartbeat:IPaddr2):       Started hex-13
         ip:1       (ocf::heartbeat:IPaddr2):       Started hex-14
  7. Test from a client machine. On a Linux client, run the following command to see if you can copy files from and to the system:

    smbclient//192.168.2.222/myshare

18.3. Debugging and Testing Clustered Samba

To debug your clustered Samba server, the following tools which operate on different levels are available:

ctdb_diagnostics

Run this tool to diagnose your clustered Samba server. Detailed debug messages should help you track down any problems you might have.

The ctdb_diagnostics command searches for the following files which must be available on all nodes:

/etc/krb5.conf
/etc/hosts
/etc/ctdb/nodes
/etc/sysconfig/ctdb
/etc/resolv.conf
/etc/nsswitch.conf
/etc/sysctl.conf
/etc/samba/smb.conf
/etc/fstab
/etc/multipath.conf
/etc/pam.d/system-auth
/etc/sysconfig/nfs
/etc/exports
/etc/vsftpd/vsftpd.conf

If the files /etc/ctdb/public_addresses and /etc/ctdb/static-routes exist, they will be checked as well.

ping_pong

Check whether your file system is suitable for CTDB with ping_pong. It performs certain tests of your cluster file system like coherence and performance (see http://wiki.samba.org/index.php/Ping_pong) and gives some indication how your cluster may behave under high load.

send_arp Tool and SendArp Resource Agent

The SendArp resource agent is located in /usr/lib/heartbeat/send_arp (or /usr/lib64/heartbeat/send_arp). The send_arp tool sends out a gratuitous ARP (Address Resolution Protocol) packet and can be used for updating other machines' ARP tables. It can help to identify communication problems after a failover process. If you cannot connect to a node or ping it although it shows the clustered IP address for samba, use the send_arp command to test if the nodes only need an ARP table update.

For more information, refer to http://wiki.wireshark.org/Gratuitous_ARP.

To test certain aspects of your cluster file system proceed as follows:

Procedure 18.1. Test Coherence and Performance of Your Cluster File System

  1. Start the command ping_pong on one node and replace the placeholder N with the amount of nodes plus one. The file data.txt is available in your shared storage and is therefore accessible on all nodes:

    ping_pong data.txt N

    Expect a very high locking rate as you are running only one node. If the program does not print a locking rate, replace your cluster file system.

  2. Start a second copy of ping_pong on another node with the same parameters.

    Expect to see a dramatic drop in the locking rate. If any of the following applies to your cluster file system, replace it:

    • ping_pong does not print a locking rate per second,

    • the locking rates in the two instances are not almost equal,

    • the locking rate did not drop after you started the second instance.

  3. Start a third copy of ping_pong. Add another node and note how the locking rates change.

  4. Kill the ping_pong commands one after the other. You should observe an increase of the locking rate until you get back to the single node case. If you did not get the expected behavior, find more information in Chapter 14, OCFS2.

18.4. Joining Active Directory Domains

Active Directory (AD) is a directory service for Windows server systems.

To configure the CTDB, the general procedure is as follows:

  1. Consult your Windows Server documentation for instructions on how to setup an Active Directory domain. In this example, we use the following parameters:

    AD and DNS server

    win2k3.2k3test.example.com

    AD domain

    2k3test.example.com

    Cluster AD member NETBIOS name

    CTDB-SERVER
  2. Procedure 18.2, “Configuring CTDB”

  3. Procedure 18.3, “Joining Active Directory”

The next step is to configure the CTDB:

Procedure 18.2. Configuring CTDB

  1. Make sure you have configured your cluster as shown in Section 18.2, “Basic Configuration”.

  2. Create a primitive with colocation and ordering constraints:

    # crm configure
    crm(live)configure# primitive ctdb ocf:heartbeat:CTDB \
          params ctdb_recovery_lock="/clusterfs/samba/ctdb.lock" \
          op monitor interval="10" timeout="20" \
          op start interval="0" timeout="60" \
          op stop interval="0" timeout="60"
    crm(live)configure# clone ctdb-clone ctdb \
          meta globally-unique="false" interleave="true"
    crm(live)configure# colocation ctdb-with-fs inf: ctdb-clone c-clusterfs
    crm(live)configure# order start-ctdb-after-fs inf: c-clusterfs ctdb-clone
    crm(live)configure# commit
  3. Stop the CTDB resource on one node:

    # crm resource stop ctdb-clone
  4. Change the /etc/samba.conf configuration file:

    [global]
        workgroup = 2K3TEST
        realm = 2k3test.example.com
        security = ADS
        netbios name = CTDB-SERVER
        idmap config * : range = 1000000-2000000
  5. Update on all nodes the file /etc/samba.conf:

    csync2 -xv
  6. Restart the CTDB resource:

    # crm resource start ctdb-clone

Finally, join your cluster to the Active Directory server:

Procedure 18.3. Joining Active Directory

  1. Edit the file /etc/resolv.conf and set the nameserver to your AD server. This addition needs to be added to all nodes.

    You can use Csync2 for this file if—and only if—the following conditions are true:

    • The content of /etc/resolv.conf is the same on all nodes.

    • The file /etc/resolv.conf is manually edited.

    • The file is not generated by the YaST network module.

    In that case, add the file /etc/resolv.conf into the Csync2 configuration file /etc/csync2/csync2.cfg.

  2. Sychronize the cluster node clocks with the AD server. This can be ensured either with Kerberos or the Network Time Protocol (NTP).

  3. Run crm configure edit and search for the ctdb resource. Add the following line:

    ctdb_manages_winbind="false"
  4. Set the following lines in /etc/nsswitch.conf:

    passwd: files winbind
    group:  files winbind
  5. Restart the NSC daemon:

    rcnscd restart
  6. Create the Kerberbos configuration file /etc/krb5.conf:

    [libdefaults]
        default_realm = 2k3test.example.com
    
    [realms]
        2k3test.example.com = {
            kdc = win2k3.2k3test.example.com
            admin_server = win2k3.2k3test.example.com
            default_domain = 2k3test.example.com
        }
    
    [domain_realm]
        .2k3test.example.com = 2k3test.example.com
        2k3test.example.com = 2k3test.example.com
  7. Run CTDB on all nodes:

    # crm resource cleanup ctdb:0
    # crm resource cleanup ctdb:1
  8. Wait until the unhealty status disappears. The status should look like this:

    # ctdb status
    Number of nodes:2
    pnn:0 192.168.1.10  OK (THIS NODE)
    pnn:1 192.168.1.20  OK
    Generation:1046869196
    Size:2
    hash:0 lmaster:0
    hash:1 lmaster:1
    Recovery mode:NORMAL (0)
    Recovery master:0
  9. Join the realm 2k3test.example.com:

    # net ads join -U Administrator
    Enter Administrator's password: ******
    Using short domain name -- 2K3TEST
    Joined 'CTDB-SERVER' to realm '2k3test.example.com'
    Not doing automatic DNS update in a clustered setup.
  10. Change the ctdb_manages_winbind option:

    1. Stop the ctdb resource:

      crm resource stop ctdb-clone
    2. Run crm configure edit and search for the ctdb resource as you did in Step 3. Change the value from false to true:

      ctdb_manages_winbind="true"
    3. Restart the ctdb resource:

      # crm resource start ctdb-clone
  11. Run on all nodes to see the list of Active Directory users:

    # wbinfo -u
    2K3TEST\administrator
    2K3TEST\guest
    2K3TEST\support_388945a0
    2K3TEST\krbtgt

18.5. For More Information