Contents
Abstract
A clustered Samba server provides a High Availability solution in your heterogenous networks. This chapter explains some backgrounds and how to set up your clustered Samba server.
Trivial Database (TDB), has been used by Samba for many years. It allows multiple applications to write simultaneously. To make sure all write operations are successfully performed and do not collide with each other, TDB uses an internal locking mechanism.
Cluster Trivial Database (CTDB) is a small extension of the existing TDB. CTDB is described by the project itself as a “cluster implementation of the TDB database used by Samba and other projects to store temporary data”.
Each cluster node runs a local CTDB daemon. Samba communicates with its local CTDB daemon, instead of writing directly to its TDB. The daemons exchange metadata over the network, but actual write and read operations are done on a local copy with fast storage. The concept of CTDB is displayed in Figure 16.1, “Structure of a CTDB Cluster”.
![]() | CTDB For Samba Only |
|---|---|
The current implementation of the CTDB Resource Agent configures CTDB to only manage Samba. Everything else, including IP failover should be configured with Pacemaker. Futhermore, CTDB is only supported for completely homogeneous clusters. For example, all nodes in the cluster need to have the same architecture, you cannot mix i586 with x86_64. | |
A clustered Samba server must share certain data:
Mapping table that associates Unix user and group IDs to Windows users and groups.
User database must be synchronized between all nodes.
Join information for a member server in a Windows domain must be available on all nodes.
Metadata has to be available on all nodes, like active SMB sessions, share connections, and various locks.
The goal is that a clustered Samba server with N+1 nodes is faster than with only N nodes. One node is not slower than an unclustered Samba server.
![]() | Changed Configuration Files |
|---|---|
The CTDB Resource Agent automatically changes
| |
To set up a clustered Samba server, proceed as follows:
Prepare your cluster:
Configure your cluster (OpenAIS, Pacemaker, OCFS2) as described in this guide in Part II, “Configuration and Administration”.
Configure a shared file system, like OCFS2 and mount it, for example,
on /shared.
If you want to turn on POSIX ACLs, enable it:
For a new OCFS2 file system use:
mkfs.ocfs2 --fs-features=xattr ...For an existing OCFS2 file system use:
tunefs.ocfs2 --fs-feature=xattr DEVICE
Make sure the acl option is specified in the file
system resource. Use the crm shell as follows:
crm(live)configure# primary ocfs2-3 ocf:heartbeat:Filesystem options="acl" ...
Make sure the services
ctdb,
smb,
nmb, and
winbind are
disabled:
chkconfig ctdb off chkconfig smb off chkconfig nmb off chkconfig winbind off
Create directories for CTDB lock and Samba state on the shared file system:
mkdir -p /shared/samba/private
In /etc/ctdb/nodes insert all nodes which contain
all private IP addresses of each node in the cluster:
192.168.1.10 192.168.1.11
Add a CTDB resource to the cluster:
crm configure crm(live)configure# primitive ctdb ocf:heartbeat:CTDB params \ ctdb_recovery_lock="/shared/samba/ctdb.lock" \ smb_private_dir="/shared/samba/private" \ op monitor timeout=20 interval=10 crm(live)configure# clone ctdb-clone ctdb \ meta globally-unique="false" interleave="true" crm(live)configure# colocation ctdb-with-fs inf: ctdb-clone fs-clone crm(live)configure# order start-ctdb-after-fs inf: fs-clone ctdb-clone crm(live)configure# commit
Add a clustered IP address:
crm(live)configure# primitive ip ocf:heartbeat:IPaddr2 params ip=192.168.2.222 \ clusterip_hash="sourceip-sourceport" op monitor interval=60s crm(live)configure# clone ip-clone ip meta globally-unique="true" crm(live)configure# colocation ip-with-ctdb inf: ip-clone ctdb-clone crm(live)configure# order start-ip-after-ctdb inf: ctdb-clone ip-clone crm(live)configure# commit
Check the result:
crm status
Clone Set: dlm-clone
Started: [ hex-14 hex-13 ]
Clone Set: o2cb-clone
Started: [ hex-14 hex-13 ]
Clone Set: c-ocfs2-3
Started: [ hex-14 hex-13 ]
Clone Set: ctdb-clone
Started: [ hex-14 hex-13 ]
Clone Set: ip-clone (unique)
ip:0 (ocf::heartbeat:IPaddr2): Started hex-13
ip:1 (ocf::heartbeat:IPaddr2): Started hex-14Test from a client machine. On a Linux client, run the following command to see, if you can copy files from and to the system:
smbclient //192.168.2.222/myshareTo debug your clustered Samba server, the following tools which operate on different levels are available:
Run this tool to diagnose your clustered Samba server. This gives you lots of debug messages which should help you track down any problems you might have.
The ctdb_diagnostics command searches for the following files which must be available on all nodes:
/etc/krb5.conf /etc/hosts /etc/ctdb/nodes /etc/sysconfig/ctdb /etc/resolv.conf /etc/nsswitch.conf /etc/sysctl.conf /etc/samba/smb.conf /etc/fstab /etc/multipath.conf /etc/pam.d/system-auth /etc/sysconfig/nfs /etc/exports /etc/vsftpd/vsftpd.conf
If the files /etc/ctdb/public_addresses and
/etc/ctdb/static-routes exist, they will be
checked as well.
Check whether your file system is suitable for CTDB with ping_pong. It performs certain tests of your cluster file system like coherence and performance (see http://wiki.samba.org/index.php/Ping_pong) and as such gives some indications on how your cluster may behave under high load.
To test certain aspects of your cluster file system proceed as follows:
Procedure 16.1. Test Coherence and Performance of Your Cluster File System
Start the command ping_pong on one node and replace
the placeholder N with the amount of nodes
plus one. The filename is available in your shared storage and is
therefore accessible on all nodes:
ping_pong data.txt NExpect a very high locking rate as you are running only one node. If the program does not print a locking rate, replace your cluster file system.
Start a second copy of ping_pong on another node with the same parameters.
Expect to see a dramatical drop concerning the locking rate. If any of the following applies to your cluster file system, replace it:
ping_pong does not print a locking rate per second
the locking rates in the two instances are not almost equal
the locking rate did not drop after you have started the second instance
Start a third copy of ping_pong. Add another node and note how the locking rates change.
Kill the ping_pong commands step-by-step. You should observe an increase of the locking rate until you get back to the single node case. If you did not get the expected behaviour, replace your cluster file system.