Contents
Oracle Cluster File System 2 (OCFS2) is a general-purpose journaling file system that is fully integrated in the Linux 2.6 kernel and later. OCFS2 allows you to store application binary files, data files, and databases on devices in a SAN. All nodes in a cluster have concurrent read and write access to the file system. A distributed lock manager helps prevent file access conflicts. OCFS2 supports up to 32,000 subdirectories and millions of files in each directory. The O2CB cluster service (a driver) runs on each node to manage the cluster.
OCFS2 was added to SUSE Linux Enterprise Server 9 to support Oracle Real Application Cluster (RAC) databases and its application files, Oracle Home. In SUSE Linux Enterprise Server 10 and later, OCFS2 can be used for any of the following storage solutions:
Oracle RAC and other databases
General applications and workloads
XEN image store in a cluster
XEN virtual machines and virtual servers can be stored on OCFS2 volumes that are mounted by cluster servers to provide quick and easy portability of XEN virtual machines between servers.
LAMP (Linux, Apache, MySQL, and PHP | PERL | Python) stacks
In addition, it is fully integrated with Heartbeat 2.
As a high-performance, symmetric, parallel cluster file system, OCFS2 supports the following functions:
An application’s files are available to all nodes in the cluster. Users simply install it once on an OCFS2 volume in the cluster.
All nodes can concurrently read and write directly to storage via the standard file system interface, enabling easy management of applications that run across a cluster.
File access is coordinated through the Distributed Lock Manager (DLM).
DLM control is good for most cases, but an application’s design might limit scalability if it contends with the DLM to coordinate file access.
Storage backup functionality is available on all back-end storage. An image of the shared application files can be easily created, which can help provide effective disaster recovery.
OCFS2 also provides the following capabilities:
Metadata caching
Metadata journaling
Cross-node file data consistency
A GTK GUI-based administration via the ocfs2console utility
Operation as a shared-root file system
Support for multiple-block sizes (each volume can have a different block size) up to 4 KB, for a maximum volume size of 16 TB
Support for up to 255 cluster nodes
Context-dependent symbolic link (CDSL) support for node-specific local files
Asynchronous and direct I/O support for database files for improved database performance
The O2CB cluster service is a set of modules and in-memory file systems that are required to manage OCFS2 services and volumes. You can enable these modules to be loaded and mounted during system boot. For instructions, see Section 14.6.2, “Configuring OCFS2 Services”.
Table 14.1. O2CB Cluster Service Stack¶
|
Service |
Description |
|---|---|
|
Node Manager (NM) |
Keeps track of all the nodes in the
|
|
Heartbeat (HB) |
Issues up/down notifications when nodes join or leave the cluster |
|
TCP |
Handles communications between the nodes with the TCP protocol |
|
Distributed Lock Manager (DLM) |
Keeps track of all locks and their owners and status |
|
CONFIGFS |
User space configuration file system. For details, see Section 14.3, “In-Memory File Systems” |
|
DLMFS |
User space interface to the kernel space DLM. For details, see Section 14.3, “In-Memory File Systems” |
OCFS2 requires the nodes to be alive on the network. The O2CB cluster service sends regular keepalive packages to ensure that they are alive. It uses a private connection between nodes instead of the LAN to avoid network delays that might be interpreted as a node disappearing and thus, lead to a node’s self-fencing.
The OC2B cluster service communicates the node status via a disk heartbeat. The heartbeat system file resides on the Storage Area Network (SAN), where it is available to all nodes in the cluster. The block assignments in the file correspond sequentially to each node’s slot assignment.
Each node reads the file and writes to its assigned block in the file at two-second intervals. Changes to a node’s time stamp indicates the node is alive. A node is dead if it does not write to the heartbeat file for a specified number of sequential intervals, called the heartbeat threshold. Even if only a single node is alive, the O2CB cluster service must perform this check, because another node could be added dynamically at any time.
You can modify the disk heartbeat threshold in the
/etc/sysconfig/o2cb file, using the
O2CB_HEARTBEAT_THRESHOLD parameter. The
wait time is calculated as follows:
(O2CB_HEARTBEAT_THRESHOLD value - 1) * 2 = threshold in seconds
For example, if the
O2CB_HEARTBEAT_THRESHOLD value is set
at the default value of 7, the wait time is 12 seconds ((7 - 1) * 2
= 12).
OCFS2 uses two in-memory file systems for communications:
Table 14.2. In-Memory File Systems Used by OCFS2¶
OCFS2 stores node-specific parameter files on the node. The
cluster configuration file (
/etc/ocfs2/cluster.conf) resides on
each node assigned to the cluster.
The ocfs2console utility is a GTK GUI-based interface
for managing the configuration of the OCFS2 services in the cluster. Use
this utility to set up and save the
/etc/ocfs2/cluster.conf file to all
member nodes of the cluster. In addition, you can use it to format,
tune, mount, and umount OCFS2 volumes.
![]() | |
The file browser column in the ocfs2console utility is prohibitively slow and inconsistent across the cluster. We recommend that you use the ls(1) command to list files instead. | |
Additional OCFS2 utilities are described in the following table. For information about syntax for these commands, see their man pages.
Table 14.3. OCFS2 Utilities¶
Use the following commands to manage O2CB services. For more information about the o2cb command syntax, see its man page.
Table 14.4. O2CB Commands¶
The OCFS2 kernel module ( ocfs2) is installed
automatically in SUSE Linux Enterprise Server 10 and later. To use OCFS2, use YaST (or the
command line if you prefer) to install the
ocfs2-tools and ocfs2console
packages on each node in the cluster.
Log in as the root user, then
open the YaST Control Center.
Select +.
In the field, enter
ocfs2.
The software packages ocfs2-tools and
ocfs2console should be listed in the right panel.
If they are selected, the packages are already installed.
If you need to install the packages, select them, then click and follow the on-screen instructions.
Follow the procedures in this section to configure your system to use OCFS2 and to create OCFS2 volumes.
Before you begin, do the following:
Initialize, carve, or configure RAIDs (Redundant Array of Independent Disks) on the SAN disks, as needed, to prepare the devices you plan to use for your OCFS2 volumes. Leave the devices as free space.
We recommend that you store application files and data files on different OCFS2 volumes, but it is only mandatory to do so if your application volumes and data volumes have different requirements for mounting. For example, the Oracle RAC database volume requires the datavolume and nointr mounting options, but the Oracle Home volume should never use these options.
Make sure that the ocfs2console, and
ocfs2-tools packages are installed. Use YaST or
command line methods to install them if they are not. For YaST
instructions, see Section 14.5, “OCFS2 Packages”.
Before you can create OCFS2 volumes, you must configure OCFS2 services.
In the following procedure, you generate the
/etc/ocfs2/cluster.conf file, save the
cluster.conf file on all nodes, and create and
start the O2CB cluster service ( o2cb).
Follow the procedure in this section for one node in the cluster.
Open a terminal window and log in as the root
user.
If the o2cb cluster service is not already enabled, enter chkconfig --add o2cb.
When you add a new service, chkconfig ensures that the service has either a start or a kill entry in every run level.
If the ocfs2 service is not already enabled, enter chkconfig --add ocfs2.
Configure the o2cb cluster service driver to load on boot.
Enter /etc/init.d/o2cb configure
At the Load O2CB driver on boot (y/n) [n]
prompt, enter y
(yes) to enable load on boot.
At the Cluster to start on boot (Enter “none”
to clear) [ocfs2] prompt, enter
none.
This choice presumes that you are setting up OCFS2 for the first
time or resetting the service. You specify a cluster name in the
next step when you set up the
/etc/ocfs2/cluster.conf file.
Use the ocfs2console utility to set up and save the
/etc/ocfs2/cluster.conf file to all
member nodes of the cluster.
This file should be the same on all the nodes in the cluster. Use the
following steps to set up the first node. Later, you can use the
ocfs2console to add new nodes to the cluster
dynamically and to propagate the modified
cluster.conf file to all nodes.
However, if you change other settings, such as the cluster name and IP address, you must restart the cluster for the changes to take effect, as described in Step 6.
Open the ocfs2console GUI by entering ocfs2console.
In the ocfs2console, select +.
If cluster.conf is not present, the console will create one with a
default cluster name of ocfs2. Modify the cluster
name as desired.
In the Node Configuration dialog box, click to open the Add Node dialog box.
In the Add Node dialog box, specify the unique name of your primary
node, a unique IP address (such as
192.168.1.1), and the port number
(optional, default is 7777), then click .
The ocfs2console console assigns node slot numbers sequentially from 0 to 254.
In the Node Configuration dialog box, click , then click to dismiss the Add Node dialog box.
Click + to save the
cluster.conf file to all nodes.
If you need to restart the OCFS2 cluster for the changes to take effect, enter the following lines, waiting in between for the process to return a status of .
/etc/init.d/o2cb stop /etc/init.d/o2cb start
Creating an OCFS2 file system and adding new nodes to the cluster should be performed on only one of the nodes in the cluster.
Open a terminal window and log in as the root
user.
If the O2CB cluster service is offline, start it by entering the following command then wait for the process to return a status of .
/etc/init.d/o2cb online ocfs2
Replace with
the actual cluster name of your OCFS2 cluster.
ocfs2
The OCFS2 cluster must be online, because the format operation must first ensure that the volume is not mounted on any node in the cluster.
Create and format the volume using one of the following methods:
In EVMSGUI, go to the Volumes page, select +, then specify the configuration settings.
Use the mkfs.ocfs2 utility. For information about the syntax for this command, refer to the mkfs.ocfs2 man page.
In the ocfs2console, click +, select a device in the Available Devices list that you want to use for your OCFS2 volume, specify the configuration settings for the volume, then click to format the volume.
See the following table for recommended settings.
Open a terminal window and log in as the root
user.
If the O2CB cluster service is offline, start it by entering the following command, then wait for the process to return a status of .
/etc/init.d/o2cb online ocfs2
Replace with
the actual cluster name of your OCFS2 cluster.
ocfs2
The OCFS2 cluster must be online, because the format operation must ensure that the volume is not mounted on any node in the cluster.
Use one of the following methods to mount the volume.
In the ocfs2console, select a device in the Available Devices list and click . Optionally, specify the directory mount point and mount options and click .
Mount the volume from the command line, using the mount command.
Mount the volume from the /etc/fstab
file on system boot.
Mounting an OCFS2 volume takes about 5 seconds, depending on how long it takes for the heartbeat thread to stabilize. On a successful mount, the device list in the ocfs2console shows the mount point along with the device.
![]() | Adding New Nodes |
|---|---|
When new nodes try to connect to the cluster, they are not allowed to join because the nodes have not added them to their connection list. To solve this issue, manually go to each node and issue the following command to update the respective connection list: o2cb_ctl -H -n ocfs2 -t cluster -a online=yes | |
For information about mounting an OCFS2 volume using any of these methods, see the OCFS2 User Guide on the OCFS2 project at Oracle.
When running Oracle RAC, make sure to use the datavolume and nointr mounting options for OCFS2 volumes that contain the Voting diskfile (CRS), Cluster registry (OCR), Data files, Redo logs, Archive logs, and Control files. Do not use these options when mounting the Oracle Home volume.
For information about using OCFS2, see the OCFS2 User Guide on the OCFS2 project at Oracle.