Oracle Linux 6: Create an OCFS2 Cluster and Filesystem

Today we are going to go through the process of creating a clustered file system on a pair of Oracle Linux 6.3 nodes.  This exercise is not very resource intensive.  I am using two VMs each with 1GB of RAM a single CPU and a shared virtual disk file in addition to the OS drivers.

The Basic Concepts

Now why is a clustered file system important?  So basically if you have the need to have a shared volume between two hosts, you can provision the disk to both machines, and everything could appear to work, however in the event that writes ever happened to the same areas of the disk at the same time you will end up with data corruption.  Now the key here is that you need a way to track locks from multiple nodes.  This is called a Distributed Locking Manager or DLM.  Now to get this DLM functionality working then it will create a cluster.  Valid cluster nodes can then mount the disk and interact with it as a normal disk.  So as part of OCFS2 we have two file systems which are created /sys/kernel/config and /dlm the prior is used for the cluster configurations, and the latter is for the distributed lock manager


OCFS2 has been in the mainline Linux kernel for years, so it is widely available, though if you compile your own kernels then you will need to include support in your kernel.  Other than that all you need is the userland configuration tools to interact with it.

Install OCFS2 Tools

# yum install ocfs2-tools

Load and Online the O2CB Service

# service o2cb load
Loading filesystem "configfs": OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading stack plugin "o2cb": OK
Loading filesystem "ocfs2_dlmfs": OK
Creating directory '/dlm': OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
# service o2cb online
Setting cluster stack "o2cb": OK
Checking O2CB cluster configuration : Failed

Notice that when we online o2cb, that it fails at checking the O2CB cluster configuration.  This is expected.  It is due to not having a cluster configuration to check at this point.

Create the OCFS2 Cluster Configuration

Now we need to create the /etc/ocfs2/cluster.conf.  This can be done with o2cb_ctl or manually.  Though it is considerably easier with o2cb_ctl.

# o2cb_ctl -C -n prdcluster -t cluster -a name=prdcluster

Here we are naming our cluster prdcluster.  The cluster itself doesn’t know anything about nodes until we add them in the next step.

Add Nodes to the OCFS2 Cluster Configuration

Create an entry for each node, using the below command.  We will need the IP of the nodes, the port, the cluster name we defined before and the host name of each node.

# o2cb_ctl -C -n ocfs01 -t node -a number=0 -a ip_address= -a ip_port=11111 -a cluster=prdcluster
# o2cb_ctl -C -n ocfs02 -t node -a number=1 -a ip_address= -a ip_port=11111 -a cluster=prdcluster

The IP Address and Port are used for the Cluster heartbeat.  The node name is used to verify a cluster member when attempting to join the cluster.  The node name needs to match the systems host name.

Review the OCFS2 Cluster Configuration

Now we can take a peek at the cluster.conf which our o2cb_ctl command created.

# cat /etc/ocfs2/cluster.conf
name = ocfs01
cluster = prdcluster
number = 0
ip_address =
ip_port = 11111

name = ocfs02
cluster = prdcluster
number = 1
ip_address =
ip_port = 11111

name = prdcluster
heartbeat_mode = local
node_count = 2

Configure the O2CB Service

In order to have the cluster start with the correct information we need to update the o2cb service and include the name of our cluster.

# service o2cb configure
Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot.  The current values will be shown in brackets ('[]').  Hitting
<ENTER> without typing an answer will keep that current value.  Ctrl-C
will abort.

Load O2CB driver on boot (y/n) [n]: y
Cluster stack backing O2CB [o2cb]:
Cluster to start on boot (Enter "none" to clear) [ocfs2]: prdcluster
Specify heartbeat dead threshold (>=7) [31]:
Specify network idle timeout in ms (>=5000) [30000]:
Specify network keepalive delay in ms (>=1000) [2000]:
Specify network reconnect delay in ms (>=2000) [2000]:
Writing O2CB configuration: OK
Setting cluster stack "o2cb": OK
Registering O2CB cluster "prdcluster": OK
Setting O2CB cluster timeouts : OK

Offline and Online the O2CB Service

To ensure that everything is working as we expect, I like to offline and online the service.

# service o2cb offline
Clean userdlm domains: OK
Stopping O2CB cluster prdcluster: Unregistering O2CB cluster "prdcluster": OK

We just want to watch that it is unregistering and registering the correct cluster, in this case the prdcluster.

# service o2cb online
Setting cluster stack "o2cb": OK
Registering O2CB cluster "prdcluster": OK
Setting O2CB cluster timeouts : OK

Repeat for All Nodes

All of the above actions need to be done on all nodes in the cluster, with no variations.  Once all nodes are Registering O2CB cluster “prdcluster”: OK then you can move on.

Format Our Shared Disk

This part is no different from any other format, keep in mind that once you have formatted the disk on one cluster node, it does not need to be done on the other node.

# mkfs.ocfs2 /dev/xvdb
mkfs.ocfs2 1.8.0
Cluster stack: classic o2cb
Features: sparse extended-slotmap backup-super unwritten inline-data strict-journal-super xattr indexed-dirs refcount discontig-bg
Block size: 4096 (12 bits)
Cluster size: 4096 (12 bits)
Volume size: 53687091200 (13107200 clusters) (13107200 blocks)
Cluster groups: 407 (tail covers 11264 clusters, rest cover 32256 clusters)
Extent allocator size: 8388608 (2 groups)
Journal size: 268435456
Node slots: 8
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 3 block(s)
Formatting Journals: done
Growing extent allocator: done
Formatting slot map: done
Formatting quota files: done
Writing lost+found: done
mkfs.ocfs2 successful

Mount Our OCFS2 Volume

You can either use a manual issuance of the mount command, or you can create an entry in the /etc/fstab

# mount -t ocfs2 /dev/xvdb /d01/share
# cat /etc/fstab

 # /etc/fstab
 # Created by anaconda on Wed Feb 27 13:44:01 2013
 # Accessible filesystems, by reference, are maintained under '/dev/disk'
 # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
 /dev/mapper/vg_system-lv_root /                       ext4    defaults        1 1
 UUID=4b397e61-7954-40e9-943f-8385e46d263d /boot                   ext4    defaults        1 2
 /dev/mapper/vg_system-lv_swap swap                    swap    defaults        0 0
 tmpfs                   /dev/shm                tmpfs   defaults        0 0
 devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 sysfs                   /sys                    sysfs   defaults        0 0
 proc                    /proc                   proc    defaults        0 0
 /dev/xvdb        /d01/share        ocfs2    defaults    1 1

Then mount our entry from the /etc/fstab.

# mount /d01/share

Mounts will need to be configured on all cluster nodes.

Check Our Mounts

Once we have mounted our devices we need to ensure that they are showing up correctly.

# mount
 /dev/mapper/vg_system-lv_root on / type ext4 (rw)
 proc on /proc type proc (rw)
 sysfs on /sys type sysfs (rw)
 devpts on /dev/pts type devpts (rw,gid=5,mode=620)
 tmpfs on /dev/shm type tmpfs (rw)
 /dev/xvda1 on /boot type ext4 (rw)
 none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
 sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
 configfs on /sys/kernel/config type configfs (rw)
 ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
 /dev/xvdb on /d01/share type ocfs2 (rw,_netdev,heartbeat=local)

Notice that /d01/share is mounted as ocfs2, and that it is mounted with rw, _netdev, heartbeat=local.  These are the expected options (these are gathered from the previous configuration).

Check Service Status

Finally we can check the status on the o2cb service and we can see information about our cluster, heartbeat and the various other mounts that are needed to maintain the cluster (configfs, and ocfs2_dlmfs).

# service o2cb status
 Driver for "configfs": Loaded
 Filesystem "configfs": Mounted
 Stack glue driver: Loaded
 Stack plugin "o2cb": Loaded
 Driver for "ocfs2_dlmfs": Loaded
 Filesystem "ocfs2_dlmfs": Mounted
 Checking O2CB cluster "prdcluster": Online
 Heartbeat dead threshold: 31
 Network idle timeout: 30000
 Network keepalive delay: 2000
 Network reconnect delay: 2000
 Heartbeat mode: Local
 Checking O2CB heartbeat: Active
  1. Ambu
    March 12th, 2013 at 08:00
    Quote | #1

    Thanks a lot for sharing this.Excellent work.

    Ambu Ambooken

  2. Ron Levy
    June 10th, 2013 at 18:35
    Quote | #2

    This post will help people for years to come. :)

Comments are closed.