Shared SCSI Storage with VMWare Server 1.X and OCFS


Sometimes, when testing software while a pilot test with virtual machines you need to emulate a SAN or setup some kind of shared storage between all nodes. I remember that some (old?) versions of VMWare products allowed SCSI controller to be shared between some nodes, and thus all the devices attached to this SCSI controller are accesible to all nodes: simple and neat. Perhaps that was an option only with VMWare Workstation, or ESX …

VMWare Logo

Free VMWare products -at least VMWare Server 1.0.X- can do the same despite it’s a bit tricky. Along this post I will explain how to setup this kind of shared storage between a group of VM. I’m using a CentOS 5.2 VM because it’s Linux and it’s 99.9% compatible with it’s RHEL counterpart that you can find at your favourite production environment. So, for this test I used the CentOS available at bagside that I found at the VMWare Appliances MarketPlace.

  1. Preparing the CentOS5.2 Virtual Machine
  2. Once you’ve downloaded & installed VMWare Server 1.0.X and the CentOS 5.2 VMWare image, we need to import the image into VMWare inventory. Then start the VM and perform the following changes in order to prepare the test.

    I assume that you’re running the whole test using the following paths:

    d:virtualnode1    -> node1 virtual machine
    d:virtualshared   -> only the shared disk
    d:virtualnode2    -> empty for now; will contain the second cluster's node

    Change the default runlevel from 5 to 3 to avoid X-Window to start unless you want it. So, edit /etc/inittab file and change the line

    id:5:initdefault::

    to

    id:3:initdefault::

    Take a look to the VMWare Network configuration. I’d rather to use the Bridged Network instead NAT, but anyway note down the network configuration you’re going to use. In my case: 192.168.232.0/24.

    Assign a hostname: node1 will be clear enought. Now edit /etc/hosts file and add two entries like:

    127.0.0.1	node1	localhost ...
    192.168.2.201	node1
    192.168.2.202	node2

    And set it’s hostname by editing the /etc/sysconfig/network

    Fix the IP address of the virtual machine to use 192.168.2.201 (don’t use DHCP!), by using the “setup” utility.

    Reboot the VM and check if all the changes are OK.

  3. Setting up the Shared SCSI storage
  4. Now, the node1 it’s almost done. We are going to add the shared SCSI controller and one SCSI device attached to it, so turn off the node1 and add a SCSI controller and a disk. Create a new SCSI disk with the options: SCSI, size 1.0Gb, allocate space now, attach it to the SCSI 1:X bus, set “independent” and “persistent” flags. Dont’ forget to store the VMWare disk files (*.vmdk and so on) in “d:virtualshared”.

    When you have created the “shared disk” while the node1 is still off, edit the node1 configuration file (*.vmx) and add the following lines:

    diskLib.dataCacheMaxSize = "0"
    diskLib.dataCacheMaxReadAheadSize = "0"
    diskLib.dataCacheMinReadAheadSize = "0"
    diskLib.dataCachePageSize = "4096"
    diskLib.maxUnsyncedWrites = "0"
    disk.locking = "false"
    scsi1.present = "TRUE"
    scsi1.sharedBus = "virtual"
    scsi1.virtualDev = "lsilogic"
    scsi1:0.present = "TRUE"
    scsi1:0.fileName = "d:virtualsharedshared_disk.vmdk"
    scsi1:0.redo = ""
    scsi1:0.mode = "independent-persistent"
    scsi1:0.deviceType = "disk"


    Note the flags independent & persistent, and you’re going to share all the SCSI 1:X bus, not only the 1:0 device.

  5. Installing and configuring OCFS2
    • ocfs2-tools
    • ocfs2
    • ocfs2console (optional)
  6. Now the node1 has a shared SCSI bus configured, and you can see that by issuing a “fdisk -l”. Let’s configure the OCFS2 part, so download the RPMS:

    In my setup, as long as I’m using CentOS5, the exact version are:

    ocfs2-tools-1.4.1-1.el5
    ocfs2-2.6.18-92.1.1.el5-1.4.1-1.el5
    ocfs2console-1.4.1-1.el5

    After installing the RPMs, you will have two new services under /etc/init.d: o2cb and ocfs2.

    If you execute /etc/init.d/o2cb configure you’ll get:

    Load O2CB driver on boot (y/n) [y]: y
    Cluster stack backing O2CB [o2cb]:
    Cluster to start on boot (Enter "none" to clear) [ocfs2]:
    Specify heartbeat dead threshold (>=7) [31]:
    Specify network idle timeout in ms (>=5000) [30000]:
    Specify network keepalive delay in ms (>=1000) [2000]:
    Specify network reconnect delay in ms (>=2000) [2000]:
    Writing O2CB configuration: OK
    Starting O2CB cluster ocfs2: Failed
    o2cb_ctl: Unable to load cluster configuration file "/etc/ocfs2/cluster.conf"
    Stopping O2CB cluster ocfs2: Failed
    o2cb_ctl: Unable to load cluster configuration file "/etc/ocfs2/cluster.conf"

    o2cb is complaining about the file “cluster.conf”, does it exist ?

    Create the file /etc/ocfs2/cluster.conf:

    cluster:
            node_count = 2
            name = ocfs2
    
    node:
            ip_port = 7777
            ip_address = 192.168.2.201
            number = 1
            name = node1
            cluster = ocfs2
    
    node:
            ip_port = 7777
            ip_address = 192.168.2.202
            number = 2
            name = node2
            cluster = ocfs2

    Note: this file has a very strict syntax! blanks before atribute names must be a TAB.

    Now, the same command /etc/init.d/o2cb configure outputs:

    [root@node1 init.d]# ./o2cb configure:
    Load O2CB driver on boot (y/n) [y]:
    Cluster stack backing O2CB [o2cb]:
    Cluster to start on boot (Enter "none" to clear) [ocfs2]:
    Specify heartbeat dead threshold (>=7) [31]:
    Specify network idle timeout in ms (>=5000) [30000]:
    Specify network keepalive delay in ms (>=1000) [2000]:
    Specify network reconnect delay in ms (>=2000) [2000]:
    Writing O2CB configuration: OK
    Starting O2CB cluster ocfs2: OK
    
    [root@node1 init.d]# ./o2cb status
    Driver for "configfs": Loaded
    Filesystem "configfs": Mounted
    Driver for "ocfs2_dlmfs": Loaded
    Filesystem "ocfs2_dlmfs": Mounted
    Checking O2CB cluster ocfs2: Online
    Heartbeat dead threshold = 31
      Network idle timeout: 30000
      Network keepalive delay: 2000
      Network reconnect delay: 2000
    Checking O2CB heartbeat: Not active

    Nice. The cluster must be in Online status in order to mkfs the partition:

    [root@node1 /]# mkfs.ocfs2 /dev/sda1
    mkfs.ocfs2 1.4.1
    Cluster stack: classic o2cb
    Overwriting existing ocfs2 partition.
    Proceed (y/N): y
    Filesystem label=
    Block size=2048 (bits=11)
    Cluster size=4096 (bits=12)
    Volume size=1069252608 (261048 clusters) (522096 blocks)
    17 cluster groups (tail covers 7096 clusters, rest cover 15872 clusters)
    Journal size=33554432
    Initial number of node slots: 2
    Creating bitmaps: done
    Initializing superblock: done
    Writing system files: done
    Writing superblock: done
    Writing backup superblock: 0 block(s)
    Formatting Journals: done
    Formatting slot map: done
    Writing lost+found: done
    mkfs.ocfs2 successful

    And:

    [root@node1 /]# mount -t ocfs2 -o datavolume /dev/sda1 /mnt/shared
    [root@node1 /]# df
    Filesystem           1K-blocks      Used Available Use% Mounted on
    /dev/mapper/VolGroup00-LogVol00
                          12156848   2987316   8542028  26% /
    /dev/hda1               101086     17694     78173  19% /boot
    tmpfs                   257744         0    257744   0% /dev/shm
    /dev/sda1              1044192     70652    973540   7% /mnt/shared
    [root@node1 /]#

    Mounted. Now, to get the partition mounted automatically after a reboot add to the fstab:

    /dev/sda1     /mnt/shared    ocfs2   _netdev,datavolume     0 0

    Reboot the node1 and check if all changes persisted.

  7. Cloning the first node
  8. At this point we have a node (node1) with a shared SCSI controller, now let’s clone this node with a name “node2” in the folder d:virtualnode2. As long as you copy the files contained in d:virtualnode1, the node2 will be an exact copy including the config of the shared SCSI controller/disk stored at d:virtualshared.

    Now ensure that node1 is turned off and turn on the node2. Then fix it’s IP address, hostname, /etc/hosts file … and so on. Reboot the node2, and turn on node1. Both nodes should see a device at /dev/sda, check it with “fdisk -l” command.

Some references:
http://www.databasejournal.com/features/oracle/article.php/3686461/Oracle-RAC-How-shared-storage-works-on-VMware–Part-1.htm
http://www.bagvapp.com
http://xenamo.sourceforge.net/ar01s03.html
http://oss.oracle.com/projects/ocfs2-tools/
http://oss.oracle.com/pipermail/ocfs2-users/2005-October/000245.html
http://www.rampant-books.com/art_hunter_rac_oracle_o2cb_cluster_service.htm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: