[Beowulf] Which do you prefer local disk installed OS or NFS rooted?
Tony Travis
ajt at rri.sari.ac.uk
Fri Jul 16 03:10:47 PDT 2004
Brent M. Clements wrote:
> Good Afternoon All:
>
> Let me start by giving a little background.
>
> Currently all of our clusters on campus have local disks which by using
> the systemimager suite of tools has an os image installed
> There are one or two clusters that are nfsrooted.
>
> I'd like to know from all of you, which way is everyone leaning when it
> comes to clusters and os distribution.
Hello, Brent.
We've got a 32-node Athlon XP 2400+ cluster running "ClusterNFS" under
openMosix 2.4.22-openmosix-2:
http://clusternfs.sourceforge.net/
http://openmosix.sourceforge.net/
The root partition of the 'head' node is exported read-only to the
'diskless' compute nodes which have symbolic links to volatile files:
/export/root/<IP ADDRESS>
All the 'diskless' nodes have a 40Gb local disk for:
/dev/hda1 /var/tmp # and /tmp -> /var/tmp
/dev/hda2 swap
The system works well except for one problem: The 2Gb limit on filesize.
This is an inevitable consequence of using clusterNFS, and I've not been
able to do anything about it yet. Our system is based on 'BOBCAT':
http://www.epcc.ed.ac.uk/bobcat/
Our 'BOBCAT' cluster architecture consists of a 'head' node with three
NIC's running a ROOTNFS fileserver and 'diskless' nodes (strictly
speaking 'dataless' nodes) with two NIC's on different ethernets, one
for PXE/DHCP/NFS and the other for openMosix IPC:
PXE/DHCP/NFS IPC LAN
192.168.0.0 192.168.1.0 143.234.32.0
192.168.0.1 node1 192.168.1.1 mpe1 143.234.32.11 bobcat
192.168.0.2 node2 192.168.1.2 mpe2 143.234.32.12 topcat
192.168.0.3 node3 192.168.0.3 mpe3
...
192.168.0.32 node32 192.168.0.32 mpe32
On our system, node1 (bobcat) is the PXE/DHCP/NFS server and node2
(topcat) is used for interactive logins. I've done quite a lot of
network monitoring using tools like "iptraf", "ibmonitor" and "iftop".
It is something of a myth about 'high' NFS traffic on clusters using
'diskless' compute nodes. It depends on what they are doing: If you're
running computationally intensive jobs the NFS traffic is minimal once
the programs are in the filesystem cache on the compute node. There is,
of course, high NFS traffic when booting the 'diskless' nodes but we
manage to boot 30 nodes in about four minutes without using a 3COM
unmanaged 100Base-T 'private' ethernet switch (i.e not on the LAN).
One advantage of 'BOBCAT' architecture is the segregation of network
traffic: You can still control the compute nodes no matter how much IPC
traffic is going on between them because the IPC traffic is on a
separate ethernet. Our system uses 192.168.0.0 for openMosix IPC, and
192.168.1.0 for ssh initiation of MPI processes and sockets.
I've written scripts to add and remove 'diskless' nodes from the cluster:
mknode # add a node to the cluster
rmnode # remove a node from the cluster
Best wishes,
Tony.
--
Dr. A.J.Travis, | mailto:ajt at rri.sari.ac.uk
Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt
Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751
Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mknode
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20040716/51f55408/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: rmnode
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20040716/51f55408/attachment-0001.ksh>
More information about the Beowulf
mailing list