[Beowulf] statless compute nodes

Wed May 27 18:56:12 PDT 2015

On 05/27/2015 09:22 PM, Trevor Gale wrote:
> Hello all,
>
> I was wondering how stateless node fair with very memory intensive applications. Does it simply require you to have a large amount of RAM to house your file system and program data? or are there other limitations?

Warewulf has been out the longest of the stateless distributions. We had 
rolled our own a while before using it, and kept adding capability to ours.

Its generally not hard to pare down a stateless node to a few hundred MB 
(or less!).  Application handled via NFS, and strip your stateless 
system down to the bare minimum you need.  In fairly short order, you 
should be able to pxe boot a kernel with a bare minimal initramfs, and 
have it launch docker and docker like containers. This is the concept 
behind CoreOS, and many distributions are looking to move to this model.

We use a makefile to drive creation of our stateless systems (everything 
including the kitchen sink, and our entire stack), which hovers around 
4GB total.   Our original stateless systems were around 400MB or so, but 
I wanted a full development, IB, PFS, and MPI environment (not to 
mention other things).  I could easily make some of this stateful, but 
our application requires resiliency that can't exist in a stateful model 
(what if OS drives or the entire controller) suddenly went away, or the 
boot/management network was partitioned with an OS on NFS.

This is one of our Unison units right now

root at usn-01:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs          8.0G  3.9G  4.2G  49% /
udev             10M     0   10M   0% /dev
...
tmpfs           1.0M     0  1.0M   0% /data
/dev/sda        8.8T  113G  8.7T   2% /data/1
/dev/sdb        8.8T  201G  8.6T   3% /data/2
/dev/sdc        8.8T   63G  8.7T   1% /data/3
/dev/sdd        8.8T  138G  8.6T   2% /data/4
fhgfs_nodev      70T  1.1T   69T   2% /mnt/unison2

with the "local" mounts being controlled by a distributed database.   
Think of it as a distributed cluster wide /etc/fstab. More relevant for 
a storage cluster/cloud than a compute cluster, but easily usable in 
this regard.

We handle all the rest of the configuration post-boot.   A little 
infrastructure work (bringing up interfaces), and then configuration 
work (driven by scripts and data pulled from a central repository, which 
is also distributable).

There are some oddities, not the least of which most distributions are 
decidedly not built for this.  But if you get them to a point where they 
think they have a  /dev/root and they mount it, life generally gets much 
easier rather quickly.

One of the other cool aspects of our mechanism is that we can pivot to a 
hybrid or NFS after fully booting.  And if the NFS pivot fails, we can 
fall back to our ramboot without a reboot.  Its a thing of beauty ... 
truly ...

FWIW: we use a debian base (and Ubuntu on occasion) these days, though 
we've used CentOS and RHEL in the past before it became harder to 
distribute.  Generally speaking we can boot anything (and I really mean 
*anything*: Any Linux, *BSD, Solaris, DOS, Windows, ... ) and control 
them in a similar manner (well, not DOS and Windows ... they are ... 
different ... but it is doable).

Warewulf has similar capabilities and is designed to be a cluster 
specific tool.  I think there are a few others (OneSIS, etc.) that come 
to mind that can do roughly similar things.  Maybe even xcat2 ... not 
sure, haven't looked at it in years.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
e: landman at scalableinformatics.com
w: http://scalableinformatics.com
t: @scalableinfo
p: +1 734 786 8423 x121
c: +1 734 612 4615