[Beowulf] statless compute nodes

Joe Landman landman at scalableinformatics.com
Wed May 27 19:32:45 PDT 2015


Not more complicated at all.  Read Doug Eadline's response.


On May 27, 2015 10:30:17 PM Trevor Gale <trevor at snowhaven.com> wrote:

> I need to configure IB, slurm, MPI, and NFS and am most likely running 
> centOS would you say that using warewulf makes configuration of these apps 
> significantly more complicated?
>
> Thanks,
> Trevor
>
> > On May 27, 2015, at 9:56 PM, Joe Landman 
> <landman at scalableinformatics.com> wrote:
> >
> >
> >
> > On 05/27/2015 09:22 PM, Trevor Gale wrote:
> >> Hello all,
> >>
> >> I was wondering how stateless node fair with very memory intensive 
> applications. Does it simply require you to have a large amount of RAM to 
> house your file system and program data? or are there other limitations?
> >
> > Warewulf has been out the longest of the stateless distributions. We had 
> rolled our own a while before using it, and kept adding capability to ours.
> >
> > Its generally not hard to pare down a stateless node to a few hundred MB 
> (or less!).  Application handled via NFS, and strip your stateless system 
> down to the bare minimum you need.  In fairly short order, you should be 
> able to pxe boot a kernel with a bare minimal initramfs, and have it launch 
> docker and docker like containers. This is the concept behind CoreOS, and 
> many distributions are looking to move to this model.
> >
> > We use a makefile to drive creation of our stateless systems (everything 
> including the kitchen sink, and our entire stack), which hovers around 4GB 
> total.   Our original stateless systems were around 400MB or so, but I 
> wanted a full development, IB, PFS, and MPI environment (not to mention 
> other things).  I could easily make some of this stateful, but our 
> application requires resiliency that can't exist in a stateful model (what 
> if OS drives or the entire controller) suddenly went away, or the 
> boot/management network was partitioned with an OS on NFS.
> >
> > This is one of our Unison units right now
> >
> > root at usn-01:~# df -h
> > Filesystem      Size  Used Avail Use% Mounted on
> > rootfs          8.0G  3.9G  4.2G  49% /
> > udev             10M     0   10M   0% /dev
> > ...
> > tmpfs           1.0M     0  1.0M   0% /data
> > /dev/sda        8.8T  113G  8.7T   2% /data/1
> > /dev/sdb        8.8T  201G  8.6T   3% /data/2
> > /dev/sdc        8.8T   63G  8.7T   1% /data/3
> > /dev/sdd        8.8T  138G  8.6T   2% /data/4
> > fhgfs_nodev      70T  1.1T   69T   2% /mnt/unison2
> >
> > with the "local" mounts being controlled by a distributed database.   
> Think of it as a distributed cluster wide /etc/fstab. More relevant for a 
> storage cluster/cloud than a compute cluster, but easily usable in this regard.
> >
> > We handle all the rest of the configuration post-boot.   A little 
> infrastructure work (bringing up interfaces), and then configuration work 
> (driven by scripts and data pulled from a central repository, which is also 
> distributable).
> >
> > There are some oddities, not the least of which most distributions are 
> decidedly not built for this.  But if you get them to a point where they 
> think they have a  /dev/root and they mount it, life generally gets much 
> easier rather quickly.
> >
> > One of the other cool aspects of our mechanism is that we can pivot to a 
> hybrid or NFS after fully booting.  And if the NFS pivot fails, we can fall 
> back to our ramboot without a reboot.  Its a thing of beauty ... truly ...
> >
> > FWIW: we use a debian base (and Ubuntu on occasion) these days, though 
> we've used CentOS and RHEL in the past before it became harder to 
> distribute.  Generally speaking we can boot anything (and I really mean 
> *anything*: Any Linux, *BSD, Solaris, DOS, Windows, ... ) and control them 
> in a similar manner (well, not DOS and Windows ... they are ... different 
> ... but it is doable).
> >
> > Warewulf has similar capabilities and is designed to be a cluster 
> specific tool.  I think there are a few others (OneSIS, etc.) that come to 
> mind that can do roughly similar things.  Maybe even xcat2 ... not sure, 
> haven't looked at it in years.
> >
> >
> > --
> > Joseph Landman, Ph.D
> > Founder and CEO
> > Scalable Informatics, Inc.
> > e: landman at scalableinformatics.com
> > w: http://scalableinformatics.com
> > t: @scalableinfo
> > p: +1 734 786 8423 x121
> > c: +1 734 612 4615
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list