[Beowulf] Compute Node OS on Local Disk vs. Ram Disk
Bogdan Costescu
Bogdan.Costescu at iwr.uni-heidelberg.de
Wed Oct 1 05:52:29 PDT 2008
On Tue, 30 Sep 2008, Donald Becker wrote:
> Ahhh, your first flawed assumption.
>
> You believe that the OS needs to be statically provisioned to the nodes.
> That is incorrect.
Well, you also make the flawed assumption that the best technical
solutions are always preferred. From my position I have seen many
cases where political or administrative reasons have very much
restricted the choice of technical solutions that could be used. Other
reasons are related to the lack of flexibility from ISVs which provide
applications in binary form only and make certain assumptions about
the way the target cluster works. Yet another reason is the fact that
a solution like Scyld's limits the whole cluster to running one
distribution (please correct me if I'm wrong), while a solution with
node "images" allows mixing Linux distributions at will.
> The only times that it is asked to do something new (boot, accept a
> new process) it's communicating with a fully installed, up-to-date
> master node. It has, at least temporarily, complete access to a
> reference install.
I think that this is another assumption that holds true for the Scyld
system, but there are situations where this is not true. Some years
ago I have developed a rudimentary batch system for which the master
node only contacted the first node allocated/desired for the job; this
node was then responsible to contact the other nodes allocated/desired
and start the rest of the job. This was very much modelled after the
way the naive rsh/ssh based launchers for MPI jobs work: once mpirun
is running, there is no connection to the master node, only between
the node where mpirun is running and the rest of the nodes specified
in the hosts file. I think that Torque also has a similar design
(Mother Superior being in control of the job), but I haven't look
closely at the details so I might be wrong.
> If you design a cluster system that installs on a local disk, it's
> very difficult to adapt it to diskless blades. If you design a
> system that is as efficient without disks, it's trivial to
> optionally mount disks for caching, temporary files or application
> I/O.
If you design a system that is flexible enough to allow you to use
either diskless or diskfull installs, what do you have to loose ?
The same node "image" can be used in several ways:
- copied to the local disk and booted from there (where the copying
could be done as a separate operation followed by a reboot or it can
be done from initrd)
- used over NFS-root
- used as a ramdisk, provided that the node "image" is small enough
Note: I have used "image" in this and previous e-mails to signify the
collection of files that the node needs for booting; most likely this
is not a FS image (like an ISO one), but it could also be one. Various
documents call this a "virtual node FS", "chroot-ed FS", etc.
--
Bogdan Costescu
IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850
E-mail: bogdan.costescu at iwr.uni-heidelberg.de
More information about the Beowulf
mailing list