[Beowulf] "Part-time" beowulf clusters

Robert G. Brown rgb at phy.duke.edu
Thu Sep 22 11:02:09 PDT 2005

Paulo Ferreira de Sousa writes:

> Hello all,
> I would like to know information about "part-time" clusters (I don't
> know if there's a specific name for it).
> The concept I'm thinking, and that I'm sure is implemented already
> somewhere, is to use PC's that only are used during daytime, to run
> code during night time. A good example is to use the PC's from a
> computer lab at any school, connect them to a switch, and use them as
> a cluster when they're idle.
> Thank you very much in advance.

There isn't really a lot of "information" about it.  It's an idea that
has been kicked around on list a bunch of times, and actually
implemented a few times.  The Win-by-day, Lin-by-night idea has a
certain appeal for the cycle-starved, for sure.

Let's see if I can recall the issues.  Rebooting from lin to win is easy
(change grub, reboot), rebooting automagically from win to lin used to
be hard.  Now I suspect that if you use PXE (only) to boot the nodes
going either way would be pretty easy, if you can figure out how to
persuade the Win boxes to reboot themselves after (say) 6 pm when their
PXE servers have altered their default boot to lin.  Getting the lin
nodes to reboot at 6 am is trivial.

These days a really interesting possibility is to use e.g. warewulf,
which lets you boot the nodes into lin diskless.  So you can ALWAYS put
the nodes back to win with a reboot.  The WinXX folks tend to have to
work to install Windows systems (except where they've worked very hard
to master installing from a standard image with network shares) and get
irritated if you breathe on their systems once they're semistable.  PXE
booting a diskless OS leaves their hard-won configurations alone.

It really helps for the nodes to have WOL or some other way of remotely
triggering a reboot, BTW.  That way you don't have to figure out how to
make WinXX reboot itself on a cron-like schedule, and a cluster/LAN
administrator can take the cluster either way via the PXE/tftpboot
server and the WOL control.

Did I forget anything?


> Paulo
