[Beowulf] diskless cluster questions

Wed Jul 7 09:09:59 PDT 2010

With diskless clusters you also need to be aware of the many ways to do it:

- RAM root - where all of the OS is loaded in memory
- NFS root - which is what a lot of people seem to call diskless
- RamRoot/NFS root hybrid - where some directories like /root live on RAM
and /usr lives on NFS for example.

We really like RAM root for HPC because you can make a small image
(150-300MB with InfiniBand) that is portable, has great performance and is
easy to reproduce and update.  The disadvantages are if you run multiple
applications where some library may not be in the image.  In that sense,
using NFS root works better for those environments that run a lot of
different applications.  However, many large clusters I have worked with are
dedicated to one single application and ram root fits the bill perfectly.

Like Ashley said all the host name info is configured via DHCP.  Many people
also put arguments in the PXE boot file to help specify additional
parameters.  I think the old Red Hat stateless did NFSROOT= for example.  I
have also seen many other homegrown ones where they throw everything but the
kitchen sink in as arguments.

In addition for configuring other devices (like InfiniBand IP addresses)
instead of just IPADDR=10.3.0.201 in the config file there would be some
script:  IPADDR=10.3.0.$(`hostname` | sed 's/node//')

These seem to be the tricks I see on doing this.  You may also want to look
into two projects that do stateless/diskless booting:  xCAT and Perceus.
Both of them allow for all three methods described above.  There may be
others as well.

Hope that helps some what.

On Wed, Jul 7, 2010 at 8:02 AM, John Hearns <hearnsj at googlemail.com> wrote:

> On 5 July 2010 06:36, Holden Dapenor <holden.dapenor at gmail.com> wrote:
> > How does diskless clustering work for those aspects of the OS that need
> to
> > be unique for each node?
>
> As Ashley says, you use DHCP for the network configuration.
> There is very little else you should need to configure differently on
> each individual host - for instance batch scheduler systems store
> information on
> batch nodes in a central place. All the node needs to do is be
> configured to know its batch master, and to start the batch system
> daemon, then
> wait for the jobs to come in.
> Any changes to (say) /etc/pbs.conf are generally made to all the
> cluster nodes identically.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
Vallard
http://sumavi.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100707/3d2426fc/attachment.html>