diskless node + g98?

Martin Siegert siegert at sfu.ca
Thu Jan 23 10:29:31 PST 2003


On Thu, Jan 23, 2003 at 11:43:29AM -0600, lmathew at okstate.edu wrote:
> Beowulf list readers:
> 
> I have a Beowulf cluster (12 diskless nodes, 1 fileserver/master) with
> 26 processors (total) that is configured to run computational simulations
> in both parallel and serial (pretty standard for this list).  I am
> interested in utilizing my cluster to run a series of serial g98
> calculations on each node.  These calcualtions (as many of you know)
> require a "scratch" space.  How can this scratch space be provided to a
> diskless node?  Here are a few options that I have identified.

I am running a 96 node (192 processor) cluster as a multi-purpose
research facility for a university. I have a lot of g98 jobs running
on that cluster. All of my nodes have /tmp on a local disk with 15GB of
scratch space.

> 1).  Mount a LARGE ram drive?  (1GB in size if possible??) 
Almost certainly not good enough: most of the g98 jobs that I see on
my cluster need more than 1GB of scratch space.

> 2).  Install hard disk drives in each of the slave nodes?  (unattractive)
By far the best solution.

> 3).  Use a drive mounted via NFS/PVNFS?  (large amount of communication)
Very bad. I first (because I did not know anything about g98) had g98
configured such that it would write its scratch files to the user's home
directory over NFS. This did not only drive the performance of the g98
towards 0, but what is worse it made life miserable for everybody on the
cluster (NFS timeouts, etc.).

> Has anyone encountered this?  If so...what was the workaround that was
> implemented?  I am open to any suggestions and comments.   :)

I am going to stick my head out here: configuring a multi-purpose
cluster with diskless nodes is a misconfiguration. Only if you know
that you'll never run a job with significant I/O on your cluster
you could consider going diskless. Otherwise: stay away from that.
(you could install a high-performance file server on your cluster - we
actually have a Netapp NFS server - but for g98 your network becomes
the bottleneck. Furthermore, this is definitely more expensive than
installing local disks ...)

Just my $0.02

Martin

========================================================================
Martin Siegert
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert at sfu.ca
Canada  V5A 1S6
========================================================================



More information about the Beowulf mailing list