cluster setup - handling user homeareas - from main public network storage device

Thu Mar 28 03:25:11 PST 2002

Shin,

   Since nobody has jumped in yet, I guess I will :)

shin at guss.org.uk wrote:

> Hi,
>
> I'm just starting out with my first Beowulf, and have worked out
> mostly how I'm going to be setting things up but I'm unsure as to how
> to deal with user home areas.
>
> The cluster will probably be configured as a number of nodes on a
> class B private address (it's not going to grow too much), with the
> front end node (FEN) sitting on the private network and also the
> main public network.
>
> The FEN will allow ssh only and will NFS export a number of s/w
> packages to the nodes (to save installing s/w on each node). It will
> also have a small scratch area - as does each node.
>
> The main problem I'm looking for advice on is how to deal with the
> handling of user home areas. All the users have a large storage
> allocation on our main RAID (connected to sun/solaris kit, quota'd &
> backed up regularly) which sits on the main network, currently users
> produce very large (Gb's worth) data files and the smallish area on
> the cluster won't suffice - as I expect similarly sized output files
> on the cluster.
>
> Should I :
>
> 1. automount the users home area from the raid on to the FEN when each
> user logs in - but then how do I cope with the fact that the nodes are
> on the private network - do I get the FEN to re-export the homearea to
> the nodes so that the jobs can write the data back? Or is NAT the
> answer somehow here?

AFAIK you can't re-export. I remember someone saying that you
could maybe re-export using the user-space nfsd but not with the
kernel space nfsd. I wouldn't recommend it anyway.

Another option, if you can do it, is to add a NIC to the RAID box
to connect it directly to the cluster switch. Of course, you would
have to take down the box to install the hardware, but it should work
pretty easily.

I think a simpler solution would be to add a few good size disks in
FEN (if you can). Good size IDE drives are fairly cheap but
some of the smaller SCSI drives are pretty good on price as well.
Then just have the users stream their data off of FEN onto the
RAID box as part of their job (that's the way we function - we
have about 120 Gigs on the FEN and then stream that off to
some NAS boxes). Just have a good network connection between
the FEN and the RAID box.

>
>
> 2. Setup lots of scratch space (or even use PVFS or similar across all
> the nodes local disks) on the FEN which each node can write to and the
> users use scp to transfer files to/fro the RAID. Expect users to
> balk at the idea of using scp.

PVFS is really intended to be a high-speed filesystem, not a place
to put home directories. The idea is to use it as scratch space for
jobs that need some reasonably high-speed IO and then move
the data from PVFS onto a more consistent filesystem. In addition,
I don't think you can run binaries out of PVFS quite yet. There
has been some work along those lines, but it will probably be a while
before this happens (you can't have symlinks either).

>
>
> Also should I allow users to run jobs (interactive?) on the FEN or
> should it be used exclusively for logging in, NFS etc?

We let users run on the FEN if they need to. If it gets to be too
much (rarely happens) I just yell at them and then help them
find a place to run :)

>
>
> Additionally the main network uses NIS for authentication - and I
> wanted to try something similar on the cluster (which will have a
> far smaller number of users than the main network) so I was planning
> on running a seperate small NIS domain on the cluster (with the FEN
> as master), rather than trying to sync passwd etc across nodes.

There was a good thread on this list about NIS and larger clusters.
I think the final conclusion was for larger clusters (100+ nodes?)
that NIS starts to eat network bandwidth quickly. I understand why
people want to use NIS since you can push lots of configuration maps
very easily and account maintenance is also easy. But, in light of the
comments that NIS can starting eating bandwidth, we have just stuck
to copying the password/group files to the compute nodes. Since
we don't reconfigure our cluster, other important files in /etc. have
no real reason to be copied to the nodes. We keep a copy of all of the
relevant nodal information on the FEN (with backups of course), so
rebuilding a node is fairly trivial.

In general I REALLY believe in KISS for account maintenance and
nodal creation.

Good Luck!

Jeff Layton

Lockheed-Martin

>
>
> Any ideas, practical advice/setups on how others are doing/dealing
> with user home areas would be appreciated,
>
> Many TIA
> Shin
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf