[Beowulf] Project Planning: Storage, Network, and Redundancy Considerations
John Hearns
john.hearns at streamline-computing.com
Mon Mar 19 09:42:44 PDT 2007
Brian R. Smith wrote:
> Hey list,
>
> 1. Proprietary parallel storage systems (like Panasas, etc.): It
> provides the per-node bandwidth, aggregate bandwidth, caching
> mechanisms, fault-tolerance, and redundancy that we require (plus having
> a vendor offering 24x7x365 support & 24 hour turnover is quite a breath
> of fresh air for us). Price point is a little high for the amount of
> storage that we will get though, little more than doubling our current
> overall capacity. As far as I can tell, I can use this device as a
> permanent data store (like /home) and also as the user's scratch space
> so that there is only a single point for all data needs across the
> cluster. It does, however, require the installation of vendor kernel
> modules which do often add overhead to system administration (as they
> need to be compiled, linked, and tested before every kernel update).
If you like Panasas, go with them.
The kernel module thing isn't all that a big deal - they are quite
willing to 'cook' the modules for you.
but YMMV
>
> Our final problem is a relatively simple one though I am definitely a
> newbie to the H.A. world. Under this consolidation plan, we will have
> only one point of entry to this cluster and hence a single point of
> failure. Have any beowulfers had experience with deploying clusters
> with redundant head nodes in a pseudo-H.A. fashion (heartbeat
> monitoring, fail-over, etc.) and what experiences have you had in
> adapting your resource manager to this task? Would it simply be more
> feasible to move the resource manager to another machine at this point
> (and have both headnodes act as submit and administrative clients)? My
> current plan is unfortunately light on the details of handling SGE in
> such an environment. It includes purchasing two identical 1U boxes
> (with good support contracts). They will monitor each other for
> availability and the goal is to have the spare take over if the master
> fails. While the spare is not in use, I was planning on dispatching
> jobs to it.
I have constructed several clusters using HA.
I believe Joe Landman has also - as you are in the States why not give
some thought to contacting Scalable and getting them to do some more
detailed designs for you?
For HA clusters, I have implemented several clusters using Linux-HA and
heartbeat. This is an active/passive setup, with a primary and a backup
head node. On failover, the backup head node starts up cluster services.
Failing over SGE is (relatively) easy - the main part is making sure
that the cluster spool directory is on shared storage.
And mounting that share storage on one machine or the other :-)
The harder part is failing over NFS - again I've done it.
I gather there is a wrinkle or two with NFS v4 on Linux-HA type systems.
The second way to do this would be to look at using shared storage,
and using the Gridengine queue master failover mechanism. This is a
different approach, in that you have two machines running, using either
a NAS type storage server or Panasas/Lustre. The SGE spool directory is
on this, and the SGE qmaster will start on the second machine if the
first fails to answer its heartbeat.
ps. 1U boxes? Think something a bit bigger - with hot swap PSUs.
You also might have to fit a second network card for your HA heartbeat
link (link plural - you need two links) plus a SCSI card, so think
slightly bigger boxes for the two head nodes.
You can spec 1U nodes for interactive login/compile/job submission
nodes. Maybe you could run a DNS round robin type load balancer for
redundancy on these boxes - they should all be similar, and if one stops
working then ho-hum.
pps. "when the spare is not in use dispatching jobs to it"
Actually, we also do a cold failover setup which is just like that, and
the backup node is used for running jobs when it is idle.
--
John Hearns
Senior HPC Engineer
Streamline Computing,
The Innovation Centre, Warwick Technology Park,
Gallows Hill, Warwick CV34 6UW
Office: 01926 623130 Mobile: 07841 231235
More information about the Beowulf
mailing list