Two heads are better than one! :)

Bob Drzyzgula bob at drzyzgula.org
Thu Oct 31 17:56:06 PST 2002


This question has recently come up at my office
as well. After a few years of running small
"research/development" clusters, one of our groups
of users is contemplating using the technology for a
"production", or business-critical application. For such
systems, we usually provide some level of redundancy in
the design. While clearly the cluster of compute nodes
provides excellent redundancy -- node failures would most
likely only marginally reduce the overall throughput of the
cluster -- the question of how to provide for redundancy
at the controller node was less obvious to people.

Thus, the question becomes whether any of the various
cluster APIs and services such as PVM, MPI, BPROC, PBS,
etc. are dependant on the selection of a single, exclusive
master. Clearly if multiple simultaneously operating
masters are tolerated in the API, you can just have
multiple head nodes which are available all the time. If
an API requires a single master, one might have to effect
some sort of manual switch-over in the event of a head
node failure; this would then raise the question of the
complexity of such a switch-over, e.g. would compute node
reconfiguration be required or would it simply be a matter
of starting up the controller service on a new system.

Personally, I am much more familiar with the hardware
aspects of all this than I am with the programming
and administrative aspects, so I would be interested in
people's opinions as to the difficulties that might
arise in attempting to provide this sort of redundancy
for any traditional cluster APIs. FWIW, the application
of interest is currently using MPI but I have a feeling
that this is just the tip of the iceberg as far as
production applications moving in this direction, so
I don't feel that I can limit the question in this way...

Thanks,
--Bob Drzyzgula


On Thu, Oct 31, 2002 at 02:59:41PM -0500, Aaron Lott wrote:
> 
> What kind of things to you want the 2 head nodes to do?
> 
> On Thu, 31 Oct 2002 robnash at rogers.com wrote:
> 
> > Hello Everyone,
> > 
> > Does anyone know if it's possible to have two active
> > head nodes in a Beowulf style cluster? I would like
> > to have two physical access points into one cluster.
> > Any help regarding this would be greatly appreciated.
> > 
> > Rob.



More information about the Beowulf mailing list