[Beowulf] Transparant clustering

Mark Hahn hahn at mcmaster.ca
Thu Nov 13 14:52:20 PST 2008

> Service). I have always thought that it should be possible to build a
> cluster which works like a single system. So that when I open an SSH session
> to the cluster I get a connection as normal while in fact I am connecting to
> the clustered system.

the ssh connection is not interesting; what happens after that _is_.
so you ssh to a cluster, and through lvs or similar, you get put onto 
some node.  then what?  is it supposed to still "work like" a single 
system?  it does as long as you don't need more than one node, but 
that's not only boring but begs the question of why a cluster?

making a cluster really "work like" a single system means that no
thread should be aware of the fact that some other thread is on 
a different node.  this means a single pid space, transparent shared
memory, etc.  and even then (an SGI altix is such a machine), none
of this is transparent in a strong sense (ie, the thread will indeed 
be able to tell when a cacheline is remote...)

> I started reading a lot, and it seems as if this can
> be done with beowulf. I just wonder if the head node would make things more
> difficult (since that can go down as well). Is this at all possible (using
> beowulf) and how would I go about configuring this?

are you confusing high-availability with clustering?  avoiding single points
of failure is laudable, but you quickly start to move away from anything 
that resembles high-performance (and necessarily start relying on more 
replication and thus cost...)

> I know this isn't very clear (I am just exploring), so please ask away.

well, "what do you mean?" pretty much covers it.  it's certainly possible
to avoid a single login node as a single point of failure.  it's also 
possible to use HA techniques to avoid other SPOF's (such as where the 
scheduler runs, or filesystems, etc).  but "working like a single system"
is much harder.

More information about the Beowulf mailing list