[Beowulf] Cluster Novice

Mark Hahn hahn at physics.mcmaster.ca
Mon Jan 17 11:32:58 PST 2005


> Does there exist any type of cluster that gives redundancy and even a parity check?
> I don't want to invent the wheel twice :-)

in general, this level of checking is done only for bank central-offices,
and usually is implemented with lockstep/quorum hardware.  for instance,
HP NonStop (nee Tandem) computers.

at the other extreme, if you simply assume that crashes are easy to detect,
and are not interested in the more "byzantine" modes of failure, it's rather
easy to set up high-availability (HA) clusters.  for instance, you might
simply have a small set of servers which elect a master (or load-balance),
and if a "heartbeat" fails, some number of servers get turned off.

> I'm intrest is in a "virtual" cpu that can run over N machines and if
> "virtual" cpu crashes the N-1 takes over. 

well, it depends on your assumptions.  for instance, how do you detect a
crash?  NonStop provides a much more paranoid view of "failure" than 
a simpler, software-based approach like STONITH HA clusters.

> I'm not looking for speed only stableness.

beowulf is about speed, not HA.

> Does this exist already or do i need to program one my self?

the answer to that is almost always that someone else has already done it.
it's a big world.  (that doesn't mean that existing wheels are perfect!)

regards, mark hahn.




More information about the Beowulf mailing list