[Beowulf] New member, upgrading our existing Beowulf cluster
Joshua Baker-LePain
jlb17 at duke.edu
Thu Dec 3 11:35:45 PST 2009
On Thu, 3 Dec 2009 at 2:29pm, Mark Hahn wrote
>>> if a single node goes down, you need to take down all the
>>> nodes in the chassis before you can remove the dead node. Not very
>>> practical.
>>
>> Eh? What's so hard about marking the other nodes as unusable in your
>> batch system, and waiting for them to become free?
>
> depends on your max job length. but yeah, idling three nodes for a week
> is not going to be noticable in anything but a quite small cluster...
But doesn't the engineer in you just bristle at the (admittedly, rather
slight) inefficiency? Call me OCD (you wouldn't be the first), but it
just bugs me.
--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
More information about the Beowulf
mailing list