[Beowulf] New member, upgrading our existing Beowulf cluster

Chris Samuel csamuel at vpac.org
Mon Dec 7 16:55:09 PST 2009

----- "Håkon Bugge" <h-bugge at online.no> wrote:

> What we did in Platform (Scali) MPI, was to drain
> the HPC interconnect, then close it down. The problem
> was then reduced to checkpoint (e.g. using BLCR)
> N processes.

I suspect this is what Open-MPI does too, but I
don't know if the VM based systems can migrate
such jobs without this application layer support.

Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency

More information about the Beowulf mailing list