[Beowulf] anyone using SALT on your clusters?

Christopher Samuel samuel at unimelb.edu.au
Fri Jun 28 22:07:20 PDT 2013


On 28/06/13 18:45, Jonathan Barber wrote:

> The problem with SSH based approaches is when you have failed nodes
> - normally they cause the entire command to hang until the attempted
>  connection times out.

xdsh in xCAT can handle that for you, passing the -v option tells it to
use the nodes status as monitored to avoid down nodes.

I might suggest to them an environment variable to enable that by
default, rather than having to remember to add it.

-- 
  Christopher Samuel        Senior Systems Administrator
  VLSCI - Victorian Life Sciences Computation Initiative
  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
  http://www.vlsci.org.au/      http://twitter.com/vlsci


More information about the Beowulf mailing list