[Beowulf] anyone using SALT on your clusters?
jonathan.barber at gmail.com
Mon Jul 1 06:09:29 PDT 2013
On 29 June 2013 06:07, Christopher Samuel <samuel at unimelb.edu.au> wrote:
> On 28/06/13 18:45, Jonathan Barber wrote:
> > The problem with SSH based approaches is when you have failed nodes
> > - normally they cause the entire command to hang until the attempted
> > connection times out.
> xdsh in xCAT can handle that for you, passing the -v option tells it to
> use the nodes status as monitored to avoid down nodes.
That's interesting, I hadn't noticed that option before.
Looking at what it does, the argument causes xcat to run "nmap -PE" (i.e.
does an ICMP echo request to the host) before connecting. So it will also
hang if the sshd blocks for some reason (such as with my past NFS woes).
> I might suggest to them an environment variable to enable that by
> default, rather than having to remember to add it.
> Christopher Samuel Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computation Initiative
> Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
> http://www.vlsci.org.au/ http://twitter.com/vlsci
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
Jonathan Barber <jonathan.barber at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf