[Beowulf] IPoIB failure
Chris Samuel
samuel at unimelb.edu.au
Wed Jan 28 03:06:11 PST 2015
On Wed, 28 Jan 2015 11:51:16 AM Peter Kjellström wrote:
> The problem is most easily demonstrated by restarting the SM and then
> bringing up new ipoib interfaces on 6.6 hosts. This creates islands of
> connectivity.
Hmm, we have managed switches and so we don't restart the SM's on them unless
we have to do a complete power-out for the machine room, which is very rare.
Could well explain why we're not seeing this problem!
All the best,
Chris
--
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
http://www.vlsci.org.au/ http://twitter.com/vlsci
More information about the Beowulf
mailing list