[Beowulf] How to know if infiniband network works?

Gus Correa gus at ldeo.columbia.edu
Wed Aug 2 09:58:07 PDT 2017

Hi Faraz

1) lsmod | grep ib should show if the Infinband kernel modules are loaded.

2) Infinband normally uses remote DMA (rdma) through "verbs".
You should see an "ib" module with "verbs" in the name.
That is the preferred/faster mode for MPI.

3) However, you can also use Infinband for TCP/IP (slower).
As the output of your ifconfig shows, your ib0 interface is
also configured for TCP/IP.

4) You may have two interfaces (one card with two or two cards) in the 
nodes. One may not be connected to a switch (ib1). Check the back of 
your nodes.

5) To check if MPI is using it, depends a bit on which MPI library
you're using.
Which one? Open MPI, MVAPICH2, some vendor/proprietary one?
If it is Open MPI the command "ompi-info" will tell.
With Open MPI there are also ways to enable/disable
Infiniband at runtime.

6) Some Infinband diagnostics may also help (normally in /usr/sbin)



OK, this is my pedestrian view of Infinband.
Now let's hear the experts in the list for deeper insights. :)

I hope this helps,
Gus Correa

On 08/02/2017 12:44 PM, Faraz Hussain wrote:
> I have inherited a 20-node cluster that supposedly has an infiniband 
> network. I am testing some mpi applications and am seeing no performance 
> improvement with multiple nodes. So I am wondering if the Infiband 
> network even works?
> The output of ifconfig -a shows an ib0 and ib1 network. I ran ethtools 
> ib0 and it shows:
>          Speed: 40000Mb/s
>          Link detected: no
> and for ib1 it show:
>          Speed: 10000Mb/s
>          Link detected: no
> I am assuming this means it is down? Any idea how to debug further and 
> restart it?
> Thanks!
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list