[Beowulf] How to know if infiniband network works?

Faraz Hussain info at feacluster.com
Thu Aug 3 09:50:09 PDT 2017


Here is the result from the tcp and rdma tests. I take it to mean that  
IB network is performing at the expected speed.

[hussaif1 at lustwzb5 ~]$ qperf lustwzb4 -t 30 tcp_lat tcp_bw
tcp_lat:
     latency  =  24.2 us
tcp_bw:
     bw  =  1.19 GB/sec
[hussaif1 at lustwzb5 ~]$ qperf lustwzb4 -t 30 rc_lat rc_bw
rc_lat:
     latency  =  7.76 us
rc_bw:
     bw  =  4.56 GB/sec
[hussaif1 at lustwzb5 ~]$


Quoting Jeff Johnson <jeff.johnson at aeoncomputing.com>:

> Faraz,
>
> I didn't notice any tests where you actually tested the ip layer. You
> should run some iperf tests between nodes to make sure ipoib functions.
> Your infiniband/rdma can be working fine and ipoib can be dysfunctional.
> You need to ensure the ipoib configuration, like any ip environment, is
> configured the same on all nodes (network/subnet, netmask, mtu, etc) and
> that all of the nodes are configured for the same mode (connected vs
> datagram). If you can't run iperf then there is something broken in the
> ipoib configuration.
>
> --Jeff
>
> On Thu, Aug 3, 2017 at 8:41 AM, Faraz Hussain <info at feacluster.com> wrote:
>
>> Thanks for everyone's help. Using the Ohio State tests, qperf and
>> perfquery I am convinced the IB network is working. The only thing that
>> still bothers me is I can not get mpirun to use the tcp network. I tried
>> all combinations of --mca btl to no avail. It is not important, more just
>> curiosity.
>>
>>
>>
>> Quoting Michael Di Domenico <mdidomenico4 at gmail.com>:
>>
>> On Thu, Aug 3, 2017 at 10:10 AM, Faraz Hussain <info at feacluster.com>
>>> wrote:
>>>
>>>> Thanks, I installed the MPI tests from Ohio State. I ran osu_bw and got
>>>> the
>>>> results below. What is confusing is I get the same result if I use tcp or
>>>> openib ( by doing --mca btl openib|tcp,self with my mpirun command ). I
>>>> also
>>>> tried changing the environment variable: export OMPI_MCA_btl=tcp,self,sm
>>>> .
>>>> Results are the same regardless of tcp or openib..
>>>>
>>>> And when I do ifconfig -a I still see zero traffic reported for the ib0
>>>> and
>>>> ib1 network.
>>>>
>>>
>>> if openmpi uses RDMA for the traffic ib0/ib1 will not show traffic,
>>> you have to use perfquery
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
>
>
> --
> ------------------------------
> Jeff Johnson
> Co-Founder
> Aeon Computing
>
> jeff.johnson at aeoncomputing.com
> www.aeoncomputing.com
> t: 858-412-3810 x1001   f: 858-412-3845
> m: 619-204-9061
>
> 4170 Morena Boulevard, Suite D - San Diego, CA 92117
>
> High-Performance Computing / Lustre Filesystems / Scale-out Storage





More information about the Beowulf mailing list