<div dir="ltr"><div dir="ltr"><div dir="ltr"><div>It may be using IPoIB (TCP/IP over IB), not verbs/rdma. <br></div><div>You can force it to use openib (verbs, rdma) with (vader is for in-node shared memory):</div><div><br></div><div><pre class="gmail-de1"><span class="gmail-co4"></span><span class="gmail-kw2">mpirun</span> <span class="gmail-re5">--mca</span> btl openib,self,vader ...<br><br></pre><pre class="gmail-de1">These flags may also help tell which btl (byte transport layer) is being used:<br><br> <code>--mca btl_base_verbose 30</code></pre><pre class="gmail-de1">See these FAQ:<br><a href="https://www.open-mpi.org/faq/?category=openfabrics#ib-btl">https://www.open-mpi.org/faq/?category=openfabrics#ib-btl</a><br><a href="https://www.open-mpi.org/faq/?category=all#tcp-routability-1.3">https://www.open-mpi.org/faq/?category=all#tcp-routability-1.3</a><br></pre><pre class="gmail-de1"><font face="arial,helvetica,sans-serif">Better really ask more details in the Open MPI list. They are the pros!<br></font></pre><pre class="gmail-de1"><font face="arial,helvetica,sans-serif">My two cents,<br>Gus Correa<br></font></pre><pre class="gmail-de1"><font face="arial,helvetica,sans-serif"><br></font></pre><pre class="gmail-de1"><br></pre></div><div><br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 30, 2019 at 3:57 PM Faraz Hussain <<a href="mailto:info@feacluster.com">info@feacluster.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Thanks, after buidling openmpi 4 from source, it now works! However it <br>
still gives this message below when I run openmpi with verbose setting:<br>
<br>
No OpenFabrics connection schemes reported that they were able to be<br>
used on a specific port. As such, the openib BTL (OpenFabrics<br>
support) will be disabled for this port.<br>
<br>
Local host: lustwzb34<br>
Local device: mlx4_0<br>
Local port: 1<br>
CPCs attempted: rdmacm, udcm<br>
<br>
However, the results from my latency and bandwith tests seem to be <br>
what I would expect from infiniband. See:<br>
<br>
[hussaif1@lustwzb34 pt2pt]$ mpirun -v -np 2 -hostfile ./hostfile <br>
./osu_latency<br>
# OSU MPI Latency Test v5.3.2<br>
# Size Latency (us)<br>
0 1.87<br>
1 1.88<br>
2 1.93<br>
4 1.92<br>
8 1.93<br>
16 1.95<br>
32 1.93<br>
64 2.08<br>
128 2.61<br>
256 2.72<br>
512 2.93<br>
1024 3.33<br>
2048 3.81<br>
4096 4.71<br>
8192 6.68<br>
16384 8.38<br>
32768 12.13<br>
65536 19.74<br>
131072 35.08<br>
262144 64.67<br>
524288 122.11<br>
1048576 236.69<br>
2097152 465.97<br>
4194304 926.31<br>
<br>
[hussaif1@lustwzb34 pt2pt]$ mpirun -v -np 2 -hostfile ./hostfile ./osu_bw<br>
# OSU MPI Bandwidth Test v5.3.2<br>
# Size Bandwidth (MB/s)<br>
1 3.09<br>
2 6.35<br>
4 12.77<br>
8 26.01<br>
16 51.31<br>
32 103.08<br>
64 197.89<br>
128 362.00<br>
256 676.28<br>
512 1096.26<br>
1024 1819.25<br>
2048 2551.41<br>
4096 3886.63<br>
8192 3983.17<br>
16384 4362.30<br>
32768 4457.09<br>
65536 4502.41<br>
131072 4512.64<br>
262144 4531.48<br>
524288 4537.42<br>
1048576 4510.69<br>
2097152 4546.64<br>
4194304 4565.12<br>
<br>
When I run ibv_devinfo I get:<br>
<br>
[hussaif1@lustwzb34 pt2pt]$ ibv_devinfo<br>
hca_id: mlx4_0<br>
transport: InfiniBand (0)<br>
fw_ver: 2.36.5000<br>
node_guid: 480f:cfff:fff5:c6c0<br>
sys_image_guid: 480f:cfff:fff5:c6c3<br>
vendor_id: 0x02c9<br>
vendor_part_id: 4103<br>
hw_ver: 0x0<br>
board_id: HP_1360110017<br>
phys_port_cnt: 2<br>
Device ports:<br>
port: 1<br>
state: PORT_ACTIVE (4)<br>
max_mtu: 4096 (5)<br>
active_mtu: 1024 (3)<br>
sm_lid: 0<br>
port_lid: 0<br>
port_lmc: 0x00<br>
link_layer: Ethernet<br>
<br>
port: 2<br>
state: PORT_DOWN (1)<br>
max_mtu: 4096 (5)<br>
active_mtu: 1024 (3)<br>
sm_lid: 0<br>
port_lid: 0<br>
port_lmc: 0x00<br>
link_layer: Ethernet<br>
<br>
I will ask the openmpi mailing list if my results make sense?!<br>
<br>
<br>
Quoting Gus Correa <<a href="mailto:gus@ldeo.columbia.edu" target="_blank">gus@ldeo.columbia.edu</a>>:<br>
<br>
> Hi Faraz<br>
><br>
> By all means, download the Open MPI tarball and build from source.<br>
> Otherwise there won't be support for IB (the CentOS Open MPI packages most<br>
> likely rely only on TCP/IP).<br>
><br>
> Read their README file (it comes in the tarball), and take a careful look<br>
> at their (excellent) FAQ:<br>
> <a href="https://www.open-mpi.org/faq/" rel="noreferrer" target="_blank">https://www.open-mpi.org/faq/</a><br>
> Many issues can be solved by just reading these two resources.<br>
><br>
> If you hit more trouble, subscribe to the Open MPI mailing list, and ask<br>
> questions there,<br>
> because you will get advice directly from the Open MPI developers, and the<br>
> fix will come easy.<br>
> <a href="https://www.open-mpi.org/community/lists/ompi.php" rel="noreferrer" target="_blank">https://www.open-mpi.org/community/lists/ompi.php</a><br>
><br>
> My two cents,<br>
> Gus Correa<br>
><br>
> On Tue, Apr 30, 2019 at 3:07 PM Faraz Hussain <<a href="mailto:info@feacluster.com" target="_blank">info@feacluster.com</a>> wrote:<br>
><br>
>> Thanks, yes I have installed those libraries. See below. Initially I<br>
>> installed the libraries via yum. But then I tried installing the rpms<br>
>> directly from Mellanox website (<br>
>> MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tar ). Even after doing<br>
>> that, I still got the same error with openmpi. I will try your<br>
>> suggestion of building openmpi from source next!<br>
>><br>
>> root@lustwzb34:/root # yum list | grep ibverbs<br>
>> libibverbs.x86_64 41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs-devel.x86_64 41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs-devel-static.x86_64 41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs-utils.x86_64 41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs.i686 17.2-3.el7<br>
>> rhel-7-server-rpms<br>
>> libibverbs-devel.i686 1.2.1-1.el7<br>
>> rhel-7-server-rpms<br>
>><br>
>> root@lustwzb34:/root # lsmod | grep ib<br>
>> ib_ucm 22602 0<br>
>> ib_ipoib 168425 0<br>
>> ib_cm 53141 3 rdma_cm,ib_ucm,ib_ipoib<br>
>> ib_umad 22093 0<br>
>> mlx5_ib 339961 0<br>
>> ib_uverbs 121821 3 mlx5_ib,ib_ucm,rdma_ucm<br>
>> mlx5_core 919178 2 mlx5_ib,mlx5_fpga_tools<br>
>> mlx4_ib 211747 0<br>
>> ib_core 294554 10<br>
>><br>
>> rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib<br>
>> mlx4_core 360598 2 mlx4_en,mlx4_ib<br>
>> mlx_compat 29012 15<br>
>><br>
>> rdma_cm,ib_cm,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,mlx5_fpga_tools,ib_ucm,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib<br>
>> devlink 42368 4 mlx4_en,mlx4_ib,mlx4_core,mlx5_core<br>
>> libcrc32c 12644 3 xfs,nf_nat,nf_conntrack<br>
>> root@lustwzb34:/root #<br>
>><br>
>><br>
>><br>
>> > Did you install libibverbs (and libibverbs-utils, for information and<br>
>> > troubleshooting)?<br>
>><br>
>> > yum list |grep ibverbs<br>
>><br>
>> > Are you loading the ib modules?<br>
>><br>
>> > lsmod |grep ib<br>
>><br>
>><br>
<br>
<br>
<br>
</blockquote></div>