<div dir="ltr"><div dir="ltr"><div dir="ltr"><div>It may be using IPoIB (TCP/IP over IB), not verbs/rdma. <br></div><div>You can force it to use openib (verbs, rdma) with (vader is for in-node shared memory):</div><div><br></div><div><pre class="gmail-de1"><span class="gmail-co4"></span><span class="gmail-kw2">mpirun</span> <span class="gmail-re5">--mca</span> btl openib,self,vader ...<br><br></pre><pre class="gmail-de1">These flags may also help tell which btl (byte transport layer) is being used:<br><br> <code>--mca btl_base_verbose 30</code></pre><pre class="gmail-de1">See these FAQ:<br><a href="https://www.open-mpi.org/faq/?category=openfabrics#ib-btl">https://www.open-mpi.org/faq/?category=openfabrics#ib-btl</a><br><a href="https://www.open-mpi.org/faq/?category=all#tcp-routability-1.3">https://www.open-mpi.org/faq/?category=all#tcp-routability-1.3</a><br></pre><pre class="gmail-de1"><font face="arial,helvetica,sans-serif">Better really ask more details in the Open MPI list. They are the pros!<br></font></pre><pre class="gmail-de1"><font face="arial,helvetica,sans-serif">My two cents,<br>Gus Correa<br></font></pre><pre class="gmail-de1"><font face="arial,helvetica,sans-serif"><br></font></pre><pre class="gmail-de1"><br></pre></div><div><br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 30, 2019 at 3:57 PM Faraz Hussain <<a href="mailto:info@feacluster.com">info@feacluster.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Thanks, after buidling openmpi 4 from source, it now works! However it  <br>
still gives this message below when I run openmpi with verbose setting:<br>
<br>
No OpenFabrics connection schemes reported that they were able to be<br>
used on a specific port.  As such, the openib BTL (OpenFabrics<br>
support) will be disabled for this port.<br>
<br>
   Local host:           lustwzb34<br>
   Local device:         mlx4_0<br>
   Local port:           1<br>
   CPCs attempted:       rdmacm, udcm<br>
<br>
However, the results from my latency and bandwith tests seem to be  <br>
what I would expect from infiniband. See:<br>
<br>
[hussaif1@lustwzb34 pt2pt]$  mpirun -v -np 2 -hostfile ./hostfile  <br>
./osu_latency<br>
# OSU MPI Latency Test v5.3.2<br>
# Size          Latency (us)<br>
0                       1.87<br>
1                       1.88<br>
2                       1.93<br>
4                       1.92<br>
8                       1.93<br>
16                      1.95<br>
32                      1.93<br>
64                      2.08<br>
128                     2.61<br>
256                     2.72<br>
512                     2.93<br>
1024                    3.33<br>
2048                    3.81<br>
4096                    4.71<br>
8192                    6.68<br>
16384                   8.38<br>
32768                  12.13<br>
65536                  19.74<br>
131072                 35.08<br>
262144                 64.67<br>
524288                122.11<br>
1048576               236.69<br>
2097152               465.97<br>
4194304               926.31<br>
<br>
[hussaif1@lustwzb34 pt2pt]$  mpirun -v -np 2 -hostfile ./hostfile ./osu_bw<br>
# OSU MPI Bandwidth Test v5.3.2<br>
# Size      Bandwidth (MB/s)<br>
1                       3.09<br>
2                       6.35<br>
4                      12.77<br>
8                      26.01<br>
16                     51.31<br>
32                    103.08<br>
64                    197.89<br>
128                   362.00<br>
256                   676.28<br>
512                  1096.26<br>
1024                 1819.25<br>
2048                 2551.41<br>
4096                 3886.63<br>
8192                 3983.17<br>
16384                4362.30<br>
32768                4457.09<br>
65536                4502.41<br>
131072               4512.64<br>
262144               4531.48<br>
524288               4537.42<br>
1048576              4510.69<br>
2097152              4546.64<br>
4194304              4565.12<br>
<br>
When I run ibv_devinfo I get:<br>
<br>
[hussaif1@lustwzb34 pt2pt]$ ibv_devinfo<br>
hca_id: mlx4_0<br>
         transport:                      InfiniBand (0)<br>
         fw_ver:                         2.36.5000<br>
         node_guid:                      480f:cfff:fff5:c6c0<br>
         sys_image_guid:                 480f:cfff:fff5:c6c3<br>
         vendor_id:                      0x02c9<br>
         vendor_part_id:                 4103<br>
         hw_ver:                         0x0<br>
         board_id:                       HP_1360110017<br>
         phys_port_cnt:                  2<br>
         Device ports:<br>
                 port:   1<br>
                         state:                  PORT_ACTIVE (4)<br>
                         max_mtu:                4096 (5)<br>
                         active_mtu:             1024 (3)<br>
                         sm_lid:                 0<br>
                         port_lid:               0<br>
                         port_lmc:               0x00<br>
                         link_layer:             Ethernet<br>
<br>
                 port:   2<br>
                         state:                  PORT_DOWN (1)<br>
                         max_mtu:                4096 (5)<br>
                         active_mtu:             1024 (3)<br>
                         sm_lid:                 0<br>
                         port_lid:               0<br>
                         port_lmc:               0x00<br>
                         link_layer:             Ethernet<br>
<br>
I will ask the openmpi mailing list if my results make sense?!<br>
<br>
<br>
Quoting Gus Correa <<a href="mailto:gus@ldeo.columbia.edu" target="_blank">gus@ldeo.columbia.edu</a>>:<br>
<br>
> Hi Faraz<br>
><br>
> By all means, download the Open MPI tarball and build from source.<br>
> Otherwise there won't be support for IB (the CentOS Open MPI packages most<br>
> likely rely only on TCP/IP).<br>
><br>
> Read their README file (it comes in the tarball), and take a careful look<br>
> at their (excellent) FAQ:<br>
> <a href="https://www.open-mpi.org/faq/" rel="noreferrer" target="_blank">https://www.open-mpi.org/faq/</a><br>
> Many issues can be solved by just reading these two resources.<br>
><br>
> If you hit more trouble, subscribe to the Open MPI mailing list, and ask<br>
> questions there,<br>
> because you will get advice directly from the Open MPI developers, and the<br>
> fix will come easy.<br>
> <a href="https://www.open-mpi.org/community/lists/ompi.php" rel="noreferrer" target="_blank">https://www.open-mpi.org/community/lists/ompi.php</a><br>
><br>
> My two cents,<br>
> Gus Correa<br>
><br>
> On Tue, Apr 30, 2019 at 3:07 PM Faraz Hussain <<a href="mailto:info@feacluster.com" target="_blank">info@feacluster.com</a>> wrote:<br>
><br>
>> Thanks, yes I have installed those libraries. See below. Initially I<br>
>> installed the libraries via yum. But then I tried installing the rpms<br>
>> directly from Mellanox website (<br>
>> MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tar ). Even after doing<br>
>> that, I still got the same error with openmpi. I will try your<br>
>> suggestion of building openmpi from source next!<br>
>><br>
>> root@lustwzb34:/root # yum list | grep ibverbs<br>
>> libibverbs.x86_64                     41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs-devel.x86_64               41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs-devel-static.x86_64        41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs-utils.x86_64               41mlnx1-OFED.4.5.0.1.0.45101<br>
>> libibverbs.i686                       17.2-3.el7<br>
>> rhel-7-server-rpms<br>
>> libibverbs-devel.i686                 1.2.1-1.el7<br>
>> rhel-7-server-rpms<br>
>><br>
>> root@lustwzb34:/root # lsmod | grep ib<br>
>> ib_ucm                 22602  0<br>
>> ib_ipoib              168425  0<br>
>> ib_cm                  53141  3 rdma_cm,ib_ucm,ib_ipoib<br>
>> ib_umad                22093  0<br>
>> mlx5_ib               339961  0<br>
>> ib_uverbs             121821  3 mlx5_ib,ib_ucm,rdma_ucm<br>
>> mlx5_core             919178  2 mlx5_ib,mlx5_fpga_tools<br>
>> mlx4_ib               211747  0<br>
>> ib_core               294554  10<br>
>><br>
>> rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib<br>
>> mlx4_core             360598  2 mlx4_en,mlx4_ib<br>
>> mlx_compat             29012  15<br>
>><br>
>> rdma_cm,ib_cm,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,mlx5_fpga_tools,ib_ucm,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib<br>
>> devlink                42368  4 mlx4_en,mlx4_ib,mlx4_core,mlx5_core<br>
>> libcrc32c              12644  3 xfs,nf_nat,nf_conntrack<br>
>> root@lustwzb34:/root #<br>
>><br>
>><br>
>><br>
>> > Did you install libibverbs  (and libibverbs-utils, for information and<br>
>> > troubleshooting)?<br>
>><br>
>> > yum list |grep ibverbs<br>
>><br>
>> > Are you loading the ib modules?<br>
>><br>
>> > lsmod |grep ib<br>
>><br>
>><br>
<br>
<br>
<br>
</blockquote></div>