[Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

Gus Correa gus at ldeo.columbia.edu
Tue Apr 30 10:20:42 PDT 2019


Did you install libibverbs  (and libibverbs-utils, for information and
troubleshooting)?

yum list |grep ibverbs

Are you loading the ib modules?

lsmod |grep ib


On Tue, Apr 30, 2019 at 10:14 AM Faraz Hussain <info at feacluster.com> wrote:

> I installed RedHat 7.5 on two machines with the following Mellanox cards:
>
> 87:00.0 Network controller: Mellanox Technologies MT27520 Family
> [ConnectX-3 Pro
>
> I followed the steps outlined here to verify RDMA is working:
>
>
> https://community.mellanox.com/s/article/howto-enable-perftest-package-for-upstream-kernel
>
> However, I cannot seem to get Open MPI 3.0.2 to work. When I run it, I
> get this error:
>
> --------------------------------------------------------------------------
>
> No OpenFabrics connection schemes reported that they were able to be
>
> used on a specific port. As such, the openib BTL (OpenFabrics
>
> support) will be disabled for this port.
>
>
>   Local host:      lustwzb34
>
>   Local device:     mlx4_0
>
>   Local port:      1
>
>   CPCs attempted:    rdmacm, udcm
>
> --------------------------------------------------------------------------
>
> Then it just hangs till I press control C.
>
> I understand this may be an issue with RedHat,  Open MPI or Mellanox.
> Any ideas to debug which place it could be?
>
> Thanks!
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190430/e05b3b47/attachment-0001.html>


More information about the Beowulf mailing list