[Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?
Gus Correa
gus at ldeo.columbia.edu
Tue Apr 30 10:20:42 PDT 2019
Did you install libibverbs (and libibverbs-utils, for information and
troubleshooting)?
yum list |grep ibverbs
Are you loading the ib modules?
lsmod |grep ib
On Tue, Apr 30, 2019 at 10:14 AM Faraz Hussain <info at feacluster.com> wrote:
> I installed RedHat 7.5 on two machines with the following Mellanox cards:
>
> 87:00.0 Network controller: Mellanox Technologies MT27520 Family
> [ConnectX-3 Pro
>
> I followed the steps outlined here to verify RDMA is working:
>
>
> https://community.mellanox.com/s/article/howto-enable-perftest-package-for-upstream-kernel
>
> However, I cannot seem to get Open MPI 3.0.2 to work. When I run it, I
> get this error:
>
> --------------------------------------------------------------------------
>
> No OpenFabrics connection schemes reported that they were able to be
>
> used on a specific port. As such, the openib BTL (OpenFabrics
>
> support) will be disabled for this port.
>
>
> Local host: lustwzb34
>
> Local device: mlx4_0
>
> Local port: 1
>
> CPCs attempted: rdmacm, udcm
>
> --------------------------------------------------------------------------
>
> Then it just hangs till I press control C.
>
> I understand this may be an issue with RedHat, Open MPI or Mellanox.
> Any ideas to debug which place it could be?
>
> Thanks!
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190430/e05b3b47/attachment-0001.html>
More information about the Beowulf
mailing list