[Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?
Faraz Hussain
info at feacluster.com
Thu May 2 08:40:56 PDT 2019
Thanks. Before I go down the path of installing things willy-nilly, is
there some guide I should be following instead? I obviously have a
problem with my mellanox drivers combined with "user error"..
So should I be paying Mellanox to help? Or is it a RedHat issue? Or is
it our harware vendor, HP who should be involved??
Looks like I need support on how to get support :-)
Quoting Christopher Samuel <chris at csamuel.org>:
>> root at lustwzb34:/root # systemctl status rdma
>> Unit rdma.service could not be found.
>
> You're missing this RPM then, which might explain a lot:
>
> $ rpm -qi rdma-core
> Name : rdma-core
> Version : 17.2
> Release : 3.el7
> Architecture: x86_64
> Install Date: Tue 04 Dec 2018 03:58:16 PM AEDT
> Group : Unspecified
> Size : 107924
> License : GPLv2 or BSD
> Signature : RSA/SHA256, Tue 13 Nov 2018 01:45:22 AM AEDT, Key ID
> 24c6a8a7f4a80eb5
> Source RPM : rdma-core-17.2-3.el7.src.rpm
> Build Date : Wed 31 Oct 2018 07:10:24 AM AEDT
> Build Host : x86-01.bsys.centos.org
> Relocations : (not relocatable)
> Packager : CentOS BuildSystem <http://bugs.centos.org>
> Vendor : CentOS
> URL : https://github.com/linux-rdma/rdma-core
> Summary : RDMA core userspace libraries and daemons
> Description :
> RDMA core userspace infrastructure and documentation, including initscripts,
> kernel driver-specific modprobe override configs, IPoIB network scripts,
> dracut rules, and the rdma-ndd utility.
>
> --
> Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
More information about the Beowulf
mailing list