[Beowulf] How to know if infiniband network works?
Gus Correa
gus at ldeo.columbia.edu
Thu Aug 3 09:37:45 PDT 2017
Hi Faraz
+1 to John's suggestion of joining the Open MPI list.
Your questions are now veering towards Open MPI specifics,
and you will get great feedback on this topic there.
If you want to use TCP/IP instead of RDMA
(say, IPoIB or Gigabit Ethernet cards),
you can use it if you tell Open MPI not to use openib
(--mca btl ^openib).
You can specify the interfaces to use also.
Please, see these Open MPI FAQ:
https://www.open-mpi.org/faq/?category=tcp#tcp-selection
These could be your IPoIB interfaces (ib0 or the corresponding subnet
addresses), or Ethernet.
Worth looking also at the FAQ about Infiniband:
https://www.open-mpi.org/faq/?category=openfabrics
Verbosity can be turned on with mca parameters that you can find with:
ompi-info --all |grep verbose
btl_base_verbose is a good start.
Note that Open MPI also uses network interfaces for the
startup, manage, and wrapup communications.
This uses another framework, "out of band" (oob), separate from the
"btl" (byte transport layer), with a corresponding
set of mca paramters that look like this: "--mca oob ..."
Overall their FAQ have very good information, as does the
README file in their tarball.
https://www.open-mpi.org/faq/
https://github.com/open-mpi/ompi/blob/master/README
I hope this helps,
Gus Correa
On 08/03/2017 11:59 AM, John Hearns via Beowulf wrote:
> Faraz, do you mean the IPOIB tcp network, ie the ib0 interface?
> Good question. I would advise joining the Openmpi list. They are very
> friendly over there.
> I have always seen polite and helpful replies even to dumb questions
> there (such as the ones I ask).
>
> I actually had to do something similar recently - we have nodes with
> only IB, so I had to run OpenMPI over Infiniband,
> but also say that the control connection had to use the ib0 interface.
>
>
>
> On 3 August 2017 at 17:41, Faraz Hussain <info at feacluster.com
> <mailto:info at feacluster.com>> wrote:
>
> Thanks for everyone's help. Using the Ohio State tests, qperf and
> perfquery I am convinced the IB network is working. The only thing
> that still bothers me is I can not get mpirun to use the tcp
> network. I tried all combinations of --mca btl to no avail. It is
> not important, more just curiosity.
>
>
>
> Quoting Michael Di Domenico <mdidomenico4 at gmail.com
> <mailto:mdidomenico4 at gmail.com>>:
>
> On Thu, Aug 3, 2017 at 10:10 AM, Faraz Hussain
> <info at feacluster.com <mailto:info at feacluster.com>> wrote:
>
> Thanks, I installed the MPI tests from Ohio State. I ran
> osu_bw and got the
> results below. What is confusing is I get the same result if
> I use tcp or
> openib ( by doing --mca btl openib|tcp,self with my mpirun
> command ). I also
> tried changing the environment variable: export
> OMPI_MCA_btl=tcp,self,sm .
> Results are the same regardless of tcp or openib..
>
> And when I do ifconfig -a I still see zero traffic reported
> for the ib0 and
> ib1 network.
>
>
> if openmpi uses RDMA for the traffic ib0/ib1 will not show traffic,
> you have to use perfquery
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> <http://www.beowulf.org/mailman/listinfo/beowulf>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> <http://www.beowulf.org/mailman/listinfo/beowulf>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
More information about the Beowulf
mailing list