[Beowulf] MPI + IB question
Jörg Saßmannshausen
j.sassmannshausen at ucl.ac.uk
Thu Nov 15 05:10:27 PST 2012
Dear all,
thanks for the feedback. I actually done a test of the IB-network after
installation (the typical Ping-Pong) with the results attached to this email.
Also, a different program (cp2k) is running without any problems when I am
using the IB network (as enforced with the --mca btl ^tcp).
I agree, what I find puzzling is that OpenMPI is not be able to use the IB
network, although it is compiled in (unless I missread that):
$ /opt/openmpi/gfortran/1.4.5/bin/ompi_info | less
[ ... ]
MCA rcache: vma (MCA v2.0, API v2.0, Component v1.4.5)
MCA btl: ofud (MCA v2.0, API v2.0, Component v1.4.5)
MCA btl: openib (MCA v2.0, API v2.0, Component v1.4.5)
MCA btl: self (MCA v2.0, API v2.0, Component v1.4.5)
MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.5)
MCA btl: tcp (MCA v2.0, API v2.0, Component v1.
[ ... ]
The libraries/drivers for the IB were installed via apt-get, so they are in
the same place. Basically my clusters are all clones to make life a bit easier
(there are small variations but the basic installation is the same here).
And yes, I know the cards are different as well! :-)
One thing came to mind: when I am disabling the TCP network, does that mean
that the lifeline will be also over the IB network or does OpenMPI still uses
the TCP network to connect to the other nodes and then uses the IB network for
the communication?
As I said, as the IB ping-pong was working and cp2k works with the --mca btl
^tcp switch I am still a bit puzzled by all of that, unless I am overseeing
something. The only other thing I could think of is that I am doing IPoIB for
the Qlogic cluster but not for the Voltair cluster. However, I would not have
thought that this does make a difference here.
Thanks for you help!
Jörg
On Thursday 15 November 2012 10:51:56 you wrote:
> Hello Joerg,
>
> I just received your message on [Beowulf] and was thinking that my comment
> migt be helpful. Your log clearly indicates that OpenMPI is seeing only the
> `self' and `sm' btl components but not the InifiniBand. This is clearly the
> question of OpenMPI installation on the Voltaire cluster.
>
> Since the binary does not carry over the MPI libraries it is the MPI
> library as installed on the two cluster that makes the difference. Are you
> aware of any program compiled with OpenMPI that runs on the Voltaire
> cluster with IB?
>
> If there is no such program probably an MPI bandwidth test, such as Intel
> MPI Benchmarks might make a good test for the OpenMPI configuration.
>
> Best of luck,
> Dima
--
*************************************************************
Jörg Saßmannshausen
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ
email: j.sassmannshausen at ucl.ac.uk
web: http://sassy.formativ.net
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
InfiniBand:
#---------------------------------------------------
# Intel (R) MPI Benchmark Suite V3.1, MPI-1 part
#---------------------------------------------------
# Date : Wed May 16 10:46:56 2012
# Machine : x86_64
# System : Linux
# Release : 2.6.32-5-amd64
# Version : #1 SMP Sun May 6 04:00:17 UTC 2012
# MPI Version : 2.1
# MPI Thread Environment: MPI_THREAD_SINGLE
# Calling sequence was:
# IMB-MPI1 PingPong
# Minimum message length in bytes: 0
# Maximum message length in bytes: 4194304
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#
# List of Benchmarks to run:
# PingPong
#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 4.97 0.00
1 1000 4.73 0.20
2 1000 4.74 0.40
4 1000 4.75 0.80
8 1000 4.83 1.58
16 1000 4.86 3.14
32 1000 4.93 6.19
64 1000 5.13 11.90
128 1000 6.22 19.64
256 1000 7.29 33.50
512 1000 8.22 59.38
1024 1000 10.08 96.91
2048 1000 12.43 157.10
4096 1000 16.45 237.46
8192 1000 25.12 310.97
16384 1000 46.05 339.29
32768 1000 70.35 444.22
65536 640 119.77 521.84
131072 320 217.19 575.53
262144 160 418.11 597.93
524288 80 816.68 612.23
1048576 40 1618.95 617.68
2097152 20 3264.62 612.63
4194304 10 7146.55 559.71
Gigabit network:
#---------------------------------------------------
# Intel (R) MPI Benchmark Suite V3.1, MPI-1 part
#---------------------------------------------------
# Date : Wed May 16 10:49:44 2012
# Machine : x86_64
# System : Linux
# Release : 2.6.32-5-amd64
# Version : #1 SMP Sun May 6 04:00:17 UTC 2012
# MPI Version : 2.1
# MPI Thread Environment: MPI_THREAD_SINGLE
# Calling sequence was:
# IMB-MPI1 PingPong
# Minimum message length in bytes: 0
# Maximum message length in bytes: 4194304
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#
# List of Benchmarks to run:
# PingPong
#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 46.99 0.00
1 1000 47.22 0.02
2 1000 47.23 0.04
4 1000 47.05 0.08
8 1000 47.10 0.16
16 1000 47.08 0.32
32 1000 47.21 0.65
64 1000 47.46 1.29
128 1000 48.12 2.54
256 1000 60.43 4.04
512 1000 67.49 7.23
1024 1000 81.41 12.00
2048 1000 101.38 19.27
4096 1000 126.08 30.98
8192 1000 163.49 47.78
16384 1000 248.18 62.96
32768 1000 413.34 75.60
65536 640 810.73 77.09
131072 320 1384.05 90.31
262144 160 2528.66 98.87
524288 80 4782.93 104.54
1048576 40 9272.86 107.84
2097152 20 18186.55 109.97
4194304 10 35994.60 111.13
More information about the Beowulf
mailing list