[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes
siegert at sfu.ca
Mon Nov 16 15:27:57 PST 2009
On Mon, Nov 16, 2009 at 04:55:51PM -0500, Gus Correa wrote:
> Hi Martin
> We didn't know which compiler you used.
> So what Michael sent you ("mmodel=memory_model")
> is the Intel compiler flag syntax.
> (PGI uses the same syntax, IIRR.)
Now that was really stupid, I am using gcc-4.3.2 and even looked up
the correct syntax for the memory model, but nevertheless pasted the
Intel syntax into my configure script ... sorry.
> Gcc/gfortran use "-mcmodel=memory_model" for x86_64 architecture.
> I only used this with Intel ifort, hence I am not sure,
> but "medium" should work fine for large data/not-so-large program
> in gcc/gfortran.
> The "large" model doesn't seem to be implemented by gcc (4.1.2)
> (Maybe it is there in newer gcc versions.)
> The darn thing is that gcc says "medium" doesn't support building
> shared libraries,
> hence you may need to build OpenMPI static libraries instead,
> I would guess.
> (Again, check this if you have a newer gcc version.)
> Here's an excerpt of my gcc (4.1.2) man page:
> Generate code for the small code model: the program and its
> symbols must be linked in the lower 2 GB of the address space. Pointers
> are 64 bits. Pro-
> grams can be statically or dynamically linked. This is the
> default code model.
> Generate code for the kernel code model. The kernel runs in the
> negative 2 GB of the address space. This model has to be used for Linux
> kernel code.
> Generate code for the medium model: The program is linked in the
> lower 2 GB of the address space but symbols can be located anywhere in the
> space. Programs can be statically or dynamically linked, but
> building of shared libraries are not supported with the medium model.
> Generate code for the large model: This model makes no
> assumptions about addresses and sizes of sections. Currently GCC does not
> implement this model.
I recompiled openmpi with -mcmodel=medium and -mcmodel=large. The program
still fails. The error message changes, however:
id=1: calling irecv ...
id=0: calling isend ...
mlx4: local QP operation err (QPN 340052, WQE index 0, vendor syndrome 70, opcode = 5e)
[[55365,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 error polling LP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 282498416 opcode 11046 vendor error 112 qp_idx 3
(strerror(112) is "Host is down", which is certainly not correct).
This now points to system libraries - libmlx4. Am I correct in assuming that
this is either an OFED problem or OpenMPI exceeding some buffers in OFED
libraries without checking?
> If you are using OpenMPI, "ompi-info -config"
> will tell the flags used to compile it.
> Mine is 1.3.2 and has no explicit mcmodel flag,
> which according to the gcc man page should default to "small".
Are you - in fact, is anybody - able to run my test program? I am
hoping that there is some stupid misconfiguration on the cluster
that can be fixed easily, without reinstalling/recompiling all
> Are you using 16GB per process or for the whole set of processes?
I am running the two processes on different nodes (and nothing else
on the nodes), thus each process has the full 16GB available.
> I hope this helps,
> Gus Correa
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> Martin Siegert wrote:
>> Hi Michael,
>> On Mon, Nov 16, 2009 at 10:49:23AM -0700, Michael H. Frese wrote:
>>> Could it be that your MPI library was compiled using a small memory
>>> model? The 180 million doubles sounds suspiciously close to a 2 GB
>>> addressing limit.
>>> This issue came up on the list recently under the topic "Fortran Array
>>> size question."
>> I am running MPI applications that use more than 16GB of memory - I do not
>> believe that this is the problem. Also -mmodel=large
>> does not appear to be a valid argument for gcc under x86_64:
>> gcc -DNDEBUG -g -fPIC -mmodel=large conftest.c >&5
>> cc1: error: unrecognized command line option "-mmodel=large"
>> - Martin
>>> At 05:43 PM 11/14/2009, Martin Siegert wrote:
>>>> I am running into problems when sending large messages (about
>>>> 180000000 doubles) over IB. A fairly trivial example program is attached.
>>>> # mpicc -g sendrecv.c
>>>> # mpiexec -machinefile m2 -n 2 ./a.out
>>>> id=1: calling irecv ...
>>>> id=0: calling isend ...
>>>> [[60322,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2
>>>> error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for
>>>> wr_id 199132400 opcode 549755813 vendor error 105 qp_idx 3
>>>> This is with OpenMPI-1.3.3.
>>>> Does anybody know a solution to this problem?
>>>> If I use MPI_Allreduce instead of MPI_Isend/Irecv, the program just hangs
>>>> and never returns.
>>>> I asked on the openmpi users list but got no response ...
>>>> Martin Siegert
>>>> Head, Research Computing
>>>> WestGrid Site Lead
>>>> IT Services phone: 778 782-4691
>>>> Simon Fraser University fax: 778 782-4242
>>>> Burnaby, British Columbia email: siegert at sfu.ca
>>>> Canada V5A 1S6
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Head, Research Computing
WestGrid Site Lead
IT Services phone: 778 782-4691
Simon Fraser University fax: 778 782-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the Beowulf