[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes

Martin Siegert siegert at sfu.ca
Mon Nov 16 15:27:57 PST 2009


Hi,

On Mon, Nov 16, 2009 at 04:55:51PM -0500, Gus Correa wrote:
> Hi Martin
>
> We didn't know which compiler you used.
> So what Michael sent you ("mmodel=memory_model")
> is the Intel compiler flag syntax.
> (PGI uses the same syntax, IIRR.)

Now that was really stupid, I am using gcc-4.3.2 and even looked up
the correct syntax for the memory model, but nevertheless pasted the
Intel syntax into my configure script ... sorry.

> Gcc/gfortran use "-mcmodel=memory_model" for x86_64 architecture.
> I only used this with Intel ifort, hence I am not sure,
> but "medium" should work fine for large data/not-so-large program
> in gcc/gfortran.
> The "large" model doesn't seem to be implemented by gcc (4.1.2)
> anyway.
> (Maybe it is there in newer gcc versions.)
> The darn thing is that gcc says "medium" doesn't support building
> shared libraries,
> hence you may need to build OpenMPI static libraries instead,
> I would guess.
> (Again, check this if you have a newer gcc version.)
> Here's an excerpt of my gcc (4.1.2) man page:
>
>
>        -mcmodel=small
>             Generate code for the small code model: the program and its 
> symbols must be linked in the lower 2 GB of the address space.  Pointers 
> are 64 bits.  Pro-
>            grams can be statically or dynamically linked.  This is the 
> default code model.
>
>        -mcmodel=kernel
>            Generate code for the kernel code model.  The kernel runs in the 
> negative 2 GB of the address space.  This model has to be used for Linux 
> kernel code.
>
>        -mcmodel=medium
>            Generate code for the medium model: The program is linked in the 
> lower 2 GB of the address space but symbols can be located anywhere in the 
> address
>            space.  Programs can be statically or dynamically linked, but 
> building of shared libraries are not supported with the medium model.
>
>        -mcmodel=large
>            Generate code for the large model: This model makes no 
> assumptions about addresses and sizes of sections.  Currently GCC does not 
> implement this model.

I recompiled openmpi with -mcmodel=medium and -mcmodel=large. The program
still fails. The error message changes, however:

id=1: calling irecv ...
id=0: calling isend ...
mlx4: local QP operation err (QPN 340052, WQE index 0, vendor syndrome 70, opcode = 5e)
[[55365,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 error polling LP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 282498416 opcode 11046  vendor error 112 qp_idx 3

(strerror(112) is "Host is down", which is certainly not correct).
This now points to system libraries - libmlx4. Am I correct in assuming that
this is either an OFED problem or OpenMPI exceeding some buffers in OFED
libraries without checking?

> If you are using OpenMPI, "ompi-info -config"
> will tell the flags used to compile it.
> Mine is 1.3.2 and has no explicit mcmodel flag,
> which according to the gcc man page should default to "small".

Are you - in fact, is anybody - able to run my test program? I am
hoping that there is some stupid misconfiguration on the cluster
that can be fixed easily, without reinstalling/recompiling all
apps ...

> Are you using 16GB per process or for the whole set of processes?

I am running the two processes on different nodes (and nothing else
on the nodes), thus each process has the full 16GB available.
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------

Thanks!

- Martin

> Martin Siegert wrote:
>> Hi Michael,
>>
>> On Mon, Nov 16, 2009 at 10:49:23AM -0700, Michael H. Frese wrote:
>>> Martin,
>>>
>>> Could it be that your MPI library was compiled using a small memory 
>>> model?  The 180 million doubles sounds suspiciously close to a 2 GB 
>>> addressing limit.
>>>
>>> This issue came up on the list recently under the topic "Fortran Array 
>>> size question."
>>>
>>>
>>> Mike
>>
>> I am running MPI applications that use more than 16GB of memory - I do not 
>> believe that this is the problem. Also -mmodel=large
>> does not appear to be a valid argument for gcc under x86_64:
>> gcc -DNDEBUG -g -fPIC -mmodel=large   conftest.c  >&5
>> cc1: error: unrecognized command line option "-mmodel=large"
>>
>> - Martin
>>
>>> At 05:43 PM 11/14/2009, Martin Siegert wrote:
>>>> Hi,
>>>>
>>>> I am running into problems when sending large messages (about
>>>> 180000000 doubles) over IB. A fairly trivial example program is attached.
>>>>
>>>> # mpicc -g sendrecv.c
>>>> # mpiexec -machinefile m2 -n 2 ./a.out
>>>> id=1: calling irecv ...
>>>> id=0: calling isend ...
>>>> [[60322,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 
>>>> error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for 
>>>> wr_id 199132400 opcode 549755813  vendor error 105 qp_idx 3
>>>>
>>>> This is with OpenMPI-1.3.3.
>>>> Does anybody know a solution to this problem?
>>>>
>>>> If I use MPI_Allreduce instead of MPI_Isend/Irecv, the program just hangs
>>>> and never returns.
>>>> I asked on the openmpi users list but got no response ...
>>>>
>>>> Cheers,
>>>> Martin
>>>>
>>>> --
>>>> Martin Siegert
>>>> Head, Research Computing
>>>> WestGrid Site Lead
>>>> IT Services                                phone: 778 782-4691
>>>> Simon Fraser University                    fax:   778 782-4242
>>>> Burnaby, British Columbia                  email: siegert at sfu.ca
>>>> Canada  V5A 1S6
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Martin Siegert
Head, Research Computing
WestGrid Site Lead
IT Services                                phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6



More information about the Beowulf mailing list