[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes

Martin Siegert siegert at sfu.ca
Mon Nov 16 18:38:09 PST 2009


On Mon, Nov 16, 2009 at 05:20:48PM -0800, Greg Lindahl wrote:
> On Mon, Nov 16, 2009 at 10:49:23AM -0700, Michael H. Frese wrote:
> 
> > Could it be that your MPI library was compiled using a small memory  
> > model?  The 180 million doubles sounds suspiciously close to a 2 GB  
> > addressing limit.
> >
> > This issue came up on the list recently under the topic "Fortran Array 
> > size question."
> 
> If you need a memory model other than the default small, you'll get a
> particular error message at link time; here's an example courtesy of
> the Intel software forums, but I bet that every compiler for Linux
> includes an example in their manual:
> 
> /tmp/ifort3X7vjE.o: In function `sph':
> sph.f:41: relocation truncated to fit: R_X86_64_PC32 against `.bss'
> sph.f:94: relocation truncated to fit: R_X86_64_PC32 against `.bss'
> sph.f:94: relocation truncated to fit: R_X86_64_PC32 against `.bss'
> sph.f:94: relocation truncated to fit: R_X86_64_PC32 against `.bss'
> 
> And it's only when your BSS is too big, not variables on the stack or
> allocated/malloced. I really doubt this is the problem either now or
> before.

Thanks, that's good to know - I certainly do not see any such messages
- neither with the Intel compiler nor gcc.
Furthermore, compiling openmpi with mcmodel=medium or large does not
make a difference.
(my previous email about the error message changing was a mistake:
the error message changes when l is 268435456 or larger).

Also: compiling openmpi with ofed-1.4.1 does not make a difference.
May I conclude that this just does not work? Or can anybody actually
send an array of 180000000 doubles?

- Martin



More information about the Beowulf mailing list