[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes
Martin Siegert
siegert at sfu.ca
Sat Nov 14 16:43:27 PST 2009
Hi,
I am running into problems when sending large messages (about
180000000 doubles) over IB. A fairly trivial example program is attached.
# mpicc -g sendrecv.c
# mpiexec -machinefile m2 -n 2 ./a.out
id=1: calling irecv ...
id=0: calling isend ...
[[60322,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for wr_id 199132400 opcode 549755813 vendor error 105 qp_idx 3
This is with OpenMPI-1.3.3.
Does anybody know a solution to this problem?
If I use MPI_Allreduce instead of MPI_Isend/Irecv, the program just hangs
and never returns.
I asked on the openmpi users list but got no response ...
Cheers,
Martin
--
Martin Siegert
Head, Research Computing
WestGrid Site Lead
IT Services phone: 778 782-4691
Simon Fraser University fax: 778 782-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sendrecv.c
Type: text/x-c++src
Size: 1054 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20091114/8a57ca20/attachment.c>
More information about the Beowulf
mailing list