[Beowulf] Problems with HPMPI under Infiniband

Joshua mora acosta joshua_mora at usa.net
Mon Oct 2 20:09:11 PDT 2006


Hello.
I am trying to run HPMPI under Infiniband. More precisely HPMPI v2.2 for
rhel4u3. Infiniband stack from Silverstorm.
It seems to start and execute the applications without problems.
However at termination is were I face the problems. I run into two scenarios
but I think both are the same.
i) it takes really long time to terminate after all the work is done(I guess
the problem happens under MPI_Finalize) but all processes terminate
ii) Some processes do not terminate and issue the following error:
'unable to unpinn memory'.
The command I issue is mpirun -UDAPL -f appfile
there are some env variables like
MPI_HASIC_UDAPL=1
MPI_ICLIB_UDAPL=/lib64/libdat.so
The main concern I have is that apart of not terminating gracefully it ends
up
crashing the specific node.
Other clues:
I am running on AMD64 processors, on a quad socket motherboard.
I would appreciate if someone has faced this type of problem and know a
solution/workaround to it. 

Best regards,
Joshua Mora.







More information about the Beowulf mailing list