[Beowulf] Problems with HPMPI under Infiniband
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joshua mora acosta joshua_mora at usa.netMon Oct 2 20:09:11 PDT 2006
- Previous message: [Beowulf] Off Topic: HPC Training Courses
- Next message: [Beowulf] More cores/More processors/More nodes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello. I am trying to run HPMPI under Infiniband. More precisely HPMPI v2.2 for rhel4u3. Infiniband stack from Silverstorm. It seems to start and execute the applications without problems. However at termination is were I face the problems. I run into two scenarios but I think both are the same. i) it takes really long time to terminate after all the work is done(I guess the problem happens under MPI_Finalize) but all processes terminate ii) Some processes do not terminate and issue the following error: 'unable to unpinn memory'. The command I issue is mpirun -UDAPL -f appfile there are some env variables like MPI_HASIC_UDAPL=1 MPI_ICLIB_UDAPL=/lib64/libdat.so The main concern I have is that apart of not terminating gracefully it ends up crashing the specific node. Other clues: I am running on AMD64 processors, on a quad socket motherboard. I would appreciate if someone has faced this type of problem and know a solution/workaround to it. Best regards, Joshua Mora.
- Previous message: [Beowulf] Off Topic: HPC Training Courses
- Next message: [Beowulf] More cores/More processors/More nodes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
