[Beowulf] Simple MPI programs hang

Geoff Jacobs gdjacobs at gmail.com
Wed Mar 19 18:13:56 PDT 2008


Gregg Germain wrote:
> Hi everyone,
> 
> I've created a 2 node cluster running FC8.  I've installed MPICH2
> 1.0.6pl on both (not NFS'd).
> 
> The Master, Ragnar, is a 64 bit; olaf is a 32 bit.
> 
> I set up the ring, and mpdtrace shows:
> 
> $ mpdtrace -l
> Ragnar_37601 (192.168.0.2)
> olaf_45530 (192.168.0.5)
> $
> 
> I run a VERY simple MPI program and it hangs:
> #include "mpi.h"
> #include <stdio.h>
> #include <math.h>
> #include <string.h>
> 
> int main( int argc, char *argv[] )
> {
>   MPI_Init(&argc,&argv);
>   printf("Hello!\n");
>   MPI_Finalize();
>   return 0;
> }
> 
> The program outputs the two lines for the two nodes and hangs. I have to
> CNTRL-C out of it:
> 
> [gregg at Ragnar ~/BEOAPPS]$ mpiexec -l -n 2 mpibase
> 0: Hello!
> 1: Hello!
> 
> It would sit there forever if I didn't bail. Other simple tests work fine:
> 
> Running a simple "hostname" test works fine:
> 
> $ mpiexec -l -n 2 hostname
> 0: Ragnar
> 1: olaf
> $
> 
> Now I run a Hello World (no MPI):
> #include <stdio.h>
> #include <math.h>
> 
> int main(int argc,char *argv[])
> {
>   printf("\nHello World!\n %d \n", n);
> }
> 
> $ mpiexec -l -n 2 ../HelloWorld
> 0:
> 0: Hello World!
> 1:
> 1: Hello World!
> $
> 
> Any help with this would be appreciated
> 
> Gregg

Last time I checked, MPICH2 does not permit heterogeneous machine
architectures. If Ragnar is using an AMD64 build of MPICH2 and Olaf
using MPICH2 targeted on IA32, you are most likely seeing an ABI conflict.

You can get around this by using a 32 bit compiler and MPICH library on
Ragnar, or a 32 bit development environment residing in a chroot, or a
hosted 32 bit VM image, or just reinstall Ragnar as 32 bit only.

Or you can go shopping for a different MPI library. The Open MPI people
look like they're actively working on this functionality, for example.

-- 
Geoffrey D. Jacobs




More information about the Beowulf mailing list