AGAIN: mpi-prog from lam -> scyld beompi DIES
Peter Beerli
beerli at genetics.washington.edu
Sat Dec 8 12:36:25 PST 2001
Some time ago I asked about some problem with my mpi program and a scyld
beowulf cluster and got no real response to it.
- did nobody every port a lam-mpi program onto a scyld-beowulf cluster?
- did I miss the right keywords or what information is missing??
any hints? I add my post again.
Peter
On Wed, 28 Nov 2001, Peter Beerli wrote:
> Hi,
> I have a program developed using MPI-1 under LAM.
> It runs fine on several LAM-MPI clusters with different architecture.
> A user wants to run it on a Scyld-beowulf cluster and there it fails.
> I did a few tests myself and it seems
> that the program stalls if run on more than 3 nodes, but seems to work for
> 2-3 nodes. The program has master-slaves architectures where the master
> is mostly doing nothing. There are some reports sent to stdout from any node
> (but this seems to work in beompi the same way as in LAM).
> There are several things unclear to me
> because I have no clue about the beompi system, beowulf and scyld in
> particular.
>
> (1) if I run "top" why do I see 6 processes running when I start
> with mpirun -np 3 migrate-n ?
here I received a useful response, but this does not solve my problem.
this is solved, and is just they way how mpich treats run and I/O,
but they these different process have different mpi-IDs? then this would
be a problem.
>
> (2) The data-phase stalls on the slave nodes.
> The master node is reading the data from a file and then broadcasts
> a large char buffer to the slaves. Is this wrong, is there a better way
> to do that [I do not know how big the data is and it is a complex mix
> of strings numbers etc.]
>
> void
> broadcast_data_master (data_fmt * data, option_fmt * options)
> {
> long bufsize;
> char *buffer;
> buffer = (char *) calloc (1, sizeof (char));
> bufsize = pack_databuffer (&buffer, data, options);
> MPI_Bcast (&bufsize, 1, MPI_LONG, MASTER, comm_world);
> MPI_Bcast (buffer, bufsize, MPI_CHAR, MASTER, comm_world);
> free (buffer);
> }
In case you wonder about the size of the buffer, it gets expanded
in pack_databuffer()
>
> void
> broadcast_data_worker (data_fmt * data, option_fmt * options)
> {
> long bufsize;
> char *buffer;
> MPI_Bcast (&bufsize, 1, MPI_LONG, MASTER, comm_world);
> buffer = (char *) calloc (bufsize, sizeof (char));
> MPI_Bcast (buffer, bufsize, MPI_CHAR, MASTER, comm_world);
> unpack_databuffer (buffer, data, options);
> free (buffer);
> }
>
> the master and the first node seem to read the data fine
> but the others either don't and wait or silently die.
>
> (3) what is the easiest way to debug this? With LAM I just attached to pids to
> in gdb on the different nodes, but here the nodes are transparent to me
> [but as I said I have never used a beowulf cluster before].
>
>
> Can you give pointers, hints
>
> thanks
> Peter
>
--
Peter Beerli, Genome Sciences, Box #357730, University of Washington,
Seattle WA 98195-7730 USA, Ph:2065438751, Fax:2065430754
http://evolution.genetics.washington.edu/PBhtmls/beerli.html
More information about the Beowulf
mailing list