[Beowulf] MPI programming question: Interleaved MPI_Gatherv?

Michael Gauckler maillists at gauckler.ch
Fri Mar 4 14:30:26 PST 2005


Dear List, 

thank you for all your replies concerning my question about interleaved
gathers. (Interleaved from  was meant in terms of memory layout, not
time of arrival of the message.)

Yes, there is a solution to this problem by changing the lower and upper
bounds of the datatype with the help of MPI_Type_create_resized.

Trough the lam-mpi mailing list I got a reply from Josh which I like to
share with you because it even includes the source of a demo application
(see below).

Thank you very much! Yours, 
 Michael

___


                               Von: 
Josh Hursey 






                             Datum: 
Tue, 1 Mar 2005 09:50:43 -0500
(15:50 CET)

Yes, this can be achieved in an elegant way with MPI_Gather, but you 
need to adjust the receive datatype. You will need to create a new 
MPI_Datatype that will stride as you need it to. The trick is to shift 
the lower and upper bounds on this new strided data type so it will 
interleave values. Something like:

     /* Create a datatype to receive into. */
     MPI_Type_vector( NUM_LOCAL_ELE,  /* # of blocks */
                      1, /* # of datatypes in a block (one for this 
array) */
                      gsize, /* Stride between successive blocks */
                      MPI_CHAR, /* Type of each block */
                      &old_type);
     MPI_Type_commit( &old_type);

     /* Resize the type to allow interleaving,
      * so make it only one MPI_CHAR wide
      */
     MPI_Type_create_resized(old_type,
                             0, /* Lower Bound */
                             1, /* Uppoer Bound change to one block */
                             &new_type);
     MPI_Type_commit( &new_type);

Then use the new_type as the receive type argument to the MPI_Gather 
function. I attached a sample code that does exactly this, and produces 
the following output:
$ mpirun -np 4 gather_interleave
Rank 0  A       A       A       A       A       A       A       A       
A       A       A       A
Rank 1  B       B       B       B       B       B       B       B       
B       B       B       B
Rank 2  C       C       C       C       C       C       C       C       
C       C       C       C
Rank 3  D       D       D       D       D       D       D       D       
D       D       D       D
Final:
         A       B       C       D       A       B       C
D       
A       B       C       D
         A       B       C       D       A       B       C
D       
A       B       C       D
         A       B       C       D       A       B       C
D       
A       B       C       D
         A       B       C       D       A       B       C
D       
A       B       C       D

Hope this helps.

Josh

-------<Code>-------------
#include <stdio.h>
#include <mpi.h>

#define NUM_LOCAL_ELE  12

int main(int argc, char *argv[]){
     int rank, gsize, i, j;
     char local_array[NUM_LOCAL_ELE];
     char *collected_array;
     MPI_Datatype new_type, old_type;

     /* Initialize */
     MPI_Init(&argc, &argv);
     MPI_Comm_rank(MPI_COMM_WORLD, &rank);
     MPI_Comm_size(MPI_COMM_WORLD, &gsize);

     /* Create a datatype to receive into. */
     MPI_Type_vector( NUM_LOCAL_ELE,  /* # of blocks */
                      1, /* # of datatypes in a block (one for this 
array) */
                      gsize, /* Stride between successive blocks */
                      MPI_CHAR, /* Type of each block */
                      &old_type);
     MPI_Type_commit( &old_type);

     /* Resize the type to allow interleaving,
      * so make it only one MPI_CHAR wide
      */
     MPI_Type_create_resized(old_type,
                             0, /* Lower Bound */
                             1, /* Uppoer Bound change to one block */
                             &new_type);
     MPI_Type_commit( &new_type);

     /* Initialize local array with characters:
      * Rank 0 = A A A...
      * Rank 1 = B B B...
      * Rank 2 = C C C...
      * ...
      */
     for(i = 0; i < NUM_LOCAL_ELE; ++i ) {
         local_array[i] = 'A' + rank;
     }

     /* Print out local array */
     sleep(rank * 1);
     printf("Rank %d", rank);
     for(i = 0; i <  NUM_LOCAL_ELE;  ++i) {
         printf("\t%c", local_array[i]);
     }
     printf("\n");

     if(rank == 0)
         collected_array = (char *)malloc(gsize * NUM_LOCAL_ELE * 
sizeof(char));

     MPI_Gather( local_array, NUM_LOCAL_ELE, MPI_CHAR, collected_array, 
1, new_type, 0, MPI_COMM_WORLD);

     /* Print out Gathered array */
     if(rank == 0) {
         printf("Final:\n");
         for(i = 0; i <  gsize;  ++i) {
             for(j = 0; j < NUM_LOCAL_ELE; ++j) {
                 printf("\t%c", collected_array[i*NUM_LOCAL_ELE+j]);
             }
             printf("\n");
         }
     }

     if (rank == 0)
         free(collected_array);

     MPI_Finalize();

     return 0;
}






Am Dienstag, den 01.03.2005, 07:44 +0100 schrieb Michael Gauckler:
> Dear List, 
> 
> I would like to gather the data from several processes. 
> Instead of the comonly used stride, I want to interleave 
> the data:
> 
> Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
> Rank 1: BBBBB ----^---^---^---^---^
> Rank 2: CCCCC -----^---^---^---^---^
> Rank 3: DDDDD ------^---^---^---^---^
> 
> Since the stride of the receive type is indicated 
> in multpiles of its mpi_type, no interleaving is 
> possible (the smallest striping factor leads to 
> AAAAABBBBBBCCCCCDDDDD).
> 
> Is there a way to achieve this behaviour in an 
> elegant way, as MPI_Gather promises it? Or do
> I need to do Send/Recv with self-aligned offsets?
> 
> Thank you for your help!
> 
>  Michael





More information about the Beowulf mailing list