<br><br>
<div><span class="gmail_quote">On 2/21/08, <b class="gmail_sendername">Mark Hahn</b> <<a href="mailto:hahn@mcmaster.ca">hahn@mcmaster.ca</a>> wrote:</span>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">> submit the jobs through a job scheduler (LSF in this case). We used the<br>> machinefile option with mpirun to order the nodes on which the processes has<br>
> to be started.<br>><br>> But i am not able to do this with the current setup where LSF is used for<br>> scheduling and SLURM for resource management.<br>> I have tried a few of the options like using the -m options to bsub for<br>
> specifying the preference and so on. But of no success.<br><br>this sounds like our HP-XC systems. but I'm a bit mystified:<br>you can get the node assignment from LSF, and then use srun -m hostfile<br>to force slurm to set up the rank-node mappings as you like.<br>
(note: not -m to LSF.) did you try that?<br></blockquote></div><br>yes it is a HP-XC system and I have tried using -m option to srun also.
<div> </div>
<div><em>This is what I tried with a sample MPI Program that prints rank on node</em></div>
<div>
<p><em>#include "stdio.h"<br>#include "mpi.h"</em></p>
<p><em>int main(int argc, char *argv[]) {</em></p>
<p><em>int ierr,rank,size,len;<br>char name[100];</em></p>
<p><em>MPI_Init(&argc, &argv);</em></p>
<p><em>MPI_Comm_size(MPI_COMM_WORLD,&size);<br>MPI_Comm_rank(MPI_COMM_WORLD,&rank);<br>MPI_Get_processor_name(name,&len);</em></p>
<p><em>printf("This is %d out of %d: %s \n", rank,size,name);<br>MPI_Finalize();</em></p>
<p><em>return 0;</em></p>
<p><em>}</em><br></p>
<p>This was submitted to LSF using </p>
<p><em> bsub -n 4 -e errfile -ext "SLURM[nodelist=n2,n1,n4,n3]" /opt/hpmpi/bin/mpirun -srun -m hostfile ./a.out</em></p>
<p>The environment variable SLURM_HOSTFILE was set to the hostfile with the nodes on which the binary had to be run in the order n2,n1,n4,n3.</p>
<p>I got the following error in my error file:</p>
<p><em>a.out: MPI_Init: node to rank map is not correct myrank :0 mynode:1<br>a.out: MPI_Init: node to rank map is not correct myrank :1 mynode:0<br>a.out: MPI_Init: MPI_MPIRUN has wrong nodemap format<br>a.out: MPI_Init: MPI_MPIRUN has wrong nodemap format<br>
a.out: MPI_Init: Cannot set srun startup protocol<br>a.out: MPI_Init: node to rank map is not correct myrank :3 mynode:2<br>a.out: MPI_Init: MPI_MPIRUN has wrong nodemap format<br>a.out: MPI_Init: Cannot set srun startup protocol<br>
a.out: MPI_Init: Cannot set srun startup protocol<br>srun: error: n2: task0: Exited with exit code 1<br>a.out: MPI_Init: node to rank map is not correct myrank :2 mynode:3<br>a.out: MPI_Init: MPI_MPIRUN has wrong nodemap format<br>
a.out: MPI_Init: Cannot set srun startup protocol<br>srun: Terminating job</em></p></div><br clear="all"><br>-- <br>Best Regards,<br>Balamurugan. R