[Beowulf] Determining PID of a multi-processor PBS job ; mpi cleanup of tasks
Rayson Ho
rayson2 at eseenet.com
Tue Jul 6 17:45:59 PDT 2004
You can either use:
1) mpiexec
It uses the TM interface to control the MPI jobs, and you can easily add
code to cleanup the shared memory segments and semaphores...
2) Use this:
http://bellatrix.pcl.ox.ac.uk/~ben/pbs/mpicleanup.c
"Cleanup of MPICH/PBS jobs"
http://bellatrix.pcl.ox.ac.uk/~ben/pbs/
Rayson
>I was trying to write a script that cleans up orphaned
>mpi tasks, their associated shared memory segments and
>semaphores of a particular mpi job submitted via PBS
>(actually openpbs).
>
>For this I need to determine the PID associated with
>the particular PBS MPI job. Lets say the MPI job runs
>on 5 processors, I am able to parse through the
>several <proc_id> in the /proc directory on the 1st
>processor, search for the variable PBS_JOBNAME in the
>"/proc/<proc_id>/environ" file and determine the
>corresponding parent id of the pbs job. But, the same
>trick does not work on the other processors.
>
>Question : Is there any easy way of determining all
>the PPIDs on different hosts corresponding to a
>multi-processor PBS MPI job ?
>
>Help ::: If someone can actually mail me a mpi-cleanup
>script (and install instructions) that cleans all
>orphaned mpi processors nad their associated shm and
>semaphores, it will save me a lot of trouble writing
>the script. And I will be indebted to you for life.
---------------------------------------------------------
Get your FREE E-mail account at http://www.eseenet.com !
More information about the Beowulf
mailing list