[Beowulf] Determining PID of a multi-processor PBS job ; mpi cleanup of tasks

Shriram R shriram1976 at yahoo.com
Mon Jul 5 10:39:02 PDT 2004


I was trying to write a script that cleans up orphaned
mpi tasks, their associated shared memory segments and
semaphores of a particular mpi job submitted via PBS
(actually openpbs).  

For this I need to determine the PID associated with
the particular PBS MPI job.  Lets say the MPI job runs
on 5 processors, I am able to parse through the
several <proc_id> in the /proc directory on the 1st
processor, search for the variable PBS_JOBNAME in the
"/proc/<proc_id>/environ" file and determine the
corresponding parent id of the pbs job.  But, the same
trick does not work on the other processors.  

Question : Is there any easy way of determining all
the PPIDs on different hosts corresponding to a
multi-processor PBS MPI job ? 

Help ::: If someone can actually mail me a mpi-cleanup
script (and install instructions) that cleans all
orphaned mpi processors nad their associated shm and
semaphores, it will save me a lot of trouble writing
the script.  And I will be indebted to you for life.

Thanks in advance,

Do you Yahoo!?
Yahoo! Mail is new and improved - Check it out!

More information about the Beowulf mailing list