preemption in PBSPro with MPICH on Linux

Gary Stiehr gary at umsl.edu
Thu Apr 25 11:48:24 PDT 2002


Hi,

Does anyone have experience using the preemption feature of PBSPro with 
MPICH jobs on Linux clusters?  I believe in the release notes for PBSPro 
5.2, it says that it will only send SIGSTOP to the process that PBS 
started (i.e, the mpirun process).  Therefore, if that process started 
other processes (as is the case with my MPICH job), the other processes 
will continue to run.  Does anyone know of a way to suspend all 
processes started from an MPICH job?

I need to do this because some MPICH jobs last several weeks and other 
smaller jobs submitted would have to wait if I do not use preemption.  I 
suppose another method would be to make sure that the long MPICH job 
checkpoints and then just have PBS kill the job after a certain amount 
of time.

Any experiences and/or suggestions would be appreciated.

Thanks,
Gary Stiehr
Information Technology Services
University of Missouri - St. Louis
gary at umsl.edu




More information about the Beowulf mailing list