[Beowulf] the solution for qdel fail.....
Angel de Vicente
angelv at iac.es
Tue Jan 18 00:14:10 PST 2005
Hi Chris,
Chris Samuel writes:
> On Tue, 11 Jan 2005 02:49 am, Jerry Xu wrote:
>
> > Hi, William, Thank for your information. Just in case somebody still
> > need it for openPBS configuration, here is my epilogue file.it shall be
> > located in $pbshome/mom_priv/ for each node and it need to be set as
> > executable and owned by root. Some others many have better epilogue
> > scripts...
>
> Hmm, the only thing that worries me about that is that for those of us with
> SMP clusters it is possible for a user to have two different jobs running on
> each of the CPUs, so an epilogue script that kills all a users processes on a
> node would accidentally kill an innocent job.
We have a SMP cluster, and to avoid the death of innocent processes we use the
script in section "Cleanup of MPICH/PBS jobs" in
http://bellatrix.pcl.ox.ac.uk/%7Eben/pbs/
It doesn't always work, and some jobs are left lingering sometimes, but at least
it doesn't kill innocents (some day I hope I will have the time to look into it
and try to find out why).
Hope it helps. Cheers,
Angel de Vicente
--
----------------------------------
http://www.iac.es/galeria/angelv/
PostDoc Software Support
Instituto de Astrofisica de Canarias
More information about the Beowulf
mailing list