[Beowulf] the solution for qdel fail.....
Jerry Xu
jerry at oban.biosc.lsu.edu
Thu Jan 6 12:33:39 PST 2005
Hey, Huang:
I found one solution that works for me, maybe you can try it and see
whether it works for you.
in your pbs script, try to add this "kill -gm 5" syntax between the
processor number and your program
like this
mpirun -machinefile $PBS_NODEFILE -np $NPROCS --gm-kill 5 myprogram
it works for me.
Jerry.
/**********************************************************
Hi,
We have a new system set up. The vendor set up the PBS for us. For
administration reasons, we created a new queue "dque" (set to default)
using the "qmgr" command:
create queue dque queue_type=e
s q dqueue enabled=true, started=true
I was able to submit jobs using the "qsub" command to queue "dque".
However, when I use "qdel" to kill a job, the job disappears from the
job list shown by "qstat -a", but the executable is still running on
the compute nodes. Every time I have to login the corresponding the
compute node and kill the running job.
I am wondering if I missed something in setting up the queue so that I
am unable to kill the job completely using "qdel".
Thanks.
More information about the Beowulf
mailing list