Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] the solution for qdel fail.....

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Jerry Xu jerry at oban.biosc.lsu.edu
Thu Jan 6 12:33:39 PST 2005


Hey, Huang:

  I found one solution that works for me, maybe you can try it and see
whether it works for you.

in your pbs script, try to add this "kill -gm 5" syntax between the
processor number and your program

like this 

mpirun -machinefile $PBS_NODEFILE -np $NPROCS --gm-kill 5 myprogram

it works for me.

Jerry.

/**********************************************************
Hi,

We have a new system set up. The vendor set up the PBS for us. For
administration reasons, we created a new queue "dque" (set to default)
using the "qmgr" command:

create queue dque queue_type=e
s q dqueue enabled=true, started=true

I was able to submit jobs using the "qsub" command to queue "dque".
However, when I use "qdel" to kill a job, the job disappears from the
job list shown by "qstat -a", but the executable is still running on
the compute nodes. Every time I have to login the corresponding the
compute node and kill the running job.

I am wondering if I missed something in setting up the queue so that I
am unable to kill the job completely using "qdel".

Thanks.




More information about the Beowulf mailing list