[Beowulf] [EXTERNAL] Re: PBS question
Ellis H. Wilson III
ellis at ellisv3.com
Tue Oct 29 14:26:03 PDT 2019
On 10/29/19 4:49 PM, Lux, Jim (US 337K) via Beowulf wrote:
> True, there’s tons of info in qstat -f, however, doesn’t qstat stop
> showing my job after it completes, though? Maybe there’s a switch that
> retrieves “last data”?
Hi Jim,
I think you're looking for tracejob. Without sufficient perms you won't
be able to get access to accounting, but should still get the info you
need from other logs it queries.
Here's real usage of it, albeit snipped extensively. It shows memory
and cpu usage at the end, though it won't say how many cores you used.
IMHO that's something you design for. If you find cpu usage to be way
lower than runtime, and your code scales out to the number of cores
available, you can request less cores until your cpu time roughly
approximates your run-time.
ellisw at snip ~ $ sudo tracejob -n1 2100762.snip.panasas.com
/var/spool/torque/mom_logs/20191029: No matching job records located
/var/spool/torque/sched_logs/20191029: No such file or directory
Job: 2100762.snip.panasas.com
10/29/2019 16:33:32 S enqueuing into route, state 1 hop 1
10/29/2019 16:33:32 S dequeuing from route, state QUEUED
10/29/2019 16:33:32 S enqueuing into eng, state 1 hop 1
10/29/2019 16:33:32 S Job Queued at request of
snip at snip.panasas.com, owner = snip at snip.panasas.com, job name =
pr_one_run, queue = eng
10/29/2019 16:33:32 A queue=route
10/29/2019 16:33:32 A queue=eng
10/29/2019 17:16:03 S Job Run at request of root at snip.panasas.com
10/29/2019 17:16:06 S Not sending email: job requested no e-mail
10/29/2019 17:16:06 A user=snip group=users jobname=pr_one_run
queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212
start=1572383766 owner=snip at snip.panasas.com exec_host=snip/0
Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.nodect=1
Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.walltime=02:00:00
10/29/2019 17:17:17 S Not sending email: job requested no e-mail
10/29/2019 17:17:17 S Exit_status=0 resources_used.cput=00:00:11
resources_used.mem=1092436kb resources_used.vmem=2817552kb
resources_used.walltime=00:01:14
10/29/2019 17:17:17 A user=snip group=users jobname=pr_one_run
queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212
start=1572383766 owner=snip at snip.panasas.com exec_host=snip/0
Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.nodect=1
Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.walltime=02:00:00 session=21205 end=1572383837
Exit_status=0 resources_used.cput=00:00:11
resources_used.mem=1092436kb
resources_used.vmem=2817552kb resources_used.walltime=00:01:14
10/29/2019 17:17:18 S dequeuing from eng, state COMPLETE
Best,
ellis
--
Ellis H. Wilson III, Ph.D.
www.ellisv3.com
More information about the Beowulf
mailing list