[Beowulf] job scheduler and accounting question
Joe Landman
landman at scalableinformatics.com
Tue Jul 14 12:06:18 PDT 2015
Hi folks:
Its been a few years since we've had a good use case for a job
scheduler, and I'll freely admit I've not paid nearly enough attention
to what is currently out there.
We are investigating options for a cluster/cloud scenario where I
need to keep track of CPU, memory, disk used during the runs. This
"keeping track" should be available via command line tools (preferably
in JSON/XML/CSV output that I can easily parse).
The last time we did anything in this space, I used Torque and wrote
my own account summary tool:
https://scalability.org/2011/03/quick-accounting-tool-for-torque/ , and
prior to that, I did something for SGE
https://arc.liv.ac.uk/pipermail/gridengine-users/2006-October/011846.html
Main requirements on the scheduler are
a) a shell access. We need to be able to quickly launch a shell and
limit CPU/memory usage. Cgroup control/monitoring would be terrific.
b) the aforementioned accounting/usage bits. Happy to write my own data
extractor (likely will need to for this project anyway) as long as I can
get the data via CLI/API/...
Ones I think I should be looking at include:
1) SLURM
2) OpenLava
3) Torque
What else? Has the gridengine mess ever been sorted out? And on a
related note, are there any updated pages listing pro's/con's of the
modern implementations of these? Again, I've not paid attention to
schedulers for a while, so things may have changed a bit in a few years ...
Thx!
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
e: landman at scalableinformatics.com
w: http://scalableinformatics.com
t: @scalableinfo
p: +1 734 786 8423 x121
c: +1 734 612 4615
More information about the Beowulf
mailing list