[Beowulf] picking out a job scheduler

Robert G. Brown rgb at phy.duke.edu
Tue Jan 2 15:44:50 PST 2007

On Tue, 2 Jan 2007, Chris Dagdigian wrote:

>> (3) Its likely that in the future I'll have part-time access to another 
>> cluster of dual-boot (XP/linux) machines.  The machines will default to 
>> booting to Linux, but will occasionally (5-20 hours a week) be used as 
>> windows workstations by a console user (when a user is finished, they'll 
>> restart the machine and it will boot back to linux).  If cluster nodes are 
>> available in this sort of unpredictable and intermittent way, can they be 
>> used as compute nodes in some fashion? Wil gridengine/PBS /??? take care of 
>> this sort of process migration?
> Grid Engine will not transparently preserve and migrate running jobs off of 
> machines that get bounced suddenly.  This sort of transparent and automatic 
> checkpointing and migration is actually pretty hard to do in practice.  If 
> you know in advance which machines are going to be shut down and rebooted 
> into windows then there are tools in all the common scheduling packages for 
> "draining" a particular machine or queue.  You can also "kill and reschedule"

For what it is worth, the current generation of Condor can, for some
code and linked with its own migration library, permit transparent
checkpointing and code migration, and it also has a very complex
"policy" engine that lets one specify in great deal how to turn jobs on
and off as user/owners use the systems in the pool.  It has recently
become "true open source" although the download website is still a PITA
to navigate and requires a kind of "registration" and its license is
still not a straight GPL.

This is kind of funny because as I read it, the toolset can now be
wrapped up in source RPMs and distributed as a standard component of
e.g. FC in extras or elsewise without violating any aspect of its
license agreement.  Doing this (for Duke, but if it is in one of Duke's
public repos it is pretty public) is on my list of things to do this
week or next.

One of the bitches that I and many others have about all of the
alternatives is that they are too damn complicated.  Many sites -- I
won't say most but many -- have very, very simple needs for a
scheduler/queuing system.  Needs that could be met without requiring the
admin to read a 1000 page manual, join a mailing list, work through a
really complicated build, and try to figure out several distinct
security models and policy models.  What is really needed is a fully
open source "scheduler lite" that pretty much sets up a simple queue for
a simple list of machines with a simple cron-like policy statement,
maybe all defined with an XMLish config file that permitted classes of
machines (like a bunch that belong to user A) to share a policy.

Some people on list (Mark Hahn, e.g.) have IIRC even written their own
lightweight schedulers out of sheer pique with this situation.  However,
I don't know if any of them have been developed to where they are
moderately portable and packagable for general use.


Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list