[Beowulf] scheduler policy design

Wed Apr 25 08:15:49 PDT 2007

All of the very large sites that I've been at (SGE users for the most  
part) who really need reservation and backfill capabilities had all  
pretty much invested effort in writing local job submission wrappers  
and front ends that programatically wrote the job scripts and handled  
job submission.  Since the front end system was writing the job  
script and handling the "qsub" arguments it was easy to make all of  
the specific resource reservation and attribute requests that were  
necessary.

Mind you, they did not invest all this effort into wrapping SGE just  
to "hide complexity" from the users or even just to get backfill  
working efficiently. By rigidly controlling the syntax of the job  
submission commands they were able to squeeze a lot of value out of  
their workflows -- simple things like having a consistent and 100%  
uniform job naming scheme made processing the accounting logs,  
debugging and troubleshooting far more efficient.

The only site doing what I mentioned above that I know I can talk  
about is described a but more fully here:
http://gridengine.info/pages/profile-DNA-Productions

For sites where wrapping applications and workflows is out of the  
question here are some Grid Engine (SGE) specific bits that would be  
involved in a system where users were expected to request IO, memory  
or runtime resources ...

Option A
-----------
Create and define a user-requestable, consumable resource that is  
appropriate to what you want to meter or make scheduling decisions on
Then associate that resource to a specific queue or the global  
execution host context
Then, edit the SGE complex to make your custom attribute be of type  
"FORCED"

The "FORCED" type is the key in this scheme. Users who do not request  
this resource are not allowed to run a job either globally, per-host  
or per-queue (depends where you stick the attribute value). So if  
your users do not characterize their IO needs or runtime needs or  
whatever under this scheme they will either not be allowed to submit  
a job at all or (in much more common cases) they will only be allowed  
to submit to some default queue and won't be allowed access to the  
higher priority queues that may be offering reservation and backfill.

Option B
-----------
Create the same user requestable resource as mentioned above
Then, create a default value for that resource that is very "high" or  
"expensive"

The idea here in option B is that you have a metered value and you  
are applying a really "expensive" default value that applies to any   
user or job who does not bother to actually request the specific  
resource via the job script or the command line. The end result is  
that users who do not characterize their needs end up getting  
penalized in the backfill/reservation/whatever scheduling scheme  
because they get socked with the high default value. They can  
override the default value by making the appropriately sized request  
at job submission time.

Implementing this stuff tends to be site specific or workflow  
specific. There is no easy one size fits all solution.  Depends on  
your apps, your execution host OS and your scheduling system (and may  
other factors).

People have all sorts of pie in the sky impressions as to how this  
stuff "should" work but their ideas tend to smash against the hard  
reality that very few applications can currently be seamlessly  
checkpointed, suspended, restarted and migrated without error.  If  
you can't  easily freeze an application and transparently move it to  
another node then all the fancy academic ideas about advanced  
reservation, backfill etc. all get real inefficient real fast in  
production computing environments.

My $.02 of course!

Regards,
Chris

On Apr 25, 2007, at 3:42 AM, Toon Knapen wrote:
>
>
> Interesting. However this approach requires that the IO profile of  
> the application is known. Additionally it requires the users of the  
> application (which are generally not IT guys) to know and  
> understand this info and pass it on to the scheduler when they  
> launch their app.
> In your experience, do you manage to convince real-life users to  
> provide this info?