[Beowulf] SGE + policy

Jim Lux James.P.Lux at jpl.nasa.gov
Thu May 27 09:14:26 PDT 2004


At 10:19 AM 5/27/2004 -0400, Robert G. Brown wrote:
>Dear Perfect Masters of Grid Computing:
>
>Economics is preparing to set up a small pilot cluster at Duke and the
>following question has come up.
>
>Primary tasks:  matlab and stata jobs, run either interactively/remote
>or (more likely) in batch mode.  Jobs include both "short" jobs that
>might take 10-30 minutes run by e.g. 1-2nd year graduate students as
>part of their coursework and "long" jobs that might take hours to days
>run by more advanced students, postdocs, faculty.
>
>Constraint:  matlab requires a license managed by a license manager.
>There are a finite number of licenses (currently less than the number of
>CPUs) spread out across the pool of CPUs.

I can't speak to the interaction between Matlab and SGE, however, I do have 
some practical experience with competing for shared Matlab licenses 
(particularly for some toolboxes)..

In some ways, what you are dealing with is a common problem in real time 
systems with phenomena like deadlocks and priority inversion, where a long 
running low priority task locks out higher priority tasks because of a 
resource lock.

Perhaps a way to deal with the long running resource consumer type job is 
to implement a requirement that they be periodically checkpointed, stopped, 
and restarted, allowing a "rebidding" for the limited resource.  Good 
software design practice should inspire designs that support this fairly 
transparently (i.e. it's foolish to write a program that must run 
uninterrupted for many hours).   If the time granularity is sufficiently 
fine, then it should work ok.

There was some research back in the 70's on optimizing scarce computing 
resources among various classes of users with different needs(timesharing 
interactive users: low compute demand, but fast response time needed vs. 
high compute, but batch oriented).  Some successful strategies implemented 
a form of bidding for resources.  The idea is that each "user or consumer" 
gets a certain regular salary, and they can either explicitly or 
automatically enter a dynamic market for the resources.  The market can 
have some "allocations or quotas" to prevent things analogous to priority 
inversions.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875




More information about the Beowulf mailing list