[Beowulf] SGE + policy
Jim Lux
James.P.Lux at jpl.nasa.gov
Thu May 27 09:14:26 PDT 2004
At 10:19 AM 5/27/2004 -0400, Robert G. Brown wrote:
>Dear Perfect Masters of Grid Computing:
>
>Economics is preparing to set up a small pilot cluster at Duke and the
>following question has come up.
>
>Primary tasks: matlab and stata jobs, run either interactively/remote
>or (more likely) in batch mode. Jobs include both "short" jobs that
>might take 10-30 minutes run by e.g. 1-2nd year graduate students as
>part of their coursework and "long" jobs that might take hours to days
>run by more advanced students, postdocs, faculty.
>
>Constraint: matlab requires a license managed by a license manager.
>There are a finite number of licenses (currently less than the number of
>CPUs) spread out across the pool of CPUs.
I can't speak to the interaction between Matlab and SGE, however, I do have
some practical experience with competing for shared Matlab licenses
(particularly for some toolboxes)..
In some ways, what you are dealing with is a common problem in real time
systems with phenomena like deadlocks and priority inversion, where a long
running low priority task locks out higher priority tasks because of a
resource lock.
Perhaps a way to deal with the long running resource consumer type job is
to implement a requirement that they be periodically checkpointed, stopped,
and restarted, allowing a "rebidding" for the limited resource. Good
software design practice should inspire designs that support this fairly
transparently (i.e. it's foolish to write a program that must run
uninterrupted for many hours). If the time granularity is sufficiently
fine, then it should work ok.
There was some research back in the 70's on optimizing scarce computing
resources among various classes of users with different needs(timesharing
interactive users: low compute demand, but fast response time needed vs.
high compute, but batch oriented). Some successful strategies implemented
a form of bidding for resources. The idea is that each "user or consumer"
gets a certain regular salary, and they can either explicitly or
automatically enter a dynamic market for the resources. The market can
have some "allocations or quotas" to prevent things analogous to priority
inversions.
James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875
More information about the Beowulf
mailing list