R: [Beowulf] SGE + policy

Lombard, David N david.n.lombard at intel.com
Thu May 27 08:29:59 PDT 2004


[Forwarding to list]

From: Robert G. Brown,
> Dear Perfect Masters of Grid Computing:

Robert, you are one of the few "masters" on this list ;^)

> It seems like this would be a common problem in shared environments
with
> a highly mixed workload and lots of users (and indeed is the problem
> addressed by e.g. the kernel scheduler in almost precisely the same
> context on SMP or UP machines).  Recognizing that the license
management
> problem will almost certainly be beyond the scope of any solution
> without some hacking and human-level policy, are there any well known
> solutions to this well known problem?  Can SGE actually automagically
> control jobs (stopping and starting jobs as a sort of coarse-grained
> scheduler to permit high priority jobs to pass through long running
low
> priority jobs)?  Is there a way to solve this with job classes or
> wrapper scripts that is in common use?

LSF is sometimes used for this very purpose, i.e., to manage access to
licensed jobs so that no job fails due to lack of licenses. LSF has
"exits" that can handle the license accounting, e.g., checking for
available licenses, fully automating the process.


For SGE, on which I only have passing familiarity, could you create a
matlab-specific queue and limit the number of concurrent jobs?
Certainly the various PBS implementations can do this.  Alternatively,
perhaps a matlab-specific resource could be required by each job and
limit it that way?

-- 
David N. Lombard
 
My comments represent my opinions, not those of Intel Corporation.




More information about the Beowulf mailing list