Best Of Breed Batch Systems & Schedulers

Ron Chen ron_chen_123 at yahoo.com
Wed May 22 18:36:53 PDT 2002


Hello,

> On top of OpenPBS, there are multiple schedulers
> (Maui, UnderLord).

GridEngine (SGE) also has Maui support. And there is
an interface/API for external/3rd party schedulers
interacting with SGE --

http://gridengine.sunsource.net/unbranded-source/browse/~checkout~/gridengine/source/daemons/schedd/schedd.html?content-type=text/html

I know Silver works with Maui, since Maui supports
SGE, Silver should also work with SGE too? And then
there are other components like QBank, but need to do
more research to find out the level of support.

> Grid schedulers include Silver and OverLord.

And Globus supports SGE and PBS.


IMHO, PBS has lots of users, but Veridian puts most of
the fixes in PBSPro but not OpenPBS. SGE used to be
commerical. It gains more users quickly since open-
source, mainly due to fault tolerance is better (you
can config 1 or more "shadow master"s to take over
if the master host crashed), better scheduling
algorithms (including share tree, deadline, and
functional), better security (communications between
daemons can be encrypted, interactive job I/O can be
encrypted). Also, load sensors can be used to collect
resource information (like #of licenses available),
and
can define the start/stop/checkpoint methods so that
special actions can be taken when stopping the jobs,
which is useful in EDA environments, where the license
needs to be released before stopping a job.

AFAIK, SGE scales better, it is common to have a large
number of jobs submited to SGE (in the order of
10^5 or 10^6), and it handles failed nodes better --
in PBS, if a node fails, the scheduler hangs for 1 to
2 minutes, then timeouts, dies, and a new scheduler
process is created. In SGE, the scheduler does not
hang, and SGE can re-schedule jobs that are
scheduled to the fail node to another running node.

I am mainly a SGE user now, so I can't tell you much
about OpenPBS.

If you need more info about SGE, go to the opensource
project site --

http://gridengine.sunsource.net


-Ron

__________________________________________________
Do You Yahoo!?
LAUNCH - Your Yahoo! Music Experience
http://launch.yahoo.com



More information about the Beowulf mailing list