Best Of Breed Batch Systems & Schedulers
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Ron Chen ron_chen_123 at yahoo.comWed May 22 18:36:53 PDT 2002
- Previous message: Want to build web cluster server
- Next message: Process or task migration
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello, > On top of OpenPBS, there are multiple schedulers > (Maui, UnderLord). GridEngine (SGE) also has Maui support. And there is an interface/API for external/3rd party schedulers interacting with SGE -- http://gridengine.sunsource.net/unbranded-source/browse/~checkout~/gridengine/source/daemons/schedd/schedd.html?content-type=text/html I know Silver works with Maui, since Maui supports SGE, Silver should also work with SGE too? And then there are other components like QBank, but need to do more research to find out the level of support. > Grid schedulers include Silver and OverLord. And Globus supports SGE and PBS. IMHO, PBS has lots of users, but Veridian puts most of the fixes in PBSPro but not OpenPBS. SGE used to be commerical. It gains more users quickly since open- source, mainly due to fault tolerance is better (you can config 1 or more "shadow master"s to take over if the master host crashed), better scheduling algorithms (including share tree, deadline, and functional), better security (communications between daemons can be encrypted, interactive job I/O can be encrypted). Also, load sensors can be used to collect resource information (like #of licenses available), and can define the start/stop/checkpoint methods so that special actions can be taken when stopping the jobs, which is useful in EDA environments, where the license needs to be released before stopping a job. AFAIK, SGE scales better, it is common to have a large number of jobs submited to SGE (in the order of 10^5 or 10^6), and it handles failed nodes better -- in PBS, if a node fails, the scheduler hangs for 1 to 2 minutes, then timeouts, dies, and a new scheduler process is created. In SGE, the scheduler does not hang, and SGE can re-schedule jobs that are scheduled to the fail node to another running node. I am mainly a SGE user now, so I can't tell you much about OpenPBS. If you need more info about SGE, go to the opensource project site -- http://gridengine.sunsource.net -Ron __________________________________________________ Do You Yahoo!? LAUNCH - Your Yahoo! Music Experience http://launch.yahoo.com
- Previous message: Want to build web cluster server
- Next message: Process or task migration
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
