[Beowulf] Reliable Job Queueing and Notification

Reuti reuti at staff.uni-marburg.de
Thu Oct 18 02:06:40 PDT 2007


Hi,

Am 16.10.2007 um 16:08 schrieb Sean Ward:

> I've started work on a web service which contains several  
> potentially long running processing steps (molecular dynamics),  
> which are perfect to farm out to the fairly large (90 node) Beowulf  
> I have access to. The primary issue is translating requests from  
> the event driven web service, to job queues, and back again upon  
> completion. Specifically, the major queuing systems I have  
> immediate access to (Sun Grid Engine and Condor) only support e- 
> mail based notification of job completion. Starting jobs isn't an  
> issue, as my service can simply ssh over and execute shell scripts  
> as needed to start things up, the problem is reliably being  
> informed when the jobs fail or complete, via any programmatic  
> method (such as executing a shell script, calling a web service via  
> SOAP/etc, or an asynchronous message library). My other problem,  
> ensuring that these web service requests don't starve in house jobs  
> on the Beowulf is easily handled via the priority levels built into  
> all the various job managers, although being able to checkpoint a  
> long running job would be a plus (such as is supported by Condor).

if it's possible to compile your program with Condor-Checkpointing,  
then you can even compile it as a standalone application which  
includes checkpointing. This you can use with any other  
queuingsystem. It's of course best to choose one which supports  
checkpointing like SGE. It has no built-in checkpointing (like  
Condor), but has extra setup options to make its handling easy.

-- Reuti


> I am currently investigating modifications to either Condor (more  
> complex to update, but checkpoint is useful) or Ruby Queue (very  
> easy to update for reliable notification) to solve this issue, but  
> wanted to be sure I wasn't overlooking any existing solutions to  
> programmatic based queuing and receiving notifications on jobs in a  
> Beowulf environment...
>
> -Sean
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list