[Beowulf] doubt about job queuing systems

Chris Dagdigian dag at sonsorol.org
Fri Apr 17 11:36:58 PDT 2009


On Apr 17, 2009, at 9:07 AM, Jin, Yao wrote:

> Dear all,
>
> We've just bought a program named  CST Microwave Studio 2009(CST MWS)
> Engineer from CST says their program could be scheduled by Platform  
> LSF. Our cluster is built with Rockscluster, only torque or SGE is  
> avaiable. We cannot affort LSF as it is too expensive.


> Engineer from Dell(vendor of our cluster) says torque, SGE and LSF  
> have just the same functionality, what makes difference is  
> stability, schedule algorithm and price.
>

The differences are mainly in capability, cost and internal  
architecture. For the purposes of just running an MPI application you  
could treat them as roughly similar

> We don't have LSF in hand, and engineer from CST refuses to  
> demostrate running CST MWS with LSF.(They says they don't have LSF  
> either, the test is carried on other customer's cluster.)
>
>

The vendor should expect to document "howtos" or best practice  
whitepapers if they expect to sell their code onto clusters. If the  
are releasing a MPI application to the market then it is insane that  
they don't already know this.



> CST MWS is traditionally a program run on a single PC.  Version 2009  
> is the first release with MPI parallell code. It uses a modified MPI  
> Library.

If you run the code successfully on your rocks cluster without SGE or  
LSF then you should be able to take the final step of integrating it  
into the cluster scheduler. You may have to do this yourself but there  
is no special magic required. Both LSF and SGE handle graphical X11  
applications without much problem.

>
>
> Even console execution would start GUI interface. If no X server is  
> found, it would redirect graphical output to Xvfb(a virtual X11  
> server that performs all graphical operations in memory)
>
> When I argue that many other computational electromagnetics programs  
> such as FEKO can run in batch mode without any X output and can be  
> sheduled directly by external job queueing system, engineer from CST  
> declares that kind of  program not starting a GUI is old-fashioned...
>

The engineer is clueless and is talking from a single-workstation GUI  
world. Nobody who computes at significant automated or batch scale  
would even consider a GUI for most applications (except to manage  
workflow).


> Are they telling the truth, or just lying?

I suspect they just have not made the proper transition from their  
single workstation model to the MPI/cluster world yet. You may be one  
of their first customers for this.


You have a few options:

- Test the code on  your ROCKS cluster with an MPI install without SGE  
or LSF. If the product works and is something you need then consider  
integrating it into SGE on your own. Generally not a huge problem if  
you have a decent SGE admin

- Ask on internet forums for other customers who use this product.  
There is a good chance that others are already running this on a  
cluster and can share their tips with you

- Ask the vendor for a discount or free software licenses if you offer  
to publish the methods by which you integrated the code with ROCKS/SGE

- Tell the vendor that if they are serious about selling an MPI  
application into a cluster world they will have to get their head  
around the fact that GUIs are bad and the product must be entirely  
scriptable



my $.02 of course


-Chris








More information about the Beowulf mailing list