Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Please help to setup Beowulf

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Reuti reuti at staff.uni-marburg.de
Fri Feb 20 12:58:26 PST 2009


Am 20.02.2009 um 15:37 schrieb Bogdan Costescu:

> On Fri, 20 Feb 2009, Glen Beane wrote:
>
>> I looked into SGE a long time ago, but I found the MPI support  
>> terrible when compared to TORQUE/PBS Pro
>
> Indeed and AFAIK is still in a similar state today. There was talk  
> for a long time on the SGE devel list for a TM API to be added, but  
> it seems like this is not considered a high priority feature.

This is just, as they have a replacement called qrsh for the ususal  
rsh/ssh calls (as you know, but maybe others on the list not).  
Although it was in former times just using a special version of rsh,  
it was in the end under full control of SGE. In such a setup, the  
tradititonal rsh/ssh can be disabled completely inside the cluster  
(or ssh just limited to admin staff).

Nowadays it's replaced by a builtin startup method which is more  
scalable.

Having both, a TM and a tight integrated RSH/SSH replacement would of  
course be the best. Linda (which is Gaussian's parallel library)  
starts only with rsh/ssh. I see sites, having exactly for this  
purpose a "cleaner" script running in their Torque operated cluster  
to get rid of such kinds of jobs, as Torque can't know, what was  
started by rsh/ssh on some nodes.

> I've not only looked but actually used SGE for about 1 year (IIRC,  
> about 5 years ago) during which I had to spend time fixing the  
> interactions with LAM/MPI and many of the parallel applications  
> that were used on that cluster - and finally gave up.

It's successor Open MPI calls qrsh directly, when it discovers that  
it's running under SGE. It just checks some environment variables.

-- Reuti


> On the plus side, during the time that SGE was used, I have never  
> seen a process left behind from a job and the queueing system  
> itself seemed very stable - something that I could not say for the  
> OpenPBS/Torque that I've also tested at that time.
>
> -- 
> Bogdan Costescu
>
> IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
> Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
> E-mail: bogdan.costescu at iwr.uni-heidelberg.de
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list