Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Please help to setup Beowulf

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Reuti reuti at staff.uni-marburg.de
Wed Feb 18 01:42:19 PST 2009


Am 18.02.2009 um 07:32 schrieb Mark Hahn:

>> searches. Without array task scheduling this would require 500,000  
>> individual job submissions. The fact that I never met a serious  
>> PBS shop that had not
>
> what's wrong with 500k job submissions?  to me, the existence of  
> "array jobs"
> is an admission that the job/queueing system is inefficient.

When I compare this e.g. to C:

- a loop like:

     for (i=1;i<=100000;i++)
         printf("Hello from run %d.\n", i);

- and you can guess: 100,000 times:

     printf("Hello from run %d.\n", i++);


While the execution time is nearly the same, compilation of the first  
one is faster by far. So, that a "for"-instruction in C exists is  
also an admission that it can't compile sequential code very well and  
generates a big executable? The compiler must read the source, and  
SGE has to read the job requirements for every job again and again  
and store it.

-- Reuti

(PS: not to mention, that a for-loop/array-job is easier to handle  
for the user)


>   if you're saying that the issue is not per-job overhead of  
> submission, but rather that jobs are too short, well, I think  
> that's a user problem.  I think it's entirely reasonable to require  
> user jobs to consume some minimum cpu time
> (say, few minutes).
>
>> - Policy and resource allocation features are very important to  
>> people deploying these systems
>
> so I'm curious what that means.  things like "dept A needs to be  
> guaranteed
> N cpus, but dept B gets to use whatever is left over"?  or node  
> choice based on amount of free disk?  I don't really see why these  
> sorts of issues
> would be less important to more parallel environments.
>
>> - Storage speed is often more important than network speed or  
>> latency in many cases
>
> which makes me wonder: do bio types consider using map-reduce-like
> frameworks?  that is, basically distributing the work to the data.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list