Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Torque Error: multi-req PBS jobs not allowed

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Rahul Nabar rpnabar at gmail.com
Mon Apr 20 15:29:55 PDT 2009


On Mon, Apr 20, 2009 at 5:11 PM, Greg Lindahl <lindahl at pbm.com> wrote:
> On Mon, Apr 20, 2009 at 04:59:31PM -0500, Rahul Nabar wrote:
>
>> Why would PBS-Torque not allow this and my previous threads
>> "JOBNODEMATCHPOLICY EXACTNODE" by default? Are there any reasons not
>> to use them? The compromise is not obvious to me.
>
> In general it's most efficient to have a job use whole nodes. If you
> use a partial node, jobs will likely interfere with each other,
> reducing overall performance.
>
> Now if your code only runs on N**2 nodes, using whole nodes can be
> painful. But with M*N nodes, you're usually OK.

My code (depending on the specific job at hand ) parallelizes well
over a multiple of a small integer. Mostly (a) multiples of 4 or (b)
multiples of 9.

Since I have 8 cpu/server Job-Type-(a) is great but Job-Type-(b)
requires a trade off. If I wanted to request whole nodes I'd have to
shoot for 8x9=72 but that degree of parallalization is overkill. At
that point the parallalization is no longer so efficient.

The ideal situation for our cluster then seems to me: Request full
nodes + a fragment. If I can selectively allow these
non-full-node-fragments from a specific pool of nodes I could minimize
my overall cluster fragmentation.

But I've no clue how to implement this. Has anybody tried such a solution?

-- 
Rahul



More information about the Beowulf mailing list