[Beowulf] assigning cores to queues with torque

Mon Mar 8 09:20:32 PST 2010

Micha Feigin wrote: 

>The problem: 
> 
>I want to allow gpu related jobs to run only on the gpu 
>equiped nodes (i.e more jobs then GPUs will be queued), 
>I want to run other jobs on all nodes with either: 
> 
> 1. a priority to use the gpu equiped nodes last 
> 2. or better, use only two out of four cores on the gpu equiped nodes 

In PBS Pro you would do the following (torque may have something 
similar): 

1. Create a custom resource called "ngpus" in the resourcedef 
file as in: 

ngpus type=long flag=nh 

2. This resource should then be explicitly set on each node that 
includes a GPU to the number it includes: 

set node compute-0-5 resources_available.ncpus = 8 
set node compute-0-5 resources_available.ngpus = 2 

Here I have set the number of cpus per node (8) explicitly to defeat 
hyper-threading and the actual number of gpus per node (2). On the 
other nodes you might have: 

set node compute-0-5 resources_available.ncpus = 8 
set node compute-0-5 resources_available.ngpus = 0 

Indicating that there are no gpus to allocate. 

3. You would then use the '-l select' option in your job file as follows: 

#PBS -l select=4:ncpus=2:ngpus=2 

This requests 4 PBS resource chunks. Each includes 2 cpus and 2 gpus. 
Because the resource request is "chunked" these 2 cpu x 2 gpu chunks would 
be placed together on one physical node. Because you marked some 
nodes as having 2 gpus in the nodes file and some to have 0 gpus, only those 
that have them will get allocated. As a consumable resource, as soon as 2 
were allocated the total available would drop to 0. In total you would have 
asked for 4 chunks distributed to 4 physical nodes (because only one of these 
chunks can fit on a single node). This also ensures a 1:1 mapping of cpus to 
gpus, although it does nothing about tying each cpu to a different socket. You 
would to do that in the script with numactl probably. 

There are other ways to approach by tying physical nodes to queues, which you 
might wish to do to set up a dedicate slice for GPU development. You may also 
be able to do this in PBS using the v-node abstraction. There might be some 
reason to have two production routing queues that map to slight different parts 
of the system. 

Not sure how this could be approximated in Torque, but perhaps this will give you 
some leads. 

rbw 
_______________________________________________ 
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing 
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100308/c81628d5/attachment.html>