[Beowulf] MPI OSCAR 3.0 on the BEOWULF cluster

John Bushnell bushnell at ultra.chem.ucsb.edu
Thu Nov 18 03:04:57 PST 2004


One way to control how the nodes are being allocated is to simply
use brute force. :-)

One user gave me the tip that you can specify exactly what nodes
to use with something like the following:

#PBS -l nodes=node001+node002+node003+node004

Which would allocate one processor on each of the nodes named
"node001", "node002", "node003" and "node004" in this example.  Seems
like this should work well if you have the cluster all to yourself for
testing purposes.  Alternatively, you could do something like:

#PBS -l nodes=node001:ppn=2+node002:ppn=2+node003:ppn=2+node004:ppn=2

And then only launch a single process on each node if you wanted to
grab both of the cpu's on each machine so they won't be allocated to
another job during your test run.  But then it would be up to you to
fake out your program if it happens to just swallow the PBS_NODEFILE
(or whatever its called) to figure out where it wants to run.

  Hope this helps  -  John

PS:  The server attribute node_pack may also be of interest, though
somewhat less direct.  I prefer the direct approach.

On Wed, 17 Nov 2004, Antonio Parodi wrote:

> Good morning,
> I am using a cluster with the following charactestics:
> 
> BEOWULF 
> MPI OSCAR 3.0
> RED HAT 9
> 11 NODES: 1 NODE PRINCIPAL
>           10 SUBNODES: EACH SUBNODE HAS 2 PROCESSOR P4 XEON 2.8 GIGA
>           NO IPERTRADING
>           2 giga RAM DUAL CHANNEL
>           EACH SUBNODE HAS 200 GIGA: EIDE 7200, 8 MEGA BUFFER
> 
> I want to use this cluster to test the scalability of a numerical code
> using 1, 2, 4, 8 processors. For example I would like to test the code
> with 4 processors but I am not able to force it to use 4 subnodes
> (that is one processor for each subnode) instead of 2 subnodes (2
> processor for each subnode) as the cluster does. In this way the
> cluster creates local conflict and memory sharing problems in each
> subnode, decreasing the code performances
> This is surprising for me since I use the following script to run the
> simulations in which the line 3 prescribes the use of a processor for
> each subnode if it possible
>             
> #!/bin/csh
> #PBS -m e
> #PBS -l nodes=4:ppn=1
> #PBS -l walltime=9999:00:00
> #PBS -M user at domain
> #PBS -j oe
> #PBS -o rb.out
> #PBS -N rb
> #PBS
> limit coredumpsize 0
> set NN = `cat $PBS_NODEFILE | wc -l`
> echo "NN = "$NN
> #cd $PBS_O_WORKDIR
> cd /home/antonio/test_paper_numerico/RB1E5
> pwd
> cat $PBS_NODEFILE  > newlist
> date
> time mpirun  -machinefile newlist -np $NN rb > nav2.log
> date
> 
> I hope that someone can helps me
> Ciao
> Antonio
> 
> 




More information about the Beowulf mailing list