Parallel batch jobs on beowulf?

Eray Ozkural (exa) erayo at cs.bilkent.edu.tr
Tue Oct 2 00:38:47 PDT 2001


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Gary,

On Tuesday 02 October 2001 01:34 am, Gary Stiehr wrote:
> Hi,
>    PBS is perfectly acceptable for this type of use.  Each job can
> request all of the nodes or just a subset of them.  The job is queued
> until the requested number of nodes is available.  You are still able to
> run MPI/PVM programs on the nodes in an interactive way when using PBS
> but this may be confusing/misleading to the users of the cluster.
>

Thanks for your suggestion. I'll look into OpenPBS.

I have one question though. How would I prevent users from starting parallel 
jobs while a PBS job is running? Since parallel codes don't run in a flash, 
it's highly likely that another user might, unknowingly, interfere with a 
parallel job. (and perturb its wall-clock performance) Is there a nice way to 
'lock' a node when a parallel job starts on a node and 'release' it when it 
terminates so that no other user process can be started on it while the 
parallel job is running?

Since all nodes are accessed via rsh it could be on that level, or at a lower 
level I guess, but I'm uncertain as to how this should be implemented in a 
reliable way (so that it does not corrupt otherwise normal operations)

>     You can also set PBS to use your nodes in a "time-shared" manner.
> This way, PBS can allocate more than one job to each node.  When you
> need to have only one job per node, you can configure the nodes as
> "exclusive" nodes.  I hope this helps.

It seems I'd use exclusive nodes. Could I also allocate them for a specific 
time interval? For instance, having batch jobs run in the night... But I 
don't know if that would be a good sol'n.

Thanks,

- -- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara
www: http://www.cs.bilkent.edu.tr/~erayo
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7uW8IfAeuFodNU5wRAlutAJ9Wgh9MZQ3hrHdXk1YXp3X/cg+0mwCfflp3
sBKq6HiRlooC/8AsjoLWLt4=
=H78k
-----END PGP SIGNATURE-----




More information about the Beowulf mailing list