[scyld-users] bpsh in background: defunct processes

Donald Becker becker at scyld.com
Mon Sep 29 18:32:10 PDT 2003


On Mon, 29 Sep 2003, Anand Bedekar wrote:

> I'm trying to run bpsh in a script that calls bpsh in
> a loop, like this:
> 
> for i in 1 2 3
> do
>     bpsh -n $i run.sh &
> done

Suggestion: you should be using 'beomap' to get a dynamic schedule:

for i in `beomap --np 3`; do ...

> What happens is that all the processes called within
> run.sh seem to go into a "defunct" state without
> finishing cleanly. This is making the process table
> fill up, so that no more processes can be run. 

This sounds like a long-fixed bug in the BProc.  The status and
termination messages were being processed in reverse order.

> Is this usual behaviour when calling bpsh to run a
> shell script, given the way I am calling 
> 'bpsh -n $i run.sh &' ? Is there some other way to run
> it? 

With our new release there is a command named 'beorun' that
automatically combines a scheduler mapping with efficiently controlling
the resulting processes:
   beorun --np 3  command;

> Unfortunately all the nodes in the cluster are
> currently out of action because the process table is
> full on all of them, due to the above.

You should be able to restart the cluster nodes in about a second...

> So I can't report on which version of scyld has been installed,
> until the sysadmin reboots the whole thing. I do know
> the machines are P3 running RedHat 7.0, kernel version
> 2.2.19.

That doesn't sound like a Scyld release.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993




More information about the Scyld-users mailing list