how to tell when jobs are finished

Pedro Díaz Jiménez pdiaz88 at terra.es
Wed Aug 1 11:50:11 PDT 2001


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I agree with Sean about the use of waitpid(). About the daemon, well, I think 
is not necesary. If I not  misunderstood you, what you want to do is execute 
a certain number of programs and know when anyone of those programs exited. 
Here is my proposal, in  the form of a pseudo-shell

1.read somehow the list of programs to execute
2. For each program to run, create a child using fork() (the master
creates all the childs)
3. (optional) you may want to redirect each child output to some file
4. get from the childs its pid via some IPC mechanism (a pipe will do) and
store the in an array or something (i would use a linked list, or a 
search-tree table if you will have lots of pids)
5. Finally, each child calls exec*() and replaces its memory image with the 
program desired - that is - executes the program
6. Now, you have to know when each of the programs you executed has exited. 
For simplicity, lets assume you whant to printf something like "Hey!, PID
XXXX finished!". You can do this in two forms (to mi knowledge):
	a) loop until all the programs have exited. You can use waitpid with WNOHANG 
to poll each pid. Advantage: Simple.  Disadvantage: You can't do other 
productive things while waiting 
	b) Set a signal handler to the alarm signal, and test say each second for 
completion of one of the pids in your list. If completion, print message and 
remove that pid from the list. Disable signal callback when list is empty, 
and re-enable when list has at leat one element. Advantages: You can do other 
productive things, like launching more processes. Disadvantages: A little 
more complicated. If you use this option, see sigaction(2) and signal(2)


I hope that I didn't missunderstand you and the above will help

Cheers
Pedro
On Wednesday 01 August 2001 15:52, Nicholas Henke wrote:
> I think that might work--- what I am trying to do is start the job via
> bpsh or brexec ( to be determined...), from each I can get the pid. I am
> wondering what is the RightWay(tm) to tell that the job is no longer
> executing. I think wait may work, but I wonder about the scalability of
> that. Am I right in assuming that I would need to have a daemon that
> monitors the entire list of pids at a certain interval?
>
> Thanks :-)
> Nic
>
> On Wed, 1 Aug 2001, Sean Dilda wrote:
> > On Wed, 01 Aug 2001, Nicholas Henke wrote:
> > > Hello--
> > > 	I am writing a resource manager based on top of bproc, and I am
> > > working on job execution. I am wondering if anyone has any ideas on how
> > > to tell when a job is finished executing. The only solution that I have
> > > thought of is to wrap the command in a shell script that tells the
> > > resource manager that the job is done executing.
> > >
> > > Any help would be greatly appreciated
> >
> > I'm going to assume you're wanting to know how to do this from a
> > programming level.  My advice is to save the pid of the job, then use
> > waitpid() with the WNOHANG option to check if the job has finished or
> > not.  'man 2 wait' for more information on using waitpid().
> >
> > If this isn't what you're looking for, please give me more information
> > on what exactly you are trying to do and I'll try to help you out.

- -- 

  __________________________________________________
 /                                                  \
 | Pedro Diaz Jimenez                               |
 |                                                  |
 | pdiaz88 at terra.es      pdiaz at acm.asoc.fi.upm.es   |
 |                                                  |
 |                                                  |
 | http://planetcluster.org                         |
 | Clustering & H.P.C. news and documentation       |
 |                                                  |
 | There are no stupid questions, but there're a    |
 | lot of inquisitive idiots                        |
 |        Anonymous                                 |
 |                                                  |
 | "I find your lack of faith disturbing."          |
 |        Darth Vader, Star Wars Episode IV         |
 \__________________________________________________/

- -----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

mQGiBDqcGZsRBADFIahNPLk8suMlS39m8RqatLgX4dO7PU2F5p1oVvkyB7PaLQCv
FREWwfrjGpxAjRnxyZ4TdaFi1oCP495t5R2CdjPZu0EfjsEqosdLXkjDsKl2n4Wo
Afb6BaHMJS5PADEI0QfpZOkB8OruAZja/oGmn5rThyjgCxWHUuK1ArmeGwCg7+9a
owg9wP1RohePHJSDB9d2HYMD/i7z1X4ev+K90LumgJwSWlScJ7MEip5rw4wqGOkK
lF/C2nTYsoX5CVEn/pu7hROL/BWIYtBgkNDaEjsVsyb+4KjQXcZUW5l3ADipWYx2
r9s4sFfeZ9nfhDcG0aNYRcCNkYSZ/WxUkXS8UjVEAEhkFu1BA+6UZmeq3pKtJZTR
+HqKA/9zRmgTon36zt2qe9eiR6DyY0EpGEI0iY+KYX6GC/wxizeHBw0FW1eOEoxF
GjtxdBv/U9vi7Vgav6aY+pr4la5q6jVabe03Y8yGDFeL8jM+lqww1rzpABiGrF+W
qge65zCUjL3jJE5+5yi+KcRyllb1OA7uXQTtsRw+TGq9Dvaaz7QwUGVkcm8gRGlh
eiBKaW1lbmV6IChCLk8uRi5ILikgPHBkaWF6ODhAdGVycmEuZXM+iFYEExECABYF
AjqcGZsECwoEAwMVAwIDFgIBAheAAAoJEJ7ud33hGMZRj20An2Ce4S/vBTuZDxnL
WFBrJRnc3UdaAKDnIPNRbz7r4dh9AuBcpbCE1pQ/SLkBDQQ6nBmqEAQAr7O07Dws
5zAbQvm1hwGthXKCHtIIuWCPdX/XkNG6ZxV/cXgs4LI4oAg3GhttD2JIEk2SoVXE
FOf/wIddIDz70/9mIZavMvpR31LxBFSJk0Up3caOvThM90wMttRi7tg7cf04rrMM
Phy8T5bOIW/q5SMwZffbJXD7bA0/jDLdQ6MAAwYD/1emSwNTzOOmMCZadoEBpKIE
HA35P2/m/SsCI+pQ/OKXKPvvrQKTQqRCcDa5aq31oSiT9M5WQ96BlIGKHRPWGpvm
0822V7M9RF2mYZPIfgKfTSvZpYHzjz+RM7PvBBiBc9l95vy70Sh7SywIF86H80Ag
D0dUIDtGlrSANhXjx4EJiEYEGBECAAYFAjqcGaoACgkQnu53feEYxlHdVACgjVhU
Y8CKf6MYZgQOR9eIDNvTX0AAn3dwbW1HLxEF5OQKJIsngl0BUlYK
=d4S3
- -----END PGP PUBLIC KEY BLOCK-----
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7aE9pnu53feEYxlERAuPWAKCLw1kkSXPx+wkTLjJmXD5l66brEgCgmXBf
buzfDFpqPt6aOIq8hQumKM8=
=us/v
-----END PGP SIGNATURE-----




More information about the Beowulf mailing list