Running two MPI jobs simultaneously
Eray Ozkural
eozk at bicom-inc.com
Tue Dec 10 01:10:49 PST 2002
On Tuesday 10 December 2002 03:56 pm, Miska Le Louarn wrote:
>
> BUT when I try to run these two programs at the same time, one of them
> hangs. It just stops doing anything and sits there without crashing
> until the other program is completed. Then it starts to work again.
Is this LAM?
I'm not sure but I think I saw this behavior a couple of times.
[snip]
> So does anybody have any idea why this is ? Is it a Linux scheduler
> "feature" related to the network communication between the nodes (if I
> launch 2 non-MPI jobs, I get the standard slow-down) ? Or maybe
> interference inside MPI between the two processes ?
>
> Any tests I could do to see what is going on ?
>
I guess you could try to turn on the debugging features of the MPI
implementation you're using and try to see what really happens. I would also
write a very simple program to replicate the bug in a simpler environment
such as sending a dummy message back and forth many times. Then you should
check if multiple incarnations of this application has similar behavior.
Thanks,
--
Eray Ozkural <eozk at bicom-inc.com>
Software Engineer, BICOM Inc.
GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C
More information about the Beowulf
mailing list