Running two MPI jobs simultaneously

Eray Ozkural eozk at bicom-inc.com
Tue Dec 10 01:10:49 PST 2002


On Tuesday 10 December 2002 03:56 pm, Miska Le Louarn wrote:
>
> BUT when I try to run these two programs at the same time, one of them
> hangs. It just stops doing anything and sits there without crashing
> until the other program is completed. Then it starts to work again.

Is this LAM?

I'm not sure but I think I saw this behavior a couple of times.

[snip]

> So does anybody have any idea why this is ? Is it a Linux scheduler
> "feature" related to the network communication between the nodes (if I
> launch 2 non-MPI jobs, I get the standard slow-down) ? Or maybe
> interference inside MPI between the two processes ?
>
> Any tests I could do to see what is going on ?
>

I guess you could try to turn on the debugging features of the MPI 
implementation you're using and try to see what really happens. I would also 
write a very simple program to replicate the bug in a simpler environment 
such as sending a dummy message back and forth many times. Then you should 
check if multiple incarnations of this application has similar behavior.

Thanks,

-- 
Eray Ozkural <eozk at bicom-inc.com>
Software Engineer, BICOM Inc.
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C




More information about the Beowulf mailing list