Trouble with Bonding Broken MPI

Ricardo hraa at
Wed Jul 3 05:17:26 PDT 2002


Take a look at the mpi machines file and verify if you're calling
the right network.

On Tue, 25 Jun 2002, Todd Broucksou wrote:

> All,
>   I have a RedHat 7.2, kernel 2.4.9-31 cluster on 8 Athlon MP's. Recently 
> added second 3c590 nic to bring up Channel Bonding. According to ifconfig 
> -a my Bond0 is up. I can ping across without any packet loss. And can ftp 
> internal to the cluster without any loss. But both the LAM and MPitch 
> mpirun are broken. I can add servers with lamboot or pg4 but can not run 
> mpirun. I get a network error. LAM and MPitch did work before "upgrade" to 
> channel bonding.
>   I have tried two separate Cisco switches and one Cisco switch running 
> EtherChannel still does not work with MPIrun type programs.
>   I have even tried recompiling Mpitch without any change in success.
>   Any help would be appreciated.

More information about the Beowulf mailing list