Trouble with Bonding Broken MPI
Ricardo
hraa at lncc.br
Wed Jul 3 05:17:26 PDT 2002
Hi
Take a look at the mpi machines file and verify if you're calling
the right network.
On Tue, 25 Jun 2002, Todd Broucksou wrote:
> All,
> I have a RedHat 7.2, kernel 2.4.9-31 cluster on 8 Athlon MP's. Recently
> added second 3c590 nic to bring up Channel Bonding. According to ifconfig
> -a my Bond0 is up. I can ping across without any packet loss. And can ftp
> internal to the cluster without any loss. But both the LAM and MPitch
> mpirun are broken. I can add servers with lamboot or pg4 but can not run
> mpirun. I get a network error. LAM and MPitch did work before "upgrade" to
> channel bonding.
> I have tried two separate Cisco switches and one Cisco switch running
> EtherChannel still does not work with MPIrun type programs.
> I have even tried recompiling Mpitch without any change in success.
> Any help would be appreciated.
>
More information about the Beowulf
mailing list