Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

Trouble with Bonding Broken MPI

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Todd Broucksou tbroucks at cnidr.org
Tue Jun 25 11:17:16 PDT 2002


All,
  I have a RedHat 7.2, kernel 2.4.9-31 cluster on 8 Athlon MP's. Recently 
added second 3c590 nic to bring up Channel Bonding. According to ifconfig 
-a my Bond0 is up. I can ping across without any packet loss. And can ftp 
internal to the cluster without any loss. But both the LAM and MPitch 
mpirun are broken. I can add servers with lamboot or pg4 but can not run 
mpirun. I get a network error. LAM and MPitch did work before "upgrade" to 
channel bonding.
  I have tried two separate Cisco switches and one Cisco switch running 
EtherChannel still does not work with MPIrun type programs.
  I have even tried recompiling Mpitch without any change in success.
  Any help would be appreciated.




More information about the Beowulf mailing list