[Beowulf] tcp error: Need ideas!

Kilian CAVALOTTI kilian.cavalotti.work at gmail.com
Thu Jan 22 01:15:18 PST 2009


Hi Gerry,

On Wednesday 21 January 2009 23:40:26 Gerry Creager wrote:
> History/background/description of the cluster
> * 126 node Dell 1950 cluster with dual-quad core Xeons
> * bnx2 module loaded for the Broadcom onboard nics

> Received disconnect from 192.168.200.154: 2: Bad packet length 808464432.

It may also be worth making sure you're using the latest bnx2 version. 
$(modinfo bnx2) should give you that, the latest one being 1.7.6b3.

I've been using those PE1950 a while, and had my share of weird issues with 
them, including complete kernel panics on high traffic load. Upgrading to the 
Dell-provided bnx2 version proved helpful. 

If you use SuSE or a Redhat-ish distro, the Dell Linux repository 
<http://linux.dell.com/repo/hardware/> may be useful, and upgrading the module 
version would be extremely simple to handle with the help of DKMS.

Cheers,
-- 
Kilian



More information about the Beowulf mailing list