[Beowulf] HP 2848 switch woes

Bill Wichser bill at Princeton.EDU
Mon Jan 17 05:54:16 PST 2005


Well, I submitted this about a week ago!  Not very timely...  Maybe it's 
just my mailer seeing it this morning for the first time.

The solution, with help coming from the CentOS mailing list, is that the 
HP switches come preconfigured with LACP active on every port.  When the 
network interface is reset, the extra delay on the switch port causes 
the messages to get lost never allowing the head node to respond since 
it never sees a request.

The solution is to disable the LACP on every port.  From the switch's 
config commandline interface:

no int all lacp
write mem

And then the switch functions fine.

Bill


Bill Wichser wrote:
> Trying to install a new cluster of Tyan 2881 mothers with CentOS 3.3, 
> kernel 2.4.21-27.0.1.ELsmp (Opteron).
> 
> When running through this switch (Firmware:I.08.55, ROM:I.08.04), the 
> system is forced to do a manual install as a failure occurs in what I 
> believe is the initial discovery phase after the kernel boots.
> 
> When a direct connection is made to the head node, everything proceeds 
> as normal.
> 
> During the initial booting, after PXE, the system sends a request out to 
> the network asking for it's MAC address.  Right before this time, the 
> network card appears to be reset by the OS.  This appears to be the 
> normal progression from within the kernel.
> 
> On a direct cable, the rarp is seen and the compute node receives the 
> info via the head node, right after the network card is reset.  Through 
> the switch though, the rarp is never seen by the head node.
> 
> At first I thought it was something with autodetection and so set the 
> switch up for just Gig.  It certainly isn't the case that rarps don't 
> work as the initial tftp boot works fine, the vmlinuz is downloaded and 
> booting proceeds.  It only is when during the boot phase when the 
> network card is reset does communication somehow fail.
> 
> I've set the timeout in the switch for 15 minutes, made sure spanning 
> tree was off, connected the cables to adjacent ports, all to no avail.
> 
> If anyone has any suggestions I am all ears as I have run out of ideas 
> at this point.  HP just suggests updating the firmware, which I have 
> done to no avail.
> 
> Thanks,
> 
> Bill
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf





More information about the Beowulf mailing list