[Beowulf] Help with inconsistent network performance

Patrick Geoffray patrick at myri.com
Tue Dec 18 15:21:35 PST 2007


Hi Joe, Brendan

Joe Landman wrote:
>> Since it is a full duplex switched network, there should not be any
>> collisions happening.  Since the image is less than 1 MB total, I don't
> 
> There could be blocking ...  if one unit grabs the single network pipe 
> of the display node while the another node tries to send data, then the 
> late node will back off (well with TCP it will) in a pre-determined manner.

It definitively looks like natural switch contention (N->1 pattern). 
However, TCP's reaction will depend on how the switch itself handles 
contention. If the hardware flow-control is turned off, packets will be 
dropped in the switch, and TCP will quickly shrink its send window: big 
hiccup. If the hardware flow-control is turned on, the sender NICs will 
be paused and (hopefully) no packets are dropped. TCP will not be aware 
of the backpressure and the send window may even increase a bit because 
of the pausing delay: no big hiccup.

I don't know about the hardware flow-control implementation in the 
Procurve 2848, and it may just be off by default like most Ethernet 
switches. FWIW, there was no working hardware flow-control on the 10GigE 
Procurve switch that I have played with, even when turned on.

Patrick



More information about the Beowulf mailing list