[Beowulf] Help with inconsistent network performance

Patrick Geoffray patrick at myri.com
Tue Dec 18 18:05:41 PST 2007


Hi Greg,

Greg Lindahl wrote:
> ethtool -a eth0
> 
> and it says RX/TX pause are on, doesn't that mean that the switch
> supports it?

No, it just means the NIC supports it. RX means that the NIC will send 
PAUSE packets if the host does not consume fast enough (rare) and TX 
means that the NIC will stop sending when receiving a PAUSE packet (more 
likely). It's independent of the switch flow control settings.

> My dumb Netgear 24-port 1-gig switch supports hw flow control.  Sounds
> like things are a bit more difficult with low-end 10gigE ports.

For RX hardware flow-control, you need enough buffer space to keep one 
full frame plus the latency on the longest wire, for every port. It is a 
bit more expensive to do with 10GigE, because you need faster memory and 
more of it. Some recent 10GigE chips use a shared SRAM buffer that is 
not big enough for the worst case with 9K packets: it works fine as long 
as a few ports are blocked, then it happily collapses and drops packets.

Flow-control is not for everyone, and that's why it is often turned off 
by default. When a sender is paused, it will stop sending anything, 
including packets for different destinations. Dropping packets is 
expensive to recover but it keeps things moving.

Patrick



More information about the Beowulf mailing list