[Beowulf] raw ethernet
Keith D. Underwood
kdunder at sandia.gov
Sat Jul 24 06:41:00 PDT 2004
> are connected by a point-to-point gigabit link). I'm interested in
> understand why I have a packet loss (interrupt management, full rx ring,
> rx buffer overflow, each of them combined?) Now I'm trying to modulate the
You lose packets because you can. That sounds sarcastic, but it is the
very unfortunate reality. I have a bit of low level GigE experience
and I am pretty sure that is the primary reason.
For example, say you have a high end GigE switch that guarantees sub-5
microsecond LIFO latency. Now, assume you have enabled flow control
(yes, flow control was defined in the GigE spec) and that you know the
cards have a decent amount of buffer. Next, simultaneously send 35
packets to one destination with origins distributed evenly among 7
sources. Want to know what will happen? Most of those packets will get
dropped. Why? They switch should have enough buffer to handle that.
Well... as best I could tell, the answer is that ethernet can drop
packets (it is defined as an unreliable protocol) and the switch
recognized a momentary flood that prevented it from meeting latency
guarantees, so it started dropping packets. Because the protocol is
inherently unreliable, "drop" is a valid design decision at any point
along the path.
More information about the Beowulf