[Beowulf] Re: typical latencies for gigabit ethernet

Patrick Geoffray patrick at myri.com
Mon Jun 29 15:22:23 PDT 2009


Dave, Scott,

Dave Love wrote:
> Scott Atchley <atchley at myri.com> writes:
>
>> When I test Open-MX, I turn interrupt coalescing off. I run  
>> omx_pingpong to determine the lowest latency (LL). If the NIC's driver  
>> allows one to specify the interrupt value, I set it to LL-1.

Note that it is only meaningful wrt ping-pong latency. To optimize for 
all latency cases, you just want interrupt coalescing to be off.

> results apart, probably, from the minor kernel version.  If I set
> rx-frames=0, I see this:
> 
> rx-usec    latency (µs)
> 20         34.6
> 12         26.3
> 6          20.0
> 1          14.8
> 
> whereas if I just set rx-frames=1, I get 14.7 µs, roughly independently
> of rx-usec.  (Those figures are probably ±∼0.2µs.)

rx-usecs specifies the minimum time between interrupts, whereas 
rx-frames specifies the number of frames (packets) between interrupts. 
So, if you set rx-frames to 1, there will be an interrupt after each 
packet. Not many devices implement rx-frames, since it does not 
distinguish between small and large frames. Adaptive coalescing methods 
do look at the size of the frames to figure out if the traffic is mostly 
latency or bandwidth sensitive, but it's just a guess.

>> The downside is lower throughput for large messages on 10G Ethernet. I  
>> don't think it matters on gigabit.
> 
> It doesn't affect the ping-pong throughput significantly, but I don't
> know if it has any effect on the system overall (other cores servicing
> the interrupts) on `typical' jobs.

On GigE, each 1500 Bytes frames takes more than 10us on the wire so even 
with interrupt coalescing turned off, you won't get more than 100K 
interrupts per second. It used to be a problem, but it's no big deal on 
recent machines. However, you can get a lot more interrupt when 
receiving smaller packets, although the interrupt overhead itself would 
limit the interrupt load to well below 1 Million per second. In the 
worst case, you would lose a core if you don't let the OS move the 
interrupt handler to do load balancing. What is one core these days ? :-)

Patrick



More information about the Beowulf mailing list