[Beowulf] Performance tuning for Jumbo Frames

Bogdan Costescu bcostescu at gmail.com
Mon Dec 14 02:59:21 PST 2009

On Sat, Dec 12, 2009 at 7:59 AM, Rahul Nabar <rpnabar at gmail.com> wrote:
> I have seen a considerable performance boost for my codes by using
> Jumbo Frames. But are there any systematic tools or strategies to
> select the optimum MTU size? I have it set as 9000.

I played with this as well several times and found variable results.
At one time, the memory allocation proved to be the limiting factor:
because the page size was 4K, a packet with a MTU smaller than that
would fit into one page, while a packet of 9000bytes would require 3
contiguous pages, making the search more time consuming; when plotting
the bandwidth vs. MTU it peaked at just below 4K, so an increased MTU
was beneficial compared with the default 1500bytes one, but only as
long as it fits in one page. At another time, the switch was more
likely to drop large frames under high load (maybe something to do
with internal memory management), so the 9000bytes frames worked most
of the time while the 1500bytes ones worked all the time... At yet
another time, the high interrupt load generated by the 1500bytes
fragments would make the computer unstable (probably an Athlon MP
based system, but memory is fuzzy), so larger frames and/or interrupt
coalescing was the only way to actually use that computer.

The MTU can be set to higher values than 9000bytes if all the
components involved support it - switch, network cards and driver. I
remember seeing 2-3 years ago for some network equipment a MTU of 16K
- but again the memory is fuzzy on what equipment that was - so it's
definitely possible to have it higher than 9000bytes. Usually setting
a too large MTU would be seen in bandwidth testing - if fragments
above a certain MTU are dropped or only partly transferred, there will
be retransmissions and the useful bandwidth will drop significantly
(and for a trained eye, the statistics of the network driver and stack
will provide clues as well).

> Also, are there any switch side parameters that can affect the
> performance of HPC codes?

Most (all ?) switches do their job in hardware, to arrive at wire
speed. There is usually nothing that can be set to affect the way the
engine works.


More information about the Beowulf mailing list