low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)

Donald Becker becker at scyld.com
Tue Dec 17 07:13:37 PST 2002

On 17 Dec 2002, Patrick Geoffray wrote:
> On Mon, 2002-12-16 at 19:31, Donald Becker wrote:
> > > BIP
> >    Magic protocol using custom Myrinet firmware.  Grumble: early
> >    performance numbers were not reproducible (I got exactly 50% of tech report
> >    numbers on same hardware and software).
> This is weird. I was involved in BIP back when I was a student in
> France, and Loic (BIP's author and Myrinet guru at large) was very
> careful about publishing real and reproducible numbers. I have myself
> confirmed these numbers many times.
> Try to get a hand on 2 recent NICs and test the latest BIP (0.99u) from
> http://www.ens-lyon.fr/LIP/BIP/.

As I said, "early numbers", probably 4-5 years ago.  This was a comment
that reflected a few days work back then.  We had exactly the same
Pentium Pro PR440FX motherboards, Myrinet cards and used the same
reported kernel version.  I'm fairly certain that we created a
near-duplicate test environment, and we were trying hard to reproduce,
not refute, the numbers.

Today it would be much more difficult to reproduce the exact
environment.  Just take the 2.4 kernel.  It
  - modifies the PCI bridge and bus master parameters based on many
    input variable,
  - configures and uses the IOAPIC based on BIOS tables,
  - is usually patched by the distribution, with many modification being
    to the PCI quirks table, 
  - may be compiled with widely varying GCC verisons 2.96, 2.96, 3.0, 3.2
And unlike 2.2 and earlier kernels, the performance under heavy I/O load
can depend heavily on the initial pattern of interrupts to the APIC.

I didn't mean for this posting to be about BIP.  I do feel it was fair to
put in a short note reflecting our experience.

> I am sure Loic would help you if you
> cannot reproduce good numbers. Last time I heard about it, BIP was
> getting <4 us on L9 (not reliable though. Reliability was planned but
> never implemented, the curse of all academics projects).

Implementing reliability is essential to understanding the effectiveness
of the approach.  There is a big gap between "put the next packet into
this memory location, call it done, and assume everything goes according
to plan" and a TCP/IP socket.

Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993

More information about the Beowulf mailing list