low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)

Donald Becker becker at scyld.com
Tue Dec 17 11:10:29 PST 2002

On 17 Dec 2002, Patrick Geoffray wrote:
> On Tue, 2002-12-17 at 10:13, Donald Becker wrote:

> To come back on the original post, the fact that
> Os-bypass/zero-copy/whatever communication layer is often tight to a
> specific hardware is not only a way to secure an effective source of
> revenue. It's also aimed at make the life of the software developers
> much easier: having a common closed hardware environment reduces the
> dependencies on the host. 

I would go further -- the only way OS-bypass is efficient enough to use,
especially on SMP machines where the page tables and cache must be
consistent, is to exploit special features of the hardware.

The SMP issue is a very big one.  To convert a standard driver to SMP is
figuring out what serialization assumptions have been and can be made.
To convert an OS-bypass driver to SMP requires redesigning the structure
to something much more complex.

> One big question when designing a non-TCP layer for Ethernet is how deep
> to go trying to exploit the hardware ? If you try to be generic, you
> will quickly realize that the existing driver architecture makes a very
> decent job. If you starts to use hardware specific functionalities, you
> will either lock yourself in a few hardware solutions (scary when you do
> not control the future of the hardware line) or the amount of work
> needed to support a large set of GigE NICs at a ssuch low level is
> exploding. Add to that the fact that GigE chipsets have a quite short
> life cycle and the vendors are reluctant to provide details about their
> chips, I understand why there is no such product today.

A good example is easy to find: the Intel GigE NIC series has added
about a half dozen new PCI IDs in the past year.  Not just revision
numbers, which we don't track, but the major ID number.  Some of those
new versions appear to have significant changes in the feature set and
in the way they handle small packets to reduce latency.

That's a life cycle of about three months, which is shorter than the
time to decide a device driver is stable.

Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993

More information about the Beowulf mailing list