FNN vs GigabitEther & Myrinet

Donald Becker becker at scyld.com
Thu Oct 25 11:30:03 PDT 2001


On Thu, 25 Oct 2001, Bogdan Costescu wrote:

> On Wed, 24 Oct 2001, Greg Lindahl wrote:
> > One big problem with both the GAMMA and MVIA projects is that
> > originally, neither was doing any kind of error correction, simply
> > telling the application "whoops! there's a problem."

It seems that every networking project, including most of the commercial
ones, initially assume that errors will not occur.  Adding error
detection and recovery typically more than doubles the latency and
overhead.

> Why don't they get some support from the big guys ? For example, I
> always wondered after Scyld announced its distribution if they are going
> to include some kind of low-latency package, be it GAMMA, MVIA or
> something else... Scyld certainly has the knowledge (but I don't know
> about resources) to make it happen.

The network device drivers we've written cover over 95% of the commodity
market.  Scyld is perhaps uniquely qualified to cause the something like
GAMMA to be widely adopted -- we could put the proper hooks into the
drivers.  But that's a large effort that would be a resource and
financial drain, without any associated revenue.

> However, IMHO doing this in Linux at the moment is not future-proof; apart
> from the ever changing VM subsystem, some people favour iovecs to do

The VM subsystem changes alone are a huge problem.  The VM design has
been replaced three times since the beginning of the year, and the
current implementation continues to be reworked with three kernel
releases in the past three weeks.

> kernel-userland data passing; OTOH when the zero-copy network changes were
> introduced (around 2.4.3 or so), there were discussions about using iovecs
> - some networking guys said that they are too complex and time-consuming
> to set up and use and so they introduced yet another mechanism which is
> now specific to networking...

Yes.  The "zero-copy" networking changes were dropped in the
supposedly-stable 2.4.3 kernel.  I'm not opposed to those changes, but
the decision process that allowed the interface change to go in with
little discussion or notice means that any OS-bypass implementation must
have substantial resources that have a full-time focus on updates.

Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993





More information about the Beowulf mailing list