low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)

jon jcmcknny at uiuc.edu
Mon Dec 16 02:46:12 PST 2002


Hi Donald Becker, master of all that is networking!  And anyone else
that can help :)

Perhaps this isn't the best way to get ahold of you, but I've also sent
this to the Beowulf list.  I've noted your comments on OS Bypass drivers
in the past.  But isn't there some room for non-TCP/IP related traffic,
such as with computing clusters?  We don't need no stinking TCP!  No
associated revenue?  You could replace Myrinet in the thousands of nodes
we have here ALONE at NCSA.

We (UIUC theoretical astrophysics group) are in the midst of purchasing
a $50K cluster (I know, small, but big for us! :)) and I'm done all the
research as to what we should be getting.  We ended up going with a
Intel Desktop gigabit board and P4, but have found the tests to be very
poor.  We only have 4 nodes right now because we worried about this very
thing.

Anyways, our problem is we are willing to pay for a commercial product,
but not at any cost, perhaps upto $200 per board.  Basically, we see all
these solutions such as:

Giganet using VIA
ServernetII using VIA
InfiniBand
U-Net
AM II
LPC
PM
FM
GigaE-PM
BIP
EMP
GAMMA
M-VIA

Half of these are seemingly dead, those that seem relatively alive are:

M-VIA: http://www.nersc.gov/research/ftg/via/
Only support a few devices, and only 1 expensive ($500) gigabit board
that's still available (the SysKonnect).

GAMMA: http://www.disi.unige.it/project/gamma/index.html
Depending on what part of their website you are at, they support
different devices.  The Alteon TIGON-II results seem to suck for
latency, which is our biggest problem.  The Netgear GA621 looks great!
But we already bought a $5000 copper gigabit switch!  We are stuck with
it! (HP Procurve 5308xl).  Whether they support the GA622 is kinda open
or at least untested according to the website.  No luck getting in touch
with driver writer about that.  Besides, EMP guy says the GA622 sucks!


EMP: http://www.osc.edu/~pw/emp/
Seems to be interesting, although the available 3Com 3C996, of which we
have 3 to test, is only said to be "maybe" supported since it's Tigon 3
and not Tigon 2.  And will it such in latency just like the Tigon 2?
EMP guys says the GA-622T sucks with it's ns chipset and that was one
option with GAMMA, assuming he really did write the driver for both 621
and 622 (their website isn't clear about this, and no emails from the
guys there), since the 622 "was" an option.

Basically after all my testing (about 2 months of light testing and last
2 weeks of hard-core 24-hour a day testing) I realized TCP sucks and I
need an OS Bypass or user-level communication driver.

My questions are:

1) Is there a commercial product for a not so expensive board that
provides what GAMMA/EMP/M-VIA provide?  Any other OS-bypass driver/MPI
layer I don't know about?

2) Is there a solution I'm missing?  Has to be copper gigabit for linux,
OS-bypass like GAMMA, MPI on top of that GAMMA-like.  No dead boards,
etc.  Why are there no commercial products?  MPI/Pro is just a funny MPI
still on top of TCP, no?

I basically want 20us latency for 0 message size and peak bandwidth, for
$100-$200 per board on gigabit.  Not too much to ask? :)   I know it's
certainly possible.

Currently with Myrinet on P4 I get 17us latency and 80MB/sec bandwidth,
gigabit on P4 gets 70us latency and 80MB/sec.  On Xeon's I get 50us
latency and 95MB/sec bandwidth with latest 3com bcm5700 or latest intel
e1000 driver.

I'm going to try EMP with my 3c996, but honestly his instructions are
damn vague and confusing (i.e. WHAT snapshot of gcc/binutils should I
use, what the heck do I do?, etc.)  Might try GAMMA too since EMP says
it may work as a Tigon processor.  GAMMA seems a bit less crazy.

Honestly, I can't really figure out what Scyld does.  Is it just a linux
distribution?  Does it actually have OS-bypass networking?  Does
anything?

Why is the OS-bypass so hard?  If wanting no TCP support, isn't it
easier than writing standard linux driver? (like you've done a lot!)

Thanks!
Jonathan McKinney
University of Illinois at Urbana-Champaign
Center for Theoretical Astrophysics
NCSA





More information about the Beowulf mailing list