low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Donald Becker becker at scyld.comMon Dec 16 16:31:40 PST 2002
- Previous message: low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)
- Next message: low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, 16 Dec 2002, jon wrote: > Perhaps this isn't the best way to get ahold of you, but I've also sent > this to the Beowulf list. I've noted your comments on OS Bypass drivers > in the past. But isn't there some room for non-TCP/IP related traffic, > such as with computing clusters? We don't need no stinking TCP! No > associated revenue? You could replace Myrinet in the thousands of nodes > we have here ALONE at NCSA. Very likely not. Myrinet has both a cost and increasing performance advantage over gigabit Ethernet when the switch is larger than about 96 ports. > We (UIUC theoretical astrophysics group) are in the midst of purchasing > a $50K cluster (I know, small, but big for us! :)) and I'm done all the > research as to what we should be getting. We ended up going with a > Intel Desktop gigabit board and P4, but have found the tests to be very > poor. We only have 4 nodes right now because we worried about this very > thing. Latency or bandwidth? And what are you using to test? > InfiniBand Hardware is just now appearing, after a rapid committee-driven complexity increase. The initial price is well above Myrinet to capture the value to the "must-have" crowd. The question is if there is the motivation to travel down the price-volume curve > Giganet using VIA > ServernetII using VIA Both effectively dead hardware products, although Giganet is still shipping is current hardware. > U-Net Dead software effort, pre-dated VIA protocol. > BIP Magic protocol using custom Myrinet firmware. Grumble: early performance numbers were not reproducible (I got exactly 50% of tech report numbers on same hardware and software). > PM, FM The other Myrinet custom protocols? Dead. With a communication processor to do the work at the other end, you can do magic application-specific things. And when you write the paper, the programming effort was mimimal and the performance astonishing. > EMP The only thing I knew of by this name was a predecessor to IPMI. > LPC A google search shows only a low-speed serial communication project. > Half of these are seemingly dead, those that seem relatively alive are: Only half? Do you have the same number of fingers on both hands? A general guideline is that building a safe, reliable, general purpose communication protocol is always much more difficult than getting something that only works in the perfect conditions. You must implement checksums, sequence numbers, and recovery for failed endpoints. If you are directly writing to a remote process memory space, you have to take into account VM page table tracking and cache coherency. These can quickly erase any performance advantage of "zero copy". > M-VIA: http://www.nersc.gov/research/ftg/via/ > Only support a few devices, and only 1 expensive ($500) gigabit board > that's still available (the SysKonnect). > GAMMA: http://www.disi.unige.it/project/gamma/index.html The top project for current support. > Depending on what part of their website you are at, they support > different devices. The Alteon TIGON-II results seem to suck for > latency, which is our biggest problem. The Netgear GA621 looks great! > But we already bought a $5000 copper gigabit switch! We are stuck with > it! (HP Procurve 5308xl). Whether they support the GA622 is kinda open > or at least untested according to the website. No luck getting in touch > with driver writer about that. Besides, EMP guy says the GA622 sucks! While there are better Gigabit chips than the DP83820, most of its bad reputation comes from the poor performance of the other drivers out there. We get quite reasonable performance from it with the Scyld ns820.c driver. Others have reported a 2.5-3X performance improvement over the driver written by Red Hat. > EMP: http://www.osc.edu/~pw/emp/ > Seems to be interesting, although the available 3Com 3C996, of which we > have 3 to test, is only said to be "maybe" supported since it's Tigon 3 > and not Tigon 2. And will it such in latency just like the Tigon 2? > EMP guys says the GA-622T sucks with it's ns chipset and that was one > option with GAMMA, assuming he really did write the driver for both 621 > and 622 (their website isn't clear about this, and no emails from the > guys there), since the 622 "was" an option. ... > My questions are: > > 1) Is there a commercial product for a not so expensive board that > provides what GAMMA/EMP/M-VIA provide? Any other OS-bypass driver/MPI > layer I don't know about? No commercial company is likely to support a communication protocol unless they can pay for it (and have a hope of it working!) by bundling it with expensive hardware. We would support something on a best-effort or time-and-materials basis. > 2) Is there a solution I'm missing? Has to be copper gigabit for linux, > OS-bypass like GAMMA, MPI on top of that GAMMA-like. No dead boards, > etc. Why are there no commercial products? MPI/Pro is just a funny MPI > still on top of TCP, no? With custom versions available -- whatever you are willing to pay for. > Honestly, I can't really figure out what Scyld does. Is it just a linux > distribution? Does it actually have OS-bypass networking? Does > anything? We are a Linux distribution specifically designed for clustering. We have various modification for higher network performance, but that's actually used against us! Our competitors say "Look, Scyld modifies the kernel while we ship you a completely standard system." Then, when things don't work (as so often happen with complex systems), they get to say "that's the standard Linux behavior, it's not our problem". > Why is the OS-bypass so hard? If wanting no TCP support, isn't it > easier than writing standard linux driver? (like you've done a lot!) It took 20 years to get TCP right... -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993
- Previous message: low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)
- Next message: low-latency high-bandwidth OS bypass user-level messaging for commodity(linux) clusters with commodity NICs(<$200), HELP! (GAMMA/EMP/M-VIA/etc.)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
