[Beowulf] The move to gigabit - technical questions

Robert G. Brown rgb at phy.duke.edu
Wed Mar 16 07:52:07 PST 2005


On Wed, 16 Mar 2005, Robert G. Brown wrote:

> On Tue, 15 Mar 2005, Vincent Diepeveen wrote:
> 
> > At 05:41 PM 3/14/2005 -0500, Glen Gardner wrote:
> > >Gigabit will be a little faster than 100Mbit on a small cluster, but not 
> > >a lot.
> > 
> > What is 'not a lot'.
> > 
> > I would guess it's factor 10 faster in bandwidth?

I hate to reply to myself, but I should have noted that the below
applies to BANDWIDTH, not latency, dominated communictions.  It was
implicit from Vincent's reply, but I should have made it explicit.  For
lots of small packets gigabit's advantage probably won't be 10x, and
this is another case where a higher-end network is indicated.  However,
the latency probably won't change a lot with different switches or
switch arrangements, either, except for the worse along paths with
multiple switch hops in between.

I should also have pointed out to the original poster that there are
nice tools (e.g. netperf, netpipe, lmbench) that will help him analyze
his raw network performance outside of a particular application that
might well have poor "networking" performance for reasons that have
nothing to do with the actual network.  There are also lots of articles
out there both in the list archives, in Cluster World magazine back
issues, in linux magazine back issues, and on various websites
(including mine and brahma's) that can really help one understand just
what ethernet is and how it works and what its numbers should be.  It is
the most widely implemented and widely understood network, good, bad,
and ugly features notwithstanding.

   rgb

> 
> (Maybe, you don't get QUITE 100% of the raw clock advantage in all
> applications on all hardware, Vincent;-).  However, for most
> applications on most hardware you >>should<< get a signficant advantage
> -- 80-95% of 10x, or 8-9.5x.  Not a just "a little".
> 
> A really, really cheap switch might have problems with bisection
> bandwidth and chop this down for simultaneous flat-out bidirectional
> data streams, but relatively few parallel applications engage in
> flat-out bidirectional communications.  Even if it does, your problem is
> more likely to be with resource contention (e.g. two hosts trying to
> talk to a third at the same time) than it is with actual bandwidth
> oversubscription.  This is what Vincent is suggesting that you look into
> (or let us look into:-) below.
> 
> If your particular usage pattern does create resource contention, then
> you might well need to either hand-optimize the pattern to avoid
> saturating your cheap hardware, create a network with cheap components
> that effectively breaks up the pathological communications pattern
> (which it sounds like is what you actually did) or buy better hardware
> (either better gigE switches or a "real" HPC network).
> 
> However you shouldn't really trash gigE itself -- it isn't at fault and
> your results aren't typical.
> 
>     rgb
> 
> > 
> > >I ended up using 5 cheap gigabit switches to make a gigabit concentrator 
> > >for my 12 node cluster.
> > >It eliminated the tendency for the network to saturate under a heavy load.
> > 
> > Very interesting, can you post a connection scheme and routing table?
> > 
> > >It also let me use gigabit network cards in my I/O node and controlling 
> > >node with a small improvement in file I/O.
> > 
> > Streaming i/o or random access?
> > 
> > cheapo disk arrays get what is it, 400MB/s handsdown or so?
> > 
> > that's raid5 readspeed, plenty security at a raid5 array.
> > 
> > >The compute nodes remaind with 100 Mbit to conserve power. The setup 
> > >works rather nicely.
> > 
> > what type of software do you run at it,
> > embarrassingly parallel software?
> > 
> > Vincent
> > 
> > >Glen
> > >
> > >Vincent Diepeveen wrote:
> > >
> > >>Good evening,
> > >>
> > >>It's interesting to investigate what gigabit can do for small home clusters.
> > >>
> > >>Any latency oriented approach is doomed to fail obviously at gigabit. But
> > >>they're cheap. For 40 euro i see several getting offered already.
> > >>
> > >>First important question is of course how much system time those NIC's eat
> > >>when fully loading their bandwidth.
> > >>
> > >>Example, i have an old dual k7 here with pci 2.2 (32 bits 33Mhz).
> > >>Suppose i put a gigabit card in it.
> > >>
> > >>In say 6 messages a second i ship 8MB data at a time. Ship and send in turn.
> > >>
> > >>So it ships a packet of 8MB, then receives a packet of 8MB.
> > >>
> > >>Other than the cost of the thread to store the packet to RAM, does such a
> > >>card in any way stop or block the cpu's which are 100% loaded with
> > >>searching software (my chessprogram diep in this case)?
> > >>
> > >>What penalty other than that thread handling the message is there in terms
> > >>of system time reduction to the 2 processes searching?
> > >>
> > >>Oh btw, i assume that gigabit can handle 48MB/s user data a second?
> > >>
> > >>Vincent
> > >>
> > >>_______________________________________________
> > >>Beowulf mailing list, Beowulf at beowulf.org
> > >>To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
> > >>
> > >>  
> > >>
> > >
> > >-- 
> > >Glen E. Gardner, Jr.
> > >AA8C
> > >AMSAT MEMBER 10593
> > >Glen.Gardner at verizon.net
> > >
> > >
> > >http://members.bellatlantic.net/~vze24qhw/index.html
> > >
> > >
> > >
> > >
> > >
> > 
> 
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu





More information about the Beowulf mailing list