large clusters and topologies

Fri Jul 28 09:57:10 PDT 2000

On Fri, 28 Jul 2000, Alan Ward wrote:

> This discussion we've been having about > 100 node clusters
> makes me wonder about two points:
> 
> - I suppose Ethernet is not all too efficient with this
> size, even with switches instead of hubs. What's the largest
> practical size for an Ethernet 'wulf? Any alternatives to
> Myrinet upwards of that?

This depends (of course -- you know this already if you think about
it:-) on the problem.  

A 10BT wulf with hub stacks connected by a stack of switches could scale
to 1000's of nodes for EP, master/slave, or suitably coarse grained real
parallel problems.  Fine grained problems with a bad
computation/communication ratio might actually slow down run on two
nodes.  Then there are classes of problems that run "slower" (in a
manner of speaking) on two more more nodes but won't fit at all or run
out of REALLY slow swap on a single node, so it is worth parallelizing
them even with a negative "parallel speedup" (although if you think
about it, the real speedup might be infinity viewed as the ratio between
not run at all in an infinite amount of time and run, however slowly,
split up:-).  Real problems span everything in between.

It is better to turn the question around.

What >>kinds<< of practical problems can be run on a wulf with various
node/os/networking types and speeds?

This actually has an answer, if one qualifies the question and answer
carefully enough.  For example, you could ask what can be run on and
scale with increasing speedup on all nodes of a beowulf consisting of 80
500 MHz PIII nodes, each with 512 MB of main memory, a "generic" local
IDE disk, and a 3c905 NIC interconnected with an HP ProCurve FE switch,
running RH 6.2 (with its stock gcc, net drivers, etc.).  That has a
fairly specific answer, although the answer still depends on a wealth of
details -- which algorithms were used, PVM or MPI or raw sockets for
IPC's, program organization and scale.  At least the >>method<< for
answering the question in a meaningful way for a given problem can then
be specified.

> - Which type of algorithm can take direct advantage of a
> cube, 
> toroid, ... topology? I would think even a "dumb" job like 
> matrix multiplication might benefit from a 2D lattice
> (instead
> of the typical all-to-all unique segment). Comments?

The web-book I referenced earlier in this discussion has chapters and
sections that discuss algorithms and topology, especially hypercubes.
See:

  http://www-unix.mcs.anl.gov/dbpp/

or look on the brahma page (www.phy.duke.edu/brahma) if you forget.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu