[Beowulf] torus versus (fat) tree topologies
Dan Kidger
daniel.kidger at quadrics.com
Sun Nov 14 08:07:49 PST 2004
On Saturday 13 November 2004 12:55 am, Greg Lindahl wrote:
> On Fri, Nov 12, 2004 at 06:02:08PM -0500, Patrick Geoffray wrote:
> > How do you close your Torus without long cables ? Unless you stack your
> > nodes in a circle, you will need long cables.
>
> BlueGene's trick is to attach every other node to the end, and then
> come back with the unused ones. So:
>
> Nodes: 0 1 2 3 4
>
> Connections: 0 -> 2 -> 4 -> 3 -> 1 -> 0
>
> No long cable. This is probably a classic solution not invented by IBM.
This is pretty common and certainly not an IBM invention. I believe most
Dolphin SCI Linux clusters are cabled like this. (In these cases I have also
seen the nodes numbered following the the interconnect position rather than
the physical position in the rack.)
The disadvantage is getting your head round the cabling in the second
dimension - you need a good diagram to follow. (and if a 3d torus...)
When discussing topologies don't foget to include the hypercube topology. Many
systems used this - most notably being the SGI Origin 2000 and the Origin
3000. The successor is the SGI Altix (a Linux cluster) but this uses a
switched fat-tree. Small Altix 350 configurations however use a 1d ring to
link the compute nodes.
Finally when comparing interconnect topologies consider the subject of
performance tuning. With a fat-tree or a full-crossbar, MPI performance
should be identical no matter which set of nodes of your cluster you use for
your application. For a hypercube or a 2d/3d torus performance varies
depending on which nodes your application used for a particular run - making
it very hard to get repeatable timings and hence optimise your code.
Likewise for a full fat tree, your application performance is not affected
by other jobs running on the same cluster (since there are independant routes
through the network). With the other two toplogies, application runs can be
slowed down by other people's parallel jobs. I remember this annoying users
who paid for their HPC usage by wallclock.
Daniel.
--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505
----------------------- www.quadrics.com --------------------
More information about the Beowulf
mailing list