[Beowulf] torus versus (fat) tree topologies
hahn at physics.mcmaster.ca
Sun Nov 14 12:52:45 PST 2004
> I guess I haven't mentioned it yet but I'm a PhD student in the
> Mechanical and Aerospace Engineering department at Syracuse University
> in upstate New York. Prior to my arrival here I only had superficial
> knowledge of clustering and have subsequently spent the last year
> researching, reading, configuring, testing, etc ... all of this while
> working on my PhD research (CFD). So I'm essentially the administrator
> _and_ major user of it. I have to admit it's kinda nice to have almost
> exclusive use of that much horsepower (64 Opteron 242's) for my work!
that's quite reasonable if you're planning to use off-the-shelf hardware.
that is, if you're an engineer doing research using HPC. if you're
actually doing research into the implementation of CFD using HPC,
then you should probably look a bit closer at adaptivity, for instance,
which winds up making FEM much less nearest-neighbor...
> benefit to torus topology for this case it might be an option. BTW, a
> managed HP Pro/Curve (forget model) 36-port gigabit switch is currently
> used, which possibly may also be hindering performance.
the port count indicates that's an older-generation switch, probably
with poorer bandwidth than current models.
> support for Fluent with their product. Dolphin has been _extremely_
> helpful in this respect, providing an SCI cluster for me to test Fluent
> and offering suggestions for running it (thanks Simen).
I'd be astonished if all of the tier-1 vendors didn't have a test cluster
available for your asking, probably with fluent installed.
> simply because the are not enough users of it. As a consequence, I am
> looking to find the "best" interconnect solution which will allow a few
> people use of most or all of the CPUs for the jobs we run.
there's always a danger of over-benchmarking, but you should probably
see if you can get access to an IB cluster. for CFM, I'm a little
surprised you appear to care so much about latency, since I'd expect
your workload to have the usual volume/surface-area scaling, and
thus doing a lot of work in a single node, and needing only moderate
bursts of bandwidth for nontrivial problem sizes.
from looking at list prices on the web, Myrinet, IB and Dolphinics have
similar per-port prices which are noticably lower than Quadrics
but also dramatically higher than gigabit. I suspect most people
would agree that Quadics is a latency specialist, at least for not
purely nearest-neighbor applications. OTOH, for cheap nodes, you
should probably consider whether spending 50% of the node price
makes sense for the performance boost. (I see 242-based servers
starting at around $2k list, and your total gigabit cost would be
less than $100/port.)
More information about the Beowulf