[Beowulf] Correct networking solution for 16-core nodes

Mark Hahn hahn at physics.mcmaster.ca
Tue Aug 1 21:44:10 PDT 2006

> With your previous suggestions 8 months ago we bought a Tyan S4881 server
> with 8 dual-core Opteron CPUs with 64GB RAM. Now we will buy new ones (2
> more for the time being) and we will eventually planning to form a cluster
> from these servers, which will have at most 8 boxes. Now, as you guess, the
> critical question that we seek for the answer is how to interconnect these
> fat nodes for max efficiency? Our parallel program is heavily loaded with
> many small-size and large-size P2P communications.

what does that mean, exactly?  you need to quantify your interconnect needs:
how often are programs waiting on file IO, for instance, and is that due to 
the fileserver being slow or the net?  similarly, how often are programs 
waiting on communication among MPI ranks, for instance, and how much of that 
weight is due to big messages (bandwidth)?

> I feel that we need a
> high performance network like Myrinet, IB, or Quadrics, not to waste 1us
> latency and high bandwidth that we observe betw the 16 CPU cores. If anyone

I intuit (totally without rigor!) that fatter nodes do increase bandwidth
needs, but don't necessarily change the latency picture.

if I had three fat nodes, I'd probably try to use a dual-port IB card 
in each, directly connected (no switches)...

