[Beowulf] Slection from processor choices; Requesting Giudence

Thu Jun 15 22:43:09 PDT 2006

> > > Initially, we are deciding to use Gigabit ehternet switch and 1GB of 
> > >RAM at
> > >each node.
> 
> that seems like an odd choice.  it's not much ram, and gigabit is 
> extremely slow (relative to alternatives, or in comparison to on-board
> memory access.)

This is a common misconception that people make (and Mark is one of the
best on this list). I'm not directing my comments at Mark, but using his
comment as a platform for my soapbox :) Ths misconception is that you need a
low-latency network for CFD codes because of the message sizes.

Let me spew some benchmarks I've seen.

- One CD  code that I've benchmarked got over 80% scaling at 200 CPUs
(2 CPUs per node) with plain GigE.
- This same code only got about 12% faster performance on Myrinet 2G than
GigE. It also only got a few percent better than Myrinet 2G with IB. Infinipath
was about the same as IB but maybe a bit faster.
- The same code only lost about 1% in performance in switching to dual-core
compared to single-core CPUs out to about 16 or 32 CPUs total (the was the limit of
the testing). This isn't a network related benchmark, but I thought I would
thow it out or fun :)
- On Overflow2, we've seen IB and IP to be about the same (at least well
within the noise) for the size problems we've tested (fairly large) and the
range of CPU counts.

   We have some experience with other CFD codes such as Star-CD, Fluent,
CFD++, CFL3D, Overflow2, etc. and they all return about the same general
trends. They all have the same gross trends althought there are some
differences. Many of the differences have to do with the algorithms used and
the implementations of the algorithms. For example, is the code structured or
unstructured? Does it do overlapping (chimera) grids? How "overlapped" are
the grids? How load-balanced is the problem and the algorithm? Are you doing
node-centered or cell-centered (in the case of unstructured). There are
many things to consider. In many cases GigE is good enough and you can't
beat the price. Level 5 looks very attractive and may be the price/performance
king. GAMMA is pretty cool as well although I don't have any benchmarks (yet).
Myrinet 2G is good as well and is close to the price/performance king. IB,
IP, Quadrics are all good as well, but they may not be the best in terms of
price/performance (I even have one benchmark where IB is slower than GigE.
I'm still trying to explain that one :)  ).

Here are some other observations:

- The network/MPI combination is fairly critical to good performance and to
price/performance. I have done some benchmarks where the right MPI library
on GigE produces faster results than a bad MPI library on Myrinet. Seems
counter-intuitive, but I've seen it.
- The MPI is pretty important to good performance particularly on GigE. One
benchmark I did showed over a 2X in performance on LAM compared to
MPICH1. I know MPICH1 is really old and we have a shiny new MPICH2, but
you would be suprised how many people start with MPICH1 and how many
people stick with it.

So, I'm not picking on Mark but I wanted to throw out some random observations
I've made over the years. Not that I'm an expert but I've got a few CFD/cluster
bruises and thought I would show people why I got them :)

Jeff