[Beowulf] recommendations for cluster upgrades

Jan Heichler jan.heichler at gmx.net
Sun May 17 04:04:39 PDT 2009

Hallo Tiago,

Sonntag, 17. Mai 2009, meintest Du:

On Sat, May 16, 2009 at 11:56 PM, Rahul Nabar <rpnabar at gmail.com> wrote:

On Sat, May 16, 2009 at 2:34 PM, Tiago Marques <a28427 at ua.pt> wrote:
> One of the codes, VASP, is very bandwidth limited and loves to run in a
> number of cores multiple of 3. The 5400s are also very bandwith - memory and
> FSB - limited which causes that they sometimes don't scale well above 6
> cores. They are very fast per core, as someone mentioned, when compared to
> AMD cores.

Thanks Tiago. This is super useful info. VASP is one of our major
"users" too. Possibly 40% of the cpu-time. Rest is a similar
computational chemistry code, DACAPO.

It would be interesting to compare my test-run times on our
AMD-Opterons (Barcelona). Is is possible to share what your benchmark
job was?

I'll try to talk to the user who crafted it for me before, but it should be no problem to pass it to you after.


Since you mention VASP is bandwidth limited do you mean memory
bandwidth or the interconnect? Maybe this question itself is naiive.
Not sure. What interconnect do you use? We have gigabit ethernet dual

Memory bandwith, as you can see by the performance gain from going to 1600MHz from 1066, even with looser timings IIRC.
Of course interconnects also play a role, even internal ones, which in the case of Xeons was a very slow FSB.

I use single GbE because for as much as I could benchmark, I hardly found anything that could use more than one node efficiently and no one - not even here - could help me with that. Seems I need infiband. 
I only managed to increase 33% with two nodes when using a really huge job(+100k atoms) on Gromacs.

For VASP you should look for ConnectX or InfiniPath. InfiniHost III scales badly for the scenarios i saw. It is probably because of the use of collectives. 

 Which brings to a point that I forgot to mention to you. When considering Intel machines, you can always get a compiler license for $2000, give or take,

2000 USD sounds rather expensive. Node locked licenses are usually cheaper... Look for the package with Compilers, MKL and MPI - the Cluster Toolkit. Is definitely worth it (when buying more than just a single machine).

