[Beowulf] Opteron performance
Greg Lindahl
lindahl at pathscale.com
Sat Nov 27 16:18:49 PST 2004
On Fri, Nov 26, 2004 at 11:29:29AM -0000, Kozin, I (Igor) wrote:
> What bothers me primarily is that you have to run a benchmark
> many more times than usual to get the best performance on an Opteron.
We haven't observed this, but we do follow a few basic rules:
1) 2.6 kernels are better than 2.4 kernels at NUMA
2) Set your bios to have "node interleave off", to improve
scaling at the potential cost of less uniformity.
3) Programs that use a lot of memory need to pre-run a
program to touch a lot of memory; this pages out existing
pages and lets the kernel balance everything as it comes
back in. This is worth a few % even for serial SPECcpu runs, and
is a must for multi-cpu things like SPECrate.
4) If you use more memory than is on 1 CPU with a single
process, expect trouble.
There's nothing really new here, you've had to do similar things
on SGI machines, and big SMPs, for ages.
Following these rules, we've gotten excellent repeatability and
scaling on 2-cpu and 4-cpu boxes on a lot of codes, both MPI and
OpenMP.
-- greg
More information about the Beowulf
mailing list