gromacs benchmark and quad Xeons
math at velocet.ca
Tue Jun 18 22:31:18 PDT 2002
On Tue, Jun 18, 2002 at 04:28:04PM -0500, Richard Walsh's all...
> On Tue, Jun 18, 2002 at 02:14:02PM -0400, Velocet wrote:
> > What's wrong here? Any ideas?
> My first reaction is to wonder what your memory foot print is ...
> the bandwidth of all quad's I have seen (except the ES-45 with
> Compaq's typhoon chipset and cross-bar) has been lousy. If you
> are not running mostly/completely in cache this could be an
> issue. Competition from all four CPUs for bandwidth could kill
> Have your tried running on one CPU? Can you shrink the test
> case to make sure it is cache and then look at the performance?
Well, we went and recompiled everything by hand with all proper options
and got it to run.
We did some quick tests on the d.dppc benchmark with only 500 steps ('cuz
we dont have all night :)
we were seeing this:
scaling vs 1cpu scaling vs 2cpu
1 cpu 94 ps/day 94 / cpu 100%
2 cpu 144 ps/day 72 / cpu 77% 100%
4 cpu 228 ps/day 57 / cpu 61% 79%
the scaling is about 75-80% of half as many cpus (per cpu). Quite odd. Im sure
we must have something misconfigured, or, as you say, it could be that the
memory bandwidth is being thrashed. 94 ps/day for a single cpu is already
phenomenal - my 1.33Ghz Tbirds were giving me 63 or so on my Tyan 2466s and
60.4 on the PcChips M817 LMR. 94 is 1.5x faster for a mere 1.2 increase in
clock. (1.6 Ghz Xeons here, with their sweet 256/512/1M l1/2/3 caches)
For comparison, with dual CPU tyan 2466s and 1.333 Ghz Tbirds (test setup to
see if tbirds worked on 2460s/66s) with Ns83820 GBE (direct connect, no
switch) we saw:
1 cpu 63 ps/day 63 / cpu 100%
2 cpu 179 ps/day 90 / cpu 141% (superlinear!) 100%
4 cpu 310 ps/day 78 / cpu 123% (still super) 87%
So perhaps thats why.
Here's the machine: (1.6 Ghz xeons, 8 gb ram)
not that its cost effective for a cluster ;)
More information about the Beowulf