[Beowulf] Slection from processor choices; Requesting Giudence

Geoff Jacobs gdjacobs at gmail.com
Sat Jun 17 11:21:32 PDT 2006


Mark Hahn wrote:
>>>> desktop (32 bit PCI) cards. I managed to get 14.6 HPL GFLOPS
>>>> and 4.35 GROMACS GFLOPS out of 8 nodes consisting of hardware
>>> ...
>>>> As a point of reference, a quad opteron 270 (2GHz) reported
>>>> 4.31 GROMACS GFLOPS.
>>> that's perplexing to me, since the first cluster has semp/2500's,
>>> right?  that's a 1.75 GHz K8 core with 128K L2 and 64b memory
>>> interface.  versus the same number of 2.0 GHz, 1M cores each with
>>> 4x 128b memory.  I really wouldn't expect them to be that close - 
>>> any speculation on why GROMACS runs so poorly on the much better
>>> SMP machine?
>> <googling for motherboard specs>
>> Aha, Socket 462. The Semprons he used are K7 based.
> 
> OK, even more so - how does an even older cpu with lower clock,
> slower memory and only gigabit interconnect beat a quad-opt.
> it seems like some other factors were determining performance.
Well, each Opteron core would have to split it's local memory pool with
it's sister, so pure bandwidth would be similar. The memory controller
on the Opteron would give a latency bonus, but the registered DIMMs
would incur a penalty. The Socket A motherboards are using an SIS
chipset which might be a little more tuned.

If the application largely factored out the interconnect, I could accept
the results being this close. But you're right. HT is so much better for
inter-process communication, and GROMACS should derive a big advantage
from it.

-- 
Geoffrey D. Jacobs

Go to the Chinese Restaurant,
Order the Special



More information about the Beowulf mailing list