[Beowulf] Slection from processor choices; Requesting Giudence

Mon Jun 19 09:08:49 PDT 2006

On Sat, 2006-06-17 at 11:34, Mark Hahn wrote:
> > >> desktop (32 bit PCI) cards. I managed to get 14.6 HPL GFLOPS
> > >> and 4.35 GROMACS GFLOPS out of 8 nodes consisting of hardware
> > > ...
> > >> As a point of reference, a quad opteron 270 (2GHz) reported
> > >> 4.31 GROMACS GFLOPS.
> > > 
> > > that's perplexing to me, since the first cluster has semp/2500's,
> > > right?  that's a 1.75 GHz K8 core with 128K L2 and 64b memory
> > > interface.  versus the same number of 2.0 GHz, 1M cores each with
> > > 4x 128b memory.  I really wouldn't expect them to be that close - 
> > > any speculation on why GROMACS runs so poorly on the much better
> > > SMP machine?
> > 
> > <googling for motherboard specs>
> > Aha, Socket 462. The Semprons he used are K7 based.
> 
> OK, even more so - how does an even older cpu with lower clock,
> slower memory and only gigabit interconnect beat a quad-opt.
> it seems like some other factors were determining performance.

  Could anyone comment on what version of GROMACS was being used?  This
code tends to use computational kernels coded in assembly, and until 3.3
released, there were no x86_64 assembly kernels.  The configure script
wasn't smart enough to figure out that an Opteron could run the x86
assembly, so it would use auto-generated fortran or C code instead.

  Hacking the configure script to use x86 assembly and compiling 32-bit
resulted in about a 2x speedup on Opteron.  I believe the new x86_64
kernel and 64-bit compilation in GROMACS 3.3 is even faster, but I don't
have numbers on hand.

-Kevin

> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf