[Beowulf] Multicore Is Bad News For Supercomputers
Bogdan Costescu
Bogdan.Costescu at iwr.uni-heidelberg.de
Mon Dec 8 11:32:07 PST 2008
On Fri, 5 Dec 2008, Prentice Bisbal wrote:
> Dell and others advertise systems that support up to 128 GB RAM, but
> I have yet to meet someone who can afford to put all 128 GB RAM in a
> single box.
Rather than saying "we're doing this for a long time", I'll mention
that we've had lots of problems with some AMD Opteron based systems.
We've always filled up all possible memory slots with the highest
capacity (but still payable ;-)) memory modules in mainboards with 4
or 8 sockets; this allowed f.e. reaching 64GB in 2006 and 128GB in
2007, but created lots of problems with instability under load.
Although we've been given many assurances that the configurations were
fully supported by CPU, mainboard and memory manufacturers, in
practice random memory errors occured and they could only be
eliminated by running the memory at a lower speed or halving the
memory size - unacceptable as these computers were by contract
required to run the full memory at the full speed. Some of the
involved manufacturers denied any knowledge of problems on similar
configurations, only to say 6 months later that such problems do exist
in many cases; after having many memory modules, CPUs and mainboards
exchanged, we could have arrived to the same conclusion by ourselves ;-|
For the latest purchase of this type, we have chosen a Tier 1 vendor
and also changed the memory architecture to Intel shared bus - but for
a different reason - and so far the 128GB didn't show any errors. Hope
they stay that way ;-)
--
Bogdan Costescu
IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
E-mail: bogdan.costescu at iwr.uni-heidelberg.de
More information about the Beowulf
mailing list