[Beowulf] Multicore Is Bad News For Supercomputers

Bogdan Costescu Bogdan.Costescu at iwr.uni-heidelberg.de
Mon Dec 8 11:32:07 PST 2008

On Fri, 5 Dec 2008, Prentice Bisbal wrote:

> Dell and others advertise systems that support up to 128 GB RAM, but 
> I have yet to meet someone who can afford to put all 128 GB RAM in a 
> single box.

Rather than saying "we're doing this for a long time", I'll mention 
that we've had lots of problems with some AMD Opteron based systems. 
We've always filled up all possible memory slots with the highest 
capacity (but still payable ;-)) memory modules in mainboards with 4 
or 8 sockets; this allowed f.e. reaching 64GB in 2006 and 128GB in 
2007, but created lots of problems with instability under load. 
Although we've been given many assurances that the configurations were 
fully supported by CPU, mainboard and memory manufacturers, in 
practice random memory errors occured and they could only be 
eliminated by running the memory at a lower speed or halving the 
memory size - unacceptable as these computers were by contract 
required to run the full memory at the full speed. Some of the 
involved manufacturers denied any knowledge of problems on similar 
configurations, only to say 6 months later that such problems do exist 
in many cases; after having many memory modules, CPUs and mainboards 
exchanged, we could have arrived to the same conclusion by ourselves ;-|

For the latest purchase of this type, we have chosen a Tier 1 vendor 
and also changed the memory architecture to Intel shared bus - but for 
a different reason - and so far the 128GB didn't show any errors. Hope 
they stay that way ;-)

Bogdan Costescu

IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
E-mail: bogdan.costescu at iwr.uni-heidelberg.de

More information about the Beowulf mailing list