Memory selection

Robert G. Brown rgb at phy.duke.edu
Thu Jan 16 08:59:20 PST 2003


On Thu, 16 Jan 2003, Dave Lane wrote:

> Hi all,
> 
> I'm going to be building a AMD MP-based beowulf system this spring and have 
> a question or about memory selection for the nodes.
> 
> First question is which type of memory to choose: ECC, non-ECC, Registered 
> or not. My understanding that registered memory is slower, but allows for 
> more capacity (more chips) on a DIMM. This should not be an issue for me, 
> since I expect that 512M in each node will be enough.

Read AMD's webpage(s) for their memory recommendations and follow them.
AMD MP's in my experience are very sensitive, period, to just about
everything in their engineering recommendations.  Use an "approved"
power supply, approved memory, approved motherboard.

My recollection (without digging out a motherboard manual or rechecking
their website myself) is that the recent dual AMD's all require
registered ECC, and pretty high quality recc at that.  As always YMMV
and somebody will probably chime in with how they tried other memory
types and it worked, but our luck in that regard has not been good.

There are still plenty of memory vendors that make dimms that meet AMD's
specs.  ECC runs a bit ($50?) above non-ECC for 512 MB PC2100 DIMMS;
registered ECC costs about $10 more than that, or in the ballpark of
$200 for 512 MB (probably less if you pricewatch for bleeding edge lows
-- these are OTC retail). 

Quite a bit more than SDRAM...

> Second question is if memory errors occur with ECC memory does Linux know 
> about and report problems in the logs? (this does occur on Sun Solaris 
> systems)?

I think this is a FAQ -- do a search on the list archives, as I think
Don Becker (?) may have answered it close to two or three times now, and
has been generally discussed fairly extensively.;-)

If I recall the discussion correctly, some listvolken swear by ECC and
run code on systems where they see memory errors turn up.  Others use
any-old memory and don't see the errors, but neither do they see
overwhelming evidence that their systems are constantly becoming
corrupted.  But I could be misremembering.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu






More information about the Beowulf mailing list