Memory type? (ECC vs non-ECC)

alvin at Maggie.Linux-Consulting.com alvin at Maggie.Linux-Consulting.com
Fri Aug 17 15:11:06 PDT 2001


hi ya

ecc or not ? ?donno

I don't think ecc memory will help us in our case...
- just had another system that died(crash) last week....
  ( a regular PC off the shelf )
  its been up and running for about 5 yrs.. and we think its
  bad memory....
	- it'd e2fsck the disk...over and over during the
	reboot...

	- problem is bad memory....so it'd e2fsck the good disk
	with bad data in memory.... it wound up trashing the disk

--- get good quality memory... 
	cheap is NOT the way to spare a few $$$ when your
	data integretity is at stake

	- ie know the name brand of the sticks of the simms 
	and the chips used on the sticks

	- we prefer the more expensive "Century xxMb SDRAM" 

-- backup your "important data"...

have fun
alvin
http://www.Linux-1U.net




On Fri, 17 Aug 2001, Jared Hodge wrote:

> Dan,
> 	Ok, first with what ECC is.  Error Correction Circuitry.  How will this
> affect performance?  As far as speed, they run about the same (ECC may
> even be a little slower).  The issue is reliability.  We had a few
> rounds of E-mails on how often errors occur in non-ECC memory chips a
> few months ago (and it's affected by climate, altitude, EMI radiation,
> solar flares, someone breaking wind near the machines, bla, bla, bla). 
> Anyway, I don't think you want us to launch that conversation again, but
> the thing is that with a single machine, non-ECC is typically fine
> (except for mission critical servers, etc.) since the time between
> errors is so great.  The problem is that with a cluster, you have so
> many memory chips that the time between failures (of any one of them) is
> significantly less.  I guess the question is how big is the cluster and
> how much do you lose if you have to restart?  We've got an 8 node
> cluster, 4 GB RAM total that seems to work fine without ECC.  We're
> getting a larger cluster with 24 GB RAM total and going with ECC.  The
> larger the cluster, the more you need ECC.  Also, if you're running
> problems that take many days to complete, go with ECC.  If you're
> running checkpoints, or individual problems only take a few hours, you
> can go with non-ECC.  Hope this helps.
> 
> Jared
> 
> Dan Kirkpatrick wrote:
> > 
> > We're finalizing our specs for our next beowulf cluster... and I had a
> > question...
> > 
> > ECC or non-ECC memory?  Motherboard "supports" ECC memory mode... although
> > non-ecc memory is cheaper so we can get more...
> > How does this realistically affect performance?
> > 
> > comments?
> > Thanks
> > 
> > =======================================================
> > Dan Kirkpatrick                   dkirk at physics.syr.edu
> > Computer Systems Manager
> > Department of Physics
> > Syracuse University, Syracuse, NY
> > http://www.physics.syr.edu/help/    Fax:(315) 443-9103
> > =======================================================
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> -- 
> Jared Hodge
> Institute for Advanced Technology
> The University of Texas at Austin
> 3925 W. Braker Lane, Suite 400
> Austin, Texas 78759
> 
> Phone: 512-232-4460
> Fax: 512-471-9096
> Email: Jared_Hodge at iat.utexas.edu
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 





More information about the Beowulf mailing list