[Beowulf] Curious about ECC vs non-ECC in practice
deadline at eadline.org
Fri May 20 09:35:26 PDT 2011
While this is somewhat anecdotal, it may be helpful.
Not a large-ish cluster, but as you may guess, I wondered
about this for Limulus
I wrote a script (will post it if anyone interested)
that runs memtester until you stop it or it finds
a error. I ran it on several Core2 Duo systems
with Kingston DDR2-800 PC2-6400 memory.
As I recall, I ran it on 2-3 systems, only
one showed an error. I stopped the others
after about three weeks. Here is an example of the
script output when it fails (it logs the
There was an error, inspect memtest-1178
Start Date was: Mon Apr 20 16:04:35 EDT 2009
Failure Date was: Fri May 8 17:55:43 EDT 2009
Test ran 1178 times failing after 1561868 Seconds
(26031 Minutes or 433 Hours or 18 Days)
My experience in running small clusters
without ECC has been very good. IMO it is also
a question of the quality of the memory vendor.
I never had an issue when running tests and
benchmarks, which I do quite a bit on new
> Hi folks
> Does anyone run a large-ish cluster without ECC ram? Or with ECC
> turned off at the motherboard level? I am curious if there are numbers
> of these, and what issues people encounter. I have some of my own data
> from smaller collections of systems, I am wondering about this for
> larger systems.
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics, Inc.
> email: landman at scalableinformatics.com
> web : http://scalableinformatics.com
> phone: +1 734 786 8423 x121
> fax : +1 866 888 3112
> cell : +1 734 612 4615
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Beowulf