[Beowulf] Not quite Walmart, or, living without ECC?

Joe Landman landman at scalableinformatics.com
Tue Nov 27 12:35:53 PST 2007


David Mathog wrote:
> Michael Will wrote:
> 
>> We have found that linpack is by far the better memory tester than
>> Memtest86+. 
> 
> So now we have a report of a second method that finds more memory 
> problems than memtest86+.  Can somebody please shed some light on why

Hi David:

   We have been using some GAMESS runs for about 3 years now.  Causes 
systems to generate MCEs at prodigious rates if the memory system is 
flaky.  These are memory systems that pass memtest86+ and other "acid" 
tests, that the vast majority of cheap-box-shippers use as their only 
burn-in test.

> these two programs find defects in memory that memtest86+ doesn't?
> Or is it that they find defects in other parts of the hardware,
> external to the actual RAM, which manifest as memory errors?  The
> key distinction being that swapping memory sticks will cure the
> former but not the latter.
> 
> In any case I'd like to know what it is about linpack/memtester which
> lets them find memory faults that memtest86+ doesn't.  Presumably

Memtest and fellow travelers access memory in a very regular manner. 
Which is unlike the way most programs access memory.  If you are going 
to test, it makes sense to test the way you are going to use the system. 
  Not only will the performance data be more realistic*, but the system 
test will in fact be based upon real runs that you intend to use the 
systems for.  This is one of the best ways to uncover real 
design/implementation issues around.

* I am not aware of anyone whose job is to run Linpack all day long, so 
it as a test case is at best artificial, and the "performance" data you 
get from it may not carry over to what it is you are doing.


-- 

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615




More information about the Beowulf mailing list