[Fwd: Re: [Beowulf] ECC exerciser/exorciser?]

Joe Landman landman at scalableinformatics.com
Mon Jan 26 15:33:06 PST 2009


Tony Travis wrote:
> Excluded by a SPAM filter + reposted, by the list owner's request :-)

Hi Tony:

   I am on gmail.com as joe.landman if spam filters are doing bad things ...


> Joe Landman wrote:
>> Mark Hahn wrote:
>>
>>> - do you have or know of a good exerciser for testing ECC's?  yes, I
>>> know about memtest86, but I'm more curious about a load that could be
>>> run under
>>> linux.  my thinking is that ecc's are triggered by bad reads, so something
>>> which allocates all memory and then continually reads it would be best.
>> Thats memtest.  We found it doesn't trigger MCEs, and often will report
>> a system as good, that once it leaves the lab, generates lots of MCEs on
>> customer code.  So we run specific codes (GAMESS and others) to burn in
>> the machine.
> 
> Hello, Joe.
> 
> Do you mean Memtester?
> 
>         http://pyropus.ca/software/memtester/

There are two that I know of ... memtest and memtest86, one of which is 
a fork of the other.  While I like both for coarse testing, we run a 
bunch of GAMESS runs to burn nodes in.  Some folks like HPL for this.  I 
like large dense matrix computations that pound on the memory subsystem.


> I stress test non-ECC memory in our compute nodes by running 100
> memtester passes on 128MB of the available RAM. This test often reveals
> problems in the memory management system that an initial 24h memtest86+
> burn-in on all the memory on a node doesn't detect. Memtester is a more

This is good to hear (that others find memtest86 and alike don't trigger 
the errors that end users/customers see in the field).

> empirical stress test than Memtest86+, but I believe it's more realistic
> and I chose 128MB as typical for the type of jobs running on our system.

Excellent.

I really like running end user code as a test.  GAMESS is one, probably 
some Gromacs and other similar things (NAMD, BLAST, HMMer) as well. 
Combined with Octobonnie, it makes for some really good loads on machines :)

Right now we have customers hammering on JackRabbits using 15-20 
simultaneous bonnies over channel bonded gigabit.  A little stress test. 
  I prefer to stress it in lab, because its harder to fix it in the field.

Joe

Joe

> 
> Bye,
> 
>         Tony.
> --
> Dr. A.J.Travis, University of Aberdeen, Rowett Institute of Nutrition
> and Health, Greenburn Road, Bucksburn, Aberdeen AB21 9SB, Scotland, UK
> tel +44(0)1224 712751, fax +44(0)1224 716687, http://www.rowett.ac.uk
> mailto:a.travis at abdn.ac.uk, http://bioinformatics.rri.sari.ac.uk/~ajt
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the Beowulf mailing list