[Beowulf] Tyan S2882

serguei.patchkovskii at sympatico.ca serguei.patchkovskii at sympatico.ca
Mon Oct 2 08:31:24 PDT 2006

Thomas Geibhardt wrote:
: cpu family      : 15
: model           : 33

This is the same model as our 875s. It should support 8 ranks of DDR400 (and not just DDR333). Your BIOS seems to be a little overcautious, in fact.

: It is very difficult to test that since we cannot trigger the 
: crashes reliably. The cluster is now running stable for more 
: than a week. If I'd slow down the the memory bus speed it would
: take months to get a statistically significant conclusion. On 
: the other side an average rate of 2 crashes per week is rather
: annoying.

We found that enabling L2 and memory scrub greatly increases the probability of slightly dodgy DIMMs failing. We use 655 and 81.9 microseconds respectively for L2 and main memory scrub interval. We also enable memory scrub redirect (which seems to be necessary to avoid POST failures on cold boot) and chipkill.


