[Beowulf] ECC

Reuti reuti at staff.uni-marburg.de
Mon Nov 5 09:33:30 PST 2012

Am 05.11.2012 um 18:23 schrieb Douglas Eadline:

>> Am 05.11.2012 um 17:50 schrieb Douglas Eadline:
>>> --snip--
>>>>> More interesting is the ECC discussion.
>>>>> ECC is simply a requirement IMHO, not a 'luxury thing' as some
>>>>> hardware engineers see it.
>>>> Depends on your computational model.  Would you rather spend money on
>>>> ECC
>>>> or on more processors?
>>>> ECC comes at a cost in speed as well.  There is some non-zero time
>>>> required to compute the syndrome bits and do the correction on the
>>>> read.
>>>> Sure, you can pipeline it, but there's some extra latency inevitably
>>>> added.
>>> I find it interesting that many users thought GPU's could not be
>>> a research tool unless they had ECC memory. I have one associate who
>>> turns it off because they get 10% better performance on their
>>> Amber runs.
>> Turned if off in the BIOS or installed non-registered memory? With my
>> tests I couldn't see any difference in execution time whether the
>> installed ECC memory is switched off or on (or even which type of error
>> correction I set up in the BIOS). Comparing registered and non-registered
>> memory would be a more understandable difference in execution time.
>> Several CPUs also slow down memory access if many DIMMs are installed, so
>> it seems to be better to use larger and hence fewer memory modules - which
>> might be more expensive though.
> Turned off in GPU BIOS, see bottom of page:
>  http://ambermd.org/gpus/#Max_Perf

Ah, thx. You were referring to GPU memory, while I meant usual main memory. Question could be: who is verifying the ECC on a GPU card?

-- Reuti

More information about the Beowulf mailing list