Take any two: motherboard performance, compatibility, value

Jakob Østergaard jakob at ostenfeld.dtu.dk
Wed Jun 28 13:28:58 PDT 2000


On Wed, 28 Jun 2000, Josip Loncaric wrote:

> Hello,
> 
...
> 
> So we are back to square one, i.e. 440BX SMP motherboards and PC100
> RAM.  This is Not Good.  High RAM bandwidth is essential, particularly
> on dual P3/800 machines (faster clock, smaller cache)...

We bought a new dual 550 PIII at work recently, and ended up using good
old Asus P2B-D (BX based).  It seemed to be the only real affordable 
and known-stable solution.

> BTW, I see that ECC corrects about one single bit error per month in
> 12GB of RAM.  Our total system will have close to 40GB, so errors could
> pop up weekly, which is why we need ECC.  

Are you absolutely certain that ECC RAM on PC hardware actually *corrects*
bit errors ?

There was a short discussion on this subject on the linux-kernel list some
weeks ago, where someone stated that ECC RAM (for PCs) can only *detect* a
parity error and offer you an NMI when that occurs.   Noone seemed to object to
this.

Yes, I know what ECC stands for, but think about it:  ECC RAM cost about
the same as normal parity-RAM,  why ?    It seemed that the conclusion was
that if you wanted error correction you should go for non-PC hardware.

The statement came up in a memory-detection discussion where someone who had
hooked a logic analyzer on a motherboard found that NT detects the amount of
memory available on a system by  1)  disabling RAM parity check,  2)  writing
until it sees an error,  and then *never* enabling RAM parity again.   Someone
found it amusing that NT didn't ever enable the parity check again, then
someone else pointed out the above, that ECC didn't help you much anyway except
assuring that the kernel would die when a bitflip occured.   The latter may of
course still be preferable to random un-noticed bitflips in data.   But the
essense of the argument was, that ECC RAM did *not* correct bit errors if it
was PC ECC RAM.

Does anyone have further information on this ?    I don't know anything about
this myself, but the price argument seems reasonable, and I guess you could
count the number of chips on your RAM modules to find out if it really has
enough bits for error correction, or only the extra one needed for parity.

-- 
................................................................
: jakob at ostenfeld.dtu.dk  : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:




More information about the Beowulf mailing list