[Beowulf] OT: recoverable optical media archive format?

Thanassis Tsiodras ttsiodras at gmail.com
Sun Jun 13 00:19:56 PDT 2010


I am the author of the updated "rsbep" package that was mentioned in
this thread, and I was contacted by David Mathog (just "mathog"
henceforth) about the issues he had.

"mathog" reported a difference in the decoded output sizes, when
directly using "rsbep" - but as mentioned by David N. Lombard, the
usage scenario that "mathog" followed was not a sanctioned one: both
my site's article as well as the package instructions (README)
referred to the "freeze"/"melt" scripts, that decode the shielded data
into the correct output size.

The reason for what mathog experienced is a bit complex, but clearly
explained on my site
(http://users.softlab.ece.ntua.gr/~ttsiod/rsbep.html), and it boils
down to this: in order to withstand storage errors, an interleaving of
the Reed-Solomon encoded data has to take place.

Basically, the x86 ASM code of Reed-Solomon that I "inherited" from
the original "rsbep" and use in my package, adds 16 bytes of parity
data to each block of 223 bytes of input, turning it to a 255-bytes
block. These parity bytes allow detection and correction of 16 errors
(in the encoded 255-byte block), as well as detection of 32 errors (in
the encoded 255-byte block). This however won't work for storage
media, since they work or fail on sector boundaries (512 bytes for
disks and 2048 bytes for CDs/DVDs) - so the encoded data are
interleaved by my package, inside blocks of 1040400 bytes (containing
4080 of the Reed-Solomon-encoded 255-byte blocks)... In this way, a
loss of a sector only impacts ONE byte in the 512 encoded "blocks"
that are passing through it (due to the interleaving)... If
interested, you can read more details on my page, where I explain how
the idea works.

The end result, is that
- the interleaved stream can lose 127 contiguous sectors (65024
contiguous bytes) and still be recoverable.
- the interleaved stream can lose 128-255 sectors, and detect the
error (and report it, but not fix it)
- Beyond that number of errors (which correspond, after
de-interleaving, to more than 32 bytes in the encoded 255 byte block),
the Reed-Solomon code is lost...  Given the interleaving that my
package performs on the encoded bytes, the only chance of this
happening, is losing a contiguous stream of more than 32x4080 bytes,
i.e. 130560 bytes. A storage error that causes this much loss (255
contiguous sectors!) is a lost cause anyway - at least as far as my
needs go. If you want to be able to recover from this or even larger
amounts of loss, you can do it, by increasing the block size from my
chosen 255x4080 (1040400 bytes) to something even bigger, and by
adapting my interleaving code (rsbep.c, "distribute" function).

To summarize, "mathog"'s pockmark app is not representative of what
happens in storage media - they NEVER fail on byte-levels - they fail
on sector levels.

So what should you do, if you want to be 100% sure of failure detection?

Simple:
By reviewing my freeze/melt scripts, you will see that all I do to the
"to-be-shielded-stream" is (a) add a "magic marker" and (b) add the
file size, so that "melt.sh" can chop the output down to the right
size. If you want bullet-proof validity checks, you can easily add the MD5 or
SHA sum of the input data, to the "to-be-shielded-stream", so that the
"melting" process can check this and be 100% certain in restoration or
detection of failure, even in the face of impossible stream corruption
(more than 130K lost).

Note however, that this is not necessary if you use an algorithm that
can detect errors in the decoded stream (which is how I use my rsbep -
i.e always on a stream generated by gzip, bzip2, etc)

Hope this clarifies things.

Kind regards,
Thanassis Tsiodras, Dr.-Ing.

-- 
What I gave, I have; what I spent, I had; what I kept, I lost. -Old Epitaph



More information about the Beowulf mailing list