[Beowulf] OT: recoverable optical media archive format?

Thu Jun 10 13:11:36 PDT 2010

On Jun 10, 2010, at 3:20 PM, David Mathog wrote:

> Jesse Becker and others suggested:
> 
>>    http://users.softlab.ntua.gr/~ttsiod/rsbep.html
> 
> I tried it and it works, mostly, but definitely has some warts.
> 
> To start with I gave it a negative control - a file so badly corrupted
> it should NOT have been able to recover it.
> 
> % ssh remotePC 'dd if=/dev/sda1 bs=8192' >img.orig
> % cat img.orig      | bzip2 >img.bz2.orig
> % cat img.bz2.orig  | rsbep > img.bz2.rsbep
> % cat img.bz2.rsbep | pockmark -maxgap 100000 -maxrun 10000
>> img.bz2.rsbep.pox
> % cat img.bz2.rsbep.pox | rsbep -d -v >img.bz2.restored
> rsbep: number of corrected failures   : 9725096
> rsbep: number of uncorrectable blocks : 0
> 
> img.orig is a Windows XP partition with all empty space filled with
> 0x0 bytes.  That is then compressed with bzip2, then run
> through rsbep (the one from the link above), then corrupted
> with pockmark.  Pockmark is my own little concoction, when used as
> shown  it stamps 0x0 bytes starting randomly every (1-MAXGAP) bytes, for
> a run of (1-MAXRUN).  In both cases the gap and run length are chosen at
> random from those ranges for each new gap/run.
> This should corrupt around 10% of the file, which I assumed would render
> it unrecoverable.  Notice in the file sizes below that the overall size
> did not change when the file was run through pockmark.  rsbep did not
> note any errors it couldn't correct. However, the
> size of the restored file is not the same as the orig.
> 
> 4056976560 2010-06-08 17:51 img.bz2.restored
> 4639143600 2010-06-08 16:19 img.bz2.rsbep.pox
> 4639143600 2010-06-08 16:13 img.bz2.rsbep
> 4056879025 2010-06-08 14:40 img.bz2.orig
> 20974431744 2010-06-07 15:23 img.orig
> 
> % bunzip2 -tvv img.bz2.restored
>  img.bz2.restored: 
>    [1: huff+mtf data integrity (CRC) error in data
> 
> So at the very least rsbep sometimes says it has recovered a file when
> it has not.  I didn't really expect it to rescue this particular input,
> but it really should have handled it better.

I have never used this tool, but I would wonder if your pockmark tool damaged the rsbep metadata, specifically one or more of the metadata segment lengths. Bear in mind that corruption of the metadata is not beyond the realm of possibility, but I assume that the rsbep metadata is not replicated or otherwise protected.

> I reran it with a less damaged file like this:
> 
> % cat img.bz2.rsbep | pockmark -maxgap 1000000 -maxrun 10000
>> img.bz2.rsbep.pox2
> % cat img.bz2.rsbep.pox2 | rsbep -d -v >img.bz2.restored2
> rsbep: number of corrected failures   : 46025036
> rsbep: number of uncorrectable blocks : 0
> % bunzip2 img.bz2.restored2
> bunzip2: Can't guess original name for img.bz2.restored2 -- using
> img.bz2.restored2.out
> bunzip2: img.bz2.restored2: trailing garbage after EOF ignored
> % md5sum img.bz2.restored2.out img.orig
> 7fbaec7143c3a17a31295a803641aa3c  img.bz2.restored2.out
> 7fbaec7143c3a17a31295a803641aa3c  img.orig
> 
> This time it was able to recover the corrupted file, but again, it
> created an output file which was a different size.  Is this always the
> case?   Seems to be at least for the size file used here:
> 
> % cat img.bz2.orig | rsbep | rsbep -d > nopox.bz2
> 
> nopox.bz2 is also 4056976560.   The decoded output is always 97535 bytes
> larger than the original, which may bear some relation to the
> z=ERR_BURST_LEN parameter as:
> 
> 97535 /765 = 127.496732
> 
> which is suspiciously close to 255/2.  Or that could just be a coincidence.
> 
> In any case, bunzip2 was able to handle the crud on the end, but this
> would have been a problem for other binary files.

This is most likely a requirement of the underlying Reed-Solomon library that requires equal length blocksizes. If your original file is N bytes and N % M != 0 where M is the blocksize, I imagine it pads the last block with 0s so that it is N bytes. It should not affect bunzip since the length is encoded in the file and it ignores anything tacked onto the end.

A quick glance at his website, it claims that the length should be the same. He only shows, however, the md5sums and not the ls -l output.

Scott

> Tbe other thing that is frankly bizarre is the number of "corrected"
> failures for the 2nd case vs. the first.    The 2nd should have 10X
> fewer bad bytes than the first, but the rsbep status messages
> indicate 4.73X MORE.  However, the number of bad bytes in the 2nd is
> almost exactly 1%, as it should be.  All of this suggests that rsbep
> does not handle correctly files which are "too" corrupted.  It gives the
> wrong number of corrected blocks and thinks that it has corrected
> everything when it has not done so.  Worse, even when it does work the
> output file was never (in any of the test cases) the same size as the
> input file.
> 
> I think this program has potential but it needs a bit of work to sand
> the rough edges off.  I will have a look at it, but won't have a chance
> to do so for a couple of weeks.
> 
> Regards,
> 
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf