[Beowulf] dedupe filesystem

Rahul Nabar rpnabar at gmail.com
Thu Jun 25 11:09:19 PDT 2009

On Tue, Jun 2, 2009 at 12:39 PM, Ashley Pittman <ashley at pittman.co.uk>wrote:

> Fdupes scans the filesystem looking for files where the size matches, if
> it does it md5's them checking for matches and if that matches it
> finally does a byte-by-byte compare to be 100% sure.

Why is a full byte-by-byte comparison needed even after a md5 sum matches? I
know there is a vulnerability in md5 but that's more of a security thing and
by random chance super unlikely , right?

Or, why not use another checksum that is as yet not vulnerable? SHA1? SHA2?
etc.? Or are they way too expensive to compute?

Just curious....

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20090625/b307f205/attachment.html>

More information about the Beowulf mailing list