[Beowulf] Rsync - checksums

Bill Wichser bill at princeton.edu
Tue Jun 18 06:16:39 PDT 2019


Stock RH 7 version, rsync-3.1.2-6.el7_6.1.x86_64.  We've tried a number 
of recompiles.  gcc, Intel.  The only thing between identical compiles 
was the md4 vs md5.

/bin/rsync -lptgoDAH -v --numeric-ids -d --relative --delete 
--delete-after --files-from=...

I'm not asking for help.  Just if anyone had attempted to change the 
algorithm into something much faster.

I refer you to this project https://cyan4973.github.io/xxHash/ where 
there is a table of speeds.  Regardless of what anyone might speculate, 
we are pursuing this route of changing out the algorithm.  Maybe it's 
all for naught.  Maybe it isn't.  But in a few weeks hopefully we'll 
have determined.

Thanks all,
Bill

On 6/18/19 9:02 AM, Ellis H. Wilson III wrote:
> On 6/18/19 6:59 AM, Bill Wichser wrote:
>> Just for clarity here, we are NOT using the -c option.  The checksums 
>> happen whenever there is a transfer between the rsync source and the 
>> rsyncd on the other end.
> ...snip...
>> This is not some trivial rsync running at the top level.  There is 
>> code we wrote as well as integration with Jenkins.  When we recompiled 
>> rsync using MD4 instead of the MD5 we see a 20% increase in 
>> performance across the board.  This is what sparked my question.
> 
> We need more details to be of much use:
> 
> 1. Specific rsync version and command line used.
> 
> 2. Compilation options both normally and with your md4 changes.
> 


More information about the Beowulf mailing list