<div dir="ltr">Are you rsyncing over ssh? If so, get HPN-SSH and use the non-cipher. MUCH faster again :)</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jun 18, 2019 at 11:00 PM Bill Wichser <<a href="mailto:bill@princeton.edu">bill@princeton.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Well thanks for THAT pointer! Using --checksum-choice=none results in <br>
speedup of somewhere between 2-3 times. That's my validation of the <br>
checksum theory things have been pointing towards. Now to get xxhash <br>
into rsync and I think we are all set.<br>
<br>
Thanks,<br>
Bill<br>
<br>
On 6/18/19 9:57 AM, Ellis H. Wilson III wrote:<br>
> On 6/18/19 9:16 AM, Bill Wichser wrote:<br>
>> Stock RH 7 version, rsync-3.1.2-6.el7_6.1.x86_64. We've tried a <br>
>> number of recompiles. gcc, Intel. The only thing between identical <br>
>> compiles was the md4 vs md5.<br>
>><br>
>> /bin/rsync -lptgoDAH -v --numeric-ids -d --relative --delete <br>
>> --delete-after --files-from=...<br>
>><br>
>> I'm not asking for help. Just if anyone had attempted to change the <br>
>> algorithm into something much faster.<br>
>><br>
>> I refer you to this project <a href="https://cyan4973.github.io/xxHash/" rel="noreferrer" target="_blank">https://cyan4973.github.io/xxHash/</a> where <br>
>> there is a table of speeds. Regardless of what anyone might <br>
>> speculate, we are pursuing this route of changing out the algorithm. <br>
>> Maybe it's all for naught. Maybe it isn't. But in a few weeks <br>
>> hopefully we'll have determined.<br>
> <br>
> Very interesting. From the rsync man page:<br>
> <br>
> "Note that rsync always verifies that each transferred file was <br>
> correctly reconstructed on the receiving side by checking a <br>
> whole-file checksum that is generated as the file is transferred, but <br>
> that automatic after-the-transfer verification has nothing to do with <br>
> this option’s before-the-transfer "Does this file need to be updated?" <br>
> check."<br>
> <br>
> So it sounds like you have sufficient churn in large files that the <br>
> checksum validation post-transfer is your bottleneck. Short of hacking <br>
> rsync to use a faster algorithm, your remaining choice is to use the <br>
> --checksum-choice=STR and set it to none, and then perform your own <br>
> hashing out-of-band to check the transferred data using the list you <br>
> have provided via in files-from. This will nerf rsync's ability to do <br>
> delta-transfer, which may be ok depending on the nature of your churning <br>
> files. If your pipes are huge (atypical for DR), your CPU is weak, and <br>
> your churning data is mostly completely new or completely changed files, <br>
> --checksum-choice=none may work very well for you.<br>
> <br>
> Best,<br>
> <br>
> ellis<br>
> <br>
_______________________________________________<br>
Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>
To change your subscription (digest mode or unsubscribe) visit <a href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf" rel="noreferrer" target="_blank">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr">Dr Stuart Midgley<br><a href="mailto:sdm900@gmail.com" target="_blank">sdm900@gmail.com</a></div></div>