[Beowulf] copying data between clusters

Hearns, John john.hearns at mclaren.com
Fri Mar 5 10:05:38 PST 2010

> I'd like to paralyze that across multiple nodes to drive the aggregate
> up
> I was hoping someone would pop up say, hey use this magical piece of
> software. (of which im unable to locate)..
My recommendation also would be to use an external storage device - a
USB drive would be useful, and I have been involved in a couple of
industrial projects where data has been brought to a cluster on an
external USB drive. It is as people say quite an efficient way to
transfer the data.

I gather that for high def digital cinema a RAID array is physically
shipped to the cinema - I guess that also helps with data security, as
you could do some sort of encryption on the drives, though I might be
In the digital media world, there are some fast parallel SCP boxes which
are an industry standard - I gather they cost $$$$ but do make transfers
I forget the name, and if they don't really do parallel SCP forgive me -
its something along those lines.

Re. moving data to/from a cluster over a WAN link, I did look at this
You can set up a fuse filesystem running over SSH. This actually works
quite well from the point of view of ease of setting up and usability,
but I didn't try any serious data transfer over it - and of course it
cannot be faster than ssh anyway!

I did also have a look at the types of tools used by grids for bulk data
transfer, but not much more than looking.
Here's an interesting link I found:  http://fasterdata.es.net/tools.html

ps. you don't say how you are transferring the data - if via rsync you
have looked at the encryption options you are using?

John Hearns

The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.

More information about the Beowulf mailing list