[Beowulf] fast file copying

Bill Broadley bill at cse.ucdavis.edu
Thu May 10 14:51:20 PDT 2007


Felix Rauch Valenti wrote:
> On 04/05/07, Bill Broadley <bill at cse.ucdavis.edu> wrote:
>> Geoff Galitz wrote:
>> > During an HPC talk some years ago, I recall someone mentioned a tool
>> > which can copy large datasets across a cluster using a ring topology.
>> > Perhaps someone here knows of this tool?
>>
>> Not sure about a ring topology, seems kinda silly...
> 
> Why would that be silly?

The normal ring based disadvantages, reliability and performance.

In the case where your head node and the client nodes have
the same speed network, all clients are present, all clients
are idle, all clients survive until the end of the transfer you
can get great performance.  It certainly seems like 90% of so
of line speed is possible.

Seems like any number of things could make the ring based approach
a poor choice, where the worst case of the ring could dramatically
slow things down.  Things like:
* Head node's network connection is 10 times faster
* A single node dies during the transfer
* A single node joins late
* A single node is very busy (I/O, memory constrained, or CPU)

A bit-torrent like approach would handle all of the above relatively
gracefully.  The nettee approach does have the advantage that all
disk accesses are sequential.  But with a large chunk size of say
64 MB (when transferring a few GB file) seems like seeks wouldn't
be a major issue.

I've seen 15 MB/sec per client  with the default chunk size (fairly
small), when I wrote the file to a better disk system I managed
30MB/sec.  I've yet to try larger chunk sizes on normal compute
node disk systems.

I'll do some more testing.

> More advantages of the ring topology: It uploads every block on every

Sounds like bittorrent.

> node exactly once, no prefetching and no seeks are required (if you
> replicate a whole partition or a single large file).

bittorrent does seek more, but it seems trivial to reduce the seeks
so that it's not a performance impact... say 1 per second.

> If you are interested in more details about the technology, like
> models and performance measurements (somewhat old by now), check out
> the second paper in this list:
> 
> http://www.cs.inf.ethz.ch/cops/patagonia/#relmat

Interesting paper, I'll try a run with GigE so I can compare fairly.



More information about the Beowulf mailing list