[Beowulf] fast file copying
Felix Rauch Valenti
felix.rauch.valenti at gmail.com
Sun May 6 20:20:38 PDT 2007
On 04/05/07, Bill Broadley <bill at cse.ucdavis.edu> wrote:
> Geoff Galitz wrote:
> > During an HPC talk some years ago, I recall someone mentioned a tool
> > which can copy large datasets across a cluster using a ring topology.
> > Perhaps someone here knows of this tool?
>
> Not sure about a ring topology, seems kinda silly...
Why would that be silly? To clarify: The transmission through the ring
happens in parallel, i.e., while a node n receives the data stream
from node n-1, it writes the stream to disk and at the same time
forwards it to node n+1.
I have yet to see a tool that can achieve better data rates in
practice, for reliable, high speed and large scale data distribution
in clusters.
> > More to the point, we are pushing around datasets that are about
> > 1Gbyte. The datasets are pushed out to dozens of nodes all at once and
>
> How often? I just bit-torrented a 1GB file to 165 nodes in 3 minutes,
> 1.5 minutes was the lazy why I launched it (the last node didn't
> start until 1.5 minutes into the run). BTW, 140 or so of those nodes
> already had 1 job per CPU running.
1 GB file in 1.5 minutes translates to about 11 MB/s, which sounds a
lot like Fast Ethernet (100 mbps). By today's standards that's
relatively slow and it's quite likely that the network will be the
bottleneck for almost any tool.
> There are various ways to maximize I/O with bit-torrent. Various
> seeders allow uploading each block only once (usually called super
> seeder mode). Assuming you have a few GB ram on the file server
> you could even prefetch the file before torrenting (i.e. dd if=file_to_server
> of=/dev/null) since the limit on bit-torrent bandwidth is often how
> quickly you can seek.
>
> Additionally you can make the chunk size larger to reduce the number
> of seeks. On the client side preallocation can greatly reduce
> the number of seeks.
More advantages of the ring topology: It uploads every block on every
node exactly once, no prefetching and no seeks are required (if you
replicate a whole partition or a single large file).
If you are interested in more details about the technology, like
models and performance measurements (somewhat old by now), check out
the second paper in this list:
http://www.cs.inf.ethz.ch/cops/patagonia/#relmat
- Felix
More information about the Beowulf
mailing list