Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] fast file copying

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Bill Broadley bill at cse.ucdavis.edu
Fri May 11 02:28:55 PDT 2007


Geoff Galitz wrote:
> - enter deployment phase
>  - check for member nodes that are alive
>  - dynamically build the config file
>  - bring the ring up

The torrent equiv is building the .torrent file (basically a list
of checksums, and launching the tracker).

>  - start the transfer

Launching the bt client.

> We use pdsh to do as much of the configuration and command execution as 
> possible.  This made dolly a better choice for us rather than nettee as 
> we can issue the exact same command to all nodes in parallel.  Nettee 
> required more specific commands on each node.

Sounds like bt would be most like dolly in that respect.  Nodes could
even join after the transfer starts.

> In our testing environment, we're getting as much as 45MB/sec and as 
> little as 11MB/sec in our various scenarios (mismatched hardware, busy 

How many nodes in your 11-45MB/sec runs?  How much data are you
distributing to each node?

My last test took 36 seconds for a 1GB file (28MB/sec) with 4MB chunks. 
Bonnie++ measure between 20 and 40MB/sec which is somewhat disappointing 
actually.  Not sure why it's so slow sometimes, and seems to significantly
vary node to node even if I mkfs right before the bonnie++.

It's easiest to tell when a node is done from the tracker (since the
client doesn't exit).  Keep in mind that a client will report in when done
and often again to report any additional uploading it does.  So as far as
counting when a client is done look for the first "left=0" in the log for
a given IP address.

My logs from the run show the launch:
  11/May/2007:02:00:25

clients finishing at:
11/May/2007:02:01:00
11/May/2007:02:01:01
11/May/2007:02:00:55
11/May/2007:02:00:49
11/May/2007:02:00:51
11/May/2007:02:00:58
11/May/2007:02:00:55
11/May/2007:02:00:50
11/May/2007:02:00:56
11/May/2007:02:00:50
11/May/2007:02:00:56
11/May/2007:02:00:52
11/May/2007:02:00:53
11/May/2007:02:00:56
11/May/2007:02:00:55
11/May/2007:02:00:53
11/May/2007:02:00:54
11/May/2007:02:00:59
11/May/2007:02:00:54
11/May/2007:02:00:55
11/May/2007:02:00:57
11/May/2007:02:00:55
11/May/2007:02:00:55
11/May/2007:02:01:01
11/May/2007:02:00:52

> network, different types of data).  We did achieve our primary goal in 
> reducing load on the master/server system.  In our old setup, our load 
> would increase to 25+.  With dolly, our load never exceeds 1.5.

Cool, so you have a solution.

> I plan on also making the same test with torrent.

Great.



More information about the Beowulf mailing list