[Beowulf] fast file copying

Bill Broadley bill at cse.ucdavis.edu
Fri May 11 02:28:55 PDT 2007


Geoff Galitz wrote:
> - enter deployment phase
>  - check for member nodes that are alive
>  - dynamically build the config file
>  - bring the ring up

The torrent equiv is building the .torrent file (basically a list
of checksums, and launching the tracker).

>  - start the transfer

Launching the bt client.

> We use pdsh to do as much of the configuration and command execution as 
> possible.  This made dolly a better choice for us rather than nettee as 
> we can issue the exact same command to all nodes in parallel.  Nettee 
> required more specific commands on each node.

Sounds like bt would be most like dolly in that respect.  Nodes could
even join after the transfer starts.

> In our testing environment, we're getting as much as 45MB/sec and as 
> little as 11MB/sec in our various scenarios (mismatched hardware, busy 

How many nodes in your 11-45MB/sec runs?  How much data are you
distributing to each node?

My last test took 36 seconds for a 1GB file (28MB/sec) with 4MB chunks. 
Bonnie++ measure between 20 and 40MB/sec which is somewhat disappointing 
actually.  Not sure why it's so slow sometimes, and seems to significantly
vary node to node even if I mkfs right before the bonnie++.

It's easiest to tell when a node is done from the tracker (since the
client doesn't exit).  Keep in mind that a client will report in when done
and often again to report any additional uploading it does.  So as far as
counting when a client is done look for the first "left=0" in the log for
a given IP address.

My logs from the run show the launch:
  11/May/2007:02:00:25

clients finishing at:
11/May/2007:02:01:00
11/May/2007:02:01:01
11/May/2007:02:00:55
11/May/2007:02:00:49
11/May/2007:02:00:51
11/May/2007:02:00:58
11/May/2007:02:00:55
11/May/2007:02:00:50
11/May/2007:02:00:56
11/May/2007:02:00:50
11/May/2007:02:00:56
11/May/2007:02:00:52
11/May/2007:02:00:53
11/May/2007:02:00:56
11/May/2007:02:00:55
11/May/2007:02:00:53
11/May/2007:02:00:54
11/May/2007:02:00:59
11/May/2007:02:00:54
11/May/2007:02:00:55
11/May/2007:02:00:57
11/May/2007:02:00:55
11/May/2007:02:00:55
11/May/2007:02:01:01
11/May/2007:02:00:52

> network, different types of data).  We did achieve our primary goal in 
> reducing load on the master/server system.  In our old setup, our load 
> would increase to 25+.  With dolly, our load never exceeds 1.5.

Cool, so you have a solution.

> I plan on also making the same test with torrent.

Great.



More information about the Beowulf mailing list