[Beowulf] copying big files
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Abhishek Kulkarni abbyzcool at gmail.comSun Aug 10 12:03:18 PDT 2008
- Previous message: [Beowulf] copying big files
- Next message: [Beowulf] copying big files
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Yes, You might want to take a look at XGET, which is a part of the XCPU clustering framework (http://www.xcpu.org). It was primarily designed to transfer boot images (kernel and initrd) across the nodes in a cluster in a very scalable manner, but it can be used to transfer any big files/directories across the network. It creates an ad-hoc tree at runtime wherein a client can also "act as a server" for the other clients. Boot image distribution for over 1024 nodes has been done in less than 10 seconds. More recently, Perceus (http://www.perceus.org) has been using XGET as the default mechanism for scalable VNFS transfer across nodes and comes bundled with XCPU modules that makes configuring it a lot easier. -- Abhishek (who wonders how can it take more than 3 days for such posts to get through on the Beowulf ML) On Fri, Aug 8, 2008 at 9:37 AM, Henning Fehrmann < henning.fehrmann at aei.mpg.de> wrote: > Hi everybody, > > Coping a big file onto all nodes in a cluster is a rather common problem. > I would have thought that there might be a standard tool for > distributing the files in an efficient way. So far, I haven't found one. > > Assuming one has a network design which allows non blocking full duplex > wire-speed connections between N/2 pairs of nodes where N is the number > of nodes in the cluster. It is basically a non blocking coreswitch. > > In this case the following scheme would be convenient and rather simple: > > The file is placed on node n1 and one builds a chain of nodes n1 , n2 .... > nN. > > One splits the file into many packages (p1..pM), lets say a fragment fits > into one TCP package. In the first step n1 transmits the package p1 to node > n2. > In the second step n1 transmits the package p2 to n2 and n2 transmits p1 to > node n3. > > The transmission of a single package is fast. The time of passing a > particular > package through the whole chain of nodes is short compared with time of the > entire copying process. E.g., using jumbo frames a package can have the > size of ca 10kB. > In Gb network the transmission time of a single package between nodes is > of the order of 0.1 ms. Even in a cluster with 1024 nodes it takes > in an ideal case just 0.1s to pass a package from node n1 through all nodes > to n1024. > > On each node the package is stored and, in the end, one reassembles the > file. > For big files (size >> 10Mb) the required time is approximately > the same as one needs for copying the file between two nodes plus 0.1s. > > One needs basically a daemon which handles copying requests and establishes > the connection to next node in the chain. > > Has somebody written such a tool? > > Cheers, > Henning Fehrmann > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080810/f9778f05/attachment.html
- Previous message: [Beowulf] copying big files
- Next message: [Beowulf] copying big files
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
