[Beowulf] copying big files

Lombard, David N dnlombar at ichips.intel.com
Thu Aug 14 06:27:09 PDT 2008


On Wed, Aug 13, 2008 at 04:14:50PM -0700, Bernard Li wrote:
> Hi:
> 
> On Tue, Aug 12, 2008 at 10:10 AM, Lombard, David N
> <dnlombar at ichips.intel.com> wrote:
> 
> > See Brent Chen's pcp at <http://www.theether.org/work.html>
> > You'll want pcp, authd, and libe. Get gexec while you're at it...
> 
> Dave!!  You beat me to mention pcp! :-)
> 
> I really wished somebody would pick up the project and make it better.

It's now part of Ganglia.  It was *always* intended to support ganglia,
hence the g of gexec, but now it's in the Ganglia tree @ SF.

>  I have tested it a few years back and it was really fast.  The
> general idea is that the host that has the original file will send to
> the neighbour and before that transfer completes the neighbour will
> send to its neighbour and so on and so forth.

It's essentially a file transfer pipeline through the hosts, so O(1)
transfers.  gexec is a tree.

>                                                But it has some
> shortcomings:
> 
> 1) If one host in the path goes down, you need to start over again

Yup

> 2) The command option/interface is a little bit awkward, if I remember
> correctly you need to specify all the hosts manually in the command
> line
> etc.

Running via ganglia eases this.  I once provided patches to allow for
the various naming shortcuts, e.g., 'n[1-42]', but Brent was more focused
on Ganglia integration than general purpose tools.

> And yes, I agree BitTorrent is a simple solution that works well, that
> is why we integrated BitTorrent as part of the distribution mechanism
> of OS images in SystemImager.

Yes, BT is an outstanding application-level multicast.  BUT, it depends
on having a suitable number of contributers to the stream, which are
built up over time in this usage.  Having said that, it really is a good
capability.

>                                However, it would be nice if someone
> actually wrote an application which eliminates the manual setup of the
> tracker, seed, etc. -- better yet, code something from scratch as
> BitTorrent cannot handle user/file permissions, and thus the way
> around it is to tar up the files you wanted transfer, and untar it
> after, which adds additional overhead.

Hmmm...

> I look forward to trying out XGET, though.

me2

-- 
David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.



More information about the Beowulf mailing list