[Beowulf] copying big files (Henning Fehrmann)

Sun Aug 10 04:57:23 PDT 2008

On Aug 9, 2008, at 5:03 PM, Reuti wrote:

> Hi,
>
> Am 09.08.2008 um 20:53 schrieb jitesh dundas:
>
>> We could try and implement this functionality of resuming broken
>> downloads like in some softwares like Download Accelerator and
>> bit-torrent.
>>
>> I hope my views can help, so here goes:-
>>
>> When a file is being downloaded, we can keep a stack of all of these
>> downloads in progress at a centralized repository, preferably where
>> the user has kept his file hosted for download or on the machine  
>> where
>> the download is to be done.
>>
>> Next, we can keep the track of the point at which the download  
>> stopped
>> and store it in the repository. Next, if the user tries to start the
>> download it again, we can again retrieve it back from the data and  
>> get
>> the end point of the previous download.
>>
>> The end point for each file can include the file details in terms of
>> bits and bytes( 0 & 1) or even in percentages or pieces..Next time we
>> can break our file based on pieces or percentages( as needed) and
>> start the download from the nearest point that is best suited for the
>> user.
>
> regarding user transmission of big files, maybe even between sites,  
> I would look into splitting the files and using a checksum like .par  
> or .par2.
>
> http://sourceforge.net/projects/parchive
>
> Even if one part doesn't make it to the other node, you can still  
> assemble the complete file due to the added checksum files.
>
> But the original question was copying files inside a cluster to  
> thousands of nodes. As 1000 nodes still means some amount of money  
> to spend, what about looking into something like IBM's GPFS and  
> their SAN switch and connect all nodes to this switch?
>
> -- Reuti

You may want to look at http://loci.cs.utk.edu. If you need to  
distribute large files within a cluster or across the WAN, you can use  
the LoRS tools to stripe the file over multiple servers and the  
clients then try pulling blocks off of each server in parallel. Using  
Internet2 and one client at Vanderbilt and a couple servers at Univ of  
Tennessee, they were able to saturate UT's ~400 Mb/s I2 link (much to  
the disbelief of the Vandy IT staff). I have seen ~5 Gb/s within a  
cluster using good 10G NICs. :-)

Scott