[Beowulf] copying big files (Henning Fehrmann)

Reuti reuti at staff.uni-marburg.de
Sat Aug 9 14:03:57 PDT 2008


Hi,

Am 09.08.2008 um 20:53 schrieb jitesh dundas:

> We could try and implement this functionality of resuming broken
> downloads like in some softwares like Download Accelerator and
> bit-torrent.
>
> I hope my views can help, so here goes:-
>
> When a file is being downloaded, we can keep a stack of all of these
> downloads in progress at a centralized repository, preferably where
> the user has kept his file hosted for download or on the machine where
> the download is to be done.
>
> Next, we can keep the track of the point at which the download stopped
> and store it in the repository. Next, if the user tries to start the
> download it again, we can again retrieve it back from the data and get
> the end point of the previous download.
>
> The end point for each file can include the file details in terms of
> bits and bytes( 0 & 1) or even in percentages or pieces..Next time we
> can break our file based on pieces or percentages( as needed) and
> start the download from the nearest point that is best suited for the
> user.

regarding user transmission of big files, maybe even between sites, I  
would look into splitting the files and using a checksum like .par  
or .par2.

http://sourceforge.net/projects/parchive

Even if one part doesn't make it to the other node, you can still  
assemble the complete file due to the added checksum files.

But the original question was copying files inside a cluster to  
thousands of nodes. As 1000 nodes still means some amount of money to  
spend, what about looking into something like IBM's GPFS and their  
SAN switch and connect all nodes to this switch?

-- Reuti


> I hope this helps...
> I request your feedback...
>
> Thanks,
> Jitesh Dundas
> Mobile- +91-9860925706
> http://jiteshbdundas.blogspot.com
>
>
> On 8/9/08, Carsten Aulbert <carsten.aulbert at aei.mpg.de> wrote:
>> Hi
>>
>> Perry E. Metzger wrote:
>>
>>> Is there a reason bittorrent isn't suited to this application?
>>>
>>
>> Our investigations so far showed that bittorrent is only good if the
>> files to be transferred fit well into main memory. If you exceed  
>> about
>> 90-95% of the RAM your disks will be accessed a lot and the  
>> performance
>> breaks down a lot (we have seen close to wirespeed in the  
>> beginning and
>> in the end we were crawling with mere few 10 kByte/s).
>>
>> Cheers
>>
>> Carsten
>>
>> --
>> Dr. Carsten Aulbert - Max Planck Institute for Gravitational Physics
>> Callinstrasse 38, 30167 Hannover, Germany
>> Phone/Fax: +49 511 762-17185 / -17193
>> http://www.top500.org/system/9234 | http://www.top500.org/connfam/ 
>> 6/list/31
>>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list