[Beowulf] Torrents for HPC

Mon Jun 11 10:49:23 PDT 2012

On Fri, 8 Jun 2012 at 5:06pm, Bill Broadley wrote

> Do you think it's worth bundling up for others to use?
>
> This is how it works:
> 1) User runs publish <directory> <name> before they start submitting
>    jobs.
> 2) The publish command makes a torrent of that directory and starts
>    seeding that torrent.
> 3) The user submits an arbitrary number of jobs that needs that
>    directory.  Inside the job they "$ subscribe <name>"
> 4) The subscribe command launches one torrent client per node (not per j
>    job) and blocks until the directory is completely downloaded
> 5) /scratch/<user>/<name> has the users data
>
> Not nearly as convenient as having a fast parallel filesystem, but seems
> potentially useful for those who have large read only datasets, GigE and
> NFS.
>
> Thoughts?

I would definitely be interested in a tool like this.  Our situation is 
about as you describe -- we don't have the budget or workload to justify 
any interconnect higher-end than GigE, but have folks who pound our 
central storage to get at DBs stored there.

-- 
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF