[Beowulf] copying data between clusters
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
kyron kyron at neuralbs.comFri Mar 5 08:18:54 PST 2010
- Previous message: [Beowulf] copying data between clusters
- Next message: [Beowulf] copying data between clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, 05 Mar 2010 11:00:03 -0500, Joe Landman <landman at scalableinformatics.com> wrote: > Michael Di Domenico wrote: >> How does one copy large (20TB) amounts of data from one cluster to >> another? >> >> Assuming that each node in the cluster can only do about 30MB/sec >> between clusters and i want to preserve the uid/gid/timestamps, etc >> >> I know how i do it, but i'm curious what methods other people use... Could you clarify? Are-you actually sending from NodeXX-clusterA to NodeXX-ClusterB ? Are-we to assume aggregate bandwidth of Node*BW (as long as you don't saturate the switch fabric)? Also, given my comment below, I am assuming the 20TB of data is actually segmented (20TB/NodeCount) across the nodes and not 20TB*NodeCount. > I am biased of course, but Fedex-net with one of these: > http://scalableinformatics.com/jackrabbit > > 1GB @ 30 MB/s is about 33s. 1TB @ 30 MB/s is about 33000s. Or more > than 1/3 of a day. 20TB @ 30 MB/s ... you are looking at ~7 days to write. > > If you have a 1GB/s disk write speed (less than the above unit can do), > 1TB takes ~1000s, 20TB takes 20000s, about 1/4 of a day. > > If the clusters are close enough (same data center) this could be a > shared storage but you will need a fast network between them. If the > clusters are far enough to avoid direct connection, chances are 30 MB/s > may be optimistic on getting data between them. > > BTW: 30 MB/s sounds suspiciously like either a) 1GbE sustained NFS speed > for some nodes or b) the speed of an IDE drive. Given I haven't seen single 20TB drives out there yet, I doubt it to be the case. I wouldn't throw in NFS as a limiting factor (just yet) as I have been able to have sustained 250MB/s data transfer rates (2xGigE using channel bonding). And this figure is without jumbo frames so I do have some protocol overhead loss. The sending server is a PERC 5/i raid with 4*300G*15kRPM drives while the receiving well...was loading onto RAM ;) Eric Thibodeau
- Previous message: [Beowulf] copying data between clusters
- Next message: [Beowulf] copying data between clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
