getting data to nodes....

Felix Rauch rauch at inf.ethz.ch
Fri Feb 14 12:09:38 PST 2003


On Fri, 14 Feb 2003, ambrose lewis wrote:
> I'm thinking about using an MPI code to analyze a
> large remote sensing dataset.
> I'm wondering the best way to distribute the data to
> the individual nodes (NFS, MPI message, other)????

If you are distributing the same data to all nodes, if your data is
large (several hundreds of MB) and if you have a switched network,
then you might consider our tool "Dolly" [1,2].

Dolly forms a virtual TCP chain through all the nodes and then sends
the data through this chain. On a Fast Ethernet network with a good
switch we achieve full wire speed (about 11 MB/s no mather how many
nodes participate in the distribution). I still have to update the
web-page, since the version of Dolly online is older than the most
recent one. So if you want to try a newer version, feel free to mail
me and I will send you the source by mail.

NFS is probably the simplest method, but it's not fast to send large
amounts of identical data to many nodes, since the server has to send
the same data over and over.

If you don't have a switched network (only hubs/repeaters), but your
network is multicast capable, then you could try UDPcast [3]. This
tool sends data using IP multicast.

- Felix

[1] http://www.cs.inf.ethz.ch/cops/patagonia/dolly.html
[2] http://www.cs.inf.ethz.ch/stricker/CoPs/patagonia/
    Project page, Dolly is near the bottom of the page.
[3] http://alain.knaff.lu/udpcast/
-- 
Felix Rauch                      | Email: rauch at inf.ethz.ch
Institute for Computer Systems   | Homepage: http://www.cs.inf.ethz.ch/~rauch/
ETH Zentrum / RZ H18             | Phone: +41 1 632 7489
CH - 8092 Zuerich / Switzerland  | Fax:   +41 1 632 1307




More information about the Beowulf mailing list