[Beowulf] Parallel Programming Question
hahn at mcmaster.ca
Wed Jun 24 08:44:03 PDT 2009
> In an mpi parallel code which of the following two is a better way:
> 1) Read the input data from input data files only by the master process
> and then broadcast it other processes.
> 2) All the processes read the input data directly from input data files
> (no need of broadcast from the master process). Is it possible?.
2 is certainly possible; whether it's any advantage depends too much
on your filesystem, size of data, etc. I'd expect 2 to be faster only
if your file setup is peculiar - for instance, if you can expect all
nodes to have the input files cached already. otherwise, with a FS
like NFS, 2 will lose, since MPI broadcast is almost certainly more
time-efficient than N nodes all fetching the file separately.
but you should ask whether the data involved is large, and whether
each rank actually needs it. if each rank needs only a different
subset of data, then reading separately could easily be faster.
More information about the Beowulf