Dolphin Wulfkit
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Steffen Persvold sp at scali.comSat May 4 05:43:52 PDT 2002
- Previous message: Dolphin Wulfkit
- Next message: Dolphin Wulfkit
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sat, 4 May 2002, joachim wrote: > [Greg Lindahl:] > > If a benchmark measures process-to-process bandwidth and latency > > between only 2 processes, it matters whether or not they are on the > > same node (shmem) or not. It's not a matter of round robin vs. groups. > > I need to refer to the disclaimer again: most interesting is not the p2p- > performance, as we all know, but the scaling behaviour. If you look at > the other PMB result Patrick has posted, it's obvious that these values > are very heavily dependand on the optimized setup of the hard- and software. > The scaling behaviour usually stays the same. > > > BTW, groups _are_ strongly preferred, but that's not the issue. Why? > > If you have a program that does nearest neighbor communication, where > > are your neighbors? With round robin you are accessing your array in > > the wrong order. > > Agreed - SCI-MPICH (i.e.) does "grouping" no matter in which order the > nodes are specified because it always makes sense (counter example?). > > Don't know about the details of the ScaMPI mapping > algorithm - I'm not Scali... > Well ScaMPI does "grouping" too I guess. Examples : # /opt/scali/bin/mpimon someapp -- node1 node2 nnode1 node2 Here rank 0 and 2 would run on node1 but when they are communicating with eachother shm is used, same with rank 1 and 3 on node2. # /opt/scali/bin/mpimon someapp -- node1 2 node2 2 Here rank 0 and 1 is running on node1 while rank 2 and 3 runs on node2. Same rules apply to this grouping Bottom line is ; multiple MPI processes which is located on the same node uses shm, no matter how you group them on the command line (kinda logical actually). AFAIK the PMB benchmark uses rank 0 and 1 when doing ping-ping and ping-pong tests, so the two examples above would thus use different communications (example 1 would use SCI, example 2 would use shm), and could therefore get very different results. Regards, -- Steffen Persvold | Scalable Linux Systems | Try out the world's best mailto:sp at scali.com | http://www.scali.com | performing MPI implementation: Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 - Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency
- Previous message: Dolphin Wulfkit
- Next message: Dolphin Wulfkit
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
