[Beowulf] IB DDR: mvapich2 vs mvapich performance
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mikhail Kuzminsky kus at free.netThu Apr 24 10:37:07 PDT 2008
- Previous message: [Beowulf] IB DDR: mvapich2 vs mvapich performance
- Next message: [Beowulf] HP ProLiant DL165G5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In message from "Tom Elken" <tom.elken at qlogic.com> (Thu, 24 Apr 2008 09:31:16 -0700): >> I have up to 1453 MB/s for 4MB message size ... on osu_bw test >w/Mellanox DDR IB >> (Mellanox version of OFED-1.2, w/binary mvapich-0.9.9); OpenMPI >-1.2.2-1 gives even a >> bit more (1470 MB/s - more exactly, 1469.753328, 1469.447179, >> 1469.977840 for 3 subsequent test runs). >> >> The SC'07 message of D.K.Panda >> http://mvapich.cse.ohio-state.edu/publications/sc07_mpich2_bof.pdf >> inform us about 1405 MB/s. >> >> Is this throughput difference the result of MPI-2 vs MPI >> implementation or should I beleive that this difference (about 4% >>for >> my mvapich vs mvapich2 at SC'07 ) is not significant - in the sense >> that it is simple because of some measurement errors (inaccuracies)? > >The way to see if there is a real throughput difference between a >MPI-2 implementation and a MPI-1 implementation is to measure it on >your pair of machines. :-) Of course - but I've the problem w/mvapich2 (from binary Mellanox/ofed-1.2) setting. When I try to run mpdboot (/etc/mpd.conf contains the same MPD_SECRETWORD оn both nodes; MV2_DEFAULT_DAPL_PROVIDER=ib0), I see mpdboot -v -n 2 -f /where/is/mpihosts mpdroot: perror msg: No such file or directory running mpdallexit on <node1_shortname> LAUNCHED mpd on <node1_shortname> via RUNNING: mpd on <node1_shortname> LAUNCHED mpd on <node2_FQDN> via <node1_shortname> RUNNING: mpd on <node2_FQDN> ================================================= /var/log/messages contains: Apr 22 21:20:53 <node1_shortname> python2.4: mpdallexit: mpd_uncaught_except_tb handling: exceptions. TypeError: not all arguments converted during string formatting /usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdlib.py 899 __init__ mpd_print(1,'forked process failed; status=' % status) /usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdallexit.py 44 mpdallexit conSock = MPDConClientSock(mpdroot=mpdroot,secretword=parmdb['MPD_SECRETWORD']) /usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdallexit.py 59 ? mpdallexit() Apr 22 21:20:53 <node1_shortname> mpd: mpd starting; no mpdid yet Apr 22 21:20:53 <node1_shortname> mpd: mpd has mpdid=<node1_shortname>_40611 (port=40611) Apr 22 21:21:01 <node1_shortname> kernel: ib0: multicast join failed for ff12:601b:ffff:0000:0000:0001:ff22:e50d, status -22 Apr 22 21:21:33 c5ws7 last message repeated 2 times etc ===================================================== ... and I don't understand (even from strace output :-)) which file want mpdboot/mpdroot :-( >Comparing your results to published results are difficult because >nearly >all the variables need to be the same for the comparison to be valid. >Variables like which of the following were used in the two tests: >- Mellanox IB DDR adapter >- PCIe interface type >- CPU model and speed >- PCIe chipset >- OFED version, ... > >Certainly the MPI flavor and version is important, but it is not, in >general, the most important of these factors. >Note for example these two results on the OSU MVAPICH web pages: > >MVAPICH2 1-sided put throughput, measured with osu_bw: >1405 MB/s: ConnectX DDR, PCIe x8, EM64T 2.33 GHz quad-core CPU >http://mvapich.cse.ohio-state.edu/performance/mvapich2/em64t/MVAPICH2-em >64t-gen2-ConnectX-DDR-1S.shtml > >1481 MB/s: MT25208 HCA silicon, PCIe x8, Intel Xeon 3.6 Ghz, EM64T > >http://mvapich.cse.ohio-state.edu/performance/mvapich2/em64t/MVAPICH2-em >64t-gen2-DDR-1S.shtml > >Both are DDR IB adapters. ConnectX is the newer silicon. But because >of system differences, the older adapter is faster, in this case. Thanks for this reference ! I thought that on my more old HCA hardware (Infinihost III Lx PCI-e x8 MHGS18-XTC), more old CPU/mobo/... (Opteron 246/2 Ghz/...), more old Linux, ofed and mvapich/mvapich2 versions I must obtain more lower throughput value ... Mikhail > >-Tom > > >> >> Mikhail Kuzminsky >> Computer Assistance to Chemical Research Center >> Zelinsky Institute of Organic Chemistry >> Moscow >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org >> To change your subscription (digest mode or unsubscribe) >> visit http://www.beowulf.org/mailman/listinfo/beowulf >>
- Previous message: [Beowulf] IB DDR: mvapich2 vs mvapich performance
- Next message: [Beowulf] HP ProLiant DL165G5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
