[Beowulf] Performance characterising a HPC application
Greg Lindahl
greg.lindahl at qlogic.com
Mon Mar 26 08:21:28 PDT 2007
On Fri, Mar 23, 2007 at 11:53:14AM -0700, Gilad Shainer wrote:
> and not only on pure latency or pure bandwidth. Qlogic till recently (*)
> had the lowest latency number but when it comes to application, the CPU
> overhead is too high.
QLogic's overhead is lower than Mellanox, how low do you want it to be?
Please see http://www.pathscale.com/pdf/overhead_and_app_perf.pdf
This shows the MPI overhead per byte transferred, as measured by Doug
at Sandia in 2005. How can ours be lower? The InfiniBand APIs are
unnecessarily complicated, and are a poor match to MPI compared to
everyone elses APIs: MX, Tports, InfiniPath's PSM.
The next slide shows a graph of the LS-Dyna results recently submitted
to topcrunch.org, showing that InfiniPath SDR beats Mellanox DDR on
the neon_refined_revised problem, both running on 3.0 Ghz Woodcrest
dual/dual nodes.
I look forward to showing the same advantage vs ConnectX, whenever
it's actually available.
-- greg
(p.s. ggv has issues with the fonts in this pdf, so try xpdf or (yech)
acroread instead.)
More information about the Beowulf
mailing list