[Beowulf] Intel buys QLogic InfiniBand business
landman at scalableinformatics.com
Fri Jan 27 18:24:10 PST 2012
On 01/27/2012 05:27 PM, Greg Lindahl wrote:
> I'm not surprised, as this 10ge adapter is aimed at the same part of
> the market that uses fibre channel, which isn't that common in HPC. It
> doesn't have the kind of TCP offload features which have been
> (futilely) marketed in HPC; it's all about running the same fibre
> channel software most enterprises have run for a long time, but having
> the network be ethernet.
That makes sense.
>> Haven't looked much at FDR or EDR latency. Was it a huge delta (more
>> than 30%) better than QDR? I've been hearing numbers like 0.8-0.9 us
>> for a while, and switches are still ~150-300ns port to port.
> Are you talking about the latency of 1 core on 1 system talking to 1
> core on one system, or the kind of latency that real MPI programs see,
> running on all of the cores on a system and talking to many other
> systems? I assure you that the latter is not 0.8 for any IB system.
I am looking at these things from a "best of all possible cases"
scenario. So when someone comes at me with new "best of all possible
cases" numbers, I can compare. Sadly this seems to be the state of many
In storage, we see small disk form factor SSDs marketed generally, with
statments like 50k IOPs, and 500 MB/s. Though they neglect to mention
several specific issues with these, such as writing all zeros, or the
75k IOPs are sequential IOPs you get from taking the 600 MB/s interface,
dividing by 8k byte operations on a sequential read. Actually do a real
random read and write and you get very ... very different results.
Especially with non-zero (real) data.
>> At some
>> point I think you start hitting a latency floor, bounded in part by "c",
> Last time I did the computation, we were 10X that floor. And, of
> course, each increase in bandwidth usually makes latency worse, absent
> heroic efforts of implementers to make that headline latency look
I think thats the point though, that moving that performance "knee" down
to lower latency involves (potentially) significant cost, for a modest
return ... in terms of real performance benefit to a code.
Thanks for the pointer on the computation. If we are 1000x off the
floor, we can probably come up with a way to do better. 10x, probably
its much harder than we think and not necessarily worth the effort.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf