[Beowulf] Q: IB message rate & large core counts (per node)?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Brian Dobbins bdobbins at gmail.comTue Feb 23 15:23:59 PST 2010
- Previous message: [Beowulf] Q: IB message rate & large core counts (per node)?
- Next message: [Beowulf] Q: IB message rate & large core counts (per node)?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Greg, > Well, clearly we hope to move more towards hybrid methods -all that's > old > > is new again?- > > If you want bad performance, sure. If you want good performance, you > want a device which supports talking to a lot of cores, and then > multiple devices per node, before you go hybrid. The first two don't > require changing your code. The last does. > > The main reason to use hybrid is if there isn't enough parallelism in > your code/dataset to use the cores independently. > Actually, it's often *for* performance that we look towards hybrid methods, albeit in an indirect way - with RAM amounts per node increasing at the same or lesser rate than cores, and with each MPI task on *some* of our codes having a pretty hefty memory footprint, using fewer MPI processes and more threads per task lets us fully utilize nodes that would otherwise have cores sitting idle due to a lack of available memory. Sure, we could rewrite the code to tackle this, too, but in general it seems easier to add threading in than to rework a complicated parallel decomposition, shared buffers, etc. In a nutshell, even if a hybrid mode *costs* me 10-20% over a direct mode with an equal number of processors, if it allows me to use 50% more cores in a node, it works out well for us. But yes, ignoring RAM constraints, non-hybrid parallelism tends to be nicer at the moment. > > But getting back to a technical vein, is the multiplexing an issue due to > > atomic locks on mapped memory pages? Or just because each copy reserves > its > > own independent buffers? What are the critical issues? > > It's all implementation-dependent. A card might have an on-board > memory limit, or a limited number of "engines" which process > messages. Even if it has a option to store some data in main memory, > often that results in a scalability hit. > Thanks. I guess I need to read up on quite a bit more and set up some tests. Cheers, - Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20100223/2abcecac/attachment.html
- Previous message: [Beowulf] Q: IB message rate & large core counts (per node)?
- Next message: [Beowulf] Q: IB message rate & large core counts (per node)?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
