<div dir="ltr"><div dir="ltr"><div><br></div><div>Hi Lance,</div><div><br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>For single node jobs MPI can be run with the MPI binary from the container with native performance for the shared memory type messages. This has worked without issue since the very early days of Singularity. The only tricky part has been multi-node and multi-container.</div></div></blockquote><div><br></div><div>  Thanks for the reply - I guess I'm curious where the 'tricky' bits are at this point.  For cross-node, container-per-rank jobs, I think the ABI compatibility stuff ensures (even if not done automagically) that you get 'native' performance, but the same-node, container-per-rank stuff is where I'm still unsure what happens.  In theory, with the run being just a process, it <i>should</i> be doable, but I don't know if there's some glue that needs to happen, or has already happened.</div><div><br></div><div>  If nobody knows offhand, it's on my to-do list to test this, I just haven't found the time yet.  I'll do so and update the list once I'm able.<br></div><div><br></div><div>  Cheers,</div><div>  - Brian<br></div><div> </div></div></div>