[Beowulf] The Case for an MPI ABI
Joachim Worringen
joachim at ccrl-nece.de
Thu Feb 24 05:30:09 PST 2005
Greg Lindahl wrote:
> I don't think it's "much reduced" by this, but I think it's clear this
> would be a matter of opinion. What you'll definitely be able to do is
> run an application built on a particular Linux version with different
> MPI libraries compiled for that same Linux version. You are correct
> that if the MPI library was built for a wildly different Linux distro
> than the app, you can't necessarily put them together.
This problem left apart, do you know of ISV's that would at least be
willing to think about giving support to an MPI ABI no matter which
implementation and interconnect, and not a specific MPI library? Because
this is what matters.
For open source software packages alone, an ABI is not of critical
importance as people with a tcp/ip cluster can use pre-linkked packages,
and people with a high-perfomance interconnect cluster typically have
enough competence to compile the software themselves.
>>Another problem are i.e. vendor-specific assertions that could conflict.
>>A solution for this could be "numerical namespaces" for such extensions,
>>but how should they be managed?
>
> This is certainly something that a committe would discuss. There are
> plenty of examples of this problem being solved successfully by
> handing out numeric ranges.
Well, for MAC addresses, PCI device ids etc, there are professional
organisations that care for this. For MPI; there is no such instituion.
ANL? Maybe.
But maybe there's another technical solution, if the linked library
could somehow know which variant of mpi.h the code was compiled against,
which then would determine the meaning of all assertion beyond 1024 (or
some other limit). Something coded into MPI_Init() or it's arguments
might be a way.. hacky, hacky.
>>And what about the different calling-conventions in Fortran?
>
> The calling conventions differences (in Linux) revolve around the
> f2c-abi issue, and it so happens that no MPI routines trip on this
> issue, as it only affects functions that return REAL*4 or COMPLEX
> types. Did I miss a function that has those return types?
I did not think of this, but more of issues like "string as an argument"
as the way how the string length is passed is not standardized. Then
there are issues with getting access to global variables from COMMON
blocks etc. which are hard (if at all) to be solved with one shared
object file for multiple compilers. We currently need to link a small
extra object file depending on the compiler used.
This does not mean that we should not continue thinking about an ABI,
but there's more than unifying mpi.h to be able to use a single shared
library.
Joachim
--
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de
More information about the Beowulf
mailing list