[Beowulf] [OT] MPI-haters

C Bergström cbergstrom at pathscale.com
Sun Mar 6 16:15:32 PST 2016

On Mon, Mar 7, 2016 at 5:58 AM, Justin Y. Shi <shi at temple.edu> wrote:
> Peter:
> Thanks for the questions.
> The impossibility was theoretically proved that it is impossible to
> implement reliable communication in the face of [either sender or receiver]
> crashes.  Therefore, any parallel or distributed computing API that will
> force the runtime system to generate fixed program-processor assignments are
> theoretically incorrect. This answer is also related to your second
> question: the impossibility means 100% reliable communication is impossible.
> Ironically, 100% reliable packet transmission is theoretically and
> practically possible as proved by John and Nancy for John's dissertation.
> These two seemingly conflicting results are in fact complementary. They
> basically say that distributed and parallel application programming cannot
> rely on the reliable packet transmission as all of our current distributed
> and parallel programming APIs assume.
> Thus, MPI cannot be cost-ineffective in proportion to reliability, because
> of the impossibility. The same applies to all other APIs that allows direct
> program-program communications. We have found that the <key, value> APIs are
> the only exceptions for they allow the runtime system to generate dynamic
> program-device bindings, such as Hadoop and Spark. To solve the problem
> completely, the application programming logic must include the correct
> retransmission discipline. I call this Statistic Multiplexed Computing or
> SMC. The Hadoop and Spark implementations did not go this far. If we do
> complete the paradigm shift, then there will be no single point failure
> regardless how the application scales. This claim covers all computing and
> communication devices. This is the ultimate extreme scale computing
> paradigm.
> These answers are rooted in the statistic multiplexing protocol research
> (packet switching). They have been proven in theory and practice that 100%
> reliable and scalable communications are indeed possible. Since all HPC
> applications must deploy large number of computing units via some sort of
> interconnect (HP's The Machine may be the only exception), the only correct
> API for extreme scale HPC is the ones that allow for complete
> program-processor decoupling at runtime. Even the HP machine will benefit
> from this research. Please note that the 100% reliability is conditioned by
> the availability of the "minimal viable set of resources". In computing and
> communication, the minimal set size is 1 for every critical path.
> My critics argued that there is no way statistic multiplexed computing
> runtime can compete against bare metal programs, such as MPI. We have
> evidences to prove the opposite. In fact SMC runtime allows dynamic
> adjustments of processing granularity without reprogramming. Not only we can
> prove faster performances using heterogeneous processor but also homogeneous
> processors. We see this capability is critical for extracting efficiency out
> of HPC clouds.

I always get lost in the fancy words of research papers - Is the
source to any of this open? How could just a normal guy like me
reproduce or independently verify your results?

I'm not sure at what level you're talking sometimes - at the network
level we have things like TCP (instead of UDP) when it comes to
ensuring that packet level reliability is ensured - at a data level
you have ACID compliant databases for storage.. there are lots of
technology on the "web" side which are commonly used and required
since the "internet" is inherently unstable/unreliable. In my mind
"exascale" machines will need to be programmed with a more open view
of what is or isn't. Should we continue to model everything around the
communication or switch focus to more of resolving data dependencies
and locality..

More information about the Beowulf mailing list