[Beowulf] [OT] MPI-haters

Justin Y. Shi shi at temple.edu
Sun Mar 6 20:06:06 PST 2016

Sorry for the fancy words... for my lack of analogies.

Data processing in HPC requires solving the CAP puzzle (or CAP theorem).
You might be interested in my SC15 workshop talk that addressed both
compute and data intensive HPC:

An older version of the source is open at this link:

This version has a dynamic daemon generation technique suitable for running
smaller experiments in traditional HPC clusters and HPC clouds.

A new version is still under development. I am looking for serious
collaborators for both legacy HPC code re-engineering and data intensive
HPC challenges. We are looking at the new TACC challenges. If you are
interested, we can talk.


On Sun, Mar 6, 2016 at 7:15 PM, C Bergström <cbergstrom at pathscale.com>

> On Mon, Mar 7, 2016 at 5:58 AM, Justin Y. Shi <shi at temple.edu> wrote:
> > Peter:
> >
> > Thanks for the questions.
> >
> > The impossibility was theoretically proved that it is impossible to
> > implement reliable communication in the face of [either sender or
> receiver]
> > crashes.  Therefore, any parallel or distributed computing API that will
> > force the runtime system to generate fixed program-processor assignments
> are
> > theoretically incorrect. This answer is also related to your second
> > question: the impossibility means 100% reliable communication is
> impossible.
> >
> > Ironically, 100% reliable packet transmission is theoretically and
> > practically possible as proved by John and Nancy for John's dissertation.
> > These two seemingly conflicting results are in fact complementary. They
> > basically say that distributed and parallel application programming
> cannot
> > rely on the reliable packet transmission as all of our current
> distributed
> > and parallel programming APIs assume.
> >
> > Thus, MPI cannot be cost-ineffective in proportion to reliability,
> because
> > of the impossibility. The same applies to all other APIs that allows
> direct
> > program-program communications. We have found that the <key, value> APIs
> are
> > the only exceptions for they allow the runtime system to generate dynamic
> > program-device bindings, such as Hadoop and Spark. To solve the problem
> > completely, the application programming logic must include the correct
> > retransmission discipline. I call this Statistic Multiplexed Computing or
> > SMC. The Hadoop and Spark implementations did not go this far. If we do
> > complete the paradigm shift, then there will be no single point failure
> > regardless how the application scales. This claim covers all computing
> and
> > communication devices. This is the ultimate extreme scale computing
> > paradigm.
> >
> > These answers are rooted in the statistic multiplexing protocol research
> > (packet switching). They have been proven in theory and practice that
> 100%
> > reliable and scalable communications are indeed possible. Since all HPC
> > applications must deploy large number of computing units via some sort of
> > interconnect (HP's The Machine may be the only exception), the only
> correct
> > API for extreme scale HPC is the ones that allow for complete
> > program-processor decoupling at runtime. Even the HP machine will benefit
> > from this research. Please note that the 100% reliability is conditioned
> by
> > the availability of the "minimal viable set of resources". In computing
> and
> > communication, the minimal set size is 1 for every critical path.
> >
> > My critics argued that there is no way statistic multiplexed computing
> > runtime can compete against bare metal programs, such as MPI. We have
> > evidences to prove the opposite. In fact SMC runtime allows dynamic
> > adjustments of processing granularity without reprogramming. Not only we
> can
> > prove faster performances using heterogeneous processor but also
> homogeneous
> > processors. We see this capability is critical for extracting efficiency
> out
> > of HPC clouds.
> I always get lost in the fancy words of research papers - Is the
> source to any of this open? How could just a normal guy like me
> reproduce or independently verify your results?
> I'm not sure at what level you're talking sometimes - at the network
> level we have things like TCP (instead of UDP) when it comes to
> ensuring that packet level reliability is ensured - at a data level
> you have ACID compliant databases for storage.. there are lots of
> technology on the "web" side which are commonly used and required
> since the "internet" is inherently unstable/unreliable. In my mind
> "exascale" machines will need to be programmed with a more open view
> of what is or isn't. Should we continue to model everything around the
> communication or switch focus to more of resolving data dependencies
> and locality..
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20160306/0213c2b1/attachment.html>

More information about the Beowulf mailing list