[Beowulf] [OT] MPI-haters

Peter St. John peter.st.john at gmail.com
Sun Mar 6 14:02:26 PST 2016


Justin,
I'm unsure just what you mean by some of what you said.

"Any fixed program-processor binding is a single point failure"
I'm troubled by the word "any". What about running two copies of a program,
each with its own copy of the same data, on two processors (e.g. on a
Tandem machine)? Surely that is not a single point of failure; is it not a
"fixed program-processor binding"?

"... it is impossible to implement reliable communication in the face of..."
If by "reliable" you mean "perfectly reliable" then the thesis is trivial
and does not require proof. Reliability is a metrical value with costs; the
cost is space (e.g. for error-correcting codes) or time (e.g. for
re-transmissions) or whatever. Do you mean that MPI is cost-ineffective in
proportion to reliability? If so, why?

Thanks,
Peter

On Sun, Mar 6, 2016 at 11:10 AM, Justin Y. Shi <shi at temple.edu> wrote:

> Actually my interest in your group is not much between "hate" and "love"
> of MPI or any other APIs. I am more interested in the "correctness" of
> parallel APIs.
>
> Three decades ago, not doing "bare metal" computing was impossible for
> effective parallel processing. Today, insisting on "bare metal" computing
> is detrimental to extreme scale efforts.
>
> Any fixed program-processor binding is a single point failure. The problem
> only shows when the application scales. And it is impossible to implement
> reliable communication in the face of crashes [Alan Fekete, Nancy Lynch and
> John Spinelli's 93 JACM paper proved this theoretically]. Therefore, any
> direct program-program communication API are theoretically incorrect for
> extreme scaling applications.
>
> The <key, value> pair API seems the only theoretically correct parallel
> programming API that can take us out of the abyss of impossibilities.
> However, systems like Hadoop and Spark have only showed the great promises
> of program-device decoupling, they were not really designed for tackling
> HPC applications. And their decoupling is incomplete by their runtime
> implementations.
>
> I proposed a Statistic Multiplexed Computing idea leveraging the successes
> of <key, value> api systems and old Tuple Space semantics. My github
> contribution is called Synergy3.0+.  You are welcome to check it out and do
> a "bare metal" comparison against MPI and any other.
>
> Our latest development is AnkaCom that was designed to tackling data
> intensive HPC without scaling limits.
>
> My apologies in advance for my shameless self-advertising.  I am looking
> for serious collaborators who are interested in breaking this decade-old
> barrier.
>
> Justin Y. Shi
> shi at temple.edu
> SERC 310
> Temple University
> +1(215)204-6437
>
>
>
> On Fri, Mar 4, 2016 at 10:14 AM, C Bergström <cbergstrom at pathscale.com>
> wrote:
>
>> A few people have subscribed and it's great to see some interest -
>> hopefully we can start some interesting discussions. Actually - my
>> background is more on the "web" side of HPC. I took a big jump when I
>> started working @pathscale - Over the past 6 years I've cringed more
>> than once when I see design that looks ***worse*** (I didn't think
>> possible) than hibernate with tons of outer joins and evil xml
>> configs.. (Java references for anyone unfortunate enough to get what
>> I'm saying)
>>
>>
>>
>>
>> On Fri, Mar 4, 2016 at 10:05 PM, Justin Y. Shi <shi at temple.edu> wrote:
>> > Thank you for creating the list. I have subscribed.
>> >
>> > Justin
>> >
>> > On Fri, Mar 4, 2016 at 5:43 AM, C Bergström <cbergstrom at pathscale.com>
>> > wrote:
>> >>
>> >> Sorry for the shameless self indulgence, but there seems to be a
>> >> growing trend of love/hate around MPI. I'll leave my opinions aside,
>> >> but at the same time I'd love connect and host a list where others who
>> >> are passionate about scalability can vent and openly discuss ideas.
>> >>
>> >> Despite the comical name, I've created mpi-haters mailing list
>> >>
>> http://lists.pathscale.com/mailman/listinfo/mpi-haters_lists.pathscale.com
>> >>
>> >> To start things off - Some of the ideas I've been privately bouncing
>> >> around
>> >>
>> >> Can current directive based approaches (OMP/ACC) be extended to scale
>> >> out. (I've seen some research out of Japan on this or similar)
>> >>
>> >> Is Chapel c-like syntax similar enough to easily implement in clang
>> >>
>> >> Can one low level library succeed at creating a clean interface across
>> >> all popular industry interconnects (libfabrics vs UCX)
>> >>
>> >> Real world success or failure of "exascale" runtimes? (What's your
>> >> experience - lets not pull any punches)
>> >>
>> >> I won't claim to see ridiculous scalability in most web applications
>> >> I've worked on, but they had so many tools available - Why have I
>> >> never heard of memcache being used in a supercomputer and or why isn't
>> >> sharding ever mentioned...
>> >>
>> >> Everyone is welcome and lets keep it positive and fun - invite your
>> >> friends
>> >>
>> >>
>> >> ./C
>> >>
>> >> ps - Apologies if you get this message more than once.
>> >> _______________________________________________
>> >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing
>> >> To change your subscription (digest mode or unsubscribe) visit
>> >> http://www.beowulf.org/mailman/listinfo/beowulf
>> >
>> >
>>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20160306/c6aa3bd5/attachment.html>


More information about the Beowulf mailing list