[Beowulf] ***UNCHECKED*** Re: Spark, Julia, OpenMPI etc. - all in one place

Oddo Da oddodaoddo at gmail.com
Wed Oct 14 08:39:22 PDT 2020


On Wed, Oct 14, 2020 at 11:32 AM Douglas Eadline <deadline at eadline.org>
wrote:

>
> IMO, both Hadoop and Spark did not use MPI because they had
> a highly defined algorithm with specific performance goals.
> Many MR jobs, like those with Hadoop are dynamic, requiring a
> varied resource load over the course of their lifetime.
> (Mapping uses a lot of resources, Reducing usually uses much less)
>
> Thus, the Hadoop scheduler, YARN, can dynamically reduce or
> increase the resources assigned to a running job. MPI does not
> provide such a dynamic resource allocation.
> Basically, MPI did not address their project goals.
> The authors were certainly aware of MPI (I worked with
> some of them on a book about YARN)
>

Doug, I agree. Just to clarify, I did not ask why Spark or Hadoop did not
start with MPI but why the whole data science/ML/AI thing did not look at
MPI first and try to use it as the underlying mechanism (you answered that
as well). I have nothing against MPI. If you look at the world of DS/ML/AI,
you also have things like Akka, which are basically message passing but
with the added bonus of being usable in "actor" settings which can be
persisted through time (think of models that just use new information to
add onto existing knowledge derived from previous information). Things like
Akka also have the added bonus of things like Scala - strong typing,
correctness, lazy evaluation, reasoning about code etc. Of course,
something like Akka would never be applicable to the traditional HPC world,
we still live in the timeline and setting dictated by the 1970s (1960s?)
concept of a job. Nothing wrong with the job concept either (Spark/Hadoop
also live in that context), just thought I mention it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20201014/af0113b0/attachment.html>


More information about the Beowulf mailing list