[Beowulf] ***UNCHECKED*** Re: Spark, Julia, OpenMPI etc. - all in one place

Oddo Da oddodaoddo at gmail.com
Wed Oct 14 16:12:43 PDT 2020


Michael, thank you, you have given me quite a lot to think about.

On Wed, Oct 14, 2020 at 2:28 PM Michael Di Domenico <mdidomenico4 at gmail.com>
wrote:

> On Wed, Oct 14, 2020 at 2:07 PM Oddo Da <oddodaoddo at gmail.com> wrote:
> >
> > You stated that Spark/Hadoop approach can code for everything that MPI
> can code for and vice versa. If this is all true and it is that easy,
> nobody would have "invented" them since we already had MPI/C/C++ to solve
> all our problems ;-).
>
> i'm not sure i meant as pointedly as you have stated it here.  this is
> the difference between whether something can be and whether it should.
> yes you can solve dense linear algebra on a Spark cluster, but you
> shouldn't
>
> > I disagree. I think yes, there is old code that does not churn but there
> are always new people/grad students coming into the field. They too are
> being pointed in the same direction of how to do things, which is what we
> are discussing here ;-)
>
> I'm not sure I agree.  I interact with a LOT of post-docs, many have
> no idea what MPI is yet alone how to use it.  but i'm not entirely in
> academia so i can't say that for certain
>
> > It seems that in your world nothing new ever gets written? You are
> talking only about re-writes ;).
>
> not entirely.  you're making my point a little more pointed then i
> intended.  but if you look at the big traditional heavy hpc code, i
> think you'll find "re-writes" are uncommon.  but if you parallel the
> "cloud" world, re-writing the entire code base  of some module because
> it's tuesday happens more often then it should
>
> > This is probably true. What is the rest of the 80% of the load in your
> HPC world?
>
> we run the gambit of stuff, everything from ML frameworks to user code
> C/python/etc to stuff like magma and matlab
>
> > Programming languages are a part of it and I have said this before -
> languages like Julia can incorporate MPI as an underlying (or one of
> underlying) mechanisms/libraries to distribute computation. I have nothing
> against MPI (as I have stated before). I have something - curiosity - about
> what is holding a field in a certain state. Spark is a framework but I
> think it is much more than MPI, by the way - as it is both a way to
> distribute computation, but there is also lazy evaluation, resilient
> datasets, Scala, functional programming etc.
>
> but see you're comparing three entirely different things, Spark =
> framework, Scala = language, MPI = library.  If you wanted to compare
> Spark to HPC, there's probably a parallel application but i can't
> think of one off the top of my head.
>
> i think the stuck state you're interpreting is a misrepresentation
> that HPC is full of stodgy greybeards who only want to run MPI code
> written in 1970's fortran.  i don't think that's the case anymore.
> HPC has branched out and includes a lot of ancillary paths, but it
> still holds onto its heritage, which is something I appreciate.  HPC
> has never been about flash, it's about solving the world's hardest
> problems.  You don't always need a porsche, sometimes a yugo works
> just as well
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20201014/e714a4ee/attachment.html>


More information about the Beowulf mailing list