[Beowulf] ***UNCHECKED*** Re: Spark, Julia, OpenMPI etc. - all in one place

Douglas Eadline deadline at eadline.org
Tue Oct 13 14:03:16 PDT 2020


> On Tue, Oct 13, 2020 at 3:54 PM Douglas Eadline <deadline at eadline.org>
> wrote:
>
>>
>> It really depends on what you need to do with Hadoop or Spark.
>> IMO many organizations don't have enough data to justify
>> standing up a 16-24 node cluster system with a PB of HDFS.
>>
>
> Excellent. If I understand what you are saying, there is simply no demand
> to mix technologies, esp. in the academic world. OK. In your opinion and
> independent of Spark/HDFS discussion, why are we still only on openMPI in
> the world of writing distributed code on HPC clusters? Why is there
> nothing
> else gaining any significant traction? No innovation in exposing higher
> level abstractions and hiding the details and making it easier to write
> correct code that is easier to reason about and does not burden the writer
> with too much of a low level detail. Is it just the amount of investment
> in
> an existing knowledge base? Is it that there is nothing out there to
> compel
> people to spend the time on it to learn it? Or is there nothing there? Or
> maybe there is and I am just blissfully unaware? :)
>


I have been involved in HPC and parallel computing since the 1980's
Prior to MPI every vendor had a message passing library. Initially
PVM (Parallel Virtual Machine) from Oak Ridge was developed so there
would be some standard API to create parallel codes. It worked well
but needed more. MPI was developed so parallel hardware vendors
(not many back then) could standardize on a messaging framework
for HPC. Since then, not a lot has pushed the needle forward.

Of course there are things like OpenMP, but these are not distributed
tools.

Another issue the difference between "concurrent code" and
parallel execution. Not everything that is concurrent needs
to be executed in parallel and indeed, depending on
the hardware environment you are targeting, these decisions
may change. And, it is not something you can figure out by
looking at the code.
P
arallel computing is hard problem and no one has
really come up with a general purpose way to write software.
MPI works, however I still consider it a "parallel machine code"
that requires some careful programming.

The good news is most of the popular HPC applications
have been ported and will run using MPI (as best as their algorithm
allows) So from an end user perspective, most everything
works. Of course there could be more applications ported
to MPI but it all depends. Maybe end users can get enough
performance with a CUDA version and some GPUs or an
OpenMP version on a 64-core server.

Thus the incentive is not really there. There is no huge financial
push behind HPC software tools like there is with data analytics.

Personally, I like Julia and believe it is the best new language
to enter technical computing. One of the issues it addresses is
the two language problem. The first cut of something is often written
in Python, then if it get to production and is slow and does
not have an easy parallel pathway (local multi-core or distributed)
Then the code is rewritten in C/C++ or Fortran with MPI, CUDA, OpenMP

Julia is fast out the box and provides a growth path for
parallel growth. One version with no need to rewrite.  Plus,
it has something called "multiple dispatch" that provides
unprecedented code flexibility and portability. (too long a
discussion for this email) Basically it keeps the end user closer
to their "problem" and further away from the hardware minutia.

That is enough for now. I'm sure others have opinions worth
hearing.


--
Doug



> Thanks!
>


-- 
Doug



More information about the Beowulf mailing list