[Beowulf] [External] Spark, Julia, OpenMPI etc. - all in one place

Michael Di Domenico mdidomenico4 at gmail.com
Tue Oct 13 06:10:01 PDT 2020


On Tue, Oct 13, 2020 at 8:39 AM Oddo Da <oddodaoddo at gmail.com> wrote:
>
> Michael, thank you for the insight. I think Hadoop in general is mostly dying, Spark is really the derivative that took off. Basically, what you are saying is that there is no demand on your infra for this kind of work. Do you have any insights as to why not? Do the AI/DS/ML guys just know that they cannot use your resources to run standard loads and go straight to the cloud or local ethernet clusters?

Part of the reason it didn't take off is because we're just not a
bigdata shop and doing math inside the hadoop world was hard.  some of
what we do does revolve around parsing through large swaths of data,
but then after that 'grep' is done the users wanted to do some complex
math on the data, but hadoop/java didn't have the right libraries or
people had to learn java, which they weren't willing to do.  the
abstraction languages like pig, (and others i forget the names), made
things a little easier, but overall it was just too complicated.  and
frankly i think this is exactly what you're seeing.  outside of
'industry' aka 'internet world' the 'hadoop architecture' really
doesn't have much utility and mpi and it's ilk really are better
suited.  whether julia can/should displace traditional C/mpi, who
knows.

> In your estimate, how many of your users write code in Julia vs MPI vs Python?

it varies.  mostly it depends on the person working on the project.  i
try to support everything across the entire center compute
infrastructure, but we leave it up to the user to figure out how best
to scale their program to the machines we have.  Even though we have
primarily a traditional HPC setup, there's nothing we can't run from
AI to CUDA to C/Fortran MPI or even just simple python programs.  We
still run 'bigdata' programs, it's just that the users have found
other ways to do it that don't require hadoop


More information about the Beowulf mailing list