[Beowulf] Beowulf Cluster VS Hadoop/Spark

Fri Dec 30 19:38:06 PST 2016

Until an industry has had at least a decade of countries and institutions
spending millions and millions of dollars designing systems to compete for
a spot on a voluntary list based on arbitrary synthetic benchmarks, how can
it possibly be taken seriously?

I do sort of recall the early days of hadoop but at the time I thought it
was a cool idea that, sadly, no one I supported was interested in using.
With hindsight it seems like it was a Kodak moment for HPC that was missed
because of our maturity. Get off my lawn, so to speak.

jbh

On Dec 30, 2016 11:24 PM, "Douglas Eadline" <deadline at eadline.org> wrote:

> The "data science" area has some maturing to do which should be exciting
> and fun for all of us :)

I am curious what you mean by "maturing." The problem space is quite
different and the goals are quire different, which necessitates
different designs. Are you aware the the Hadoop project once used
Torque and Maui as the scheduler? They developed their own
because the needed the ability at run-time to add and subtract
resources (containers) and they needed a way to schedule with
data locality in mind.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20161231/0fb94e8c/attachment.html>