<div dir="ltr"><div dir="ltr">On Tue, Oct 13, 2020 at 3:54 PM Douglas Eadline <<a href="mailto:deadline@eadline.org">deadline@eadline.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>It really depends on what you need to do with Hadoop or Spark.<br>

IMO many organizations don't have enough data to justify<br>

standing up a 16-24 node cluster system with a PB of HDFS.<br></blockquote><div><br></div><div>Excellent. If I understand what you are saying, there is simply no demand to mix technologies, esp. in the academic world. OK. In your opinion and independent of Spark/HDFS discussion, why are we still only on openMPI in the world of writing distributed code on HPC clusters? Why is there nothing else gaining any significant traction? No innovation in exposing higher level abstractions and hiding the details and making it easier to write correct code that is easier to reason about and does not burden the writer with too much of a low level detail. Is it just the amount of investment in an existing knowledge base? Is it that there is nothing out there to compel people to spend the time on it to learn it? Or is there nothing there? Or maybe there is and I am just blissfully unaware? :)<br></div><div><br></div><div>Thanks!<br></div></div></div>