<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="mso-fareast-language:EN-US">Hi All,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US">What sets Julia apart is it is not a compiled language but a Just In Time (JIT) language. I am still getting into it but it seems to be geared to complex and large data sets. As mentioned previously
I am still working with a colleague on this prototype. With Julia at least there is an IDE so to speak for it. It is based on the ATOM IDE with a package that is installed specifically for Julia.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US">I will obviously keep the list updated in regards to Julia and my experiences with it but the little I have looked at the language it is easy to write code for. Its still in its infancy as the latest
version I believe is 1.0.1<o:p></o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US">Regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US">Jonathan<o:p></o:p></span></p>
<p class="MsoNormal"><span style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span lang="EN-US"> Beowulf <beowulf-bounces@beowulf.org>
<b>On Behalf Of </b>Scott Atchley<br>
<b>Sent:</b> 14 March 2019 01:17<br>
<b>To:</b> Douglas Eadline <deadline@eadline.org><br>
<b>Cc:</b> Beowulf Mailing List <beowulf@beowulf.org><br>
<b>Subject:</b> Re: [Beowulf] Large amounts of data to store and process<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">I agree with your take about slower progress on the hardware front and that software has to improve. DOE funds several vendors to do research to improve technologies that will hopefully benefit HPC, in particular, as well as the general
market. I am reviewing a vendor's latest report on micro-architectural techniques to improve performance (e.g., lower latency, increase bandwidth). For this study, they use a combination of DOE mini-apps/proxies as well as commercial benchmarks. The techniques
that this vendor investigated showed potential improvements for commercial benchmarks but much less, if any, for the DOE apps, which are highly optimized.<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I will state that I know nothing about Julia, but I assume it is a higher-level language than C/C++ (or Fortran for numerical codes). I am skeptical that a higher-level language (assuming Julia is) can help. I believe the vendor's techniques
that I am reviewing benefited commercial benchmarks because they are less optimized than the DOE apps. Using a high-level language relies on the language's compiler/interpreter and runtime. The developer has no idea what is happening or does not have the ability
to improve it if profiling shows that the issue is in the runtime. I believe that if you need more performance, you will have to work for it in a lower-level language and there is no more free lunch (i.e., hoping the latest hardware will do it for me).<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Hope I am wrong.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">On Wed, Mar 13, 2019 at 5:23 PM Douglas Eadline <<a href="mailto:deadline@eadline.org">deadline@eadline.org</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<p class="MsoNormal"><br>
I realize it is bad form to reply ones own post and<br>
I forgot to mention something.<br>
<br>
Basically the HW performance parade is getting harder<br>
to celebrate. Clock frequencies have been slowly<br>
increasing while cores are multiply rather quickly.<br>
Single core performance boosts are mostly coming<br>
from accelerators. Added to the fact that speculation<br>
technology when managed for security, slows things down.<br>
<br>
What this means, the focus on software performance<br>
and optimization is going to increase because we can just<br>
buy new hardware and improve things anymore.<br>
<br>
I believe languages like Julia can help with this situation.<br>
For a while.<br>
<br>
--<br>
Doug<br>
<br>
>> Hi All,<br>
>> Basically I have sat down with my colleague and we have opted to go down<br>
> the route of Julia with JuliaDB for this project. But here is an<br>
> interesting thought that I have been pondering if Julia is an up and<br>
> coming fast language to work with for large amounts of data how will<br>
> that<br>
>> affect HPC and the way it is currently used and HPC systems created?<br>
><br>
><br>
> First, IMO good choice.<br>
><br>
> Second a short list of actual conversations.<br>
><br>
> 1) "This code is written in Fortran." I have been met with<br>
> puzzling looks when I say the the word "Fortran." Then it<br>
> comes, "... ancient language, why not port to modern ..."<br>
> If you are asking that question young Padawan you have<br>
> much to learn, maybe try web pages"<br>
><br>
> 2) I'll just use Python because it works on my Laptop.<br>
> Later, "It will just run faster on a cluster, right?"<br>
> and "My little Python program is now kind-of big and has<br>
> become slow, should I use TensorFlow?"<br>
><br>
> 3) <mcoy><br>
> "Dammit Jim, I don't want to learn/write Fortran,C,C++ and MPI.<br>
> I'm a (fill in domain specific scientific/technical position)"<br>
> </mcoy><br>
><br>
> My reply,"I agree and wish there was a better answer to that question.<br>
> The computing industry has made great strides in HW with<br>
> multi-core, clusters etc. Software tools have always lagged<br>
> hardware. In the case of HPC it is a slow process and<br>
> in HPC the whole programming "thing" is not as "easy" as<br>
> it is in other sectors, warp drives and transporters<br>
> take a little extra effort.<br>
><br>
> 4) Then I suggest Julia, "I invite you to try Julia. It is<br>
> easy to get started, fast, and can grow with you application."<br>
> Then I might say, "In a way it is HPC BASIC, it you are old<br>
> enough you will understand what I mean by that."<br>
><br>
> The question with languages like Julia (or Chapel, etc) is:<br>
><br>
> "How much performance are you willing to give up for convenience?"<br>
><br>
> The goal is to keep the programmer close to the problem at hand<br>
> and away from the nuances of the underlying hardware. Obviously<br>
> the more performance needed, the closer you need to get to the hardware.<br>
> This decision goes beyond software tools, there are all kinds<br>
> of cost/benefits that need to be considered. And, then there<br>
> is IO ...<br>
><br>
> --<br>
> Doug<br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
>> Regards,<br>
>> Jonathan<br>
>> -----Original Message-----<br>
>> From: Beowulf <<a href="mailto:beowulf-bounces@beowulf.org" target="_blank">beowulf-bounces@beowulf.org</a>> On Behalf Of Michael Di<br>
> Domenico<br>
>> Sent: 04 March 2019 17:39<br>
>> Cc: Beowulf Mailing List <<a href="mailto:beowulf@beowulf.org" target="_blank">beowulf@beowulf.org</a>><br>
>> Subject: Re: [Beowulf] Large amounts of data to store and process On<br>
> Mon, Mar 4, 2019 at 8:18 AM Jonathan Aquilina<br>
> <<a href="mailto:jaquilina@eagleeyet.net" target="_blank">jaquilina@eagleeyet.net</a>><br>
>> wrote:<br>
>>> As previously mentioned we don’t really need to have anything<br>
>>> indexed<br>
> so I am thinking flat files are the way to go my only concern is the<br>
> performance of large flat files.<br>
>> potentially, there are many factors in the work flow that ultimately<br>
> influence the decision as others have pointed out. my flat file example<br>
> is only one, where we just repeatable blow through the files.<br>
>>> Isnt that what HDFS is for to deal with large flat files.<br>
>> large is relative. 256GB file isn't "large" anymore. i've pushed TB<br>
> files through hadoop and run the terabyte sort benchmark, and yes it can<br>
> be done in minutes (time-scale), but you need an astounding amount of<br>
> hardware to do it (the last benchmark paper i saw, it was something 1000<br>
> nodes). you can accomplish the same feat using less and less<br>
> complicated hardware/software<br>
>> and if your dev's are willing to adapt to the hadoop ecosystem, you sunk<br>
> right off the dock.<br>
>> to get a more targeted answer from the numerous smart people on the<br>
> list,<br>
>> you'd need to open up the app and workflow to us. there's just too many<br>
> variables _______________________________________________<br>
>> Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>
> To change your subscription (digest mode or unsubscribe) visit<br>
>> <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>
>> _______________________________________________<br>
>> Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>
> To change your subscription (digest mode or unsubscribe) visit<br>
>> <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>
><br>
><br>
> --<br>
> Doug<br>
><br>
><br>
><br>
><br>
> _______________________________________________<br>
> Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>
> To change your subscription (digest mode or unsubscribe) visit<br>
> <a href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf" target="_blank">
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br>
><br>
<br>
<br>
--<br>
Doug<br>
<br>
_______________________________________________<br>
Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>
To change your subscription (digest mode or unsubscribe) visit <a href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf" target="_blank">
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><o:p></o:p></p>
</blockquote>
</div>
</div>
</body>
</html>