[Beowulf] HPC workflows

mark somers m.somers at chem.leidenuniv.nl
Wed Nov 28 04:51:05 PST 2018


Well, please be careful in naming things:

http://cloudscaling.com/blog/cloud-computing/grid-cloud-hpc-whats-the-diff/

(note; The guy only heard about MPI and does not consider SMP based codes using i.e. OpenMP, but he did understand there are
different things being talked about).

Now I am all for connecting divers and flexible workflows to true HPC systems and grids that feel different if not experienced
with (otherwise what is the use of a computer if there are no users making use of it?), but do not make the mistake of thinking
everything is cloud or will be cloud soon that fast. 

Bare with me for a second:

There are some very fundamental problems when dealing with large scale parallel programs (OpenMP) on virtual machines (most of
the cloud). Google for papers talking about co-scheduling. All VM specialists I know and talked with, state generally that using
more than 4 cores in a VM is not smart and one should switch to bare metal then. Don't believe it? Google for it or just try it
yourself by doing a parallel scaling experiment and fitting Amdahls law through your measurements.

So, one could say bare metal cloud have arisen mostly because of this but they also do come with expenses. Somehow I find that a
simple rule always seems to apply; if more people in a scheme need to be paid, the scheme is probably more expensive than
alternatives, if available. Or state differently; If you can do things yourself, it is always a cheaper option than let some
others do things (under normal 'open market' rules and excluding the option of slavery :)).

Nice read for some background:

http://staff.um.edu.mt/carl.debono/DT_CCE3013_1.pdf

One has to note that in academia one often is in the situation that grants are obtained to buy hardware and that running costs
(i.e. electricity and rack space) are matched by the university making the case of spending the grant money on paying amazone or
google to do your 'compute' not so sensible if you can do things yourself. Also given the ease of deploying an HPC cluster
nowadays with OpenHPC or something commercial like Qlustar or Bright, it will be hard pressed to justify long term bare metal
cloud usage in these settings.

Those were some technical and economical considerations that play a role in things. 

There is also another aspect when for example dealing with sensitive data you are to be helt responsible for. The Cloud model is
not so friendly under those circumstances either. Again your data is put "on someone else's computer". Thinking of GDPR and
such.

So, back to the point, some 'user driven' workloads might end up on clouds or on bare-metal on-premisse clouds (seems to be the
latest fad right now) but clearly not everything. Especially if the workloads are not 'user driven' but technology (or
economically or socially driven) i.e. there is no other way of doing it except using some type of (specialized) technology (or
it is just not allowed). I therefore also am of opinion that cloud computing is also not true (traditional) HPC and that the
term HPC has been diluted over the year by commercial interest / marketing speak.

BTW, on a side note / rant; The mathematics we are dealing with here are the constraints to be met in optimising things. The
constraints actually determine the final optimal case (https://en.wikipedia.org/wiki/Lagrange_multiplier) and people tend to
'ignore' or not specify the constraints in their arguments about what is the best or optimal thing to do. So what I did here is
I have given you some example of constraints (technical, economical and social) in the 'everything will be cloud' rhetoric to
keep an eye on before drawing any conclusions about what the future might bring :). 

just my little opinion though...

Disclaimer; I could be horribly wrong :).

-- 
mark somers
tel: +31715274437
mail: m.somers at chem.leidenuniv.nl
web:  http://theorchem.leidenuniv.nl/people/somers


More information about the Beowulf mailing list