[Beowulf] Bright Cluster Manager

Chris Samuel chris at csamuel.org
Fri May 4 07:43:35 PDT 2018


On Thursday, 3 May 2018 11:04:38 PM AEST Douglas Eadline wrote:

> Here is where I see it going
> 
> 1. Computer nodes with a base minimal generic Linux OS
>    (with PR_SET_NO_NEW_PRIVS in kernel, added in 3.5)

Depends on your containerisation method, some don't need to rely on that as 
the proactively disarm containers of dangerous abilities (setuid/setgid/
capabilities) before the user gets near them.

That said, even RHEL6 has support for that, so you'd be hard pressed to find an 
up-to-date system that doesn't have that ability.

> 2. A Scheduler (that supports containers)
> 
> 3. Containers (Singularity mostly)
> 
> All "provisioning" is moved to the container. There will be edge cases of
> course, but applications will be pulled down from
> a container repos and "just run"

This then relies on people building containers that have the right libraries 
for the hardware you are using.  For instance I tried to use some Singularity 
containers on our system for MPI work but can't because the base OS is too old 
to include support for our OmniPath interconnect.

The other issue is that it encourages people to build generic binaries rather 
than optimised binaries to broaden the systems the container can run on and/or 
because they don't have a proprietary compiler (or the distro has a version of 
GCC too old to optimise for the hardware).

I would argue that there is a place for that sort of work, but that it's the 
cloud not so much HPC (as they're not trying to get the most out of the 
hardware).

I'm conflicted on this because I also have great sympathies for the 
reproducibility side of the coin!

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



More information about the Beowulf mailing list