[Beowulf] experience with HPC running on OpenStack

Chris Samuel chris at csamuel.org
Tue Jun 30 22:05:11 PDT 2020


On 29/6/20 5:09 pm, Jörg Saßmannshausen wrote:

> we are currently planning a new cluster and this time around the idea was to
> use OpenStack for the HPC part of the cluster as well.
> 
> I was wondering if somebody has some first hand experiences on the list here.

At $JOB-2 I helped a group set up a cluster on OpenStack (they were 
resource constrained, they had access to OpenStack nodes and that was 
it).  In my experience it was just another added layer of complexity for 
no added benefit and resulted in a number of outages due to failures in 
the OpenStack layers underneath.

Given that Slurm which was being used there already had mature cgroups 
support there really was no advantage to them to having a layer of 
virtualisation on top of the hardware, especially as (if I'm remembering 
properly) in the early days the virtualisation layer didn't properly 
understand the Intel CPUs we had and so didn't reflect the correct 
capabilities to the VM.

All that said, these days it's likely improved, and I know then people 
were thinking about OpenStack "Ironic" which was a way for it to manage 
bare metal nodes.

But I do know the folks in question eventually managed to go to purely a 
bare metal solution and seemed a lot happier for it.

As for IB, I suspect that depends on the capabilities of your 
virtualisation layer, but I do believe that is quite possible. This 
cluster didn't have IB (when they started getting bare metal nodes they 
went RoCE instead).

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA


More information about the Beowulf mailing list