[Beowulf] cluster deployment and config management

Carsten Aulbert carsten.aulbert at aei.mpg.de
Mon Sep 4 23:57:48 PDT 2017


Hi

On 09/05/17 08:43, Stu Midgley wrote:
> Interesting.  Ansible has come up a few times.
> 
> Our largest cluster is 2000 KNL nodes and we are looking towards 10k...
> so it needs to scale well :)
> 
We went with ansible at the end of 2015 until we hit a road block with
it not using a client daemon a fat ferew months. When having a few 1000
states to perform on each client, the lag for initiating the next state
centrally from the server was quite noticeable - in the end a single run
took more than half an hour without any changes (for a single host!).

After that we re-evaluated with salt stack being the outcome scaling
well enough for our O(2500) clients.

Note, I ave not tracked if and how ansible progressed over the past
~2yrs which may or may not exhibit the same problems today.

Cheers

Carsten

-- 
Dr. Carsten Aulbert, Max Planck Institute for Gravitational Physics,
Callinstraße 38, 30167 Hannover, Germany
Phone: +49 511 762 17185


More information about the Beowulf mailing list