[Beowulf] cluster deployment and config management

Carsten Aulbert carsten.aulbert at aei.mpg.de
Mon Sep 4 23:57:48 PDT 2017


On 09/05/17 08:43, Stu Midgley wrote:
> Interesting.  Ansible has come up a few times.
> Our largest cluster is 2000 KNL nodes and we are looking towards 10k...
> so it needs to scale well :)
We went with ansible at the end of 2015 until we hit a road block with
it not using a client daemon a fat ferew months. When having a few 1000
states to perform on each client, the lag for initiating the next state
centrally from the server was quite noticeable - in the end a single run
took more than half an hour without any changes (for a single host!).

After that we re-evaluated with salt stack being the outcome scaling
well enough for our O(2500) clients.

Note, I ave not tracked if and how ansible progressed over the past
~2yrs which may or may not exhibit the same problems today.



Dr. Carsten Aulbert, Max Planck Institute for Gravitational Physics,
Callinstra├če 38, 30167 Hannover, Germany
Phone: +49 511 762 17185

More information about the Beowulf mailing list