[Beowulf] cluster deployment and config management
remy.dernat at umontpellier.fr
Tue Sep 5 04:52:50 PDT 2017
Le 05/09/2017 à 08:57, Carsten Aulbert a écrit :
> On 09/05/17 08:43, Stu Midgley wrote:
>> Interesting. Ansible has come up a few times.
>> Our largest cluster is 2000 KNL nodes and we are looking towards 10k...
>> so it needs to scale well :)
> We went with ansible at the end of 2015 until we hit a road block with
> it not using a client daemon a fat ferew months. When having a few 1000
> states to perform on each client, the lag for initiating the next state
> centrally from the server was quite noticeable - in the end a single run
> took more than half an hour without any changes (for a single host!).
> After that we re-evaluated with salt stack being the outcome scaling
> well enough for our O(2500) clients.
+1 for SaltStack here. It really performs very well on large
infrastructure (from doc.
and allows complex rules with reactors and orchestrators (including some
ways to manage post-reboot/connections).
There is also a github project which allows to deploy a cluster from
scratch with SaltStack, on a CentOS base, with PXE, dhcp, dns,
Personnally, I will use (it works, but it needs some additionnal tests)
SaltStack with FAI ( https://fai-project.org/ ) to deploy my nodes. Or
maybe, I will switch to banquise, but for now, this project is still a
bit too young and I need a debian base OS (but I know it is planned;
waiting for the preseed config management through Salt). I am using
gitfs as a SaltStack backend and I also have some configs files as
another git repository (eg : environment modules files).
> Note, I ave not tracked if and how ansible progressed over the past
> ~2yrs which may or may not exhibit the same problems today.
More information about the Beowulf