[Beowulf] cluster deployment and config management
sdm900 at gmail.com
Mon Sep 4 23:43:56 PDT 2017
Interesting. Ansible has come up a few times.
Our largest cluster is 2000 KNL nodes and we are looking towards 10k... so
it needs to scale well :)
On Tue, Sep 5, 2017 at 1:46 PM, Lachlan Musicman <datakid at gmail.com> wrote:
> On 5 September 2017 at 15:24, Stu Midgley <sdm900 at gmail.com> wrote:
>> Morning everyone
>> I am in the process of redeveloping our cluster deployment and config
>> management environment and wondered what others are doing?
>> First, everything we currently have is basically home-grown.
>> Our cluster deployment is a system that I've developed over the years and
>> is pretty simple - if you know BASH and how pxe booting works. It has
>> everything from setting the correct parameters in the bios, zfs ram disks
>> for the OS, lustre for state files (usually in /var) - all in the initrd.
>> We use it to boot cluster nodes, lustre servers, misc servers and
>> We basically treat everything like a cluster.
>> However... we do have a proliferation of images... and all need to be
>> kept up-to-date and managed. Most of the changes from one image to the
>> next are config files.
>> We don't have a good config management (which might, hopefully, reduce
>> the number of images we need). We tried puppet, but it seems everyone
>> hates it. Its too complicated? Not the right tool?
>> I was thinking of using git for config files, dumping a list of rpm's,
>> dumping the active services from systemd and somehow munging all that
>> together in the initrd. ie. git checkout the server to get config files
>> and systemctl enable/start the appropriate services etc.
>> It started to get complicated.
>> Any feedback/experiences appreciated. What works well? What doesn't?
> We are a small installation, with manageable needs. In our first step up
> from where you are, we ended up on:
> - Katello/Foreman (in RedHat it's called Satellite) for management of
> software repositories, in discrete sets and slices. We started with
> Spacewalk but it is a little old and fusty and just isn't appropriate
> - git for config management of environment module files
> - Ansible for easy day to day management of servers
> We no longer manage configs as such, since there is a shared data store,
> and the Ansible/Katello mix means we can rebuild any server from scratch.
> Note that Ansible and Katello/Foreman can be integrated - we haven't gone
> that far yet. Are quite happy with the two being apart. That will change in
> the near future I think.
> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic
> civics is the insistence that we cannot ignore the truth, nor should we
> panic about it. It is a shared consciousness that our institutions have
> failed and our ecosystem is collapsing, yet we are still here — and we are
> creative agents who can shape our destinies. Apocalyptic civics is the
> conviction that the only way out is through, and the only way through is
> together. "
> *Greg Bloom* @greggish https://twitter.com/greggish/
Dr Stuart Midgley
sdm900 at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf