[Beowulf] CentOS 7.x for cluster nodes

m.somers at chem.leidenuniv.nl m.somers at chem.leidenuniv.nl
Fri Dec 30 03:32:08 PST 2016


Hi,

we have been using CentOS 7 for one year now on a ~120 node cluster and on
a more recent ~40 node cluster.

Details can be found at http://theor.lic.leidenuniv.nl/facilities

Experiences:

CentOS7 does just fine, systemd is a slight annoyance due to muscle memory
of the service scripts I was used to so many years. Moving away from
Torque/Maui to SLURM and with CentOS7 we moved away of using SystemImager.
Getting that to run on CentOS6 already needed specialized post scripts to
deal with software raid and UIDs etc. With CentOS7 we now use a simple
straightforward 100 line kickstart file to provision the nodes using PXE.
NFS is only used for /home. Glusterfs is used for global scratch and big
stuff you want to keep for some time but not forever so no need for
backups. We do not invest in infiniband, as we have also easy access to
national infrastructure that has it. Our 'stack' is currently (after 15j
doing clustering and HPC) fairly stable and complete for our purposes, so
no need for OS redeployments and such during the production phase of these
clusters. We thus see uptimes of years on our nodes. Sometimes the head
node is rebooted for kernel updates etc. but long running jobs on nodes
are not affected because we provision OS on local disks on software raid 1
for nodes. Doing everything we can to keep them jobs running :).

Building a new single rack cluster from scratch (given the hardware) takes
me about 3d for the software setup but I usually take a week of 'play'
time to run benchmarks, test the new versions of stuff like openmpi etc.
Racking the hardware takes usually another day per rack. I think I have
built and maintained about 10 of these clusters now in the last 15j. So,
all in all, CentOS is very cool an important for us and CentOS 7 just
works as normal for us.

For those starting and reading along-side, you might want to check out the
openhpc project:

http://www.openhpc.community/

or ROCKS (but they do not seem to work on a CentOS 7 verion?):

http://www.rocksclusters.org

For those interested, I have a personal check list for building a cluster
but it is rough and partly in dutch. That, together with some example
config files, allows me to setup a cluster within three days from scratch.
Please mail me if you are interested.

m.

-- 
mark somers
tel: +31715274437
mail: m.somers at chem.leidenuniv.nl
web:  http://theorchem.leidenuniv.nl/people/somers






More information about the Beowulf mailing list