[Beowulf] Cluster consistency checks

John Hearns John.Hearns at xma.co.uk
Tue Mar 22 13:16:14 PDT 2016


Jim

There is indeed a version included with OpenHPC

How do I know? Well I have been working this week on setting up Openhpc and Cluster Checker to run burn and installation checks on newly delivered compute nodes.

Sent from my Windows Phone
________________________________
From: James Cownie<mailto:jcownie at gmail.com>
Sent: ‎22/‎03/‎2016 20:42
To: Olli-Pekka Lehto<mailto:Olli-Pekka.Lehto at csc.fi>
Cc: beowulf at beowulf.org<mailto:beowulf at beowulf.org>
Subject: Re: [Beowulf] Cluster consistency checks


On 22 Mar 2016, at 15:32, Olli-Pekka Lehto <Olli-Pekka.Lehto at csc.fi<mailto:Olli-Pekka.Lehto at csc.fi>> wrote:

Hi,

I finally got around to writing down my cluster-consistency checklist that I've been planning for a long time:

https://github.com/oplehto/cluster-checks/

The goal is to try to make the baseline installation of a cluster as consistent as possible and make vendors work for their money. :) Of course hopefully publishing this will help vendors capture some of the issues that slip through the cracks even before clusters are handed over. It's also a good idea to run these types of checks during the lifetime of the system as there's always some consistency creep as hardware gets replaced.

If someone is interested in contributing, pull requests or comments on the list are welcome. I'm sure that there's something missing as well. Right now it's just a text-file but making some nicer scripts and postprocessing for the output might happen as well at some point. All the examples are very HP oriented as well at this point.

Best regards,
Olli-Pekka

Olli,

Have you looked at Intel Cluster Checker<https://clusterready.intel.com/intel-cluster-checker-version-3/>? It seems to be trying to do a lot of what you are also aiming at.
It doesn’t seem to be free (though it is bundled with some of the Parallel Studio products).
I’d have hoped that there’d be a version at OpenHPC, but I couldn’t see one.

(Full Dicslosure: As you know, I work for Intel, though not on Cluster Checker…)

-- Jim
James Cownie <jcownie at gmail.com<mailto:jcownie at gmail.com>>
Mob: +44 780 637 7146
http://skiingjim.blogspot.com/



________________________________

Scanned by MailMarshal - M86 Security's comprehensive email content security solution.

________________________________
Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20160322/22200517/attachment.html>


More information about the Beowulf mailing list