[Beowulf] Cluster consistency checks

James Cownie jcownie at gmail.com
Tue Mar 22 12:41:02 PDT 2016


> On 22 Mar 2016, at 15:32, Olli-Pekka Lehto <Olli-Pekka.Lehto at csc.fi> wrote:
> 
> Hi,
> 
> I finally got around to writing down my cluster-consistency checklist that I've been planning for a long time: 
> 
> https://github.com/oplehto/cluster-checks/ 
> 
> The goal is to try to make the baseline installation of a cluster as consistent as possible and make vendors work for their money. :) Of course hopefully publishing this will help vendors capture some of the issues that slip through the cracks even before clusters are handed over. It's also a good idea to run these types of checks during the lifetime of the system as there's always some consistency creep as hardware gets replaced. 
> 
> If someone is interested in contributing, pull requests or comments on the list are welcome. I'm sure that there's something missing as well. Right now it's just a text-file but making some nicer scripts and postprocessing for the output might happen as well at some point. All the examples are very HP oriented as well at this point.
> 
> Best regards,
> Olli-Pekka

Olli,

Have you looked at Intel Cluster Checker <https://clusterready.intel.com/intel-cluster-checker-version-3/>? It seems to be trying to do a lot of what you are also aiming at.
It doesn’t seem to be free (though it is bundled with some of the Parallel Studio products). 
I’d have hoped that there’d be a version at OpenHPC, but I couldn’t see one.

(Full Dicslosure: As you know, I work for Intel, though not on Cluster Checker…)

-- Jim
James Cownie <jcownie at gmail.com>
Mob: +44 780 637 7146
http://skiingjim.blogspot.com/



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20160322/00f69e97/attachment.html>


More information about the Beowulf mailing list