[Beowulf] Cluster consistency checks

McMillan, Scott A scott.a.mcmillan at intel.com
Fri Mar 25 12:28:48 PDT 2016

Intel Cluster Checker defines a performance outlier as one outside the range of the median +/- X median absolute deviations.  X is configurable and the default value is 6.



On 3/25/16, 1:39 PM, "Beowulf on behalf of Jeffrey Layton" <beowulf-bounces at beowulf.org<mailto:beowulf-bounces at beowulf.org> on behalf of laytonjb at gmail.com<mailto:laytonjb at gmail.com>> wrote:

Olli-Pekka, et al,

I took a look at your updated website - it looks very good. One thing I wanted to ask, and this question is probably one for the entire list, when you run a test across all of the nodes in the cluster, what process do you use to determine if nodes are "outliers" and need attention?

For example, one test you mention is to run stream and look at the TRIAD results for all of the nodes. If you run it across an entire cluster you end up with a collection of results. What do you do with those results? Do you look for nodes that are a certain percentage outside of the mean? Or do you look for nodes that are outside one standard deviation from the mean?



P.S. I have my own ideas but I'm really curious what other people do.

More information about the Beowulf mailing list