<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>So for general monitoring of the cluster usage we use:</p>
<p><a class="moz-txt-link-freetext" href="https://github.com/fasrc/slurm-diamond-collector">https://github.com/fasrc/slurm-diamond-collector</a></p>
<p>and pipe to Graphana. We also use XDMod:</p>
<p><a class="moz-txt-link-freetext" href="http://open.xdmod.org/7.0/index.html">http://open.xdmod.org/7.0/index.html</a></p>
<p>As for specific node alerting, we use the old standby of Nagios.</p>
<p>-Paul Edmon-<br>
</p>
<br>
<div class="moz-cite-prefix">On 10/7/2017 8:21 AM, Josh Catana
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAJOKg0SisXROqZdWzyF1dKeY2dWruf0VuNaro=RL9kDMADvM3A@mail.gmail.com">
<div dir="auto">This may have been brought up in the past, but I
couldn't find much in my message archive.
<div dir="auto">What are people using for HPC cluster monitoring
and metrics lately? I've been low on time to add features to
my home grown solution and looking at some OTS products.
<div dir="auto">I'm looking for something that can do
monitoring, alert on condition, broken hardware, etc.</div>
<div dir="auto">Also something that does system resource
utilization metrics. If it has a plug-in for a scheduling
system like PBS where I can correlate a job ID to the
metrics of the systems it is currently running on or
previously ran on at the time, that would be an amazing
plus.</div>
<div dir="auto">Any of you beowulfers have any suggestions?</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Beowulf mailing list, <a class="moz-txt-link-abbreviated" href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit <a class="moz-txt-link-freetext" href="http://www.beowulf.org/mailman/listinfo/beowulf">http://www.beowulf.org/mailman/listinfo/beowulf</a>
</pre>
</blockquote>
<br>
</body>
</html>