[Beowulf] Monitoring and Metrics

lange at debian.org lange at debian.org
Sat Oct 7 05:29:16 PDT 2017


>>>>> On Sat, 7 Oct 2017 08:21:08 -0400, Josh Catana <jcatana at gmail.com> said:

    > This may have been brought up in the past, but I couldn't find much in my messageĀ  archive.
    > What are people using for HPC cluster monitoring and metrics lately? I've been low on time to add features to my home grown solution and looking at

I'm using ganglia for monitoring. No alerts, just node metrics like
cpu + network load but nice to look at what happened in the past.

-- 
regards Thomas


More information about the Beowulf mailing list