aggregate cluster stats
siegert at sfu.ca
Mon Jan 7 11:47:11 PST 2002
On Mon, Jan 07, 2002 at 01:41:25PM -0500, Andrew Fant wrote:
> I am currently running a 118 processor cluster, with bigbrother and larrd to monitor
> system status and gather performance and utilization data on a node by node basis.
> However, my management is now requesting aggregate statistics, and a web page
> showing load, etc, across the entire cluster.
> Has anybody hacked something like this themselves? I would rather stick close to
> bigbrother and larrd, just to simplify implementation, but I have been playing with
> SGI's open source release of PCP, and I am not adverse to switching to another
> (free) solution if it can simplify the process.
I am doing this using bigbrother and larrd. You only need to change the script
that calculates the load (probably bb-local.sh) to use ruptime instead of
uptime and then simply add up the numbers. I don't know whether you want to
to fancier things, but I just have a bigbrother client running on the master
node and collect all data from the slave nodes using r-commands. The results
are sent to the webserver.
Academic Computing Services phone: (604) 291-4691
Simon Fraser University fax: (604) 291-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the Beowulf