aggregate cluster stats

Martin Siegert siegert at sfu.ca
Mon Jan 7 11:47:11 PST 2002


On Mon, Jan 07, 2002 at 01:41:25PM -0500, Andrew Fant wrote:
> 
> I am currently running a 118 processor cluster, with bigbrother and larrd to monitor
> system status and gather performance and utilization data on a node by node basis. 
> However, my management is now requesting aggregate statistics, and a web page
> showing load, etc, across the entire cluster.
> 
> Has anybody hacked something like this themselves?  I would rather stick close to
> bigbrother and larrd, just to simplify implementation, but I have been playing with
> SGI's open source release of PCP, and I am not adverse to switching to another
> (free) solution if it can simplify the process.

I am doing this using bigbrother and larrd. You only need to change the script
that calculates the load (probably bb-local.sh) to use ruptime instead of 
uptime and then simply add up the numbers. I don't know whether you want to
to fancier things, but I just have a bigbrother client running on the master
node and collect all data from the slave nodes using r-commands. The results
are sent to the webserver.

Martin

========================================================================
Martin Siegert
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert at sfu.ca
Canada  V5A 1S6
========================================================================



More information about the Beowulf mailing list