[Beowulf] Cluster Metrics? (Upper management view)

Stuart Barkley stuartb at 4gh.net
Mon Aug 23 07:39:04 PDT 2010

Thanks for the various comments.  There are some good ideas suggested.

To somewhat clarify: The management metric request is being made of
all parts of the entire agency.  They probably don't know (or care)
what HPC really means.

We are lucky that we do get to define the metrics to measure success.
They don't want a whole lots of statistics, just a couple of "simple
numbers".  I'm just not sure how best to do that.  I've more been
thinking about the possible/useful metrics needed to manage the

These are large general purpose shared clusters.  One is primarily for
MPI (with infiniband) and the other for serial jobs of one node
threaded jobs.

Now that some results are being seen, other groups are starting to
fund expansions.  This will further complicate things since the groups
doing the funding will want some guarantee of access (qos, dedicated
nodes, enhanced fairshare) and reporting on their usage share.  I'm
working on that now and think most of it can be accomplished.  This
will be where the pretty graphs and quarterly reports will occur.
Bill Rankin's advice sounds very helpful there.

Stuart Barkley
I've never been lost; I was once bewildered for three days, but never lost!
                                        --  Daniel Boone

More information about the Beowulf mailing list