[Beowulf] Beowulf on Demand

Robert G. Brown rgb at phy.duke.edu
Tue May 29 13:48:26 PDT 2007

On Mon, 28 May 2007, Juan Camilo Hernandez wrote:

> The idea that we have is that every gruop contribute with a quantity
> of money depending of their use of the flops in the cluster, someting
> as "computacion bajo demanda"
> I would like to ask you: Do you think is this one, the best option to
> mesure the cluster user  use to be able to proced to charge to every
> user?

This is a very tricky model to manage, because cluster utilization isn't
just "FLOPs".  A user's programs can block access to the full value of
the resource by consuming network, memory, disk channels, and other
bottleneck/limiting resources as well as just "CPU cycles".

> Could you let me knoe which software can I use to be able to carry on
> this kinf of control?

There are several approaches you can take.  One is to turn on process
accounting on the nodes.  This is done as root by e.g.

  accton /var/log/acct

You will need to rotate and groom /var/log/acct if you do this -- it
grows rapidly on a busy system, although "rapidly" was really written
with respect to disk scales that are much smaller than today's.

Once you've run this, then per node a command like:

  sa -m /var/log/acct

will yield a table like:

rgb at lilith|B:904#sa -m /var/log/acct
 	173	7.49re 0.00cp 876k
root	156	7.49re 0.00cp 801k
rgb	16	0.00re 0.00cp 1540k
smmsp	1	0.00re 0.00cp 1887k

The first field is raw seconds, the last field is cpu-time in averaged
core units (probably the one you want).  You can see that from when I
turned on accounting I did a tiny amount of work -- less than some
automated background tasks -- in terms of total CPU, even though my
"time" is much higher.  You'll have to really study to learn what these
fields are and whatever other fields are available.  There are (or used
to be) tools out there that can take the output of sa and turn it into
"cooked" reports.

There are also typically accounting tools built into some of the batch
job schedulers, I believe, but you haven't indicated whether or not
you're using one (if you were, I would have expected that you'd already
found them).  You can also use a tool like xmlsysd and write your own
small application to take snapshots of system utilization and average
them out somehow.

Obviously RTM is a good idea, and GIYF, and so on, but you CAN set up
microscopic accounting if you like.  There are probably toolsets out
there that will do all of this for you, if you look for them.


> Do you have any suggest or recommendation?
> Thak you very much.
> Best regards.

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list