[Beowulf] hpl size problems
Robert G. Brown
rgb at phy.duke.edu
Tue Sep 27 06:41:01 PDT 2005
Ashley Pittman writes:
> There is a wonderful tool written by LANL specifically for measuring
> this kind of background "jitter" on nodes, it's called 'whatelse' and is
> a perl script that samples node state before and after <something> and
> reports on the different. <something> can either be an application or a
> sample time. It allows you to see precisely how many CPU cycles are
> free for the application to use.
> Running in on one (of the not particularly tuned) systems here I see
> 99.983% IDLE CPU time over a minute with two processes using JIFFIES and
> four page faults. My desktop did worse with 70% idle whilst writing
> this mail.
I'm very curious as to just what it does. Something different than the
/usr/bin/time command or what you can see running e.g. vmstat or top
while the task is running?
Granted that a well-parallelized task is often a CPU bound task seeing
how long a task spends in userspace, kernelspace and so on (and what the
overal system duty cycle is while it is running) is certainly useful,
but there are a lot of tools that can return this information already,
including at least one (xmlsysd/wulfstat) that can do so for a whole
cluster at once. What exactly are they parsing and looking at?
I ask because if it is NOT something implicitly in xmlsysd/wulfstat,
I'll bet it is pretty easy to add. I already can parse fields out of
the pid structs -- I just haven't bothered returning utime, stime,
cutime, cstime because it wasn't clear that most users would have any
need for it while monitoring their tasks.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
More information about the Beowulf