[Beowulf] Re: Why Do Clusters Suck?

Wed Mar 23 08:13:33 PST 2005

On Wed, 23 Mar 2005, Joe Landman wrote:

> Robert G. Brown wrote:
> > On Tue, 22 Mar 2005, Andrew D. Fant wrote:
> 
> [...]
> 
> > Hmmm.  OK, how's this.  Just supposing that I finish building xmlbenchd
> > before an infinite amount of time elapses (I've once again gotten mired
> > in teaching and haven't had time to work on it for a week+).  Suppose
> > xmlbenchd can run any given program inside a fairly standard timing
> > wrapper (probably a perl script for maximum portability and ease of
> > use).  Suppose that the perl script, which will certainly contain the
> > command line for the application which therefore will (for fixed random
> > number seeds where appropriate) produce some sort of fixed output.
> 
> Hmmm.  Have you seen the input and output to BBS (in retrospect, a 
> poorly named tool)?
> 
> In BBS (bioinformatics benchmark system), you create input experiments 

 <deleted>

> or similar.
> 
> FWIW:  BBS is at http://www.scalableinformatics.com/BBS , is GPL, and is 
> in active use by a number of groups/companies for testing purposes.

Yeah, like that.  Very much like that.  I'll look into this for sure.
Might be time to eliminate the leading "B" and just make it "BS";-).  I
have a functioning first cut on the benchmarking tags needed for more
general benchmarking contexts (still not adequate, but a starting point)
implemented in my working copy of benchmaster (micro, not macro) --
perhaps we can merge the two xmls without breaking either one of them,
or we can add my tags to yours where they are different.  For example
(output from the savage benchmark):

Content-Length: 978

<?xml version="1.0"?>
<benchml>
  <version>Benchmaster 1.1.2</version>
  <hostinfo>
    <hostname>lilith</hostname>
    <vendor_id> GenuineIntel</vendor_id>
    <CPU name> Celeron (Coppermine)</CPU name>
    <CPU clock units="Mhz"> 801.923</CPU clock>
    <l2cache units="KB"> 128 KB</l2cache>
    <memtotal units="KB">320940</memtotal>
    <memfree units="KB">53224</memfree>
    <nanotimer>cpu cycle counter nanotimer</nanotimer>
    <nanotimer_granularity units="nsec">86.694</nanotimer_granularity>
  </hostinfo>
  <benchmark>
    <name>savage</name>
    <command>./benchmaster</command>
    <args>-t 9</args>
    <description>xtest =
tan(atan(exp(log(sqrt(xtest*xtest)))))</description>
    <iterations>1024</iterations>
    <time units="nsec">9.27e+02</time>
    <time_stddev units="nsec">4.63e-01</time_stddev>
    <min_time units="nsec">9.25e+02</min_time>
    <max_time units="nsec">9.38e+02</max_time>
    <rate units="10e+6">1.08e+00</rate>
  </benchmark>
</benchml>

Clearly very similar -- there will be a mirror input/configuration file
for xmlbenchd that contains e.g. the first 3 fields of <benchmark> to
tell it what to run, as well as tags to set scheduling policy and some
other stuff.  Note that benchmaster already does internal statistics on
a set of independent timing runs by default -- hence the
min/max/mean/stddev for timing.  This doesn't validate the result of
savage per se, although the savage code itself could easily do so.

This is the easy part and most people will come up with very similar
tag sets for this sort of encapsulation.  What I'm working on is how to
both input and output tables/vectors of results, e.g. what one gets from
running benchmaster's encapsulation of stream for a series of command
line specified vector lengths.  I want to be able to encapsulate at
least a vector if not a 2D/3D table (performance line or performance
surface) for graphical presentation.

But we can talk offline about this.  It looks like we are indeed on the
same conceptual page here.

> > Then it would be trivial to add a segment to at LEAST diff the output
> > with the expected output, and not a horrible amount of work to actually
> > compute a chisq difference between the two.  I can easily introduce xml
> > tags for returning a validation score on the actual result (or even a
> > set of such scores) because extensibility IS useful during the time a
> > new thing is being invented (sorry, Don:-) if not beyond.  This would
> > permit the best of both worlds:
> 
> Could leverage what exists rather than re-inventing this particular 
> wheel.  Let me know if there are particular things you would like to see 
> in the output comparison.  Could do a chi-square, but this makes more 
> sense for numerical bits than non-numerical bits (BBS doesn't care, and 
> may the solution to this are analysis plug-ins that implement the 
> appropriate tests).

Absolutely.  As I said, perhaps we can do a merge of some sort, or (xml
being what it is) a hierarchical encapsulation.  I'm glad to see that
this tool IS out there -- this kind of memetic exchange is what makes
GPL development "interesting".

> >   a) I expect to assemble a set of macro-level applications to function
> > somewhat like the spec suite does today but without the "corporation"
> > baggage, for distribution WITH the package.  At that point I will
> > actually solicit this list for good candidates for primary inclusion.
> > This set can actually be quite large, permitting users to preselect at
> > configuration time the ones to run for their particular site.  For the
> > ones that are selected, I will go ahead and do the validation test for
> > when I wrap them up in the timing script.
> 
> Heh...
> 
> 	bbsrun --experiment "test1-1CPU" --debug < gamess.xml
> 
> Include the XML with your package, and bbs can (largely) do the rest.
> 
> > 
> >   b) Users who want to wrap their OWN application set up for automated
> > benchmarking inside the provided template script will then be able to
> > follow fairly simple instructions and (presuming that they know enough
> > perl to be able to parse their application's output file(s)) validate as
> > well as time to their heart's content.
> > 
> > This may not be sufficient for all users -- I'm probably not going to
> > write a core loop that would permit a sweep of an input parameter in the
> > command line, for example, and to test e.g. special function calls in
> > the GSL that change algorithms at certain breakpoints that kind of thing
> > is really necessary.  However, folks with more advanced needs will
> > presumably be more advanced programmers and the perl to add such a sweep
> > and generate a more complex validation isn't terrribly challenging.
> 
> :)
> 
> > 
> > Would that do?
> 
> As most of this exists in BBS now, and it is in active use, I would say 
> yes. :)

  I'll give it a look.  BBS does indeed look like it is within spittin'
distance of what is required on the operational front.  We'll see if the
xml's can be merged painlessly (probably so given that mine is defined
mostly in my head in a single working copy of benchmaster and hence NOT
in production, so there is little barrier to it being changed).  It
might require too much revision for BBS to remain unbroken, though, so
we may yet need to create "son-o-BBS".

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu