[Beowulf] standards for GFLOPS / power consumption measurement?

Tue May 10 11:59:23 PDT 2005

Robert:
your point is well made, and understood.  perhaps i should have included a
disclaimer stating that i wasn't *actually* advocating building a cluster
out of 386's to compete in GFLOP space with any new cluster.  the point i
was making was that it isn't fair to compare _construction cost_ per GFLOP
on machines constructed using parts that magical fairies are handing out
for free.  if you read Vincent's original post, he mentiones buying used
parts and getting network parts for free and comparing the construction
cost per GFLOP to KRONOS, kind of defeating the purpose of the article.

cheers,
twb

On Tue, 10 May 2005, Robert G. Brown wrote:

> On Tue, 10 May 2005, Timothy Bole wrote:
>
> > this seems to me, at least, to be a bit of an unfair comparison.  if
> > someone were to just give me a cluster with 80386 processors, then i would
> > tie for the lead forever, as 0/{any number>0}=0. {not counting if someone
> > were to *pay* me to take said cluster of 80386's}...
>
> Actually there is an lower bound determined by Moore's Law and the cost
> of baseline infrastructure.
>
> Let us suppose that it costs some amount to run a node for a year, more
> or less independent of what kind of node it is.  This cost is actually
> NOT flat -- older nodes cost more in human time and parts and newer
> nodes may use more power and AC, human management and administration
> costs vary somewhat -- but the modifications for specific actual systems
> are obvious and can be made in any real cost benefit analysis.  So let's
> make it flat, and furthermore assume that infrastructure costs I = 100
> GP ("gold pieces", a unit of money in World of Warcraft, to avoid
> getting too specific).
>
> Let us further assume that it costs us some multiple of this amount to
> buy a brand, spanking new node, and that furthermore we will choose to
> compare apples to apples -- both the older nodes and newer nodes are
> "the same" as far as being rackmount vs shelfmount, single vs dual
> processor, and that both fortuitously have enough memory to run a
> satisfactory operating system and our favorite task (not likely to be
> true for older systems), enough disk and network capacity, etc, not
> because these other dimensions aren't ALSO important and potential show
> stoppers but because comparing raw CPU is enough.  Again, modifying for
> other bottlenecks or critical resources can be done although it gets
> less and less straightforward as one introduces additional complexity
> into the CBA.  To be specific, we'll assign a cost of N*I = 1000 GP for
> a brand new node.
>
> Finally, let us assume that Moore's Law holds and on average decribes
> the evolution of raw CPU speed on the task of interest with a doubling
> time of 18 months.
>
> It is then straightforward to compute under these assumptions the
> break-even cost of older hardware.  For example, three year old systems
> are four times slower (so we need to buy four of them to equal one new
> system in terms of potential work throughput).  They also costs us four
> hits of I for infrastructure vs one hit of I for a single new system.
> The total cost of a new system is 1000 GP + 100 GP per year of
> operational life.  If we only consider a single year of expected
> operational life, this totals 1100 GP.
>
> To find out the break even point on the three year old systems, we
> subtract their single-year infrastructure cost and divide the result by
> four:
>
>   1100 - 400 = 700/4 = 175 GP
>
> If our estimate for I was accurate and all other things are equal, we
> break even if we buy the four 3 year old systems for 175 GP expecting to
> use them for only a year.
>
> If we plan to use them for two years:
>
>   1200 - 800 = 400/4 = 100 GP
>
> and we can spend at most 100 GP, although our assumption that the 3 year
> old systems will function out to year five starts getting a bit hairy.
>
> If we plan to use them for three years:
>
>   1300 - 1200 = 100/4 = 25 GP
>
> and somebody would pretty much have to give the systems to you in order
> for you to break even.  At this point the probability that four systems
> obtained in year three will survive to year six without additional costs
> is almost zero.
>
> Clearly there is an absolute break even point even for a single year of
> operation.  In this example, when a brand new system is 11 times faster,
> or 1.5*log_2(11) = 5.2 years old, if someone GIVES you the systems to
> run for a single year, if all of the simplifying assumptions are
> correct, if all eleven five year old systems survive the year without a
> maintenance incident, then you break even.
>
> Now, this is all upper bound stuff.  In reality the boundaries for break
> even are much closer than five years -- this is simply the theoretical
> boundary for this particular set of assumptions.  I personally wouldn't
> accept four year old systems as a gift.  Three year old systems MIGHT be
> worth accepting to run for a year in production, or longer in a starter,
> home, or educational cluster (where performance/production per se isn't
> the point), although many a systems person I know wouldn't accept
> anything older than two years old unless it was 2+ year old bleeding
> edge (when new) hardware and so had some sort of performance boost
> relative to the assumptions above.
>
> These numbers aren't terribly unrealistic, except that allowing for
> Amdahl's law and nonlinear costs associated with the exponential
> increase in probability of maintenance and difficulty getting
> replacement parts and the human energy required to squeeze modern linux
> onto an older systems and the difficulties one will have with networking
> and the space they take up and a fairer ratio of infrastructure cost to
> new system cost will all make you arrive at the conclusion that to
> REALLY optimize TCO and cost/benefit your cluster should almost
> certainly be rollover replaced every two to three years, four at the
> absolute outside.
>
> As for 386's -- a single 2.4 GHz AMD 64 box purchased for $500 is
> roughly 20 years ahead of the 386.  In raw numbers that is ballpark of
> 10,000 times faster (according to Moore's Law).  In raw clock it is
> about about 1000x faster.  Then there are four (486, pentium, P6, 64
> bit) processor generations in between, each contributing close to a
> factor of two relative to clock for a factor of 16 more or 16,000.
>
> It costs more for a systems person to LOOK AT a 386, let alone actually
> set it up or try to get anything at all to run on it than that system
> can contribute to actual production, compared to absolutely anything one
> can buy new today.  The same is true of all 80486's, all Pentiums, all
> Pentium Pro's, all PII's and PIII's, most Celerons (with a possible
> exception for brand new ones at the highest current clock, although I
> personally think the AMD 64 kills the venerable Celery dead).
>
> Brutal, sure, but it's just the way of the world given exponential
> growth in benefit at constant cost.
>
> This same sort of analysis can be extended to non-HPC TCO CBAs as well
> (although it is SO rare to see it done).  In for example a departmental
> or corporate LAN the issue is complicated by the complexity of the
> application space and productivity benefits associated with upgrades,
> which range from nil for people who dominantly use only e.g. office type
> applications and web browsers to high for people who actually "use"
> their computer's full capacity in some dimension to accomplish useful
> work.  The scaling of maintenance and infrastructure is also frequently
> dominated as much by human issues (support, training, and so on) as it
> is by hardware, in contrast to much of the HPC market.  So a much more
> informed and careful strategy is needed to optimize cost benefit and
> plan for rollover replacement.  Alas, most organizations just can't
> conceptually manage this degree of complexity and opt instead for a
> simplistic/flat "fair" policy that ends up being a wasteful and stupid
> way to allocate scarce resources nearly all of the time but which
> concentrates power in the hands of an entrenched bureaucracy and which
> reduces the need for human intelligence to support operations to near
> zero.
>
>    rgb
>
> >
> > having inhabited many an underfunded academic department, i have seen that
> > there are many places where there is just not money to throw at any
> > research labs, including computational facilities.  i think that the point
> > of the article was to demonstrate that one can build a useful beowulf for
> > a dollar amount that is not unreasonable to find at small companies and
> > universities.  not everyone can count on the generosity of strangers
> > handing out network cards and hubs.  so, the US$/GFLOP is a decent, but
> > *very* generic, means of getting the most of that generic dollar.
> >
> > of course, the bottom line is that a cost benefit analysis for any cluster
> > is really necessary, and the typical type of problem to be run on said
> > cluster should factor into this.  i applaud the work of the KRONOS team
> > for demonstrating the proof-of-principle that one can design and build a
> > useful beowulf for US$2500.
> >
> > cheers,
> > twb
> >
> >
> > On Tue, 10 May 2005, Vincent Diepeveen wrote:
> >
> > > How do you categorize second hand bought systems?
> > >
> > > I bought for 325 euro a third dual k7 mainboard + 2 processors.
> > >
> > > The rest i removed from old machines that get thrown away otherwise.
> > > Like 8GB harddisk. Amazingly biggest problem was getting a case to reduce
> > > sound production :)
> > >
> > > Network cards i got for free, very nice gesture from someone.
> > >
> > > So when speaking of gflops per dollar at linpack, this will destroy of
> > > course any record of $2500 currently, especially for applications needing
> > > bandwidth to other processors, if i see what i paid for this self
> > > constructed beowulf.
> > >
> > > At 05:19 PM 5/9/2005 -0400, Douglas Eadline - ClusterWorld Magazine wrote:
> > > >On Thu, 5 May 2005, Ted Matsumura wrote:
> > > >
> > > >> I've noted that the orionmulti web site specifies 230 Gflops peak, 110
> > > >> sustained, ~48% of peak with Linpack which works out to ~$909 / Gflop ?
> > > >>  The Clusterworld value box with 8 Sempron 2500s specifies a peak Gflops
> > > by
> > > >> measuring CPU Ghz x 2 (1 - FADD, 1 FMUL), and comes out with a rating of
> > > 52%
> > > >> of peak using HPL @ ~ $140/Gflop (sustained?)
> > > >
> > > >It is hard to compare. I don't know what sustained or peak means in the
> > > >context of their tests. There is the actual number (which I assume is
> > > >sustained) then the theoretical peak (which I assume is peak).
> > > >
> > > >And our cost/Gflop does not take into consideration the construction
> > > >cost. In my opinion when reporting these type of numbers, there
> > > >should be two categories "DIY/self assembled" and "turn-key". Clearly
> > > >Kronos is DIY system and will always have an advantage of a
> > > >turnkey system.
> > > >
> > > >
> > > >>  So what would the orionmulti measure out with HPL? What would the
> > > >> Clusterworld value box measure out with Linpack?
> > > >
> > > >Other benchmarks are here (including some NAS runs):
> > > >
> > > >http://www.clusterworld.com/kronos/bps-logs/
> > >
> > > >
> > >
> > > >
> > > >>  Another line item spec I don't get is rocketcalc's (
> > > >> http://www.rocketcalc.com/saturn_he.pdf )"Max Average Load" ?? What does
> > > >> this mean?? How do I replicate "Max Average Load" on other systems??
> > > >>  I'm curious if one couldn't slightly up the budget for the clusterworld
> > > box
> > > >> to use higher speed procs or maybe dual procs per node and see some
> > > >> interesting value with regards to low $$/Gflop?? Also, the clusterworld
> > > box
> > > >> doesn't include the cost of the "found" utility rack, but does include the
> > > >> cost of the plastic node boxes. What's up with that??
> > > >
> > > >This was explained in the article. We assumed that shelving was optional
> > > >because others my wish to just put the cluster on existing shelves or
> > > >table top (or with enough Velcro strips and wire ties build a standalone
> > > >cube!)
> > > >
> > > >Doug
> > > >>
> > > >
> > > >----------------------------------------------------------------
> > > >Editor-in-chief                   ClusterWorld Magazine
> > > >Desk: 610.865.6061
> > > >Cell: 610.390.7765         Redefining High Performance Computing
> > > >Fax:  610.865.6618                www.clusterworld.com
> > > >
> > > >_______________________________________________
> > > >Beowulf mailing list, Beowulf at beowulf.org
> > > >To change your subscription (digest mode or unsubscribe) visit
> > > http://www.beowulf.org/mailman/listinfo/beowulf
> > > >
> > > >
> > > _______________________________________________
> > > Beowulf mailing list, Beowulf at beowulf.org
> > > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > >
> >
> > =========================================================================
> > Timothy W. Bole a.k.a valencequark
> > Graduate Student
> > Department of Physics
> > Theoretical and Computational Condensed Matter
> > UMBC
> > 4104551924
> > reply-to: valencequark at umbc.edu
> >
> > http://www.beowulf.org
> > =========================================================================
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> >
>
> --
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>
>
>

=========================================================================
Timothy W. Bole a.k.a valencequark
Graduate Student
Department of Physics
Theoretical and Computational Condensed Matter
UMBC
4104551924
reply-to: valencequark at umbc.edu

http://www.beowulf.org
=========================================================================