[Beowulf] $2500 cluster. What it's good for?

Raymond Norris Raymond.Norris at mathworks.com
Wed Dec 22 20:16:39 PST 2004

Hi Jim-

I apologize for the top post.  My mail is not showing well who is
replying to whom.

I wanted to make a slight clarification on your comment about licensing
and cost of our distributed computing tools.  Previously, you would have
had to pay full price for MATLAB plus any toolboxes needed for each
node.  We have now released two products, the Distributed Computing
Toolbox (client) and the MATLAB Distributed Computing Engine (engine).
The client is sold per user, similar to a typical toolbox.  Each engine
is sold in packs (8, 16, 32, etc).  A node typically runs one engine
(though it could run more), so for each node that is running an engine,
it is consuming a license from the pack.

However, for those who are familiar with the MathWorks pricing
structure, you will see that the average cost of an engine is less than
the cost of a single copy of MATLAB, with the average cost per engine
dropping as the number of engines per pack goes up.  In addition, the
engine is granted full use of any toolboxes that the client is licensed
for (with the exception of code generation toolboxes) at no extra

The MathWorks, Inc.

-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org]
On Behalf Of Jim Lux
Sent: Monday, December 20, 2004 10:07 AM
To: Robert G. Brown
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] $2500 cluster. What it's good for?

----- Original Message -----
From: "Robert G. Brown" <rgb at phy.duke.edu>
To: "Jim Lux" <James.P.Lux at jpl.nasa.gov>
Cc: "Douglas Eadline, Cluster World Magazine" <deadline at linux-mag.com>;
<beowulf at beowulf.org>
Sent: Monday, December 20, 2004 6:55 AM
Subject: Re: [Beowulf] $2500 cluster. What it's good for?

> On Sun, 19 Dec 2004, Jim Lux wrote:
> > This brings up an interesting optimization question. Just like in
> > things (I'm thinking RF amplifiers in specific) it's generally
> >
> This has actually been discussed on list several times, and some
> answers posted.  The interesting thing is that it is susceptible to
> algebraic analysis and can actually be answered, at least in a best
> approximation (since there are partially stochastic delays that
> contribute to the actual optimal solution).
>   "TCO".  Gawd, I hate that term, because it is much-abused by
> marketeers, but truly it IS something to think about.  There are
> (economic) risks associated with building a cluster with bleeding-edge
> technology.  There are risks associated with mixing hardware from many
> low-bid vendors.  There are administrative costs (sometimes big ones)
> associated from mixing hardware architectures, even generally similar
> ones such as Intel and AMD or i386 and X86_64.  Maintenance costs are
> sometimes as important to consider as pure Moore's Law and hardware
> costs.  Human time requirements can vary wildly and are often
> when doing the CBA for a cluster.

And TCO with bleeding edge equipment is where the one vs many managment
problem becomes so important.  Managing the idiosyncracies of one high
machine may be within the realm of possibility. Managing 8/16/1024 is
probably unreasonable.  So, as you point out, there's a value/cost that
be associated with various generations of equipment with less bleeding
generally being lower cost (and the ever present potential for "having a
day" and getting a zillion copies of an unreliable component).

>   Infrastructure costs are also an important specific factor in TCO.
> fact, they (plus Moore's Law) tend to put an absolute upper bound on
> useful lifetime of any given cluster node.  Node power consumption
> CPU) scales up, but it seems to be following a much slower curve than
> Moore's Law -- slower than linear.  A "node CPU" has cost in the
> ballpark of 100W form quite a few years now -- a bit over 100W for the
> highest clock highest end nodes, but well short of the MW that would
> required if they followed anything like a ML trajectory from e.g. the
> original IBM PC.  Consequently, just the cost of the >>power<< to run
> and cool older nodes at some point exceeds the cost of buying and
> running a single new node of equivalent aggregate compute power.  This
> is probably the most predictable point of all -- a sort of "corallary"
> to Moore's Law.  If one assumes a node cost of $1000/CPU and a node
> power cost of $100/year (for 100W nodes) and a ML doubling time of 18
> months, then sometime between year four and year six -- depending on
> particular discrete jumps -- it will be break even to buy a new node
> $1000 and pay $100 for its power versus operate 11 nodes for the year.

I'm going to guess that the 100W number derives from two things: the
to use existing power supply designs; and probably more important;
the desire to use standard IEC power cords, which are limited to 7 Amps,
decent design practice which would limit the "real" load to roughly half
that (say, 400-450W, peak, into the PS).  There are other regulatory
with building things that draw significant power.  The 7Amp cordset
component values and ratings for inexpensive components such as power
switches, relays, fuses, etc.


> > > I also have an interest in seeing a cluster version of Octave or
> > > set to work like a server. (as I recall rgb had some reasons not
> > > these high level tools, but we can save this discussion for later)
> >
> > I'd be real interested in this... Mathworks hasn't shown much
> > accomodating clusters in the Matlab model, and I spend a fair amount
> > running Matlab code.
> I believe that there is an MPI library and some sort of compiler thing
> for making your own libraries, though.  I don't use the tool and don't
> keep close track, although that will change next year as I'll be using
> it in teaching.  The real problem is that people who CAN program
> to do stuff in parallel aren't the people who are likely to use matlab
> in the first place.  And since matlab is far, far from open source --
> actually annoyingly expensive to run and carefully licensed -- the
> people who might be the most inclined to invest the work don't/can't
> so in a way that is generally useful.

I'll say it's annoyingly expensive.. from what I've been told, you need
license for each cpu of the cluster.  That makes running matlab on the
JPL 1024 Xeon cluster a bit impractical.

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list