[Beowulf] Redmond is at it, again

Robert G. Brown rgb at phy.duke.edu
Thu Jun 3 07:39:40 PDT 2004

On Thu, 3 Jun 2004, Laurence Liew wrote:

> Hi,
> Thanks for the comments, and I think I from the discussion, we can 
> separate the issue into 4 distinct issues:
> (1) providing the OS distribution and updates
> (2) providing a cluster toolkit which makes #1 into a working cluster, 
> ie cluster management tool, mpich, pvm, blas, scalapack etc (and updates)
> (3) OS support
> (4) cluster support
> For (1),  for experienced cluster admins, paying per node price does not 
> make sense, as they can take ONE copy from a mirror (free or paid) and 
> duplicate it to hundreds and thousand of nodes. So in this scenario, the 
> price point is basically one copy (per year).
> For (2), some (and there are a lot of less capable sys admins out there) 
> would be willingly to pay for an easier way to do (1) above.
> For (3), it is assumed that in a university setting, the sysadmin would 
> be linux OS aware and can support himself. This is not really true in 
> some cases. We do get support calls like "how to setup printing from the 
> cluster".
> For (4), most linux sysadmins who manages a cluster is not a cluster 
> admin, and he cannot answer questions like "how come my mpi code does 
> not work", "why is SGE not able to run my job". Typically we have to 
> answer those questions on behalf of the sysadmins. (Of course we do have 
> syadmins that learn quickly and become self-supporting and less reliant 
> on us - but not always).
> As clearly articulated by RGB, providing OS distros and updates should 
> be minimal costs as most of the costs has been borned by others. - I Agree.
> And a cluster distro with some value add like integrated cluster 
> management tools, mpich, blas, scalapack... again a small fee is 
> acceptable for the work done in integrating and packaging it.

Beautifully and succinctly put.  I personally don't do succinct, alas;-)

Real added value is worth a fee, although it isn't clear that the fee
should scale like cluster size per se.  It is perfectly reasonable,
though, for a vendor to set up several price points that effectively
give a price break to somebody just getting started with a tiny cluster.

> So I guess for cluster vendors out there what the community would like 
> to see in any cluster distro pricing is:
> a) low per unit costs if any (say $10-$20/node) for a cluster distro 
> WITH VALUE ADDED, OR a one time annual site license for unlimited 
> distritbutions within the campus

I don't view $10 per node to be a low unit cost, and don't think that
this cost should in general scale with the size of the cluster or LAN or
whatever.  As has been pointed out several times, one buys "one copy" of
the product, which then can be used once or a hundred times.  There are
no additional marginal costs to you (the seller of the product) if it is
used on a two node cluster or a 2000 node cluster (issues like dealing
with the failure of tool X to scale to 2000 nodes aside, but those can
be addressed with SUPPORT charges below that scale with usage and time).

However, this is really a "what the market will bear" issue where my
opinion isn't important.  As you note in your summary and Joe has
remarked offline, there are a number of cluster markets emerging where
the administrators are not cluster experts (and sometimes not even
linux/unix experts) and who are perfectly happy to pay money on a per
node or workstation LAN client basis as an alternative to hiring an
experienced administrator or cluster programmer (which does come at a
nontrivial baseline cost of say $40-100K).  These are groups that might
not be able to set up e.g. a repository, PXE, kickstart, yum on their
own, groups that cannot build/package/assemble PVM, MPI etc on their
own, groups that don't know how to install and configure some canned
toplevel cluster package e.g.  blast on their own.  They want a turnkey
solution that can run "unmanaged" locally or with minimal, fairly poorly
trained, local management.  They will pay for it.

Determining whether there are ENOUGH of them to support a company
providing the solution at prices they will pay, whether they will remain
customers or learn enough to do it on their own and save the money (high
prices being a strong incentive to do so), whether a company can survive
as "software only" or if the market is best pursued in the Penguin/Scyld
model of sales of hardware, software, and integrated hardware/software
(so one can make money any of the above ways when times are tough and
clients few in any particular dimension) -- well, that's what business
and risk is all about.  This particular list, of course, likely contains
relatively poor candidates for customers as most of the longtime members
are strictly DIY kinds of people who are used to squeezing dollars til
they squeak and who will only pay for things that provide real
measurable value compared to their doing things themselves or with staff
resources.  Universities in particular tend to have both the expertise
and opportunity cost time available to do a lot of things themselves.
However, you also see new list members who in some cases would be ideal
customers as they are clueless and need help.

Remember also that the market for integrated solutions largely exists
because of sheer laziness on the part of the NON-clueless cluster
persons around the world.  If every cluster tool were properly packaged
to be auto-rebuildable (and many of them are so packaged, but not all)
and hence could be included in an open distribution like Fedora or
Debian with "no marginal effort" beyond an e.g. rpm --rebuild on an
upgrade (plus any debugging/patching associated with actual library
changes) then ALL open distributions would be cluster distributions.
Fedora, for example, comes with pvm and lam MPI ready to roll (as did RH
before it).  SGE is lacking as SGE isn't distributed in cleanly
rebuildable source rpm form (why not, I don't know, as the exercise of
making it so would likely improve the product).  mpich is also missing,
which seems silly.  blas it has.  scalapack it doesn't have.  However,
it COULD have ALL of these things if somebody who needs them and builds
them into rpm's for local distribution anyway would simply "own" the
package and contribute/maintain it for the fedora core.

This isn't exactly a business opportunity, but to me it makes sense for
e.g. the NSF and DOE and NIH to get together and fund a cluster group
for the sole purpose of so "owning and maintaining" these and other core
packages "forever".  I don't mean developing MPICH per se -- I mean
doing the requisite work of packaging and maintaining MPICH for the open
distributions so that its stable version is primarily distributed THERE
instead of from its project specific websites.  This could be supported
by simply adding it to the charge of the various groups developing these
packages (many of which are likely supported by government grants
already) or by creating a group at some University or government lab or
a consortium of the above and funding it.

This is a better idea than the oscar/rocks/etc approach of building a
"cluster distribution", open or closed.  What is this "cluster
distribution"?  Nothing but a collection of packages.  These packages
can equally well be made a standard part of standard open distributions
so EVERY such distribution is a cluster distribution.  The "cluster
distributions", even funded commercial ones such as Scyld, tend to lag
the open or commercial linux distributions from which they are
inevitably derived by months to years, to the point where they are
insecure and no longer being maintained and missing all sorts of recent
software developments and improvements. This is a serious, and largely
unnecessary, problem (although in the case of Scyld, which does quite a
bit of additional work to really fairly radically alter the base
distribution with a consequent need for extensive testing and longer
term maintenance and stability one can argue for an exception).

With yum and kickstart the issue is even MORE moot.  A "cluster node"
fundamentally differs from a "desktop workstation" by nothing more than
the selection of packages installed (including a package with a suitable
%post to alter configuration or the same script run some other way).
The packages themselves can be collected into package groups to make
converting a workstation into a cluster node or a workstation that is
ALSO a cluster node a single command or line in a kickstart file plus a
reinstall.  Going this route makes not only a cluster, but the
institutional-scale linux IT environment scale close to the theoretical
limit of scalability.

> b) separate out the support charges, so that experienced sysadmins can 
> pay less or opt to pay for support (and finger pointing), and not bundle 
> the support into the per node price.

HERE is where I think there is a real market.  As you say, 1) and 2)
"should" be cheap to free under any circumstances except perhaps where
you add real value in the form of locally developed and maintained
tools.  In most cases any price "should" be institutional with little to
no per unit scaling, but of course this "should" is really controlled by
Adam Smith's invisible hand, and if you, or Red Hat, can get folks to
pay for it per box have at it -- it is just like printing money on a
color laserprinter except that it is legal.  In the long run, though,
expect to have to deliver real value for real money, and don't expect to
get rich quick without a really new idea or killer app, as most people
will eventually decide to get a color laserprinter of their own.

3) and 4) in your list (and an unlimited redistribution update feed for
a "certified and ready to play" repository) are what I consider to be
"support" and worth paying for, and while I personally think that the
feed should be cheap cheap cheap (go for the mass market, viewing free
free free Fedora or Debian as the "competition") and scale only very
weakly indeed with institution size (e.g. "personal" for $5/year of
yum-driven updates and access to the repository or $15/year for
"household/departmental" rsync access so you can mirror the repository
for something like my household, to maybe $1000/year for an
institutional feed/mirror where an "institution" is something like
duke.edu with a couple of sensible intermediate points in between).

However, even at Duke 4) in particular is a major issue.  The University
can support linux very well and does.  We have considerable linux and
Unix and general systems expertise on campus, and even have a decent
amount of cluster-specific expertise on campus and available on a
consultative basis to would-be cluster builders.  We have and support
both local/group level clusters, department level clusters, and
institutional shared cluster spaces where groups buy machines but don't
have to install them, care for them, feed and cool them, in local
infrastructure and with local administrative expertise.  So we can solve
1-3 very efficiently -- close to the theoretical limit of efficiently
although we do have to build and package various cluster tools that we
need but that aren't yet in distros.

4) we CANNOT solve in this way without maintaining a "cluster
programming group" of some sort with a very wide range of expertise
indeed, and reselling its services to all takers (or providing them "for
free" ditto).  This is unwieldy and not cost effective.  I and several
others on campus consult for groups on clustering issues, but I cannot
efficiently figure out a (say) parallelized quantum chemistry
application, or BLAST, or help somebody in the medical center
parallelize an antique fortran application that is the key to their
studies of electron transport processes in human tissue.

With that said, I don't know how easy it would be to get the groups
doing this sort of thing to buy the required services from an ISV.  Some
of them would rather hire a programmer and do it themselves, or are
willing to take the time to educate a postdoc or graduate student and
then maintain a chain where they educate incoming postdocs and grad
students (well known sources of slave labor, cheap relative to a "real"

Still, I think this is where Joe and others in the business make money.
Here you aren't providing a "distribution" per se (although of course
you may be).  You aren't providing a cluster toolkit per se (although
you may be).  You may well be providing raw OS support to relatively
ignorant groups, although dealing with this inside a real institution
with its security and other contraints may or may not be possible in all
cases even where the group in question is ignorant (at Duke this would
be difficult, for example).  You are very definitely providing a
valuable service in terms of installing task-specific tools, training
people in their use, maintaining them (fixing them when they break,
updating them as appropriate), and supporting them (answering the 800
possibly silly questions their use generates per year).  And there is
plenty of room to do this more cheaply than the institution or group can
do it themselves and still make money, assuming e.g. 1/2 an FTE or more
to provide the service locally.

How to scale this I don't know.  The real costs of providing the service
still don't scale well with number of nodes -- it is more a matter of
number of users, level of ignorance of users, number of personality
disorders in user base.  One really stupid and obnoxious person with a
four node cluster can easily generate more support work than thirty
competent and pleasant individuals working on massive cluster.  Some
poorly packaged software tools might require per-node effort to install.
Providing it as part of a turnkey (preinstalled) cluster clearly
requires per node effort.  Again, the marketplace is what will decide,
in the long run, what you can charge for this sort of thing.  Again, it
is more like running a consultancy than a software business -- you can
make a good living but you won't get rich as it is labor intensive and
labor DOESN'T scale all that well, but the client is likely to be doing
all parts of the labor locally that do scale well and this is what is
left over.


Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list