[Beowulf] SC05 blogs and observations

Robert G. Brown rgb at phy.duke.edu
Wed Nov 23 10:05:27 PST 2005

On Wed, 23 Nov 2005, Jim Lux wrote:

> I'd say that a MS cluster will actually perform WORSE than that.. maybe 
> 60-70% of a Linux cluster in terms of "useful work done".  However, say 
> you've got a 100 node cluster.  You now have to buy 30 extra nodes at, say, 
> $3000 each (rolling in HVAC, room, etc).  So, you spend $100K on the extra 
> nodes.  You DON'T have to spend $300,000 for the Linux Cluster Specific 
> support staff, though.  The same old MS server jockeys can run your cluster 
> as run the rest of the corporate data center.   When the cluster throws up a 
> BSOD, they know what to do.
> The real question is whether there is a significant "business scale" market 
> for "non-scientific" cluster computing at all.  Or, is the market really HA 
> or transaction processing, which is a whole 'nother world.  Historically, MS 
> hasn't really been very interested in scientific computing anyway. (all those 
> cranky and principled iconclastic scientists who have no budget but lots of 
> free time)

I'd agree with all of that (including the parts of your argument I
deleted).  And I think that there is a real question as to whether there
is a business scale HPC market for MS to address, at whatever relative
cost-efficiency it ultimately manages.  I honestly don't know.  I think
that they could "make" one, but whether or not they end up making money
from it is the question.

However (as I pointed out offline in another thread of this discussion)
they have a pretty strong incentive to do this even at a loss.  Just
invert your argument.

Right NOW linux is the only game in town, so people HAVE to buy that
$300K linux support team in order to build a cluster.  Once you have it,
you've "nucleated" linux in an organization, and (applying the metaphors
of physics to the problem) there are significant cost-benefit driven
forces (such as trivial PXE-based installation and maintenance,
stability, low per-seat costs, large free application base, extensive
and free community support mechanism, open source, yum maintainability,
securability) that can drive its growth into other computing arenas
within the organization.  This has been going on for years already in
the server arena -- linux servers are cheap to build, stable, extremely
functional and efficient, and secure.  Windows servers aren't any of the
above, in comparison, although they are efficient ENOUGH to hold
homogeneous clients.  Once an organization bites the bullet and puts in
linux systems ANYWHERE, Microsoft views this correctly as a kind of
"cancer" that can grow without bound, driven by the fact that the
organization actually saves money as it grows and may discover that
things work better and are more secure ALSO, leaving one with damn-all
reason to stick with Windows at the enterprise level.

It is too late to shut the barn door associated with servers -- that
horse done run away.  However, MS doubtless sees that if small clusters
DO start to become a commercially viable commodity even for very
specific applications in very specific venues, even if those clusters
are all provided (in the end) by a turnkey cluster provider that sets
things up with a nice GUI so that the user doesn't need any particular
expertise to use it, if the user simply KNOWS that the cluster is
running linux and the user gets SUPPORT from people who run linux, the
user will learn that there exists a viable alternative to Microsoft for
all sorts of things.  The user will get to see, e.g. KDE or Gnome
desktops.  The user might actually try Open Office.  The user will see
that Mozilla or Firefox work, that Enlightenment looks a whole lot like
Outlook (or any other mail GUI).  If the organization doesn't go turnkey
but tries to do it itself (possibly in a small department with a local
support staff person who knows a bit about linux to begin with anyway)
things are even worse, as ONE PERSON can set up a linux repository now
from which an entire organization can install linux on desktops or
servers alike at will.  One person cannot manage all the user and
hardware issues, but one person can (and at Duke, pretty much does)
manage LINUX for the entire campus including all the clusters.

That's where things get all wanky and scary for MS.  One of their major
selling points is that MS is "professionally manageable", but they are
dealing with clients that have BEEN through the Death of the Mainframe,
which happened in spite of IBM telling all its clients about the
wonderful benefits of central management and professional grade
software.  They know that computer vendors lie.  In actual fact,
Microsoft is not terribly CB advantageous in seats per FTE, seats per
server (and server license), cost per seat, MANAGEMENT cost per seat,
maintenance cost per seat.  It is expensive and requires a lot of well
trained people to run it to make it work at the enterprise level, and
when it DOES work it crashes all the time (at least in our labs here).

Modern linux, OTOH, with the ONE EXCEPTION of device driver support (my
own personal current bete noire in linux) and a smattering of inevitable
bugs in key tools, scales right out there at the theoretical edge of
scalability, whether it is in the context of a cluster, a server room,
or a workstation LAN.  It is still not really ready for unmanaged
desktop use, but MANAGED desktops can cost very, very little per seat
outside of the hardware and a tiny chunk of the OC of the management
staff.  Literally hundreds of systems per admin even at the human level,
one admin per thousands of systems at the software level. PXE, yum, and
linux are a very scary combination for Microsoft, and tools like
warewulf threaten to virtualize THE ENTIRE OPERATING SYSTEM.

You see, this is where I think the universe is going in managed
environments.  Even though thin clients sold on top of PROPRIETARY tools
never made it as disk was cheaper than the additional hassle, PXE on top
of ultrafast networks makes free thin client installations pretty simple
to set up and eliminates all sorts of management issues (such as desktop
crackability).  It loads the OS when it boots.  It goes away when it
reboots.  A user can even choose WHICH OS to load when it boots, and a
different user can make a different choice, or the same user could
choose a different operating system at a later time for a different

Veeerrry verrry scary to MS.  If they OWNED this they'd LOVE it -- they
could sell MS pay-per-use (boot WinXX today, get billed $0.50).  In a
competitive environment (boot WinXX today to read your mail and get
billed $0.50, or boot into a Gnome desktop and read your mail and get
billed nothing, hmmm) things aren't quite so attractive for them.
That's why the .NET wars are playing now in a theater near you.  The
winner in this inevitable conflict (if it is to be MS) MUST have a
protected edge -- something it can do that everyone wants/needs to do
and that can only be done with an MS OS.  Open source open standard
webware is anathema to this; as web browsers become "the" de facto
interface to increasing chunks of application space.

SO MS may well be forcing its way into the cluster market as part of a
long term strategy to take any measures necessary, including eating
losses, to keep linux-based nucleation points out of organizations.
Consequently it is no longer viable for Microsoft to say "we don't do
clusters"; they have to be able to provide clusters even if the
economics of clustering forces them to provide the cluster software at
very low (for them) margins or even at a loss.  Naturally they'll TRY to
actually make money from clusters, but it might be more about prestige,
marketing, and maintaining exclusivity on current Win-homogeneous
environments than it is about any particular actual clustering market.

None of which will (in my opinion) make any difference in the long run.
The current round of linux desktops are getting a lot of kudos for being
very, very usable at the (managed) desktop level.  They are still a
nightmare at the unmanaged desktop level, and will remain that way until
the kernel people find a way to work with hardware manufacturers who
want to provide binary-only device drivers that is somewhat less hostile
than it currently is.  An unmanaged desktop user doesn't want to have to
know "anything" but how to click their way through obvious menu choices
when they buy a camera, a printer, a network card, and they don't want
to have to use Google for two hours trying to figure out what cards will
or won't work with their operating system.

This is really, in my opinion, the last thing preventing linux from
achieving Linus's dream of taking over the world.  However, it is an
absolute roadblock, a dam, an insurmountable mountain.  No matter how
lovely linux is in a managed environment, no matter how functional the
desktop is overall, if a "dumb" cannot install commercial software or
commercial OTC hardware following simple directions, linux loses the
unmanaged desktop.  Period.  Make it so that linux can run ANY piece of
hardware that is ever released, give software vendors an incentive for
putting out a linux-only version of their product, and linux becomes an
irresistable competitor.

> But, have to go now.. Next installment for my devil's advocacy:
> "Why windows based clusters make sense for individual users"

The answers are, I imagine, the same -- device drivers and GUIs.
They're a bit fuzzier as there are some serious DISadvantages that the
user has to cope with. Right now WinXX is NOT terribly easy for users to
install even on a SINGLE box, for example. The main reason they end up
with "functional" windows installations is that the actual installation
is done by professionals at the store before they bring it home and
never upgraded over the lifetime of the hardware.  I have had to
(re)install WinXP on systems a few times recently, and it really, really
sucks.  Still.  Even for a certified card carrying geek such as myself
(I'm currently wearing my Cluster Monkey hat that Doug sent me home from
SC05 via Justin Moore, thanks Doug and Justin:-).

Just turning off all the automarketing is difficult, and I still haven't
figured out how to make the trial copies of Norton and/or McAffee just
shut up and go away.  I also don't know if it is doing anything like
auto-updating itself, even though it installed with a vendor-supplied
auto-updater of some sort doing all sorts of complex things that
required user intervention.  In reality, if you gave a WinXX CD to your
favorite computer luddite (in my case, to my wife) and pointed to a
naked box and said "make it work" -- well, it's just a cruel picture,
that's the word, cruel.  Not that linux is any better, and the point is
that ONCE WinXX is installed, she probably COULD install an OTC camera
so she could use it and preview and print out the pictures.  A task that
has brought ME to my knees in linux on more than one occassion.

I suspect that to make this so, a cluster would have to be able to
"install itself" on top of an existing pre-installed WinXX base, or be
so automatable that it can be installed on arbitrary hardware from a
GUI.  Seems a bit unlikely, really, although with PXE anything is
possible.  Right NOW just installing a WinXX cluster on arbitrary
hardware (at the user level) would be pretty amazingly difficult, I
imagine, not only because of the pain of getting through the install
menus and post-boot menus and configuration menus, but also because the
network and account and shared filespace management for a LAN (windows
or otherwise) is infinitely beyond most users' capabilities.

Then there are licensing issues -- WinXX can't just install, it has to
check for the RIGHT to be installed, for licenses and keys and such, and
managing THIS is pretty difficult for a single system install and bound
to be worse for a cluster install.  The license management issues for
third party applications.  The INCREDIBLE security implications; a
cluster turned to Viral Evil and SPAMming makes me shudder.

And finally, of course, there are the application issues, on top of all
the above.  Who's to blame if the user installs a cluster in order to
run some application and discovers that it has negative task scaling the
way the cluster is set up?

Or did you mean >>turnkey<< "windows based clusters", installed by third
parties, to run single applications on a correctly tuned design?  Only?


> Jim...
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list