[Beowulf] Why I want a microsoft cluster...

Jim Lux James.P.Lux at jpl.nasa.gov
Wed Nov 23 11:49:09 PST 2005

So here we go with some devil's advocacy...

 From the user viewpoint, in a largish shop, but with a single user in mind

The scenario is that I want to run some sort of analysis tool that is 
computationally intensive enough to require more crunch than I can get with 
a single desktop.  Applications that spring to mind are various forms of 
finite element modeling (electromagnetics, structures, etc.).  My work is a 
tiny fraction of the overall output of the business, most of which is the 
result of pedestrian office tools like word processing, spreadsheets, 
schedules, as well as some homegrown applications that are "business 
centric".  All this other work is done in MS Office and the like on MS 
Windows platforms, because it has to be interoperable the division across 
the country, etc., and they use MS Office too.  For example, my monthly 
status reports must be prepared in MS Powerpoint, because they get merged 
in with the other 10 folks' status reports, and shipped up to management in 
that form.

So, whatever I do, my output is eventually going to wind up pasted or 
copied into some MS product, AND, it has to be "clean" enough that when the 
admin for the manager 3 levels above me tries to resize the images that 
I've cut and pasted, it doesn't choke (that means using WMF or EMF for 
graphics, for instance).

What does this sort of environment mean?  It means that a strategy where I 
run my analysis tool on a Linux box and then try to export the data back to 
my Windows box for doing the reports is a royal pain.  It's worse than 
sneakernet.  Sure, I can SSH into the Linux box from my windows box, and 
even fire up a Xserver on the Windows box, but things like cutting and 
pasting just don't work very seamlessly, and it seems that Linux 
application creators consider generating Windows compatible file formats 
anathema (leaving aside the file format aspects..) because they might be 
considered "pandering to the dark side".  Folks.. uncompressed TIFF images 
don't hack it as an interchange medium.

And no, Open Office is not fully interoperable with MS Office.  There's 
always little hiccups with things that you really, really need.(hmm.. 
equation editor?  footnotes?  change tracking? Outline mode?)  The typical 
scenario is that you're one of half a dozen folks working on a document, 
and you all pass it back and forth and make changes, and for all practical 
purposes, we ALL have to be using the same tools (even going back and forth 
between Mac and PC is problematic.. Those "big red X" things that appear in 
your ppt slides).

Let's be realistic.. as a hypothetical small user of a cluster for some 
analytical task, more than 50% of my time is going to be spent not doing 
the analysis, but in dealing with other aspects of the job: administrative 
reporting; writing budgets; generating reports; creating proposals for new 
work.  We leave aside here scenarios where I get to manage a cadre of 
cluster monkeys who I get to tell "do this analysis, produce this report, 
make it so", in which case I'm really not the cluster user, but rather the 
analysis buyer.

So, whatever applications I'm using on my cluster have to seamlessly 
integrate with the tools the "rest of the business world" are using, 
whether I like or not.

Now, let's consider another practical detail..  I've got my cluster 
running, and I'm cracking through my work.  Something breaks (maybe a PC 
rolls over and dies).  I call the help desk.  The vast majority of problems 
are something simple (whether the cluster is Linux or Windows).  The IT 
organization has dozens of folks familiar with getting Windows PCs fixed 
and running: after all, they've got all those thousands of Windows desktops 
to support.  Probably any one of them can come and swap out disk drives on 
my cluster nodes, or bring up the spare node.  Say my IT support 
organization does actually support Linux too, but, in view of the 
realities, Linux is probably less than 5% of the installed base, so the 
support staff for Linux boxes is 1/20th of that for Windows. If you have 
10,000 installed Windows desktops, you probably have around 50-100 support 
people for those desktops, of which perhaps 10 are real crackerjack skilled 
ones who can take on the peculiarities of your cluster.  You might have 5 
who support Linux, and only 1 who might know something about clusters.

The odds of getting someone to fix my broken cluster, today or tomorrow, 
are much higher if it's Windows based, just because there's more folks 
around who are capable of doing it. If that 1 Linux cluster weenie happens 
to be on vacation, I'm dead... the odds of all 10 Windows cluster weenies 
being on vacation simultaneously is much lower.

Now let's talk security.  My speculative IT organization supports 10,000 
windows desktops, and has fairly systematic and rigorous ways to deal with 
the patches that come out once a month, as well as hotfixes for 
vulnerabilities that get discovered.  My Windows based cluster isn't going 
to seem scary to the IT security folks.. it's just another 100 computers 
and represents an infinitesimal increase in the overall workload and a 
small increase in the complexity of their workload. The incremental cost to 
bring my cluster into the corporate fold, from a security standpoint, is small.

Say I wanted to install a Linux cluster.  Ooops.. they're not quite as 
familiar with that.  They don't have all the patch rollout stuff, they 
don't have a patch validation methodology, etc.  Sure, there's all kinds of 
patch management stuff for Linux, in a bewildering variety of options, but 
now we've got to have a Linux security expert, in addition to the cadre of 
MS security folks we already have. You mean your cluster uses a different 
distro than the other iconoclastic Linux desktop users have? You recompiled 
the kernel to get the latest whizbang high performance network support?

With MS, the choice is easy.. use what you're already using for the rest of 
the company (SMS probably).  Kernel or distro compatibility isn't an 
issue.. you use what you're given and suck up the inefficiencies and live 
with it.  If it's a performance dog, you go make the pitch to buy more nodes.

Then there's the whole "hooking my box to the corporate network" thing... 
Most corporate IT infrastructure folks get pretty picky about what's 
hanging on the net, especially if you're using some sort of tunnel to talk 
to it.  They might want you to put a third party firewall between your 
cluster and the network, which all of a sudden not only increases the cost, 
but also means that it might be hard for you to sit at your desktop machine 
and talk to your cluster down the hall.

So, all in all, there's a real case to be made for a Windows based cluster, 
even if the raw performance takes a big hit.  In terms of "getting the work 
done" for a fixed dollar allocation, you might be better off buying more 
nodes to make up for the performance than paying for all the  extra stuff 
that corporate IT is going to require.

James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

More information about the Beowulf mailing list