[Beowulf] SGI to offer Windows on clusters

Joe Landman landman at scalableinformatics.com
Wed Jan 17 08:54:16 PST 2007

Craig Tierney wrote:

>> My understanding of pricing (for the windows portion) is that it adds 
>> (as an OS) $500USD to each node.  So for a 32 node machine, this is an 
>> extra $16k USD "tax" added on.  Doesn't include the absolutely 
>> necessary antivirus, anti-spyware, ...  Calling all that roughly $4k 
>> USD (roughly $125/node), we are looking at something closer to $20k 
>> extra per 32 nodes.  So for 128 nodes, this adds $80k USD.  For 1024 
>> nodes, this adds $640k USD.
>> My question has been on the CBA side.  What do you get for that extra 
>> tax that you don't get now?
> You mean like the "tax" that vendors who sell Redhat put on their
> systems because it adds an extra cost to each node? 

That is correct.

> What do you get for
> that?

Good question.  Why don't you ask the people that do this.

> I think that you would get a system that fits well into an existing
> MS environment.  I also see getting a system where you don't have to

Curiously, well designed and implemented Linux clusters also integrate 
quite nicely into MS environments.  They have for years.

> go through driver hell to get things working when the vendors don't
> (or can't) get their drivers in the kernel.  I know of some very large

[scratch scratch]  so you have cluster vendors delivering things where 
there are no drivers?  And you pay them for this?  Or am I missing what 
you are saying here?

If you are talking about driver hell, I presume you have not ever 
installed MS WinNT/2k/XP?

> and smart organizations that cannot get their IB, perfctr, and
> lustre patches working together correctly.  Why do I have to
> have a kernel engineer on staff to make this stuff work?

Ah.... ok.  You are talking about getting drivers into the kernel.  This 
is different.  Small/large windows vendors also never (ever) get their 
drivers into the kernel.  They are built as DLLs (a.k.a. kernel 
modules).  Are you blaming Linux for not being able to enable code which 
cannot be built as a driver, but requires kernel patches to be 
responsible for it not being able to be built as a driver?  You are not 
blaming the driver authors?  I can build xfs as a kernel module.  Works 
fine on RH and similar systems where it is not included natively.

Which IB BTW?  IB is in the kernel now.

Looking up Perfctr inclusion, see http://lwn.net/Articles/203731/ at 
bottom.  It might be that the author does not wish to go through the 
process again.

> I see the MS solution attractive to the ISVs where they only
> have to build their and test their code once.  No building

Oddly enough the ISVs tend to follow where the customers are going.  We 
haven't seen many customers ask for an all windows shop (even on 
computing systems).  They (ISVs) went to Linux as their customers asked 
them to.  In the process they were able to whittle away OSes that have 
effectively died from the perspective of HPC purchasing.  This had the 
net effect of reducing the ISVs costs (reduction in supported platforms 
and testing).  Most of the ISVs we have spoken with are aiming at 2 
platforms going forward.  Many had been burned in the past by being on a 
single platform when that platform fell out of favor.

> for RH and Novell, actively ignoring Fedora, Debian, Gentoo,
> and Unbutu, and then worrying about the interconnect and

This is a problem with all Linux now.  One I am personally frustrated 
with.  Linux != Redhat, despite RH's best efforts (and SuSEs, but that 
is another story).  If we can get people to write to the standards 
(LSB), things will work nicely.  Right now they are not doing that, 
which means that Linux is rapidly becoming RH in the eyes of the 
customers.  And RH uses positively ancient kernels.  New system support 
is painful if you use RH.  SATA anyone?  NUMA?

We are currently working with two different accelerator cards that work 
wonderfully under RH and related distros, and not at all under late 
model distros (including FCx where x>3).  It has to do with how they 
wrote it.  They built in lots or RH-isms.  Which warms the cockles of 
RH's heart.

(n.b.  I have nothing against RH.  I simply disagree with their choices 
to ignore good file systems in the face of ones that don't work as well 
for large volumes/systems/high speed/highly reliable IO.  That and that 
they have positively ancient kernels which tends to have all the bugs 
and few of the fixes of the old kernels ... which has been explained to 
me before, but did not make business/technological sense then or now.  I 
do like and use RHEL4 and free variants when appropriate).

> version of MPI that happens to be used.

This is a problem, and one I have complained about before.  So many 
MPIs.  Completely missing binary compatibility.  Massively exploding 
test matrix.  Settle on one and move forward.  This causes *everyone* 
grief.  And it costs money.  We have MPICH, MPICH2, LAM, mvapich, 
mvapich2, OpenMPI, ... .  Then you can build each of these with 
different compilers (gcc 3.x, 4.x, intel, PGI, PathScale).  This all 
before you hit the commercial variants (Scali, ...).

This is nuts folks.  We need one binary interface and specification. 
Once set of libraries to link to.  Been muttering about this for years 
... :(

> Most everyone on this list is smart and talented enough to solve
> these problems.  MS isn't selling to us.

I would like to say of course not, but from what I have seen, they are 
going after the places that quite a few of us on this list work at/with.

I do believe they can add value.  I am just not convinced they are going 
about it the right way.  Their HPC efforts appear to be just a tactic in 
the extension of the "crush linux" strategy.  We (my company) do believe 
that closer integration of HPC resources is important, and enabling end 
users easier use and management of HPC from their desktops, laptops, and 
PDA-phones is a good thing.  We agree with Microsoft on that part.

> And no, I don't have any interesting in building an MS cluster
> for all of the other problems it introduces.

We follow our customers requests.  Haven't had any for windows clusters 
to date.  Might happen, and if it does, we will execute against it. 
Sort of like the Solaris 10 clusters.

>> Microsoft could simply be subsidizing this for SGI.  Others have 
>> (cough cough) for them.
> You wouldn't?  Anyone trying to crack into a new market
> would do so.

Not anyone.  SGI has been subsidized by others before (including, 
briefly in the past, MSFT).  I haven't seen it ever result in anything 
other than a disaster for the subsidizer.  Then again, SGI is under 
mostly new leadership, so hopefully the mistakes of the past are 
actually in the past.


Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615

More information about the Beowulf mailing list