[Beowulf] Why I want a microsoft cluster...

Sat Nov 26 19:40:13 PST 2005

Jim Lux wrote:
> At 09:04 AM 11/26/2005, Mark Hahn wrote:

[...]

>> why?  because low-end clusters are mostly a mistake.

I disagree on this for a number of reasons.  It may make sense for some 
university and centralized computing facility folks for larger machines. 
  However, one of the things I have seen relative to the large scale 
clusters out there is most are designed for particular workloads which, 
if you look at the workloads that other folks are buying their clusters 
for, simply do not make sense.  Specifically, few informatics folk need 
low latency interconnect.  They need large number of processors, big 
memory systems, and large local IO facilities, not to mention very high 
speed data scatter across their cluster.   This has been recently one of 
the fastest growing segments of the HPC market, as well as one of the 
largest segments.  This is from IDC and other data we have seen recently.

Large clusters designed to run HPCC/HPL very well will generally be 
poorly designed for the type of work they do.  Of course, they are just 
one example out of several.

>> at a university like mine, for instance, nearly _everyone_ realizes 
>> that it's
>> insane for each researcher to buy/use/maintain his own little $50-500K
>> cluster.  I see three clear reasons for this:

Again, it is not, if the large cluster was designed for different 
purposes.  More on that in a moment.

>>         - the value of a cluster is some superlinear function of its 
>> size.

Value is relative to the applicability of the particular system to the 
problem at hand, as well as how well it solves those particular problems.

>>         - the maintenance cost of a cluster is very sub-linear.
>>         - most workloads are bursty.

Hmmm... the clusters we have put into universities are quite loaded. 
Bursty may happen at places (we see this in some of our commercial 
customers).

>> the first two factors encourage larger clusters; the latter means that 
>> bursts
>> can be overlapped in a shared resource very nicely.

I cannot comment on all bursty natures.  I have seen at some places a 
bursty load from a resource that was overspecified for the need.  There 
simply is not enough work to fill the cycles.   Waiting in queue is ok 
as long as those waits are not too long.  When wait times exceed a few 
hours/days you can have issues.  But at the same time, if a researcher 
requesting lots of resources always gets them right away from the large 
cluster, while they may like it, it is likely that the machine may have 
been a bit too large for the need.  This is really hard to generalize 
about, as most of our customers have computing needs that grow linearly 
or superlinearly in time.

> 
> 
> However, this sort of logic (the economies of scale) push clusters 
> towards the model of the "mainframe supercomputer" in it's special 
> machine room shrine tended by white garbed acolytes making sure the 
> precious bodily fluids continue circulating.

Yes!!!  Precisely.    But worse than this, it usually causes university 
administrators to demand to know why department X is going off on their 
own to get their own machine versus using the annointed/blessed hardware 
in the data center.  As someone who had to help explain in the past why 
our university did not want us to be running on the same million dollar 
plus IBM 43xx mainframe that ran student records, and buy a real 
supercomputer, I can tell you that this is a painful battle at best. 
Department X may be getting its own machine due to the central resource 
being inappropriate for the task, or the wait times being unacceptably 
long, or the inability to control the software load that you need on it.

There are possibilities to find a happy middle ground, where the central 
folks manage the resource, and allow cycles to be used by others when 
the owners of the machine are not using so much of it.   Moreover they 
can connect it to central disks, authentication, and so forth.  That is, 
the value of the centralized IT is realized, even if it is a separate 
resource.

Forcing all users to use the same resource usually winds up with 
uniformly unhappy users (apart from the machine designers who built it 
for a particular purpose).

> 
> One of the biggest values of the early clusters was that they let people 
> "fool around" with supercomputing and get real work done, without 
> hassling the instutional overheads.  Sure, they may have been 
> non-optimized (in a FLOPS/dollar sense, certainly, and in others), but 
> because they were *personal supercomputers*, nobody could complain.  
> They were invisible.

Not really invisible.  And not all workloads make sense to characterize 
in terms of FLOPS/$.  This is not a swipe at Mark or Jim, but 
fundamentally, while the HPL has been interesting for characterizing 
some workloads, it is effectively useless for wide swaths of workloads 
that have no fundamental reason to look at FLOPS w.r.t. anything. 
Supercomputing as such is much larger than just FP calculations, and 
certainly folks doing Monte-Carlo FP heavy loads probably don't care 
about low latency.

> There is, I maintain, a real market for smallish clusters intended to be 
> operated by and under the control of a single person.  In one sense, I'm 
> wasting compute resources, but, I'm also doing this because my desktop 
> CPU spends most of its time at <5% utilization.  Having that desktop 
> under my personal control means that if something isn't working right, I 
> can just abort it. Or, if I am seized by the desire to dump all the 
> other stuff running temporarily and run that big FEM model, I can do 
> that too.  No calling up the other users and asking if it's ok to kill 
> their jobs. No justifying processor utilization to a section level 
> committee, etc.

The interesting thing is that if you look at the numbers from IDC and 
others you realize something very interesting.  First, the real HPC 
hardware market (this is a 7B$ market today, growing at > 10% CAGR) has 
its largest section by volume and largest growth in the 25-50k$ region. 
  Second, the large market is shrinking.  Again, this is not a slap 
against Jim and Mark.  The real HPC cluster market is moving down scale, 
and the larger ones are growing more slowly or shrinking.  This is going 
to apply some rather Darwinian forces to the market (has been for a 
while).

This is not to say that there is not a big cluster market.  There is. 
Its real.  Its just not growing in dollar volume.  Some of us wish it 
would, but budgets get trimmed, and hard choices are made.  You have to 
go through fewer levels of management to justify spending 50k$ than you 
do 500k$, or 5M$.

> 
> To recapitulate from previous discussions: a cluster appliance

Yup.  A 100k$ cluster appliance won't sell well according to IDC.  If 
your appliance (board/card/box/...) sits in the 25-50k$ region and 
delivers 10-100x the processing power of your desktop, then you should 
see them sell well.  This is where the market growth is.  Companies 
retreating to the high end of the market risk the same fate as all the 
other companies that have tried this before (e.g. any exec pushing that 
strategy should take a long hard look at the HPC market and the players 
that have retreated to the high end, most of them are gone and buried).

[...]

> this is the real MS/Cluster disconnect... Clusters just don't seem to 
> fit in the MS .NET world.  On the other hand, MS has a lot of smart 
> people working there, and it's not out of the question that they'd 
> figure out some way to leverage the idea of "commodity computers with 
> commodity interconnects" in some MS useful way.

For one thing, I would expect that .NET will eventually == grid. 
Doesn't make it HPC, but this is what I expect.

What Microsoft could bring to the table is forcing people to build 
reasonable interfaces for the HPC systems.  Today we have lots of APIs 
for MPI, DRMAA for schedulers,....  Technology doesn't become useful for 
the masses until it becomes easy for them to use.  If Microsoft makes 
this simpler by either creating/forcing a standard upon the cluster 
world, this might not be such a bad thing in the longer term.  I find 
nothing wrong with the notion that more HPC users means more HPC.

That said, I am not sure that they can do this without trying to force 
windows upon the users.  The demo at SC was ok, but you can do stuff 
like that today with web services systems.  The issue is that there are 
no real standards associated with that.   Not w3c stuff, but higher 
level standards that make programming these things easier for end users.

Also it should be noted that the Unix world laughed rather heartily when 
Microsoft started attacking its servers with their lower end systems. 
This was a mistake on the part of these people.  Microsoft has 
effectively unlimited resources, and significant patience.  My take on 
this is that it would be better to engage them to get them to create the 
stuff that we really all need (and make sure it is cross platform ala 
Mono, because like it or not, the world will not run just one flavor of 
OS/language on clusters/HPC systems ... this is part of the reason IMO 
that things like Java will probably not ever take off as HPC glue like 
python/perl/... do ... we can run Python/Perl/... on pretty much 
anything, good luck getting a late model Java supported on Itanium2, or 
some other beast of an uncommon processor ).

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615