/. MS vs. university Beowulfs
Bari Ari
bari at onelabs.com
Fri Feb 22 08:32:48 PST 2002
I found these posts to be the most informative and to the point:
Windows Clusters <http://www.windowsclusters.org/> [windowsclusters.org].
"I must now put on the traditional monkey hat of shame, for the
naysayers are quite correct. There are TWO microsoft products called
clustering. One is used by Windows 2000 Advanced Server to do load
balancing, and is, in fact, split into two parts, the first called
Clustering, the second Network Load Balancing... see this page
<http://microsoft.com/windows2000/advancedserver/evaluation/features/>
[microsoft.com], which includes the statement "Both [of the Windows 2000
Advanced Server] Clustering technologies are backwards compatible with
their Windows NT Server 4.0 predecessors". The other is High Performance
Clustering (HPC), in its current form called Computational Clustering
Technical Preview (CCTP), which I am certain has nothing to do with the
previous Clustering technology... I doubt it was available for Windows
NT 4.0, among other things (thus the Technical Preview status).
Notes for any and all interested in this; it's a technical preview,
which any other company would call a pre-Beta or an Alpha release. The
only way anyone sane would use this in a production system would be as
an Early Adoption Partner..."
"Microsoft has a few types of clustering:
1. Failover clustering. This is an OS service that servers like SQL
Server and Exchange plug into that allows Active/Passive or
Active/Active clustering over a shared SCSI/Fibre bus. In theory
you could write your app to use this service but I think it would
be overkill.
2. Network Load Balancing. This is just a software version of the
standard kinds of NLB found in cisco boxes.
3. Component Load Balancing. This is the most suitable. It's provided
by Application Center and it allows you to deploy COM+ objects on
a cluster of machines and have the calls distributed according to
the load on those machines. You can control the threading and
lifetime of the objects and view the status of the machines pretty
easily using the Application Center MMC plugin (or SNMP, I
believe). You'd have to wrap the computational part of your
application into one or more COM objects. Once you've done that
then you can create and call those objects in the cluster as if it
were one machine - the clustering is transparent to the client
application. I played around with AC a bit when it was in beta for
a project that I was working on. We didn't go with it in the end
because the design of our application ended up not requiring it
(we just went with hardware load balancing), but it seemed like
pretty cool technology - if you're into the whole COM thing. It
has a really cool rolling deployment feature where you can
redeploy your components (and/or IIS application if you have one)
to your cluster incrementally while it's still running.
Here's some links to docs on MS's site: Introducing Windows 2000
Clustering Technologies
<http://www.microsoft.com/windows2000/techinfo/howitworks/cluster/introcluster.asp>
[microsoft.com]
Application Center home page
<http://www.microsoft.com/applicationcenter/> [microsoft.com]
Component Load Balancing
<http://microsoft.com/technet/prodtechnol/acs/reskit/acrkch5.asp?frame=true#f>
[microsoft.com] "
" For a computational cluster, the OS itself shouldn't really matter.
What matters is, do you have the tools you need, and does the
environment allow you to work with the cluster in a flexible way.
For a typical compuatational cluster, what determines the performance
will be the quality of your application. Only if you pick an OS with
some extremely poor basic functionality (like, horribly slow
networking), will the OS have an impact on performance.
People optimize how their application is parallelized (eg. how well it
scales to more nodes). The OS doesn't matter in this regard. They
optimize how well the simple computational routines perform (like,
optimizing an equation solver for the current CPU architecture) - again,
the OS doesn't matter.
So, in this light, you might as well run your cluster on Windows instead
of Linux, or MacOS, or even DOS with a TCP/IP stack (if you don't need
more thatn 640K ;)
However, there's a lot more to cluster computing than just pressing
"start". You need to look at how your software performs. You need to
debug software on multiple nodes concurrently. You need to do all kinds
of things that requires, that your environment and your tools will allow
you to work on any node of the cluster, flexibly, as if that node was
the box under your desk.
And this is why people don't run MS clusters. Windows does not have
proper tools for software development (*real* software development, like
Fortran and C - VBScript hasn't really made it's way into anything
resembling high performance (and god forbid it never will)).
Furthermore, you cannot work with 10 windows boxes concurrently, like
they were all sitting under your desk. Yes, I know terminal services
exist, and they're nice if you're a system administrator, but they are
*far* from being usable to run debuggers and tracing tools on a larger
number of nodes, interactively and concurrently.
Last but not least, there are no proper debugging and tracing tools for
windows. Yes, they have a debugger, and third party vendors have
debuggers too. But anyone who's been thru the drill on Linux (using
strace, wc -l /proc/[pid]/maps, ...), and needed the same flexibility on
windows, knows that there is a world of difference between what vendores
can put in a GUI and what you can do when you have a system that was
built for developers, by developers.
So sure - for a dog&pony show, windows will perform similar to any other
networked OS with regards to computational clusters. But for real-world
use ? No, you need tools to work. "
"It seems to me that part of the beauty of a linux cluster is
1. Not having to buy an licence for each machine
2. Having an infinitely configurable system (meaning that you can load
as much or as little of the OS and libraries as you want/need)
3. The use of high quality, low/no cost development tools."
Eugene Leitl wrote:
>On Fri, 22 Feb 2002, Velocet wrote:
>
>>Have fun wading through the blind linux advocacy and silly jokes to try
>>and find any content.
>>
>
>Um, I wasn't suggesting you to read it. I was suggesting writing a few
>comments, trying to educate slashdotties about the issues.
>
>>If someone can summarize any real information they grok from this massive
>>thread (ie how good is computational clustering on M$), please post it here.
>>
>
>NT 4.0 did suck bigtime on IP QoS, I don't see the reasons why Win2k
>should do differently. Even not to mention node licenses, and stability.
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
More information about the Beowulf
mailing list