[Beowulf] HPC in Windows
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduTue Oct 12 08:35:05 PDT 2004
- Previous message: [Beowulf] HPC in Windows
- Next message: [Beowulf] HPC in Windows
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, 11 Oct 2004, Erik Paulson wrote: > On Sat, Oct 09, 2004 at 06:11:01PM -0400, Robert G. Brown wrote: > > On Sat, 9 Oct 2004, Rajiv wrote: > > > > > Dear All, > > > Are there any Beowulf packages for windows? > > > > Not that I know of. In fact, the whole concept seems a bit oxymoronic, > > as the definition of a beowulf is a cluster supercomputer running an > > open source operating system. > > > > It's really time that gave up on trying to hold a strong definition > to "beowulf". It's like kleenex or hacker/cracker. The world doesn't > care. Clusters of x86 PCs doing "HPC" = beowulf Now look what you did. Used up my whole morning, just about. The easily bored can skip the rant below. <rant id="389"> This (what's a beowulf?) is list discussion #389, actually. Or maybe it is that the discussion has occurred 389 times, I can't remember. I do remember that the first time I participated in it was around seven or eight years ago, that I advanced the point of view that you espouse here -- and that I changed my mind. The definition of beowulf as OPPOSED to "just" a cluster of systems (nuttin' in the definition about them being "PC"s, just COTS systems) was given by the members of the original beowulf project with explicit reasons for each component. Note well that cluster supercomputing was at the time not new -- I'd been doing it myself by then for years (on COTS systems, for that matter, if Unix workstations can be considered off the shelf), and I was far, far from the first. At that time, there were already NOWs, COWs, PoPs and more. See Pfister's "In Search of Clusters" for a lovely, balanced, and not terribly beowulf-centric historical review. Two things differentiated the beowulf from earlier cluster efforts. a) Custom software designed to present a view of the cluster as "a supercomputer" in the same sense (precisely) that e.g. an SP2 or SP3 is "a supercomputer" -- a single "head" that is identified as being "the computer", specialized communications channels to augment the speed of communications (then quite slow on 10 Mbps ethernet), stuff like bproc designed to support the member computers being "processors" in a multiprocessor machine rather than standalone computers. Note that this idea was NOT totally original to the beowulf project, as PVM already had incorporated much of this vision years earlier. b) The fact that the beowulf utilized an open source operating system and was built on top of open source software. The reasons for this at the time were manifest, and really haven't changed. In order to realize their design goals that >>extended<< the concepts already in place in PVM, they had to write numerous kernel drivers (hard to do without the kernel source) as well as a variety of support packages. Don Becker wrote (IIRC) something like -- would that be all of the linux kernel's network drivers at the time or just 80% of them? -- hard to remember at this point, but a grep on Becker in /usr/src/linux/drivers/net is STILL pretty revealing. Now look for Sterling and Becker's contributions to the WinXX networking stack. Hmmmm.... The insistence on COTS hardware, actually, is what I'd consider the "weakest" component of the original definition, as it is the one component that was readily bent by the community in order to better realize the design goal of a parallel supercomputer capable of running fine grained parallel code competitively with "big iron" supercomputers. The beowulf community readily embraced non-commodity networks when they appeared. Note that I consider "commodity" as meaning multisourced with real competition holding down prices and generally built on an "open" standard, e.g. ethernet is open and has many vendors, myrinet is not open and is available only from Myricom (although at all points there has been at least some generic competition at least between high end proprietary networks). Myrinet historically was perhaps >>the<< key component that permitted beowulves to reach and even exceed the performance of so-called big iron supercomputers for precisely the kind of fine grained numerical problems that the supercomputers had historically dominated. I remember well Greg Lindahl, for example, showing graphs of Alpha/Myrinet speedup scaling compared to e.g. SP-series systems and others, with the beowulf model actually winning (at less than 1/3 the price, even using the relatively expensive hardware involved). > And on the Beowulf on Windows bit - > http://www.amazon.com/exec/obidos/tg/detail/-/0262692759/qid=1097514164/sr=8-1/ref=sr_8_xs_ap_i1_xgl14/104-7091285-1915902?v=glance&s=books&n=507846 > > "Beowulf Cluster Computing with Windows (Scientific and Engineering Computation) > by Thomas Sterling" - If Tom says that you can build a beowulf on > Windows, I think you can. I can only reply with: http://www.beowulf.org/community/column2.html by Don Becker, in which he points out that when they first met, Sterling was "obsessed with writing open source network drivers". Or if you prefer, Question Number One of the beowulf FAQ: 1. What's a Beowulf? Beowulf Clusters are scalable performance clusters based on commodity hardware, on a private system network, with open source software (Linux) infrastructure. Each consists of a cluster of PCs or workstations dedicated to running high-performance computing tasks. The nodes in the cluster don't sit on people's desks; they are dedicated to running cluster jobs. It is usually connected to the outside world through only a single node. Some Linux clusters are built for reliability instead of speed. These are not Beowulfs. Or check out my "snapshot" of the original beowulf website, preserved in electronic amber (so to speak) from back when I ran a mirror: http://www.phy.duke.edu/resources/computing/brahma/Resources/beowulf/ The introduction and overview contains a number of lovely tidbits concerning the beowulf design and how it differs from a NOW. It makes it pretty clear that the only way a pile of WinXX boxes could be "a beowulf" (as opposed to a NOW) would be if Microsoft Made it So -- the WinXX kernels and networking stack and job scheduling and management are essentially inaccessible to developers in an open community, which is why WinXX clusters like Cornell's (however well they work) stand alone, supported only to the extent that MS or Cornell pay for it with little community synergy. Nobody would argue, of course, that one can't build a NOW based on WinXX boxes. A number exist. WinXX boxes run PVM or MPI (and have been able to for many years, probably even predating the beowulf project although I'm too lazy to check the mod dates of the WinXX ifdefs in PVM). One can also obviously build a grid with WinXX boxes in it, probably more easily than one can build a true parallel cluster. Grid-style clusters (a.k.a. "compute farms") predate even virtual supercomputers in cluster taxonomy, for all that they have a new name and a relatively new set of high-level support software (just as the beowulf has, in the form of bproc implemented in clustermatic and scyld). Those of use who used to "roll our own" gridware to permit the use of entire LANs of workstations on embarrassingly parallel problems find this (toplevel support software) a welcome development, and it has indeed blurred the lines between beowulfs and other NOWs to some degree, but if anything it is DIMINISHING the identification of all clusters as "beowulfs". Look at all the Grid projects in the universe -- BioGRID, the smallpox grid, ATLAS grid, PatriotGrid -- grids are proliferating like crazy, but they aren't considered or referred to as beowulfs. In most cases "beowulf" isn't even mentioned in their toplevel documentation. One of the fundamental reasons for differentiation is this very list. Few people who have been on the list for a long time and who have worked with beowulfs and other kinds of open source clusters for a long time have any particular interest in providing community support to cluster computing under Windows. For one thing, it is nearly impossible -- it requires somebody with trans-MCSE knowledge of Windows' kernels, libraries, drivers, networking stack, and tools including the various WinXX ports of key cluster software where it exists. For another, people who work in that community who DO have that level of expertise don't seem to want to share -- they want to sell. One has to pay to become a MCSE; one then expects a high rate of consultative return on the investment. One cannot easily obtain access to WinXX source code, and open or not, access to kernel-level source code turns out to be essential to getting maximal performance out of a true beowulf or even advanced non-beowulf style cluster. Besides, nearly all the tools involved (beyond userspace stuff like PVM or MPI in certain flavors) are SOLD and supported by Microsoft (only) or other Microsoft-connected commercial developers and the only "benefit" we get back in the community from providing support for them is to increase their profits and to encourage them to turn around and resell us our own developments and ideas at a high cost. So let THEM provide the consultation and expertise and "intellectual property" they prize so highly; I will not contribute. Contrast that with the really rather unbelieveable level of support freely offered via this list to (yes) general cluster computer users and builders (not just "beowulf" builders by the strict definition). This support is predicated on the fundamental notions of open source software -- that effort expended on it comes back to you amplified tenfold as the COMMUNITY is strengthened in the open and free exchange of ideas. Consider the many tools and products that support beowulfery (or generalized cluster computer operation) that would simply be impossible to develop in a closed source proprietary model. People who participate in this sort of development have no desire to do all the work to create new tools and products only to have Microsoft and its software lackeys do its usual job of co-opting the tool, branding it, shifting the core standard from open to proprietary, and then squeezing out the original inventors (extended rant available on request:-). For all of these reasons, I think that it is worthwhile to maintain the moderately strict definition of "a beowulf" as a particular isolated network arrangement of COTS systems running open source software and functioning as a cluster capable of running anything from fine grained parallel problems down to distributed single tasks with a single "view" of task ID space. This is a fairly open and embracing definition -- people on the list run "beowulfs" with a single head, multiple heads, many operating systems other than Linux (most of them open source -- WinXX users are subjected to fairly merciless teasing if nothing else ...hotter:-). It is differentiated from (recently emerging) definitions of Grid-style clusters, from my much older definition of a "distributed parallel supercomputer" (built largely of dual use workstations that function as desktop machines in a LAN while still permitting long-running numerical tasks to be run in the background), from MUCH older definitions of NOWs, COWs, Piles of PCs. So, if somebody says they've "built a beowulf" out of a bunch of WinXX boxes, yes, I know what they mean, even though what they say is almost certainly not correct. The list is fairly tolerant of pretty much anybody doing any kind of cluster computing, even Windows based NOWs or Grids. "Extreme Linux" as a more general vehicle for linux cluster development never quite took off, and www.extremelinux.org continues to be a blank page as it has been for years now. As I said above, I personally don't even DO "real" beowulf computing and never have -- my clusters tend to be NOWs, although we're gradually shifting more towards a Grid model as improved software makes this the easy path support-wise. As a final note, I personally view the original PVM team as the "inventors of commodity cluster computing" even more than Sterling and Becker (much as I revere their contributions). If a "beowulf" is a network of computers running e.g. PVM on top of proprietary software, Dongarra et. al. beat Sterling and Becker to the punch by years. This isn't a crazy idea -- PVM already contains "out of the box" many of the design goals of the beowulf project -- a unified process id space (tids), a single control head that supports the "virtual machine" model, the ability to run on commodity hardware. It just does it in userspace, and hence has limits on what can be accomplished performance-wise, and has the usual PVM vs MPI problems with the older supercomputer programmers (who all used MPI, for interesting historical reasons). (Interestingly, "old hands" in the beowulf/cluster business nearly all tell me that they used to use and still prefer PVM, while MPI is still the "commercially salable" parallel library that better favors the traditional big iron supercomputing model;-) To what PVM already provided, Sterling and Becker contributed the notions of >>network isolation<< to achieve predictable network latency, >>channel bonding<< of network channels, built on top of open source network drivers, to improve network bandwidth (an accomplishment somewhat overshadowed by the rapid development faster networks and low-latency networks), and eventually >>kernel-level modifications<< that truly converted a cluster of PCs into a "single machine" the components of which could no longer stand alone but were merely "processors" in a massively parallel system with a single user-level kernel interface. So how in the world can Sterling argue that this >>beowulf<< software, developed by the original beowulf team, is available for Windows? Did I miss something? Network isolation, fine, that's a matter of trivial network arrangement that anybody with $50 for an OTC router/firewall can now accomplish, but channel bonded networks? Unified process id spaces? Kernel modifications that make nodes into virtual processors in a single "machine"? Not that I know of, anyway, and obviously impossible without fairly open access to Windows source code in any event. At a guess, it would require such a violent modification even to the more modern and POSIX compliant WinXX's that the result could be called "Windows" only in the sense that linux running a windowing system can be called "Windows" -- pretty much a complete rewrite and de-integration of the GUI from the OS kernel would be required (something that Microsoft has argued in court is impossible, amusingly enough, as they have sought to convince an ignorant public that Internet Explorer -- a userspace program if ever there was one -- cannot be be de-integrated from Windows:-). Asserting that there are truly Windows-based beowulfs does not make it so, and coopting the term "beowulf" to apply to generic computing models and tools that preceded the project by years is a kind of newspeak. I'll have to just go on thinking of the idea as an oxymoronic one, at least until Microsoft opens its source code or somebody succeeds in rewriting history and the original definition and goals of the beowulf project. > ps - define "supercomputer" :) AT THE TIME of the beowulf project, the definition was actually pretty clear, if only by example. I'd say it is still pretty clear, actually. At that time (and still today, mostly) the generic term "computer" embraced: a) Mainframes (the oldest example of "computer", still annoyingly common in business, industry and academe). b) Minicomputers (e.g. PDP's, Vaxes, Harris's). Basically cheaper/smaller versions of mainframes that generally stood alone although of course a number of them were used as the core servers for Unix-based workstation LANs. c) Workstations (e.g. Suns, SGIs). Typically desktop-sized computers in a client-server arrangement on a LAN. Server-class Suns and SGIs were sometimes refrigerator-sized units that were de facto minicomputers, blurring the lines between b) and c) in the case where both were running Unix flavors (or at least real multitasking operating systems). d) Personal computers. A "personal" computer was always a desktop sized unit, and the term "PC" generally applied to x86-family examples, although clearly Apples were (and continue to be) PCs as well. Note that PCs were sometimes as capable, hardware-wise, as workstations and had been networkable for years, so networking or hardware per se had nothing to do with being a PC vs a workstation. A PC really was differentiated from being a workstation by a key feature of its operating system -- the INability to login to the system remotely over a network. To use a PC, you had to sit at the PC's actual interface. (Note that aftermarket tools like "PC anywhere" did not a PC a workstation make). e) Supercomputers. A supercomputer was (and continues to be) a generic term for a "computer" capable of doing numerical (HPC) computations much faster than the CURRENT GENERATION of a-d computers. Obviously a moving target, given Moore's Law. From the "first" so-called supercomputer, the 12 MFLOP Cray-1, through to today's top 500 list, the differentiating feature is obviously RELATIVE performance, as the Palm Tungsten C in my pocket (with its 400 MHz CPU) is faster than the Cray 1. f) Today there is a weak association between "supercomputer" and "single task" HPC (so Grids and compute farms of various sorts are somewhat excluded, probably BECAUSE of the top500 list and its insistence on parallel linpack-y sorts of stuff as the relevant measure of supercomputer performance). So Grids have emerged as a kind of cluster in their own right that isn't ordinarily viewed as a supercomputer although a Grid is essentially unbounded from above in terms of aggregate floating point capacity in a way that supercomputers are not. One could make a grid of all the top500 supercomputers, in fact... Note that historically supercomputers are differentiated from other a-d class computers not by being "mainframe" or not, not by being vector processor based vs interconnected parallel multiprocessor based, not by its operating system, not even by its underlying computational paradigm (e.g. shared memory vs message passing), certainly not by its ABSOLUTE performance, but strictly by relative numerical performance. My Palm a decade ago would have been an export-restricted munition supercomputer, usable by rogue nations to simulate nuclear blasts and build WMD. Today it is a casual tool used by businessmen to check the web and email and remind them of appointments, while other munitions-quality systems are now toys, used by my kids to race virtual motorcycles around hyperrealistically rendered city streets. Talk about swords into plowshares...;-) The exact multiplier between "ordinary computer" performance and supercomputer performance is of course not terribly sharp. Over the years, a factor of order ten has often sufficed. In the original beowulf project, aggregating 16 80486DX processors (at best a few hundred aggregate FLOPS, again, my Palm probably would beat it at a walk) was enough. Nowadays perhaps we are jaded, and only clusters consisting of hundreds or thousands of CPUs, instead of tens, are in the running. Maybe only the top500 systems are "supercomputers. Maybe the term itself is really obsolete, as fewer and fewer systems that are anything BUT a beowulf style cluster (even if it is assembled and sold as a big iron "single system" with its internal cluster CPUs and IPC network and memory model hidden by a custom designed operating system) appear in the HPC marketplace. Still, I think most people still know what "supercomputer" means. In fact, when one looks over the current top500, it appears that it has >>almost<< become synonymous with the term "beowulf";-) But not (note well!) with the term "grid", as grids aren't architected to excell at linpack, and a grid is very definitely not a beowulf. As far as I can tell, just about 100% of the top500 are clusters (COTS or otherwise) architected along the lines laid out by the beowulf project, with 95% of them having lots scalar processors and the remaining 5% having lots of vector processors. Unfortunately, the top500 (which I continue to think of as being almost totally useless for anything but advertising) doesn't present us with a clear picture of the operating systems or systems software architectures in place on most of the clusters. In fact, it provides remarkably little useful information except the name of the cluster manufacturer/integrator/reseller (imagine that;-). Two clusters on the list (#146 at Cornell and #233 in Korea) are explicitly indicated as running Windows. Looking over the general cluster hardware architectures and manufacturer/integrator/resellers, I would guess that linux is overwhelmingly dominant, followed by freebsd and other (proprietary) flavors of Unix, with WinXX quite possibly dead last. Open source development is an evolutionary model, capable of paradigm shifts, far jumps in parametric space, and N^3 advantage in searching high dimensional spaces. Proprietary software development is by its nature a gradient search process, prone to optimizing in perpetuity around a slowly evolving local minimum, making long jumps only when it steals fully developed memetic patterns (such as the Internet, cluster computing, and many more) more often than not produced by evolutionary communities. To be fair, new patterns are sometimes introduced a priori by brilliant individuals without clear roots in open communities (e.g. "Turbo" compilers), although that is less common in recent years as the open source development process has itself evolved. The individuals only RARELY work for major corporations any more, and the corporations that are famous as idea factories -- e.g. Bell Labs -- created internal "open" communities of their very own where the new ideas were incubated and exchanged and kicked around. It's just a matter of mathematics, you see. Linux = mammal (sorry, Tux:-) Evolving at a stupendous speed (compare everything from kernel to complete distributions over the last decade) WinXX = Great White Shark Evolutionarily frozen, remarkably efficient at what it does, immensely yet curiously vulnerable... </rant> Well, that's enough rant for the day. I've GOT to get some actual work done... rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: [Beowulf] HPC in Windows
- Next message: [Beowulf] HPC in Windows
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
