[Beowulf] OS for 64 bit AMD
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduMon Apr 4 08:22:13 PDT 2005
- Previous message: [Beowulf] OS for 64 bit AMD
- Next message: [Beowulf] OS for 64 bit AMD
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, 3 Apr 2005, Joe Landman wrote: > > FC is not a platform, Linux is. I'd be most curious to hear the explanation > > of how an app gets to be dependent on RHEL and will not work on other > > distributions which conform to the same API. or are you claiming that > > there is no ABI? > > <sigh> What has this got to do with FC being production grade? The ABI > for FC has shifted. The ABI for RHEL-x has shifted, though in a defined > manner, and this ABI will remain constant for a 5 year interval after > RHEL-x release. FC-x will shift when needed. These shifts of FC ABI, > the functionality changes, the kernel changes that fundamentally alter > the way drivers work define the purpose of the environment... all these > contribute to the overall view of whether FC is a production ready or > not. If you don't need commercial apps, or better still, your > commercial apps are supported on "Linux" and not on "RHEL", then it > doesn't matter what the OS underlying it is. More to the point, if the > OS does not break drivers with the upgrades, does not break major > functionality at each upgrade, then it is probably a production class > OS. FC-x isnt that. One can easily make the same argument about a > certain OS from the northwest US (I always kick myself after an upgrade, > as they introduce something new that almost, but not quite, works the > same as it did before, and usually manages to break compatibility with > other bits). I've been following your discussion with great interest, as you both make excellent points. Let me add a few comments. a) Picky point, but an "upgrade" vs an "update" has always meant that binary compatibility is broken for at least some things. Sometimes major things, like libc. Sometimes minor things like libwhatever. In addition there tends to be fairly significant motion in application-land and GUI-land. b) 90% of this discussion is occurring because vendors of commercial linux software don't understand the concept of linux packaging and participating in a dynamic process. This is neither here nor there -- it doesn't get Joe off his hook -- but it is an apropos observation because if companies that sold software (open source or not) actually learned how to build for linux and participate in the various distribution forums a lot of this problem would go away. c) There is always Centos -- logo-free RHEL, basically. IIRC it follows RHEL by what, a few hours? I think it is perfectly reasonable for companies that sell commercial software for linux to build it for a commercial distribution like RHEL, and if you are in an industry where you MUST have both somebody to call and a line of responsibility should things fail in certain ways (e.g. banking, certain parts of the health care industry, etc.) it is almost certainly both wise and legally necessary to buy RH. If you are setting up a research cluster or departmental LAN at a University, though, and don't want to pay RH (without getting into whether what you pay per node is "reasonable") there is always Centos -- no direct phone support but all the stability, and we never use any direct support anyway. This is usually fair -- just because it is "enterprise linux" doesn't mean that it is bug free, and Universities STILL are primary debugging entities for RH, SuSE, Debian, FC, and everything else. d) I think that one place where you (Joe) and you (Mark) most fundamentally disagree -- TECHNICALLY beta testing refers to a very specific pre-release phase in a commercial software development cycle. As in: Alpha: Testing during active development by development team to ensure that the product "works". There can be a first and second stage (known as "black box" testing) Beta: Sampled testing in the community where the software is to be used, usually on a "pre-release" basis. Pilot: Like a beta but usually after or in parallel with the beta phase and intended to see if the product has commercial potential or if it is still missing desirable features that would enhance its commercial potential. Gamma: This is a joke term for software that is released but is still full of bugs so that the hapless public becomes de facto "beta testers" after the formal beta phase. Every major release of Windows has, for a while after its appearance, very much looked like a gamma release. (From one of several sources, e.g. http://en.wikipedia.org/wiki/Beta_testing#Alpha_testing ) Note that this cycle formally refers to commercial code development, where there is a well-defined "team" and an organization capable of supporting the various levels of testing, feedback, and ultimately commercial exploitation. One "feature" of open source software is that in a very deep sense it is all, always, beta/gamma software and all users are, in a very deep sense, beta/gamma testers. This is true in the commercial software world as well; it's just that THEY don't always acknowledge it and are sometimes about as responsive as molasses when you call them and point out that their platform doesn't work on this graphics adapter or crashes randomly every time a certain subtask is initiated, hence the "gamma" joke. Other companies (like pathscale) are very responsive and really "get" points b) and d) and support things as broadly as possible in order to keep their market as broad as possible. As a consumer, I'm a lot more likely to buy pathscale's compilers if I don't "have" to buy RHEL for my entire enterprise first. The same is true for all sorts of WinXX software -- I might well buy some of it for my linux boxes if I didn't have to buy WinXX (and all that implies) first. To be very clear, FC is not a "beta" linux distribution any more than RH itself, SuSE, Debian, Mandrake are beta distributions, but since ALL linux distributions are composed of hundreds of packages in varying states of ongoing development, ALL linux distributions involve feature changes and bugs that are revealed as they are implemented in ever richer and more complex environments. Just like the rest of the software universe. The primary difference is that linux FIXES those bugs VERY RAPIDLY so that >>any<< linux distribution with an update mechanism such as yum rapidly becomes stable in production, just as ALL of them are somewhat unstable and break things with new features when they are first released. So what you are arguing about is a degree, not an absolute. It is silly to call Red Hat "stable" and Fedora "unstable", just as it is silly to argue that there aren't real differences in their rates of change and the longevity of their support cycles. As a consumer, one can choose a level of both that suits you and your needs. e) On this same line, let's be very careful to differentiate between the terms "commercial distribution", "stable", and "rapidly changing". Commercial linux distributions such as Red Hat are neither necessarily stable nor slowly varying, but by implication they are in a lot of the discussion so far. To use RH as a historical example, the 7.3->8->9 sequence is one of the most striking in recent years. Major libc changes, lots of stuff broken, very rapid release cycle, lots of pissed off humans. Hence commercial distributions can occur rapidly or not, depending on what reasons there are either way. Note also that just because RH promises to support stuff for N years doesn't mean that their consumers will actually be satisfied with that cycle, or that other (commercial) linux distros will pick the same cycle. In fact, the REASON to get long-running support for a particular release is because things are NOT either stable or bug free -- bugs appear in libraries for years after a release first comes out, some of them serious or security-related. At some point, though, the world needs to just move on. f) To amplify, "slowly varying" is very much an ambivalent "advantage" for an operating system and distribution. It is an open invitation to stagnation and laziness on the part of commercial developers and administrators alike. How often on this list do we hear of people who are STILL running RH 7.3 based clusters and wonder aloud at the lack of driver support or the ability to run on Opterons? What's the knee-jerk response? Upgrade to something modern. Things improve, they get better, more secure, faster, more powerful. So what one is really arguing about (I hope) isn't that we should all be running RH 5.2 just because there exist vendors somewhere who never bothered to port their application(s) to 6.x, 7.x, 8, 9, RHEL (or any of the other distros). You laugh, but I've corresponded with people repeatedly over the years who are locked into one or another of those numbers by some silly application. These individuals are to be viewed with sympathy and unwilling tolerance, not praised for helping to hold the world back... To conclude, once one separates commercial, stable, rapidly/slowly varying from some deep correlation as in "Debian is not commercial and hence is both unstable and not rapidly varying enough", "Fedora is Red Hat's Beta testing distribution and horribly unstable as it changes too rapidly", or "Red Hat is commercial and hence is stable and each release will still be around, supported, when my youngest kid goes off to college" THEN one can assess a particular situation instead of throwing sweeping generalizations out there that are obviously false for some important set of potential applications. * Some Fedora releases have changed things enough to break some customers' systems. Fine, so have some Red Hat releases, some Mandrake releases, some Windows releases. It is too early to determine if this is a trend, and in any event if RH >>does<< adopt things piloted in Fedora -- ever -- it is just a matter of time before those same things "break" their clients' systems. The evolution and development of new features and tools is as much a reason FOR using Fedora as it is AGAINST it, in most environments but with some clear exceptions. * There is a considerable "cost" to a major upgrade in any organization (and for any distribution, commercial or non-commercial). Things have to be rebuilt, that final gamma stage of testing occurs, deep bugs and incompatibilities are revealed, you have to deal with the commercial package problem. This cost is balanced against the cost of doing nothing and staying with a single snapshot of a single distribution forever. Millions of WinXX users can attest with every bug, crash, or virus the sterility of choosing to nothing (or having no real choice but to do nothing). Most sites choose a sane middle ground here that is comfortable for their level of expertise, application set, and other resources. In our case we use FC on most desktops as users LIKE getting new desktop features and FC (under active development) tends to stabilize very rapidly so that early teething problems are rapidly resolved and yum-updated across an organization without further human effort. We avoid the 6 month cycle by only upgrading the linux at duke distribution every other upgrade, which gives us a lot of time to get used to the new features. We do use Centos on fault-intolerant servers, and have it available for people with a commercial "requirement" for RH's libraries. We ALSO have SuSE available (for a price) or RHEL itself (also for a price) for people who want either the support or who have particular commercial packags with library dependences. Not everything is built for RHEL. * This latter mix clearly indicates that THERE IS NO "RIGHT" ANSWER to this debate. FC is clearly perfectly acceptable as an operating system for a cluster, a LAN of workstations, my laptop ;-), a server farm. RHEL or Centos are also perfectly acceptable. At the cluster level, they are nearly indistinguishably acceptable; for a LAN server they might have a small edge not because FC cannot be made stable enough for servers but because it is a PITA to upgrade servers and so it is a good place to stick with a release as long as it is supported and security patched and has a functional span of the (usually small) set of e.g. server programs such as nfsd or httpd required by the server and applications. And Debian is equally reasonable, as are Scientific Linux, Caosity, etc. * So it would be great if arguments in absolute terms were softened just a bit. There are some circumstances where it makes sense to use RHEL or Centos or SuSE. The burning need to use a software package that will only run on one or the other of them is a great reason to use it. In other circumstances it makes more sense to use FC*, or Debian, or Scientific Linux, because it is free, >>BECAUSE<< it is rapidly evolving and you need to use one of its rapidly evolving features, because you like using distributions under rapid development so bugs are fixed quickly and community scrutiny is strong. It is no more fair to say "Fedora Core Sucks" and should never be installed on a cluster (not true, use it all the time, works great) than it is to say "Fedora Core is Perfect" and that there are never good reasons to use RHEL or Centos. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: [Beowulf] OS for 64 bit AMD
- Next message: [Beowulf] OS for 64 bit AMD
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
