[Beowulf] OS for 64 bit AMD
Bob Drzyzgula
bob at drzyzgula.org
Sun Apr 3 20:34:47 PDT 2005
I've been following this discussion, and I just wanted to
throw in my $0.02 on a couple of points:
* I think that it's possibly a bit disingenuous to
focus on the rapid cycling of FC-x releases. Perhaps
someone will correct me, but I wasn't aware that the
interfaces present in e.g. FC-2 have ever changed
all that much. AFAIK the big changes happen at release
boundaries, e.g. FC-2 to FC-3, like SElinux changing from
default off to default on, and even then an upgraded
machine will be less dramatically changed than one that
had its drive wiped before installing the new OS.
What is much more important in a true "production"
environment is the length of time one can expect to
obtain patches for the OS. No "production shop" that
really is running a "production application" is likely
to be replacing the OS on anything like the kind of
schedule that FC-x -- or even RHEL -- releases come
out. They are much more likely to qualify all their
applications on a specific OS release, move this new
image -- OS + applications -- into production, and run
it until there is some compelling reason to change,
and this compelling reason can be several years in
coming. Even OS patches would only be applied in
limited circumstances. These would be (a) to remedy a
locally-observed failure mode, (b) to support required
application updates, or (c) to address specific security
issues. In all cases except in the most severe security
problems, such patches would be applied after extensive
testing to verify that production activities would not
be affected.
Now, in principal there is no real reason why --
vendor support notwithstanding -- a production shop
could not be set up to run on e.g. FC-3. However, the
disappearance of the official patch stream after a few
months would, or at least should, give one pause. Of
course there is Fedora Legacy, and one can always
patch the RPMs one's self. But it all starts to get
pretty tenuous and labor-intensive after a while. By
contrast, Red Hat is promising update support for RHEL
version for at least five years after release. *This*,
not the release cycle, is why production shops -- and
their application vendors -- will prefer RHEL over
FC-x. It really doesn't (or shouldn't) make a damn
bit of difference to a production shop how the OS is
characterized: "beta", "proving ground", "enterprise",
whatever. What really matters is the promises that are
made with respect to out-year support.
That being said, the product's longevity is a bit of
a double-edged sword. To the extent that any part of
a system's user base needs to move on -- to develop
and/or implement new applications -- the age of the OS
you are running can come back to hurt you. The latest
versions of your software or hardware may simply not
work with your rickety old OS. But this falls into the
category of "compelling reasons to change" as I said
above. Change control is a perpetual balancing act,
but that just makes the long update life that much more
important -- the last thing any production shop needs
is another reason to have to change.
This, of course, is how one finds out that one is not
really a "production shop", after all -- when the demand
for the latest and greatest is constantly trumping the
"production" applications' inertial pull in the other
direction. RHEL can suck pretty bad in a research
environment, where you are likely to wind up with half
of the RH-supplied packages supplemented with your own
builds of more recent stuff piling up in /usr/local.
* I get a bit frustrated at the hostility toward
commercial applications and closed hardware, especially
to the extent that it gets directed toward the customers
of those products. If there existed an open replacement
for SAS, for example, I can say without hesitation that
we would be using it. Hell, if there was a *commercial*
replacement for SAS, we'd probably be using it. There
simply isn't -- there isn't even anything close. Same
thing with Matlab [1] or Gauss. Yes, there is Scilab
and Octave, but those only implement the bulk of the
core functionality of Matlab. The Matlab toolboxes
are unique even in the world of commercial software.
If you have a choice between solving a problem today or
spending months writing the tool to solve the problem,
the decision will most likely be based on (a) how much
it will cost to develop the tool plus the cost of not
having the solution for months (which is likely to have,
absent extensive analysis, non-monetary units), and
(b) what it would cost to have the tool today.
For many commercial tools, each side of this question
will be represented by large classes of problems
and circumstances. To the extent that organizations
commonly find that it would be both tolerable and more
cost-effective to wait for a locally-developed tool to
solve a particular problem, we are much more likely to
have an open-source tool available (Apache, anyone?) to
solve that sort of problem. But in the case of SAS,
for example, it appears that the people who find it
practical to build a replacement tool either don't find
it effective to release it as open source, don't find
it practical to build in generally-applicable form,
or simply don't exist.
The only other approach is, of course, to find a
different problem to solve, one that can be solved with
existing, free tools. I suspect that this often happens
in academia, but it is rarely practical in business
or government.
The same goes for closed hardware. I don't much
care about high-end graphics cards, but storage
is a big issue. I've recently been looking for new
storage for a sizable network, and am finding that the
option of affordable external, high-speed (FC class)
RAID controllers serving up generic, high-speed,
high-reliability (e.g. not SATA) disk, has pretty much
vanished from the market over the past year or so. As
has been mentioned, everyone wants you to use their
JBODs, their disk modules, and in some cases their
HBAs and closed-source drivers. And they want you to
pay dearly for it. I hardly find this acceptable, but
I honestly don't know what else to do except to decide
that capacity, throughput, reliability, availability and
manageability just aren't that important after all.
--Bob Drzyzgula
[1] Matlab is actually a poor example for this discussion
in that, to their credit, Mathworks in fact only
requires, beyond a 2.4 or 2.6 kernel, a specific glibc
version. 2.3.2.
More information about the Beowulf
mailing list