[Beowulf] OS for 64 bit AMD

Robert G. Brown rgb at phy.duke.edu
Wed Apr 6 12:11:08 PDT 2005

On Tue, 5 Apr 2005, Tony Travis wrote:

> Robert G. Brown wrote:
> > [...]
> > Ah, but the PROPER and ORIGINAL meaning of axiom is "unprovable
> > assumption" upon which a system of logic and discourse is built.  The
> > final half-definition is the only one that is actually correct.
>
> > In fact, I've got a whole essay entitled "Axioms" on my personal
> > website.  Seriously.
>
> Hello, Robert.
>
> OK, I bow to your greater wisdom: I have now learned that axioms are:
>
> "...unprovable assumptions whose truth and falsehood cannot be assessed"
> [http://www.phy.duke.edu/~rgb/Philosophy/axioms.php]
>
> What struck me was that both Mark and Joe used the word "axiom"
> appropriately according to at least one dictionary:
>
> 	http://www.thefreedictionary.com/axiom

Oh, I have much the same definition from Webster in the very work cited
above -- the interesting thing (and reason I brought it up) is that
"axiom" in everyday usage HAS come to mean something that is
self-evidently true, without question, when the Greek/Latin root meaning
is actually in some sense the opposite -- something that is not
self-evidently true, but that is assumed to be true literally for the
sake of argument.

rom Webster's Revised Unabridged Dictionary (1913) [web1913]:

Axiom, n.-- L. axioma, Gr.; that which is thought
worthy, that which is assumed, a basis of demonstration, a
principle, fr.; to think worthy, fr.; worthy, weighing as
much as; cf.; to lead, drive, also to weigh so much: cf F.
axiome. See Agent.
1. (Logic and Math.) A self-evident and necessary truth, or a
proposition whose truth is so evident as first sight that
no reasoning or demonstration can make it plainer; a
proposition which it is necessary to take for granted; as,
The whole is greater than a part;'' A thing can not,
at the same time, be and not be.''

2. An established principle in some art or science, which,
though not a necessary truth, is universally received; as,
the axioms of political economy.

"That which is assumed", "That which is thought worthy" is a fair
description of meaning, and "the basis of a demonstration" (or
argument:-) better still, but Webster makes a common and real mistake in
calling an axiom a self-evident truth in Mathematics or Logic.  One
important enough to discuss in some detail even on a list like this one
where it is a bit off topic but where a lot of people participate in
reasoning and debate.  Axioms shape discourse, and it is very important
to understand that to avoid spinning one's wheels or missing new truths,
and you can't know too much formal mathematics;-)

In Webster's very example, "The whole is greater than the part"
precludes the existence of negative numbers or the entirety of vector
algebra (where |\vec{A} + \vec{B}| can easily be less than |\vec{A}| and
|\vec{B}| and is always <= |vec{A}| + |\vec{B}| (triangle inequality).
Curved space geometry follows only when one drops the "self-evident"
axiom that parallel lines never meet.  Nowhere is it MORE evident that
an axiom is that which is assumed as the basis for a demonstration or
formal reasoning and >>not<< absolute or self-evident truth than in
mathematics and logic.  "Its axiomatic that..." is another way of saying
"For the sake of argument, let's both assume that..."

However, the actual usage in the discussion was sense 2, where calling
something an "axiom" (or using something as if it were an axiom) is a
political and argumentative step intended to short-circuit the reasoning
process.  Self-evident truths such as "All men were created equal" are
manifestly and obvious falsehoods by any objective measure (and begs at
least three or four serious questions to boot, as in "men" (versus women
and children and dogs and fish), "created" (as opposed to evolved,
appeared out of nowhere, are a figment of my solipsistic imagination),
"equal" (in precisely what abstract and projective geometry, given that
this usage is clearly not the same as what is meant when I say 1 is
equal to 2*(1/2) or my weight is equal to 980 Newtons.

This sort of "axiomatic" reasoning is a powerful tool in any debate.
Here one makes a statement that one hopes to have ACCEPTED as being
"axiomatically true" as this begs the question and precludes any
possible argument.  The entire discussion on the subject of e.g.
abortion is rife with this sort of "axiomatic" reasoning, and
unsurprisingly such debate is not only unresolved, it is unresolvable as
the participants fail to agree on their basic axioms and will inevitably
be led to different, equally "valid", conclusions.  Semantically, the
argument is that of a typical five year old -- "Is so." "Is not."  "Is
so."  Not terribly useful.

> I don't think their discussion was pointless: Actually, I found it
> interesting and relevant to my own situation running openMosix on a

I agree that it wasn't useless and was enjoying it at least to where I
was apparently making Joe feel bad (which was never my intention, and I
apologize, Joe:-), but one aspect of it WAS useless -- the issue of
whether or not FC is or is not a "beta distribution".  Whether you call
it an issue of axioms or definitions, it was (and still is) clear that
Mark and Joe (and I:-) >>agreed<< about what Fedora Core "is" -- an
actual linux distribution, with its own decisioning process, its own
testing process (including both alpha and beta phases), its own
community-based repair mechanism (whether or not one finds it to be
efficient, it is certainly there).  It quacks like a duck, likes water,
and its mature females lay eggs -- call it a duck.

Joe calls it a "rolling beta distribution" because one of several
undoubted benefits Red Hat gets from funding it is the ability to see
what works and what doesn't and only select from it what works for their
(necessarily) ultra-conservative Enterprise release.  Mark agrees that
Red Hat does get to select from it (and from all the other linuces in
existence, and from new or different packages that may not have appeared
in FC) for EL, but insists that this doesn't make FC a "rolling beta
distribution".

(Parenthetically, it is fair to say that the many people who work on or
with FC who do NOT work for or even with Red Hat per se might take
offense at hearing their work portrayed as a "rolling beta distribution
for Red Hat" used as a pejorative term, just as Debian people get tired
of hearing that Debian is always either way behind (for the stable
version) or unstable (surprise) for the development version and linux
people in general get REALLY tired by the old saw that Microsoft is
"supported" and linux is not.  I'm equally sure Joe never meant to
offend any of them, either -- he was just making and emphasizing a
point.)

This is the point where things do get silly.  Two people agree as to
what Fedora core "is" in terms of all its functional aspects, but argue
over what it is to be called, which distracts from the real point, which
is/was whether FC was a suitable distro to choose for use on a cluster
and whether any of the specific issues that e.g. Joe encountered with
particular packages were problems with FC per se or with those packages.

There anybody who has been on the list more than a few days KNOWS that
the answer is going to be "it depends" and "YMMV"; a useful
>>discussion<< illuminates what it depends ON, so users can answer very
reasonable questions such as "ok, so I might have to rebuild a kernel in
order to use a kernel-tainting Nvidia driver and still use FC".  It is
also useful to hear where something will not work under any
circumstances, and where particular users have reported problems and
gotten less than satisfactory responses (such as Greg's experience).

Instead Joe calls it a "beta" in some very large, very general sense in
which the >>entire distribution<< is a beta, since it clearly >>has its
own<< beta releases, as do (in many cases) the individual >>packages<<
from which it is comprised, and calling something that is post-beta for
an actual, formal beta test a "beta" is clearly not "technically"
correct.  Clearly he means to imply that FC as a whole is not
acceptable, as beta software is almost by definition full of
undiscovered bugs and not ready for a prime time.  Beta is being used as
a term of indictment.

Mark insists that this isn't technically correct (and he's right) -- a
post-beta product is not in any accepted sense of the term, a beta
product.  Also, ALL post-beta releases can still be full of bugs
depending on how broad the beta phases were and how complex the product;
hence the "gamma release" joke/extension of the series.  Show me a major
linux distribution that never had a release that was a "beta" in the
sense that a ton of stuff in the HUNDREDS of packages distributed was
more than a bit broken or buggy, and I'll show you a major distro that
doesn't need an update stream.  (Any takers?  I thought not.)  That
still leaves open the IMPLICIT indictment and question -- "is FC ready
for prime time"? or "Is RH to blame for FC's particular bug stream"?

Fortunately, we all know what BOTH of them mean, and fully realize BOTH
that RH gains various benefits from the de facto testing that occurs in
FC and that FC is not, actually a beta for RHEL in anything like the
sense that the term is used in an actual software development cycle.  FC
>>has<< a beta phase, and its "gamma" phase >>does<< further refine the
release because bugs do get fixed however responsive or unresponsive the
formal FC "community support" mechanism is.  That doesn't save RH from
having to run an actual beta on their own EL products, though.  Let me
be clear in a way I hope everybody can agree on.  RHEL is most
definitely NOT just some late snapshot of a given FC release rebranded,
and that's precisely what it would have to be in order to make Joe's
assertion technically correct.

Whole releases of FC occur in between RHEL releases, and frankly with
only FC 1-3 under our belts it is really difficult for me to accept ANY
useful assertion concerning whether FC is going to be "stable" or
"unstable" or conservative or radical in the long run.  FC 1 was,
indeed, pretty much a beta release, but I personally have found FC 2 to
be every bit as stable/functional as any given RH x.y <= 9 ever was and
far MORE stable and useful than a number of memorable specific releases.
FC 3 I'm undecided about -- it works well enough but also has enough
differences that I'm a bit uncomfortable with the rate of change,
especially in X.

However, as always, YMMV, caveat emptor, maybe it sucks for you and not
for me, maybe it sucks for me and I'm to stupid to notice it.

By this standard Debian is just an "rolling beta for RHEL", as is SuSE,
as I'd bet significant amounts of money that packages that are primarily
maintained by Debian and SuSE devlopers, or by independent developers
e.g. Ximian or the mozilla folks or the gnu folks -- make it into any
given EL release without first passing through FC (and whole GENERATIONS
of FC occur without making it into EL).  They (like the FC packages) are
doubtless passed through their very own alpha and beta cycles by their
primary developers followed by a fairly standard process of repackaging
for EL specifically and subsequent alpha and beta testing all over.  The
most FC "buys" RH is MAYBE a simpler and more reliable port into EL and
fewer problems during the beta for certain overlapping packages, which I
actually think is a perfectly reasonable tradeoff and a good deal for
everybody concerned but is less than sufficient to convince me that this
makes FC automatically "unstable" or "unreliable" or brandable as a
"beta distribution".  Only time will prove one way or the other, and any
such conclusion will remain subject to change again over still more
time.

> interesting and relevant to my own situation running openMosix on a
> 64-node RH-9 based Beowulf cluster. I had a difficult time trying to
> upgrade our cluster from RH-8.0 to FC-2 because the Adaptec Ultra160
> drivers in FC-2 were broken, so I went to RH-9 instead.

That probably should read "were broken in the particular kernel snapshot
I tried to install".  Google fairly quickly turns up a few hits on this
problem (but not many) and several suggestions on how to proceed.  This
seems like something that would have rapidly and long since been
resolved, given the large number of adaptec users out there.  Still, I
have definitely seen problems (notably a broken USB subsystem) within
some of the FC kernel snapshots.  This doesn't make FC "broken" wrt to
EL -- I've had BIGGER problems dealing with EL's broken/out of date
libraries in my own numerical code.  An old GSL alone is a show stopper
for HPC applications in my personal opinion.  RH has clearly interpreted
"stable" as meaning "not to be changed even in clearly positive ways"
(that is, "stagnant") in my opinion.  Fine for banks, fine for servers,
not so good for desktops or clusters expected to run a rapidly changing
mix of applications.

Note also that the adaptec problem is a linux kernel issue, not one with
FC per se.  I personally helped work on the linux adaptec drivers some
years ago when we got a bunch of brand new systems with unsupported
hardware integrated with the motherboards and assure you that the issues
are nontrivial (among other things, I recall that the core driver code
was shared between freebsd and linux and at the time adaptec itself did
not formally support linux, in ADDITION to the actual driver/kernel
issues that ALONE were nontrivial).  So it wouldn't surprise me in the
least if there were issues with any given driver in any given kernel
release and by extension any given distribution release that
inadvertently used that kernel release and hence driver.

Over the years and many, many revisions of various linux distributions
there is no doubt in my mind -- some kernel snapshots are broken on at
least some hardware combinations, and some of the broken snapshots make
it into distribution releases or updates.  Most are not COMPLETELY
broken (many would pass a beta or the issues are with hardware that
isn't -- if one bothers to read the release notes -- technically on the
"supported" list, and most -- but not all -- get fixed fairly rapidly.
Still, sometimes one has to actually PARTICIPATE in the open source
kernel development process or build a kernel/driver from scratch that
somebody else that does even in these best of times, especially if your
hardware is new or a major change in the kernel source has broken
something.

Such problems generally transcend distribution.  RH and EL are in the
same boat as everybody else (Debian, SuSE, Mandrake, whoever) -- not
even an extensive beta has a ghost of a chance at revealing every
possible flaw in something as complex (and multiauthored) as a modern
linux kernel.  Note also that using things like RH-9 and its more or
less frozen kernels simply avoids one problem and the expense of others
and has a variety of negative sequellae in the long term -- lack of
support for X86_64, frozen in bugs and security problems, lack of driver
support.  We live in an imperfect world, but in >>this<< imperfect world
you aren't helpless and have a number of choices.  FC is not a perfect
choice, but it is a perfectly reasonable choice (in my opinion) for many
users and environments (including cluster environments), while not for
others.

rgb

--
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu