[Beowulf] SGI to offer Windows on clusters

Robert G. Brown rgb at phy.duke.edu
Mon Apr 16 13:35:27 PDT 2007

On Mon, 16 Apr 2007, Joshua Baker-LePain wrote:

> On Mon, 16 Apr 2007 at 2:12pm, Robert G. Brown wrote
>> Try installing two year old Centos AT ALL on six-month-old hardware, and
>> I think that there is a very high probability that it will require a
>> much larger investment in time backporting kernels and worse.
> I think you underestimate the amount of driver back-porting RH puts into 
> point releases.  When I first got my dual woodcrest compute nodes, the 
> current CentOS point release (4.4) worked almost perfectly on them (a network 
> driver upgrade removed the "almost" from that statement).  Are the app 
> libraries out of date compared to Fedora?  Sure.  Are you more likely to have 
> success at this with "server" hardware than more desktop oriented hardware? 
> Sure.  But the point is that RH does roll a lot of new hardware support into 
> their enterprise distro as it ages.

And the other point is that it isn't just driver backporting that
matters.  There is chipset support and more.  There is HAL.  The problem
is especially pernicious (and visible) on laptops, but I've encountered
it multiple times on desktops as well.  It isn't just what they fix in
the update stream -- you still have to be able to INSTALL on the system,
and it is a wee bit difficult to install Centos on a system when the
UNchanged install image kernel lacks the right chipset support and
network drivers.  Yes, I've wasted days trying to find a combination
that would make it work.  Yes, I've been burned multiple times.  Even if
you can manage to get a boot to work, in some cases "stability" is a
stochastic thing and a system will run for a while and then die when it
finally does the wrong thing.

This is directly associated with the wish for a minimal install -- I
have a system sitting upstairs right now that thinks that it has a
problem with a library that a) I've updated repeatedly so that I'm
certain that the image I'm installing from matches the one on Duke's
mirrors; b) I could care less about anyway -- if I could easily figure
out what it is that thinks that it needs it I'd eliminate the toplevel
package and get the system to install.  IF any of the RH-derived distros
simply installed a barebones minimal core -- enough to bootstrap the
rest of a yum-based install in "phase two" from a standard package list
-- then my install wouldn't fail with a nasty message saying I had to
start all over again every single time I retry.  It would SUCCEED in
installing a minimal core, reboot into phase two, and kick me out into a
shell when it encountered a supposed dependency issue with the repo.  If
it were really really nicely set up, it would let me select the yum
option for installing everything that it can, and telling me what it
can't get through, and would even write the latter out into a nice
little package list upon which I could invoke yum (or not) to finish it
all off WITH A FULLY FUNCTIONAL SYSTEM at some later time when I figured
out what is wrong.

The main reason I'm whining online like this is that I really see things
going in a really bad, really wrong way fast here.  Every new release of
FC or RH is bigger and more complex.  Very little intelligent effort
seems to have been expended in hierarchical decompositions of this
increasingly vast system and application space -- even into just this:
"systems" vs "application" space -- and as a consequence certain already
annoying but not quite critical problems are about to be magnified

The sad thing is that this sort of hierarchical decomposition is the
thing that has forever given Unix its strength.  We've been talking
about it for days now.  Unix builds big complex tools out of many small
simple tools.  Unix builds big complex applications on top of many small
simple reusable shared libraries, minimally adding to what is already
there instead of trying to write hundred thousand code line monolithic
applications that have to "work all at once".  ABIs abound.  APIs
abound.  Interface standards abound.  Standard of practice is to not
reinvent wheels, to reuse code, to comply with standards, all of which
relatively easy to write and debug because when I write:

    printf("Hello, world\n");

I generally don't have to debug "printf()".  Packages were INVENTED to
extend this sort of decomposition across whole functional sets of
applications, but over time the decomposition planes have been blurred,
mixed, twisted, as it has always been "easy" to just add a few more
packages to a set that all have to be rebuilt and tested as a monolithic
"distribution" anyway.

Well, its about to stop being easy.  It may stop being POSSIBLE.  500
packages was easy.  1500 packages wasn't easy, but with enough effort it
was doable.  6000+ packages is doable only because it is effectively
decomposed into 1500+4500 or thereabouts.  7000, 9000, 11000 packages as
a single set is going to be a living nightmare.  FC "as a whole" will
simply "never" work for the whole package set in its lifetime, and
nobody will know what bugs and dependencies really need to be fixed
first as they are twined throughout the entire huge space.

This is something that I think that the debian developers have done much
better with, and is the only way they can manage the close to 20Kpkgs
that make up "greater" Debian.

Whine, whine, whine.  We (in linux) need to take a giant step BACKWARDS
and build a rock-solid common "a"-core, establish at least the "b"-level
decomposition that is likely to be common to all linux distros, and then
build distro-specificity and divergence at the "c" and "d" levels only.
I'd like to see this so much so that one could install an ab-core system
using (e.g. FC) and then complete the install using debian packages or
vice versa and have everything work "perfectly" as far as the core
interactions where concerned -- they can break like hell if you want at
the d-level packaging and dependency resolution, but ab-level
dependencies need to "by definition" be a solid, self-contained base
that one can stabilize and thereafter just not (easily) break.



Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list