[Beowulf] SGI to offer Windows on clusters

Robert G. Brown rgb at phy.duke.edu
Mon Apr 16 07:49:24 PDT 2007

On Sun, 15 Apr 2007, Ed Hill wrote:

> From a packaging (not a user perspective) there are a number of ways
> that merging Core+Extras was/is a big improvement.  Dependencies
> between packages (Core items could not depend on Extras) was, for
> instance, an annoying problem that now vanishes.

Fair enough, although pretty much by definition core items should not
ever HAVE to depend on extras -- it is what makes the core the core.

I've always liked the idea of the core remaining a VERY marginal set
that is pretty much "just enough" to bootstrap an install.  One of the
things that from time immemorial has bugged me about the red hat install
process is its complete lack of robustness, so that if for any reason it
fails in midstream, one pretty much has to start over.  This has always
been pretty silly.  The correct way for the install to have proceeded,
especially post yum, is for a minimal diskful installation of "the core"
to take place almost immediately, leaving the system bootable.  Then to
use yum OUTSIDE of the basic install mode to complete the installation,
because a yum install on a package list is essentially restartable.

Yes, one "can" with some effort do this by hand, using e.g. kickstart
(it is pretty difficult from the graphical installer, actually) but it
is a PITA because it isn't designed to work that way, it is designed for
you to select e.g. a "gnome workstation" from a checkbox list and do it
all.  On a fast, reliable LAN connection to a fast, reliable server
installing on known compatible, reliable hardware of course this works
just fine.  Over a relatively slow DSL link connecting to a heavily
loaded public server that is apt to reject a connection in midstream
onto hardware that may (gasp) have some bugs that get tweaked --
especially in the install kernel which is not necessarily the updated
kernel that actually works with the hardware once the install and update
are finished -- well, let's just say that I find myself even now cursing
and pulling out hair.

And then there is the stupidity of having to do an install and THEN a
yum update.  Using yum for all installation past a basic bootstrap
install and using the full repo set INCLUDING updates, one can actually
just install the up-to-date versions of all packages.  I know that this
is where everything is moving, but RPM-based installs cannot get to a
two-step (real "core" plus a post-install "everything else") fast enough
from by point of view.

So yes, it does worry me a bit that there will "just" be a core repo.
This flattens dependency resolution, sure, but by eliminating any sort
of functional/library groupings of packages it actually makes
maintenance of the entire suite a monolithic nightmare from the point of
view of debugging.  For example, I personally could easily see all of X
-- libraries, modules, and the basic X-based packages that depend only
on core libraries to build -- going into a repo all by itself.  In the
development process, one then a) updates core; b) ports/patches X on top
of "just core"; c) ports/patches anything else with mixed dependencies
built on top of "both core and X and this and that".  Having a
functional separation at the level of API and library is a good thing.

Whether or not this "has" to be done at the repo level, well, obviously
not.  Groups could do it.  But there are two aspects of groups -- one is
the human oriented grouping of items by human relevant category --
games, scientific, office, whatever.  Then there is the system oriented
grouping by dependency trees.  Those trees "should" be organized in such
a way that there are at least a few points of clear separation -- core
should be the root or primary trunk of that tree with NO dependencies
EVER on anything not in core (where core is minimal, NOT monolithic
which is yes the other way to ensure that this is true).  As one moves
up the trunk, there are a few other fairly clear forking points --
X-based software for example all comes back to an "X core".

Eventually -- possibly even fairly quickly -- one gets to levels where
the packages will have mixed dependencies across several of these second
level "cores" or provide library support to many different kinds of
application and clear separation is no longer possible, but it does seem
very useful to maintain this separation to the highest functional levels
possible to orthogonalize and decompose the global debugging problem to
some sort of minimal level of possible circular reference.

So it seems to me that there is a need for two distinct kinds of
grouping, only one of which can conveniently be accomodated by "package
groups".  Maybe multiple repos aren't a good way of separating out the
required functionality, although I personally think that if "scientific
linux" had been from the beginning NOT a distro but just a yum repo
built and maintained on TOP of a distro -- or better yet, packaged up to
be built and delivered on top of several distros -- it would have been
finished and in universal use years ago.  "Extras" isn't a great
solution either as it is already way too BIG -- recall that I started by
making this observation and the problem that caused me to make it isn't
really made better by making the repo the "extras" packages are in even

The problem is inheritable.  Does unifying "everything" with "core" in
FC mean that "everything" will be unified in the next RHEL release built
from FC?  If so, then the cost of doing QA and support for RHEL
increases exponentially or worse with the number of packages added (just
as it really does with FC according to the argument above).  Maintaining
a logical and physical separation of "scientific linux" as an RPM repo
built >>on top of<< FC >>or<< Centos >>or<< RHEL >>or<< (the rpm-based
distro of your choice) means that one can safely and reliably install
RHEL or Centos on a cluster or LAN and then add SL on top via yum with a
neat isolation of the associated problems that may or may not arise -- a
broken application for example.  It makes it very easy to back off to a
functional core, to reinstall to a functional core, to determine where
the problem lies, and to be able to fix it.  Games are another obvious
example of something that can and should be in a repo all there own -- a
layer that NOTHING in the core should EVER depend on.  Office packages
ditto -- I love open office, but I definitely don't want something in
the actual core required to make e.g. a beowulf cluster node function to
depend on an open-office supplied library and again making "Office
Linux" into a repo and package suite would focus development attention
on what is there in a very appropriate way.

By keeping any or all of these things "separated" from the defended core
on the basis of library dependency decomposition AND function one makes
it easy to build and maintain systems based on modular package
selections.  I'm very concerned that flattening everything out and
relying on package groups on the one hand and internal dependency
resolution on the other (as has been done for many years now) to provide
functional decomposition in more than two dimensions will simply
perpetuate several problems that have existed for many years and that
plague RPM-based system users and managers.

To understand this problem and where it is headed, it might be really
useful to use rpm tools to map out the complete dependency matrix for
e.g. FC 6 and note how it has grown relative to e.g. FC 4 and FC 5.  If
one maps it out hierarchically looking for optimal decomposition points
and decomposition planes in the multivariate space thus represented it
would be even more useful.  With this in hand, one could then very
likely at least anticipate what hierarchical additions might be required
to accommodate the otherwise rapidly diverging complexity without an
attendant divergence in distributed management cost.

> But getting to your point about package segregation into named repos
> for end-user manageability -- I think I see what you want.  Perhaps
> it is something that can be better handled by improving the package
> "groups/categories" (aka "Comps") situation?  Its a topic that has been
> discussed within Fedora and it will hopefully get more attention (and
> better tools) as the total number of packages grows.

It needs more attention quite rapidly.  The complexity of the dependency
tree is (I suspect) highly nonlinear as a function of its size -- I'd
guess greater than exponential, although it is still inheriting benefits
from the de facto decompositions built into it by its development
history and the economic constraints inherent therein.  Add ten thousand
packages to what WAS the core -- which is obviously one of those de
facto decomposition points (with several others implicit therein) and
flatten everything and the full force of that complexity will rapidly
come out as a maintainer of a single package in that set will have 9999
ways to wreak dependency havoc and create deeply hidden bugs with no
obvious "owner".

My concern may be silly, of course.  The complexity problem will be
under considerable economic pressure to self-organize into functional
decompositions that keep some sort of lid on the otherwise enabled
(mathematical) catastrophe and sane, intelligent humans will probably
find a way to muddle through.  It does seem to me that it might be wiser
to do some numerical/statistical/mathematical studies of the
decomposition problem and ANTICIPATE the catastrophe and think now about
ways of doing better than just "muddling through".  I'd think that Red
Hat itself would pretty much demand this sort of foresight as a
condition of supporting FC development, being as how they will
willy-nilly inherit the real economic burden of maintenance problems
created by n-tupling (with n>4) the number of packages "in" RHEL in two
more years.

Of course they won't -- they'll do their OWN line-drawing and
decomposition right back into RHEL and an "extras" that consists of all
the FC packages that they don't want to be directly responsible for
supporting, but it does seem to me to be highly desireable to INTEGRATE
this sort of decomposition now rather than impose it a posteriori onto a
dependency space that will rapidly mix once the "requirement" that the
core build independently is relaxed and extended over the full FC
package space.

> If you have a desire to improve the situation the best place to start
> is:
>  http://fedoraproject.org/wiki/
> and volunteer to help with some aspect.

I'm a bit overcommitted -- I'm trying to get three or four distinct
applications I have written and personally support ready to go INTO FC
extras (which is not trivial because of the various disciplines this
requires!) AND I teach full time (and a half, this semester) AND I do a
bit of consulting AND I write far too much on this list AND I try to
have a life that doesn't involve JUST typing at my keyboard with
decreasing success.  So I'd much rather just predict doom and gloom in a
public forum of the possible consequences of unifying and flattening a
rather complex dependency space without thinking first about the
problems of dependency decomposition planes in an abstract space of
rather high dimensionality and the economics of hierarchical
decomposition of the associated debugging and maintenance problems.

That way if people have actually thought about these problems and have a
solution or have concrete empirical reasons to believe that they won't
become problems after all (rgb, you ignorant slut!) all is well and
good.  If not, well, maybe they WILL think about them and either
reconsider or deliberately engineer a solution to them that can be
expected to scale nicely up to 20+ kpkgs and eventually beyond.

This is serious business.  I can without even trying name a half dozen
times in the past that people built de facto scaling limits into
operating systems that would "never" be exceeded that were of course
exceeded.  We have been discussion the 32 bit problem.  Then there is
the Unix clock problem.  There are all sorts of places inside the Linux
kernel that went from being unsigned ints to being unsigned long long
ints (32 bit ints to 64 bit ints, at any rate) because yes, one CAN
receive more than 2^32 packets on a network between boots etc.  There is
the famous 10-bit boot address problem.  There are the limits in PID
space and UID space (which at one point were signed 16 bit ints and
probably still are AFAICT looking over PIDs returned by ps aux).  There
are limits in the number of open sockets and device handles and more.

Some of these limits have a "low cost" associated with pushing the
limits of scaling -- and many of them are disjoint linear problems that
scale at zero cost until they fail altogether and require a one-time
costly fix.  In a way they are NICER because of this.  In others -- the
problem associated with having to stat very large directories in a flat
FHS-dictated layout -- the potential scaling catastrophe has been
ameliorated faster than it developed by virtue of Moore's law (competing
exponents, as it were) and a human time contraint rather than a system
efficiency constraint.  In this case the economic impact of a poor
design decision is quite large, even given that the "cost" is mostly
opportunity cost time DONATED to the project and hence easily viewed as
an inexhaustible resource.



> Ed

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list