[Beowulf] hpl size problems
Robert G. Brown
rgb at phy.duke.edu
Thu Sep 29 09:54:44 PDT 2005
Joe Landman writes:
> No one is being needlessly spare. Just want the minimum OS footprint on
> the node as possible, and mount all the applications from a server.
> This way, if you really want to run Maple across your cluster, the
> installation is really easy. Any application you want can be placed in
> /apps, and mount it. Or if absolutely needed, mount /usr from the
> network. If you are doing that, you should just finish the transition
> and go diskless/ramdisk based.
Sure, this is a venerable and reliable administrative mode and back from
the mid 80's through the mid 90's was pretty much the exclusive way I
ran systems, at least Suns (our SGI's tended to be one-offs or two-offs
and we ran them with local installs). Diskless (SLC, ELC). Diskfull
but with mounted /usr, /home, and /usr/local (this was a long time ago
and /usr/local was commonly used for the homemade stuff, a non FSH
compliant legacy that persists in Makefiles to this day:-).
Based on a fair bit of experience running it both ways, I don't that you
are correct when you say that "Any application you want can be placed in
/apps" and run semi-diskless, as if doing so is a trivial exercise like
yum --installroot /apps followed by an export.
I'd assert that this is NOT risk free and painless and in fact likely to
break countless rpms. You're advocating the deliberate building of a
non-FSH compliant network of systems, something that I personally think
is a Really Really Bad Idea that will Come Back To Haunt You Yes It
Will.
A small list of the concerns:
* What if the app installs stuff into /etc? I mean /apps/etc? What if
it installs xinetd's, or configures stuff in sysconfig, or installs a
license and matching application path during the %post (where
--installroot makes it think your /apps is ROOT, not that the leading
order path is /apps). Few RPMS (at least) are really path-relocatable.
Heck, not even most TARBALLS are trivially path-relocatable, not without
a fair bit of hackery and/or autoconf magic. Oops.
* Don't forget to manage every user's MANPATH, LD_LIBRARY_PATH, and
PATH, which is more than just fixing it up in your system defaults in
/etc. Maybe they roll their own. Maybe the app you install is a library
and they want to statically link to it (not just an ld.so fix, that is).
What about all those include files (in e.g. the GSL) that are no longer
in /usr/include?
* And then there is Dependency Hell...
For an arbitrary list of applications grabbed at random from any distro
in the world, this is a nightmare beyond my imagining. Just deciding
what CAN be safely installed in /apps and still work and what has to be
installed in your FSH-compliant base is pretty nightmarish, and is one
reason that nobody I know "likes" stuff that installs EVEN into
FSH-compliant /opt. To work, it has to a) install stub scripts into
/usr/bin anyway or you end up having to tell every user to put
/opt/maple10/bin on their path and set seven other environment variables
and stuff STILL breaks; b) those stub scripts have to SET all of those
environment variables and stuff still breaks. Forget maple; look at
what PVM has to go through to be able to install into /usr/share but
still run as "pvm" out of /usr/bin. Look at stuff with compiled in
install paths, configuration paths, auxiliary data paths in general.
You're a bit safer exporting /usr or /opt from an installroot image.
This leaves you internally FSH compliant but still leaves you untangling
and managing all sorts of CORRECTLY IMPLEMENTED FSH COMPLIANT package
installations that didn't expect you to do anything like that. You're
still in Hell, with far more work required just sorting out what works
than dropping a file into either an autobuilt, class-typed kickstart
file or into an autoyum-updated group so that it just goes "poof" and
automagically appears on all nodes in its class type by the next day.
There are still a few tools lacking (AFAIK) that would REALLY streamline
this sort of class-typed grooming for workstation LAN or cluster. Or I
should say, they are probably there but still not common or openly
available. One that I'd like to see is one that takes an arbitrary
install and a suitable package list and nightly re-places the systems in
the class PRECISELY into compliance with the list. Removes ones that
are removed, adds ones that have been added, runs a class-typed %post
update script (where you can e.g. chkconfig on and off). So that one
can take a workstation, remove its monitory, change its class type in a
config file, and run a one-line command or come back the following day
post nightly cron and find that it is a node -- or vice versa -- without
needing a full reinstall. Obviously this would (only) work on top of an
invariant Base group and would probably need some thought as to the
details and options, but I think it would be a good thing that yum could
do NOW but it just doesn't have the surrounding stuff to do so written
yet.
This sort of automagic is routine at install time and indeed the same
tools could be modified to help with the middle layer. To see ways of
managing class-typed kickstart files that are automagically built, look
at e.g. glump:
http://linux.duke.edu/projects/mini/glump/
Define an xmlish glump file per type, add the host to the appropriate
host type, and the tool builds you a kickstart (or other) file in a form
that can be served for any purpose. But of course there are many ways
to do it.
I remember well and with no regrets managing all kinds of software that
one had to kludge together solutions for back in the /usr/local days.
This is one of the things that MOTIVATED application packaging and FSH
-- splitting up applications into multiple, non-standard paths does NOT
scale well and can NOT handle arbitrarily complex package installation
requirements without work, and the work required to MAKE it work is
immense and would be done slightly differently by every two people who
ever tried it (so all that work isn't even reusable). Bad, bad, bad.
So yeah, for maple /apps might work -- it is pre-wrapped to install in a
standalone directory in /opt from the beginning and you can probably
offload license management to a server and fix a few paths with a hungry
sed-script and be done. It would be path and dependency hell for nearly
everything else.
Note that this does NOT apply to full-diskless installs, where
--installroot creates a fully consistent tree with all dependencies in
the right places. Warewulf seems to do this very nicely. In fact,
getting glump to write a wareful script to build the class diskless root
AND a nearly perfectly matching diskfull kickstart file would provide
the best of both worlds and a full choice. Have a disk, want local?
Edit a couple of files and do a PXE install. Lose the disk and want
diskless while it is being run? Edit a couple of files and reboot. If
you preserve just a tiny bit of system install-state identity data (e.g.
/etc/ssh) and restore it appropriately, you don't even lose the system's
"identity" in the transition. The heck with clusters -- warewulf is
poised to become a general purpose LAN management tool that makes
diskful operation and system class damn near a runtime choice.
Cluster-by-night made simple and totally automagical.
Since I'm being all righteous and assertive, let me go ahead and add to
my IMHO assertions of yesterday now and get it all over with. I'm
probably already in trouble with everybody, so I may as well go whole
hog and get all the pissed-offness-at-me done at once, right? :-)
Take the following for what it is worth (possibly nothing at all, that
is;-):
%< snip snip ------------------------------------------------------------
RGB's Commandments of Systems Management (cluster or not):
* Thou shalt not depart from the FSH or thou shalt be Smitten with a
Sucker Rod and those that manage after thee will come to Curse Thy Name.
* Needless Heterogeneity is a Great Evil and should be Put Aside by the
Righteous Administrator.
* Thou Shalt Not Hack the basic tools and methods associated with
managing thy Distribution or thou shalt be Tormented for an Eternity in
the Hell of Maintenance. [Exception is granted for the Developer, who
willing accepts the Burden of Maintenance that others may be free from
sin, but even the Developer's soul stands in Peril when they Hack on a
production network.]
* Cursed Indeed is the One who Breaks a Dependency or Forces a package
against its will.
* The jabber of conflicting dependencies and random design decisions is
as a Clamorous Bell being Beaten By Monkeys with a Sucker Rod while
wearing it around your ears. Private installation paths are a tortuous
Maze crafted by the Devil filled with the Bones of those who seek to
Understand Them.
* Blessed are the Developers, for they create new blessing for the
select few.
* Blessed are the Packagers, as they Bless Us All with the fruit of
their labors.
* More Blessed still are those who Package >>correctly<< or Fix the
Improperly Packaged and hence save us from the Sin of Hacking and
Cursing the Name of the Packagers as we do so.
* It is a Sin to have to touch any single system by hand for any
purpose but the mechanical from the Original Install to the day it is
sent off to Surplus and Salvage.
* Yea, though our Packages be ill-conceived and rudely dependent, and
though our Distributions verily make Sinners Of Us All, we strive for
the Perfection of Perfect Scaling.
* The LSB is another Vision of Perfection, although it is opposed and
ignored by the Greedy and Foolish Alike. Wise men sigh and turn away on
the camel-trodden paths of distribution-specific sin or they won't get
their goods to market at all. Fools follow the needs of their
distribution gladly, seeking to bury themselves in the perfume-scented
distractions of the bazaar so that they never reach the still greater
perfume of the garden in which all flowers grow.
* Yum is great, Yum is good, let us Thank Seth ye Yummed Up Dude, as it
protects us in our innocense from lo the many Sins of the Distribution,
the Packaging, the LSB-Non-Compliance of the mortal world.
Installation Proverbs
* Install it Not if you will Need it Not.
* Install it Not if you don't Understand it and/or Accept the
Responsibility of Managing It.
* Subject to the above, Install all that You or Your Users MIGHT need,
provided that you are prepared to accept the Burden of Maintenance, per
System Class.
* Install it >>everywhere<< for the Class of Systems involved, lest
thou fall into the Sin of Heterogeneity.
* All that is to be Installed for a Class of Systems must be Packaged.
Period and I'm Not Kidding. Exceptions are granted for one-offs, e.g.
development platforms or special purpose machines. Exceptions are NOT
granted for most servers, even if they are one-offs, as even one-offs
need upgrading.
* Keep a Sucker Rod Handy for the schooling of Users, for they be Fools
and prone to Sin.
* A system that does not Know Itself and that cannot Control its Own
State is like unto the victim of a closed head accident who wasn't too
bright to begin with. It may be alive after a fashion, but it will
require much love to nurse it through the simplest of tasks. Euthanize
it that it may be Reborn, for this is a Doctrine of Rebirth and
Redemption, at least until the hardware perisheth.
* In case it wasn't clear from the above, running a system with Windows
is in and of itself a Great Sin as it Has No FSH, it Encourages
Heterogeneity, it is filled with Conflicting Dependencies so that
installing A breaks B without warning or recourse, and it is hardly ever
aware of or in control of its own state. Fortunately this sin is Easily
Redeemed.
* Just because a Computer Can Do Anything does not mean that it Should.
After all, a computer can run Windows after a fashion. It is therefore
permitted to Just Say No to unreasonable user demands.
* However, it is Wise to Say No Politely and with a full cost-benefit
analysis that shows the ignorant WHY you say No, unless the user is an
Inferior Lifeform in which case you can say no with your Sucker Rod or
while laughing hysterically and wiping tears from your eyes.
* Sometimes No turns out to be Yes After All. Sometimes Inferior
Lifeforms evolve to become department chairs or corporate CEOs.
Sometimes Hacking is the Lesser of Evils. Every Rule has its Exceptions
and YMMV. However, this also explains the existence of Windows and Much
Other Evil in the Universe and the Righteous will Obey the Commandments
and Grave the Proverbs onto their Eyelid Corners or they will Reap Much
Sorrow from this Vale of Tears.
%< snip snip ------------------------------------------------------------
This probably isn't exhaustive, but its not a bad start.
So I say to thee Joe (standing on a hillside wearing a dirty robe while
grasping a pair of old shoes):
Sinner, Repent thy Evil Ways or thy Sole is in Peril!
:-)
rgb
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050929/dc4c6694/attachment.sig>
More information about the Beowulf
mailing list