[Beowulf] RedHat Satellite Server as a cluster management tool.

Thu Oct 14 11:54:15 PDT 2004

Robert,

So have you actually used the satellite server? My biggest problem with
using RHN has been the strong lack of deployments it's had.. A lot of
people just naturally assume redhat is bad (hell, I even do. I use debian
for all of my personal and corporate servers).. But very few who
automatically take that stance have actually worked with the products
enough to give emperical evidence as to why.

It took a while to gather enthusiasm enough to evaluate it, and a couple
of months of solid testing before I could recommend it.  I've built about
1/2 dozen similar deployment/management tools at this point, each one
built for a customer (hence the reason building 6 instead of just
improving upon the same one).

Imaging is one thing, and yeah kickstart is easy, no objections to that..
RHN just makes it a lot easier to deal with kickstart. It also gives a
rather useful, but more enterprise focused management system to allow you
to manage (software|config) channels, server groups, and a good method to
deal with groups with unions & intersections.  I'm finding it especially
nice at one site at which 1/2 of their servers are used for testing and
1/2 for their production environment.  Pushing new patches, scripts,
commands, files to select sets of systems requires very little effort.

RedHat's configuration management system is actually really nice. They've
put a simple  (but extensible) macro system into it, which allows you to
keep one configuration file for all of the servers in a given class, when
only a few things change, and having system-specific variables be parsed
out when servers pull configs from the gold server.. Sure, you can do this
with cfengine or pikt, but uploading a config file to a webform is a lot
simpler than setting up cfengine/pikt and implementing it (I know this
from a lot of experience.

One of the lackings of using a yum/pxe/kickstart environment (of which I'm
rather familiar with, currently  managing 6 customers with a similar
environment) is that there's no "already there" configuration/versioning
management system.  That was one of the key points of redhat, the fact
that I can do at-will repurposing/reprovisioning (like turning a 100
server 30/70 app db server/app server environment into a 70/30 app/db
server environment in 5 minutes without kickstarting and zero manual
interaction)..

In the end, it's probably just an apples/oranges comparison.. in a science
lab/school cluster environment, it's probably more a more valuable place
to use a more manual process because grad students are cheap, and interns
are free.. :) In a corporate world, the $28k i'd spend for a 100 server
environment to save a sysadmin's worth of time, pays for itself 10 fold in
terms of environment consistency..

Either way, I'm not trying to evangelize, just relate my own experiences, 
and try to find the best solution for a given problem. What tools out
there are good for this type of a situation, then? Thanks for the refs to
werewulf, I'm checking it out now.

>

> On Wed, 13 Oct 2004, Michael T. Halligan wrote:
>
>> Has anybody used (or tried to use) the RHN system as a HPC management
>> tool. I've implemented this
>> successfully in a 100 host environment for a customer of mine, and am in
>> the process of
>> re-architecting an infrastructure with about 150 nodes.. That's about as
>> far as I've gotten
>> with it. Once I get past the cost, the poor documentation, and "OK"
>> support, I'm finding
>> that it's actually a great (though slightly immature) piece of software
>> for the enterprise.  The ease of keeping
>> an infrastructure in sync, and tthe lowered workload for sysadmins
>
> <nuke warning="alert">
>
> I can only say "why bother".  Everything it does can be done easier,
> faster, and better with PXE/kickstart for the base install followed by
> yum for fine tuning the install, updates and maintenance (all totally
> automagical).  Yum is in RHEL, is fully GPL, is well documented, has a
> mailing list providing the active support of LOTS of users as well as
> the developers/maintainers, and is free as in air.  Oh, and it works
> EQUALLY well with Centos, SuSE, Fedora Core 2, and other RPM-based
> distros, and is in wide use in clusters (and LANs) across the country.
>
> With PXE/kickstart/yum, you just build and test a kickstart file for the
> basic node install (necessary in any event), bootstrap the install over
> the net via PXE, and then forget the node altogether.  yum automagically
> handles updates, and can also manage things like distributed installs
> and locking a node to a common specified set of packages.  It manages
> all dependencies for you so that things work properly.
>
> It takes me ten minutes to install ten nodes, mostly because I like to
> watch the install start before moving on to handle the rare install that
> is interrupted for some reason (e.g. a faulty network connection).  One
> can do a lot more than this much faster if you control the boot strictly
> from PXE so you don't even need to interact with the node on the console
> at all.  How much better than that can you do?
>
> Alternatively, there are things like warewulf and scyld where even
> commercial solutions probably won't work out to be much more (if any
> more) expensive.  Especially when you add in the cost of those two
> "beefy boxes acting as RHN servers".  What a waste!  We use a single
> repository to manage installs and updates for our entire campus (close
> to 1000 systems just in clusters, plus that many more in LANs and on
> personal desktops).  And the server isn't terribly beefy -- it is
> actually a castoff desktop being pressed into extended service, although
> we finally have plans to put a REAL server in pretty soon.
>
> I mean, what kind of load does a cluster node generally PLACE on a
> repository server after the original install?  Try "none" and you'd be
> really close to the truth -- an average of a single package a week
> updated is probably too high an estimate, and that consumes (let's see)
> something like 1 network-second of capacity between server and node a
> week with plain old 100BT.
>
> There are solutions that are designed to be scalable and easy to
> understand and maintain, and then there are solutions designed to be
> topdown manageable with a nifty GUI (and sell a lot of totally unneeded
> resources at the same time).  Guess which one RHN falls under.
> </nuke>
>
>   Flamingly yours (not at you, but at RHN)
>
>       rgb
>
>>
>> At 100 nodes, the pricing seems to be about $274/year per node including
>> licensing, entitlements, and the
>> software cost of a RHN server (add another $5k-$7k for a pair of beefy
>> boxes to act as the
>> RHN server.. though as far as I can tell, redhat's specs on the RHN
>> server are far exagerrated.. I
>> could get by with $2500 worth of servers on that end for the
>> environments I've deployed on).  So, in the
>> end, $28k/year for an enterprise of 100 servers, in one environment has
>> meant being able to shrink the
>> next year staffing needs by 2 people, and in one by one person, it pays
>> for itself..
>>
>> We have a 512 node render farm project we're bidding on for a new
>> customer, and I'm wondering how those in the
>> beowulf community who have used RHN satellite server perceive it. So far
>> we're considering LFS and Enfusion,
>> which are both more HPC oriented, but I'm really enjoying RHN as a
>> management system.
>>
>> ----------------
>> BitPusher, LLC
>> http://www.bitpusher.com/
>> 1.888.9PUSHER
>> (415) 724.7998 - Mobile
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> --
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>
>
>
>

-------------------
BitPusher, LLC
http://www.bitpusher.com/
1.888.9PUSHER
(415) 724.7998 - Mobile