Good Tutorial for Clusters
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduWed May 8 01:47:43 PDT 2002
- Previous message: Good Tutorial for Clusters
- Next message: Good Tutorial for Clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 8 May 2002, Raju Mathur wrote: > >>>>> "rgb" == Robert G Brown <rgb at phy.duke.edu> writes: > > rgb> [snip > rgb> These days building a "generic" cluster is often not more > rgb> complex that installing an out-of-the-box distr, e.g. RH 7.3 > rgb> (released yesterday, hooray) and hand-picking the beowulfish > rgb> packages already therein, such as pvm from the list of > rgb> available RPM's. > > Curious: is redhat generically preferred for runing Beowulf, or is > that your personal choice? In general (no, I don't want to start > another distro war here!) is there any particular reason for > preferring one distribution over another /for running Beowulf/ ? It is almost certainly not generically preferred. It is locally preferred. After all, here I sit about eight miles away from the RH corporate office (at least until they finish moving to Raleigh, the rats! at which point it will be more like 25 miles:-). Duke (dulug.duke.edu) has one of the primary RH mirrors -- we moved 1.5 TB of data off our mirror yesterday as folks grazed on 7.3. We also have a linux genius named Seth Vidal who has built a fully automated RH installation site for the campus -- one can install RH onto any system on campus in about five minutes (plus the time required to navigate the setup panels), depending on load and bandwidth, and the local installs come preconfigured to automatically update themselves nightly so we basically never have unpatched systems on campus (security or functional updates both). The install setup fully supports DHCP and kickstart, so we can install beowulf nodes in about three minutes over 100BT back to the campus server with no hands at all. We are thus so damn scalable that one person, in ADDITION to being the physics department primary sysadmin, "supports" close to 1000 linux boxes all over campus (Hmmm, I wonder how many there really are at this point:-). Of course, the dulug mailing list and a few other very good linuxoid humans provide additional support to newbies and others, but the sethbot is legendary for answering most questions (including some that are truly boneheaded) some slightly before they are asked. (I wonder how he DOES that...:-) RH is also the base for Scyld. Now, to prevent being Debianized, or Mandrake-curse'd, or SuSE-Q'd, or Slackwarified, I will openly and freely admit that in all probability one can create an equally scalable and transparent operation with those distros, possibly working a bit harder or a bit less hard (aye, that's the rub:-). However, we just happen to do RH, and at this point it is now VERY VERY VERY easy. The sethbot is working on a rewrite of yup (the yellow dog update tool) that will make it even easier as well as much faster. We have real hopes of being able to yup-update a running system to 7.3, for example, without having to do a full reinstall (probably will need a reboot, of course, to manage the new kernel and might need a bit of extra or re configuration to support new features, but the PACKAGES should all update correctly without rewriting their existing configurations or killing /etc, which is very nice). If this works, we'll probably require full (re)installs only at major distro releases (8.0, for example) and even there we're working on ways for a system to do an automated reinstall to a higher distibution number without losing the basic configuration data and preserving at least the same optional packages that were in the previous install. At that point our scalability will be approaching the theoretical maximum. Complete linux idiots will be able to manage a network install into a reasonably bulletproof configuration, and once installed their systems will automatically do all of those update-thingies that are so critical to real security. Unless they work actively to defeat it, their system will track all the minor version releases without having to do anything but reboot post update into the new kernel, and will be ABLE to do a major version release update without ruining their setup, although they may have to follow some instructions for that one. Support will then consist of telling newbies to read the README on how to install, and developing x.x into the duke release form (we add this and that, test, and so forth before certifying it for yup-update or reinstall to all campus hosts). And bug fixes and answering questions, of course. Overall, one person plus a good LUG will indeed be able to manage all the "wild" users (students and faculty on personal systems) and do almost all the work required to support departmental operations BUT the actual management of their network -- this doesn't, of course, remove the need for departmental administrators, it just makes their job MUCH easier. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: Good Tutorial for Clusters
- Next message: Good Tutorial for Clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
