[Beowulf] Which distro for the cluster?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Geoff Jacobs gdjacobs at gmail.comFri Dec 29 12:48:33 PST 2006
- Previous message: [Beowulf] Which distro for the cluster?
- Next message: [Beowulf] Which distro for the cluster?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Robert G. Brown wrote: <snip> > The problem is REALLY evident for laptops -- there are really major > changes in the way the kernel, rootspace, and userspace manages devices, > changes that are absolutely necessary for us to be able to plug cameras, > memory sticks, MP3 players, printers, bluetooth devices, and all of that > right into the laptop and have it "just work". NetworkManager simply > doesn't work for most laptops and wireless devices before FC5, and it > doesn't really work "right" until you get to FC6 AND update to at least > 0.6.4. On RHEL/Centos 4 (FC4 frozen, basically), well... Laptops. *shudder* How often do laptop manufacturers change their hardware configurations? Every other week? I've found the optimal way to purchase laptop hardware is with a good live cd for testing. No boot, no buy. > One of the major disadvantages linux has had relative to WinXX over the > years has been hardware support that lags, often by years, behind the > WinXX standard. Because of the way linux is developed, the ONLY way one > can fix this is to ride a horse that is rapidly co-developed as new > hardware is released, and pray for ABI and API level standards in the > hardware industry in front of your favorite brazen idol every night > (something that is unlikely to work but might make you feel better:-). > > The fundamental "advantage" of FC6 is that its release timing actually > matches up pretty well against the frenetic pace of new hardware > development -- six to twelve month granularity means that you can > "usually" by an off-the shelf laptop or computer and have a pretty good > chance of it either being fully supported right away (if it is older > than six months) or being fully supported within weeks to months -- > maybe before you smash it with a sledgehammer out of sheer frustration. >> From what I've seen, ubuntu/debian has a somewhat similar aspect, user > driven to get that new hardware running even more aggressively than with > FC (and with a lot of synergy, of course, even though the two > communities in some respects resemble Sunnis vs the Shites in Iraq:-). RGB must now go into hiding due to the fatwa against him. > SINCE they are user driven, they also tend to have lots of nifty > userspace apps, and since we have entered the age of the massive, fully > compatible, contributed package repo I expect FC7 to provide something > on the order of 10K packages, maybe 70% of them square in userspace (and > the rest libraries etc). > > This might even be the "nextgen" revolution -- Windows cannot yet > provide fully transparent application installation (for money or not) > over the network -- they have security issues, payment issues, > installshield/automation issues, permission issues, and > compatibility/library issues all to resolve before they get anywhere > close to what yum and friends (or debian's older and also highly > functional equivalents) can do already for linux. What the software > companies that are stuck in the "RHEL grove" don't realize is that RPMs, > yum and the idea of a repo enable them to set up a completely different > software distribution paradigm, one that can in fact be built for and > run on all the major RPM distros with minimal investment or risk on > their part. Then don't "get it" yet. When they do, there could be an > explosion in commercial grade, web-purchased linux software and > something of a revolution in software distribution and maintenance (as > this would obviously drive WinXX to clone/copy). Or not. > > Future cloudy, try again later. > <snip> > > Ooo, then you really don't like pretty much ANY of the traditional "true > beowulf" designs. They are all pretty much cream eggs. Hell, lots of > them use rsh without passwords, or open sockets with nothing like a > serious handshaking layer to do things like distribute binary > applications and data between nodes. How things have improved... > Grid designs, of course, are > another matter -- they tend to use e.g. ssh and so on but they have to > because nodes are ultimately exposed to users, probably not in a chroot > jail. Even so, has anyone really done a proper security audit of e.g. > pvm or mpi? How difficult is it to take over a PVM virtual machine and > insert your own binary? I suspect that it isn't that difficult, but I > don't really know. Any comments, any experts out there? Would compromising PVM frag a user or the whole system? > In the specific case of my house, anybody who gets to where they can > actually bounce a packet off of my server is either inside its walls and > hence has e.g. cracked e.g. WPA or my DSL firewall or one of my personal > accounts elsewhere that hits the single (ssh) passthrough port. In all > of these cases the battle is lost already, as I am God on my LAN of > course, so a trivial password trap on my personal account would give > them root everywhere in zero time. In fact, being a truly lazy > individual who doesn't mind exposing his soft belly to the world, if > they get root anywhere they've GOT it everywhere -- I have root set up > to permit free ssh between all client/nodes so that I have to type a > root password only once and can then run commands as root on any node > from an xterms as one-liners. > > This security model is backed up by a threat of physical violence > against my sons and their friends, who have carefully avoided learning > linux at anything like the required level for cracking because they know > I'd like them to, and the certain knowledge that my wife is doing very > well if she can manage to crank up a web browser and read her mail > without forgetting something and making me get up out of bed to help her > at 5:30 am. So while I do appreciate your point on a > production/professional network level, it really is irrelevant here. <snip> > There are three reasons I haven't upgraded it. One is sheer bandwidth. > It takes three days or so to push FCX through my DSL link, and while I'm > doing it all of my sons and wife and me myself scream because their > ain't no bandwidth leftover for things like WoW and reading mail and > working. This can be solved with a backpack disk and my laptop -- I can > take my laptop into Duke and rsync mirror a primary mirror, current > snapshot, with at worst a 100 Mbps network bottleneck (I actually think > that the disk bottleneck might be slower, but it is still way faster > than 384 kbps or thereabouts:-). > > The second is the bootstrapping problem. The system in question is my > internal PXE/install server, a printer server, and an md raid > fileserver. I really don't feel comfortable trying an RH9 -> FC6 > "upgrade" in a single jump, and a clean reinstall requires that I > preserve all the critical server information and restore it post > upgrade. At the same time it would be truly lovely to rebuild the MD > partitions from scratch, as I believe that MD has moved along a bit in > the meantime. > > This is the third problem -- I need to construct a full backup of the > /home partition, at least, which is around 100 GB and almost full. > Hmmm, it might be nice to upgrade the RAID disks from 80 GB to 160's or > 250's and get some breathing room at the same time, which requires a > small capital investment -- say $300 or thereabouts. Fortunately I do > have a SECOND backpack disk with 160 GB of capacity that I use as a > backup, so I can do an rsync mirror to that of /home while I do the > reinstall shuffle, with a bit of effort. > > All of this takes time, time, time. And I cannot begin to describe my > life to you, but time is what I just don't got to spare unless my life > depends on it. That's the level of triage here -- staunch the spurting > arteries first and apply CPR as necessary -- the mere compound fractures > and contusions have to wait. You might have noticed I've been strangely > quiet on-list for the last six months or so... there is a reason:-) Time. The great equalizer. > At the moment, evidently, I do have some time and am kind of catching > up. Next week I might have even more time -- perhaps even the full day > and change the upgrade will take. I actually do really want to do it -- > both because I do want it to be nice and current and secure and because > there are LOTS OF IMPROVEMENTS at the server level in the meantime -- > managing e.g. printers with RH9 tools sucks for example, USB support is > trans-dubious, md is iffy, and I'd like to be able to test out all sorts > of things like the current version of samba, a radius server to be able > to drop using PSK in WPA, and so on. So sure, I'll take your advice > "any day now", but it isn't that simple a matter. Walled gardens and VPNs for wireless access? Sweet. <snip> > No arguments. But remember, you say "users" because you're looking at > topdown managed clusters with many users. There are lots of people with > self-managed clusters with just a very few. And honestly, > straightforward numerical code is generally cosmically portable -- I > almost never even have to do a recompile to get it to work perfectly > across upgrades. So YMMV as far as how important that stability is to > users of any given cluster. There is a whole spectrum here, no simple > or universal answers. <snip> > Truthfully, it is trans great. I started doing Unix admin in 1986, and > have used just about every clumsy horrible scheme you can imagine to > handle add-on open source packages without which Unix (of whatever > vendor-supplied flavor) was pretty damn useless even way back then. > They still don't have things QUITE as simple as they could be -- setting > up a diskless boot network for pxe installs or standalone operation is > still an expert-friendly sort of thing and not for the faint of heart or > tyro -- but it is down to where a single relatively simple HOWTO or set > of READMEs can guide a moderately talented sysadmin type through the > process. > > With these tools, you can adminster at the theoretical/practical limit > of scalability. One person can take care of literally hundreds of > machines, either nodes or LAN clients, limited only by the need to > provide USER support and by the rate of hardware failure. I could see a > single person taking care of over a thousand nodes for a small and > undemanding user community, with onsite service on all node hardware. I > think Mark Hahn pushes this limit, as do various others on list. That's > just awesome. If EVER corporate america twigs to the cost advantages of > this sort of management scalability on TOP of free as in beer software > for all standard needs in the office workplace... well, one day it will. > Too much money involved for it not to. <snip> > I thought there was such a party, but I'm too lazy to google for it. I > think Seth mentioned it on the yum or dulug list. It's the kind of > thing a lot of people would pay for, actually. <snip> > And I _won't_ care...;-) Come to think of it, the only way you can lose in such a contest is if quality slips. Pretty much plusses across the board. :-D > It took me two days to wade through extras in FC6, "shopping", and now > there are another 500 packages I haven't even looked at a single time. > The list of games on my laptop is something like three screenfuls long, > and it would take me weeks to just explore the new applications I did > install. And truthfully, the only reason I push FC is because (as noted > above) it a) meets my needs pretty well; and b) has extremely scalable > installation and maintenance; and c) (most important) I know how to > install and manage it. I could probably manage debian as well, or > mandriva, or SuSE, or Gentoo -- one advantage of being a 20 year > administrator is I do know how everything works and where everything > lives at the etc level beneath all GUI management tool gorp layers > shovelled on top by a given distro -- but I'm lazy. Why learn YALD? > One can be a master of one distro, or mediocre at several... This is absolutely valid. There is no need to move to the latest whiz-bang distro if what you're using works fine. > Pretty much all of the current generation do this. Yum yum. > > Where one is welcome to argue about what constitutes a "fast-moving" > repository. yum doesn't care, really. Everything else is up to the > conservative versus experimental inclinations of the admin. How usable is the FC development repository? > The last time I looked at FAI with was Not Ready For Prime Time and > languishing unloved. Of course this was a long time ago. I'm actually > glad that it is loved. The same is true of replicators and system > imagers -- I've written them myself (many years ago) and found them to > be a royal PITA to maintain as things evolve, but at this point they > SHOULD be pretty stable and functional. One day I'll play with them, as > I'd really like to keep a standard network bootable image around to > manage disk crashes on my personal systems, where I can't quite boot to > get to a local disk to recover any data that might be still accessible. > Yes there are lots of ways to do this and I do have several handy but a > pure PXE boot target is very appealing. > >>> Yes, one can (re)invent many wheels to make all this happen -- >>> package up stuff, rsync stuff, use cfengine (in FC6 extras:-), write >>> bash or python scripts. Sheer torture. Been there, done that, long >>> ago and never again. >> Hey, some people like this. Some people compete in Japanese game shows. > > Yes, but from the point of view of perfect scaling theory, heterogeneity > and nonstandard anything is all dark evil. Yes, many people like to > lose themselves in customization hell, but there is a certain zen > element here and Enlightment consists of realizing that all of this is > Illusion and that there is a great Satori to be gained by following the > right path.... > > OK, enough system admysticstration...;-) > > rgb > -- Geoffrey D. Jacobs
- Previous message: [Beowulf] Which distro for the cluster?
- Next message: [Beowulf] Which distro for the cluster?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
