NIS?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduFri Oct 5 08:38:06 PDT 2001
- Previous message: NIS?
- Next message: NIS?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, 5 Oct 2001, Steven Timm wrote: > > On Fri, Oct 05, 2001 at 08:28:56AM -0500, Steven Timm wrote: > > > > > The rsync script is a good idea and something we are thinking > > > of implementing--only problem is...how do you handle the > > > situation when a node happens to be down during a push? > > > > FSL uses the same generic mechanism that I use to keep all files in > > sync. This means that when a node boots, it syncs before it returns to > > service. There are many files that you want to maintain in synch (like > > /etc/hosts.allow) which don't go in NIS. I would assume that systems > > like "cfengine" (which the sysadmin community uses to keep > > workstations configured) also do that. > > > Is there some way to inhibit the sync if for some reason all > workstations end up rebooting at once? Also, any way to force it > manually? There are lots of ways. This is a common enough administrative problem in any reasonably large domain. We used to use nightly cron scripts to do a variety of maintenance on systems in the department, and some of the script tasks would load server-shared resources (e.g. NFS, NIS and so forth). This was back in the 10Base days with slow disks and 4 MIPS servers (if you were lucky). We out of necessity developed ways of distributing the times of the cron hits to avoid the logjam. One can easily do the same thing here -- put host-specific delays into the boot scripts, put random delays into the boot script (which works well enough for a few hosts but remember that poisson random doesn't mean antibunched, and you want antibunched), institute a low overhead antibunching handshake (ask nicely for the transfer and if the server says no sleep a bit and ask again via e.g. a simple xinetd daemon). The problem is that you have to ask and write your own scripting or daemons after you hear the answers (all of which will work well enough with a bit of effort) because it isn't a standard tool or method. This is really a re-lamentation of a longstanding problem that has often been lamented on this list -- we still lack a lot of "standard" tools for cluster management and this is one of them. What we all really want is an RPM with documents; what we've got is somewhat kludgy recipes. I wish I could help with the former but I'm up to my ears in aquatic reptiles and offput projects. I'd strongly urge anyone who DOES tackle this problem to consider doing it "right" after really thinking it out, and turning their solution into a stable toolset. I personally think that a GPL antibunching etc_file_xfer daemon would be a gangbusters solution -- have the master daemon requests either fork a server to service the request OR return the requester a delay (computed from the number of pending requests and running measurements of the time of completion); have the clients respect the delay. That way each server can literally service requests as fast as possible (less the overhead of the original single queuing handshake). It could run on top of ssh/rsync or rsh/rsync as your security and cluster require, and would be pretty trivial to write in either C (lowest server load) or perl (perhaps easier to code). I've got prototype code for the C daemon that could probably be hacked into this if anybody is interested. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: NIS?
- Next message: NIS?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
