NIS?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
David Bussenschutt d.bussenschutt at mailbox.gu.edu.auSun Oct 7 21:45:13 PDT 2001
- Previous message: New Jobs section in parallelcrunchers.net
- Next message: node status
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all again.... you all seem to be interested in making sure that :
1) if a node is down, then it gets updated when it comes up.
2) if all nodes come up at once, then you don't want the network/master
flooded with more requests.
Possible solutions and issues:
1) make clients "pull" files on boot (suggested on list)
* must add random delays to the pull so server is not overloaded.
* must have localised scripts on client machines that perform
"pull".
(what do you do when you want to update these scripts because the pull
isn't working properly? - you can't have them "pull" the new version!)
2) have server push when client is up again.
* server is never loaded because it ititiates actions (and my perl
script is currently not multi-stranded)
* all management changes are in one place.
Also I had someone as the following questions:
how do I force an update on all nodes?
* touch the file on the server that I want to update to nodes, and
let my perl daemon push it out.
how do I force an update on one node?
* I don't, I just do all nodes. I'm sure the script could be
improved if this was required.
I have updated my perl program (below) so it handles the cases were a a
node/client goes down ...it polls for the client periodically (waiting for
it to come back up) - the frequency of which is settable at start time
(or defaults to a reasonable value)
at start time, I now just run:
syncfiles node1
syncfiles node2
syncfiles node3
etc..
David.
here is the 'syncfiles' program:
-------perl-code-cut-line-start-----------------
#!/usr/bin/perl -wT
#
# syncfiles
#
# usage: in a rc.sysinit or inittab on the master host, run:
# syncfiles clienthost [checkdelay] [retrydelay]
# Designed to automatically check every 'checkdelay' seconds whether
certain files have
# been modified and if modifications have been made, then push them to the
# requested client host over a ssh connection. This script runs as a
daemon by default.
# Requires:
# 1) to run as root for /etc/files, so you can access the shadow
file
# 2) rsync and ssh must be available, and at the paths defined in
this script
# 3) root ssh access without a password to the client (ie
.ssh/authorised_keys2 on remote hosts)
#
# written by David Bussenschutt Oct 5 2001 - free for everyone - no
responsibily accepted.
# It could be made bigger and better, but I like the KISS principle.
# October 8 2001 improvement:
# The script will now retry the send if the client is not available...
# If client host is uncontactable (or other ssh connection problems), then
the delay is changed
# to the (hopefully longer) 'retrydelay' until the host is available again
in order that the network
# isn't flooded with retrying/failing requests the whole time..
use strict; # for syntax checking
use POSIX; # for 'setsid'
#-------------USER DEFINABLE SETTINGS
BEGIN--------------------------------
# files to update, and the initial ctime to use
my %files = ('/etc/passwd' => '0', '/etc/shadow' => '0', '/etc/group' =>
'0',);
# where are the rsync and ssh commands?
# rsync should also be in the same place on the remote hosts as on the
primary host
# the path to these is hardcoded because this script runs as root
my $rsync = '/usr/bin/rsync';
my $ssh = '/usr/bin/ssh';
# client name should be provided on command line, if it's not, the script
won't run.
my $client = $ARGV[0];
# how often do we check for changes to the ctime of the files?
# should be provided as second command line option, or default to 10
seconds.
my $checkdelay = $ARGV[1]||10;
# if we couldn't contact the host or had errors, how long till we retry?
from cmd line or default to 60.
my $retrydelay = $ARGV[2]||60;
# dissassociate from terminal (1=yes)?
my $DAEMON=1;
my $DEBUG=1; #(1=more output)
print "Running in DEBUG mode - modify script to turn off DEBUG and silence
this output.\n" if $DEBUG;
#-------------USER DEFINABLE SETTINGS
END----------------------------------
#------------ERROR CHECKING BEGIN---------------
die "rsync not found\n" unless defined $rsync;
die "remote host not defined on command line\n" unless defined $client;
chomp $rsync;
# untaint the client name
if ($client =~ /^([-\w.]+)$/) { # alphanumerics,hyphens and dots
only.
$client = $1; # now untainted
} else { print "Really Bad data in client hostname. Only alphanumerics,
hyphens and dots allowed.\n"; die; }
# untaint the path
$ENV{'PATH'} = '';
#------------ERROR CHECKING END---------------
#--------DAEMON CODE BEGIN-------------------------
# are we going to run as a daemon?
my $pid;
if ($DAEMON == 1){
# only INT, TERM and HUP are REALLY needed.
$SIG{TERM}=$SIG{INT}=$SIG{HUP}=\&signal_handler;
#ignore $SIG{PIPE} as it's dangerous
$SIG{PIPE}='IGNORE';
# turn stdio output off
print "Dissosociating from terminal and running as a daemon\n" if
$DEBUG;
# virtualise / to be in a 'safe' location
#chroot("/var") or die " Couldn't chroot to /var: $!";
# fork a child and let the parent exit.
$pid=fork;
exit if $pid;
die "Couldn't fork new process: $!" unless defined ($pid);
# dissociate the process from the terminal and don't be part of
# my old process 'group'
POSIX::setsid() or die "Can't start a new session: $!";
}
sub signal_handler {
die "syncfiles: dying on signal\n";
}
#---------DAEMON CODE END------------------------
#----------------MAIN LOOP BEGIN-----------------------
my $delay = $checkdelay;
while (1) {
sleep $delay;
foreach my $file (keys %files){
#if ctime of file is > than ctime in $ctimes then
# do push.,
# update ctime into %files
my $newctime;
if (defined ( $newctime = (stat($file))[10] ) and $newctime >
$files{$file}) {
my $retval = system ("$rsync -ae '$ssh -x' --rsync-path='$rsync'
$file root\@$client:$file");
print "$rsync -ae '$ssh -x' --rsync-path='$rsync' $file
root\@$client:$file" if $DEBUG;
if ($retval==0) {
$files{$file} = $newctime; # if system call returned ok, define a
new 'last-updated' ctime.
$delay = $checkdelay;
print "-- done ok\n" if $DEBUG;
} else {
$delay = $retrydelay; # if there was an error, then wait for a
longer period before trying again.
print "-- error occured\n" if $DEBUG;
}
} elsif (!defined $newctime or $newctime<=1) {
# file that should exist does not exist or other error
die "error getting ctime of $file\n";
} # else no update of this file needed.
}
}
#----------------MAIN LOOP END-----------------------
-------perl-code-cit-line-end------------------
--------------------------------------------------------------------
David Bussenschutt Email: D.Bussenschutt at mailbox.gu.edu.au
Senior Computing Support Officer & Systems Administrator/Programmer
Location: Griffith University. Information Technology Services
Brisbane Qld. Aust. (TEN bldg. rm 1.33) Ph: (07)38757079
--------------------------------------------------------------------
Steven Timm <timm at fnal.gov>
10/05/01 11:28 PM
To: David Bussenschutt <d.bussenschutt at mailbox.gu.edu.au>
cc: beolist <beowulf at beowulf.org>
Subject: Re: NIS?
The rsync script is a good idea and something we are thinking
of implementing--only problem is...how do you handle the
situation when a node happens to be down during a push?
Steve
------------------------------------------------------------------
Steven C. Timm (630) 840-8525 timm at fnal.gov http://home.fnal.gov/~timm/
Fermilab Computing Division/Operating Systems Support
Scientific Computing Support Group--Computing Farms Operations
On Fri, 5 Oct 2001, David Bussenschutt wrote:
> Slight side-bar here, but I think it relates:
>
> My chain of thought:
>
> 1) everyone agrees NIS works (even if it is arguable about the speed,
> reliability, security etc)
> 2) everyone agrees that it can/cause have problems in some situations -
> especially beowulf speed related ones.
> 3) the speed has to do with the synchronisation delays inherent in a
> bidirectional on-the-fly network daemon approach like NIS
> 4) many people prefer the files approach for speed/simplicity (ie to
avoid
> problems in 3).
> 5) In a beowulf cluster, passwords shouldn't be changed on nodes, so a
> server push password system is all that's required -hence the files
> approach in 4).
> 6) why not have the best of both worlds? What we need is a little
daemon
> on the server that pushes the passwd/shadow/group/etc files to the
clients
> over a ssh link whenever the respective file is modified on the server.
> 7) How I suggest implementing this:
>
> The nieve/simple approach:
> set up the client so that root can ssh to them without a password (I
> suggest a ~/.ssh/authorisedkeys2 file amd a ~/.ssh/known_hosts2 file)
> root crontab entries that run the following commands periodically (as
> often as you require - depending on how much password latency you can
live
> with)
> # first client
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/passwd
> root at client1
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/shadow
> root at client1
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/group
> root at client1
> # second client
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/passwd
> root at client2
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/shadow
> root at client2
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/group
> root at client2
> # etc
>
>
> The improved aproach (a perl program i just wrote - tell me what u
think?
>
> ):
>
>
> --------------------------------------------------------------------
> David Bussenschutt Email: D.Bussenschutt at mailbox.gu.edu.au
> Senior Computing Support Officer & Systems Administrator/Programmer
> Location: Griffith University. Information Technology Services
> Brisbane Qld. Aust. (TEN bldg. rm 1.33) Ph: (07)38757079
> --------------------------------------------------------------------
>
>
>
>
> Donald Becker <becker at scyld.com>
> Sent by: beowulf-admin at beowulf.org
> 10/05/01 10:32 AM
>
>
> To: Tim Carlson <tim.carlson at pnl.gov>
> cc: Greg Lindahl <lindahl at conservativecomputer.com>, beolist
> <beowulf at beowulf.org>
> Subject: Re: NIS?
>
>
> On Thu, 4 Oct 2001, Tim Carlson wrote:
> > On Thu, 4 Oct 2001, Greg Lindahl wrote:
> >
> > > BTW, by slaves, do you mean "slave servers" or "clients"? There's a
> > > big difference. Having lots of slave servers means a push takes a
> > > while, but queries are uniformly fast.
> >
> > I meant clients.
> > 1 master, 50 clients.
> > The environment on the Sun side wasn't a cluster. 50 desktops.
>
> Completely different cases.
> Workstation clients send a few requests to the NIS server at random
> times.
> Cluster nodes will send a bunch of queries simultaneously.
>
> > Never had complaints about authentication delays. I just haven't seen
> > these huge NIS problems that everybody complains about.
>
> The problems are not failures, just dropped and delayed responses. A
> user might not notice an occasional ten second delay. When even trivial
> cluster jobs took ten seconds, you'll notice.
>
> > If you were running
> > 1000 small jobs in a couple of minutes I could imagine having problems
> > authenticating against any non-local mechanism.
>
> Hmmm, a reasonable goal is running a small cluster-wide job every
> second. I suspect the NIS delays alone take longer than one second with
> just a few nodes.
>
> > Our current cluster builds use http://rocks.npaci.edu/ for clustering
> > software. This system uses NIS. I know it is odd to hear of any other
> > system than Scyld on this list, but we have had good luck with NPACI
> > Rocks.
>
> We don't discourage discussions about other _Beowulf_ systems on this
> list. We have thought extensively about the technical challenges
> building and running clusters, and are more than willing to share our
> experiences and solutions.
>
> Donald Becker becker at scyld.com
> Scyld Computing Corporation http://www.scyld.com
> 410 Severn Ave. Suite 210 Second
Generation
> Beowulf Clusters
> Annapolis MD 21403 410-990-9993
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
>
- Previous message: New Jobs section in parallelcrunchers.net
- Next message: node status
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
