NIS?
David Bussenschutt
d.bussenschutt at mailbox.gu.edu.au
Sun Oct 7 21:45:13 PDT 2001
Hi all again.... you all seem to be interested in making sure that :
1) if a node is down, then it gets updated when it comes up.
2) if all nodes come up at once, then you don't want the network/master
flooded with more requests.
Possible solutions and issues:
1) make clients "pull" files on boot (suggested on list)
* must add random delays to the pull so server is not overloaded.
* must have localised scripts on client machines that perform
"pull".
(what do you do when you want to update these scripts because the pull
isn't working properly? - you can't have them "pull" the new version!)
2) have server push when client is up again.
* server is never loaded because it ititiates actions (and my perl
script is currently not multi-stranded)
* all management changes are in one place.
Also I had someone as the following questions:
how do I force an update on all nodes?
* touch the file on the server that I want to update to nodes, and
let my perl daemon push it out.
how do I force an update on one node?
* I don't, I just do all nodes. I'm sure the script could be
improved if this was required.
I have updated my perl program (below) so it handles the cases were a a
node/client goes down ...it polls for the client periodically (waiting for
it to come back up) - the frequency of which is settable at start time
(or defaults to a reasonable value)
at start time, I now just run:
syncfiles node1
syncfiles node2
syncfiles node3
etc..
David.
here is the 'syncfiles' program:
-------perl-code-cut-line-start-----------------
#!/usr/bin/perl -wT
#
# syncfiles
#
# usage: in a rc.sysinit or inittab on the master host, run:
# syncfiles clienthost [checkdelay] [retrydelay]
# Designed to automatically check every 'checkdelay' seconds whether
certain files have
# been modified and if modifications have been made, then push them to the
# requested client host over a ssh connection. This script runs as a
daemon by default.
# Requires:
# 1) to run as root for /etc/files, so you can access the shadow
file
# 2) rsync and ssh must be available, and at the paths defined in
this script
# 3) root ssh access without a password to the client (ie
.ssh/authorised_keys2 on remote hosts)
#
# written by David Bussenschutt Oct 5 2001 - free for everyone - no
responsibily accepted.
# It could be made bigger and better, but I like the KISS principle.
# October 8 2001 improvement:
# The script will now retry the send if the client is not available...
# If client host is uncontactable (or other ssh connection problems), then
the delay is changed
# to the (hopefully longer) 'retrydelay' until the host is available again
in order that the network
# isn't flooded with retrying/failing requests the whole time..
use strict; # for syntax checking
use POSIX; # for 'setsid'
#-------------USER DEFINABLE SETTINGS
BEGIN--------------------------------
# files to update, and the initial ctime to use
my %files = ('/etc/passwd' => '0', '/etc/shadow' => '0', '/etc/group' =>
'0',);
# where are the rsync and ssh commands?
# rsync should also be in the same place on the remote hosts as on the
primary host
# the path to these is hardcoded because this script runs as root
my $rsync = '/usr/bin/rsync';
my $ssh = '/usr/bin/ssh';
# client name should be provided on command line, if it's not, the script
won't run.
my $client = $ARGV[0];
# how often do we check for changes to the ctime of the files?
# should be provided as second command line option, or default to 10
seconds.
my $checkdelay = $ARGV[1]||10;
# if we couldn't contact the host or had errors, how long till we retry?
from cmd line or default to 60.
my $retrydelay = $ARGV[2]||60;
# dissassociate from terminal (1=yes)?
my $DAEMON=1;
my $DEBUG=1; #(1=more output)
print "Running in DEBUG mode - modify script to turn off DEBUG and silence
this output.\n" if $DEBUG;
#-------------USER DEFINABLE SETTINGS
END----------------------------------
#------------ERROR CHECKING BEGIN---------------
die "rsync not found\n" unless defined $rsync;
die "remote host not defined on command line\n" unless defined $client;
chomp $rsync;
# untaint the client name
if ($client =~ /^([-\w.]+)$/) { # alphanumerics,hyphens and dots
only.
$client = $1; # now untainted
} else { print "Really Bad data in client hostname. Only alphanumerics,
hyphens and dots allowed.\n"; die; }
# untaint the path
$ENV{'PATH'} = '';
#------------ERROR CHECKING END---------------
#--------DAEMON CODE BEGIN-------------------------
# are we going to run as a daemon?
my $pid;
if ($DAEMON == 1){
# only INT, TERM and HUP are REALLY needed.
$SIG{TERM}=$SIG{INT}=$SIG{HUP}=\&signal_handler;
#ignore $SIG{PIPE} as it's dangerous
$SIG{PIPE}='IGNORE';
# turn stdio output off
print "Dissosociating from terminal and running as a daemon\n" if
$DEBUG;
# virtualise / to be in a 'safe' location
#chroot("/var") or die " Couldn't chroot to /var: $!";
# fork a child and let the parent exit.
$pid=fork;
exit if $pid;
die "Couldn't fork new process: $!" unless defined ($pid);
# dissociate the process from the terminal and don't be part of
# my old process 'group'
POSIX::setsid() or die "Can't start a new session: $!";
}
sub signal_handler {
die "syncfiles: dying on signal\n";
}
#---------DAEMON CODE END------------------------
#----------------MAIN LOOP BEGIN-----------------------
my $delay = $checkdelay;
while (1) {
sleep $delay;
foreach my $file (keys %files){
#if ctime of file is > than ctime in $ctimes then
# do push.,
# update ctime into %files
my $newctime;
if (defined ( $newctime = (stat($file))[10] ) and $newctime >
$files{$file}) {
my $retval = system ("$rsync -ae '$ssh -x' --rsync-path='$rsync'
$file root\@$client:$file");
print "$rsync -ae '$ssh -x' --rsync-path='$rsync' $file
root\@$client:$file" if $DEBUG;
if ($retval==0) {
$files{$file} = $newctime; # if system call returned ok, define a
new 'last-updated' ctime.
$delay = $checkdelay;
print "-- done ok\n" if $DEBUG;
} else {
$delay = $retrydelay; # if there was an error, then wait for a
longer period before trying again.
print "-- error occured\n" if $DEBUG;
}
} elsif (!defined $newctime or $newctime<=1) {
# file that should exist does not exist or other error
die "error getting ctime of $file\n";
} # else no update of this file needed.
}
}
#----------------MAIN LOOP END-----------------------
-------perl-code-cit-line-end------------------
--------------------------------------------------------------------
David Bussenschutt Email: D.Bussenschutt at mailbox.gu.edu.au
Senior Computing Support Officer & Systems Administrator/Programmer
Location: Griffith University. Information Technology Services
Brisbane Qld. Aust. (TEN bldg. rm 1.33) Ph: (07)38757079
--------------------------------------------------------------------
Steven Timm <timm at fnal.gov>
10/05/01 11:28 PM
To: David Bussenschutt <d.bussenschutt at mailbox.gu.edu.au>
cc: beolist <beowulf at beowulf.org>
Subject: Re: NIS?
The rsync script is a good idea and something we are thinking
of implementing--only problem is...how do you handle the
situation when a node happens to be down during a push?
Steve
------------------------------------------------------------------
Steven C. Timm (630) 840-8525 timm at fnal.gov http://home.fnal.gov/~timm/
Fermilab Computing Division/Operating Systems Support
Scientific Computing Support Group--Computing Farms Operations
On Fri, 5 Oct 2001, David Bussenschutt wrote:
> Slight side-bar here, but I think it relates:
>
> My chain of thought:
>
> 1) everyone agrees NIS works (even if it is arguable about the speed,
> reliability, security etc)
> 2) everyone agrees that it can/cause have problems in some situations -
> especially beowulf speed related ones.
> 3) the speed has to do with the synchronisation delays inherent in a
> bidirectional on-the-fly network daemon approach like NIS
> 4) many people prefer the files approach for speed/simplicity (ie to
avoid
> problems in 3).
> 5) In a beowulf cluster, passwords shouldn't be changed on nodes, so a
> server push password system is all that's required -hence the files
> approach in 4).
> 6) why not have the best of both worlds? What we need is a little
daemon
> on the server that pushes the passwd/shadow/group/etc files to the
clients
> over a ssh link whenever the respective file is modified on the server.
> 7) How I suggest implementing this:
>
> The nieve/simple approach:
> set up the client so that root can ssh to them without a password (I
> suggest a ~/.ssh/authorisedkeys2 file amd a ~/.ssh/known_hosts2 file)
> root crontab entries that run the following commands periodically (as
> often as you require - depending on how much password latency you can
live
> with)
> # first client
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/passwd
> root at client1
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/shadow
> root at client1
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/group
> root at client1
> # second client
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/passwd
> root at client2
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/shadow
> root at client2
> /usr/bin/rsync -ae 'ssh -x' --rsync-path='/usr/bin' /etc/group
> root at client2
> # etc
>
>
> The improved aproach (a perl program i just wrote - tell me what u
think?
>
> ):
>
>
> --------------------------------------------------------------------
> David Bussenschutt Email: D.Bussenschutt at mailbox.gu.edu.au
> Senior Computing Support Officer & Systems Administrator/Programmer
> Location: Griffith University. Information Technology Services
> Brisbane Qld. Aust. (TEN bldg. rm 1.33) Ph: (07)38757079
> --------------------------------------------------------------------
>
>
>
>
> Donald Becker <becker at scyld.com>
> Sent by: beowulf-admin at beowulf.org
> 10/05/01 10:32 AM
>
>
> To: Tim Carlson <tim.carlson at pnl.gov>
> cc: Greg Lindahl <lindahl at conservativecomputer.com>, beolist
> <beowulf at beowulf.org>
> Subject: Re: NIS?
>
>
> On Thu, 4 Oct 2001, Tim Carlson wrote:
> > On Thu, 4 Oct 2001, Greg Lindahl wrote:
> >
> > > BTW, by slaves, do you mean "slave servers" or "clients"? There's a
> > > big difference. Having lots of slave servers means a push takes a
> > > while, but queries are uniformly fast.
> >
> > I meant clients.
> > 1 master, 50 clients.
> > The environment on the Sun side wasn't a cluster. 50 desktops.
>
> Completely different cases.
> Workstation clients send a few requests to the NIS server at random
> times.
> Cluster nodes will send a bunch of queries simultaneously.
>
> > Never had complaints about authentication delays. I just haven't seen
> > these huge NIS problems that everybody complains about.
>
> The problems are not failures, just dropped and delayed responses. A
> user might not notice an occasional ten second delay. When even trivial
> cluster jobs took ten seconds, you'll notice.
>
> > If you were running
> > 1000 small jobs in a couple of minutes I could imagine having problems
> > authenticating against any non-local mechanism.
>
> Hmmm, a reasonable goal is running a small cluster-wide job every
> second. I suspect the NIS delays alone take longer than one second with
> just a few nodes.
>
> > Our current cluster builds use http://rocks.npaci.edu/ for clustering
> > software. This system uses NIS. I know it is odd to hear of any other
> > system than Scyld on this list, but we have had good luck with NPACI
> > Rocks.
>
> We don't discourage discussions about other _Beowulf_ systems on this
> list. We have thought extensively about the technical challenges
> building and running clusters, and are more than willing to share our
> experiences and solutions.
>
> Donald Becker becker at scyld.com
> Scyld Computing Corporation http://www.scyld.com
> 410 Severn Ave. Suite 210 Second
Generation
> Beowulf Clusters
> Annapolis MD 21403 410-990-9993
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
>
More information about the Beowulf
mailing list