Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

NIS?

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Tim Carlson tim.carlson at pnl.gov
Thu Oct 4 21:24:18 PDT 2001


On Thu, 4 Oct 2001, Donald Becker wrote:

> > If you were running
> > 1000 small jobs in a couple of minutes I could imagine having problems
> > authenticating against any non-local mechanism.
>
> Hmmm, a reasonable goal is running a small cluster-wide job every
> second.  I suspect the NIS delays alone take longer than one second with
> just a few nodes.

So I ran the following test on one of our small clusters.

6 client NIS nodes with one NIS master (front end node) and no NIS slave
servers. Dual 800Mhz Pentium IIIs connected on a fast ethernet switch.
Forgive my sloppy C shell programming :)

The "script" which is basically 100 rsh calls and some NIS work on
looking up the ownership of a file.

I am doing an ls on /tmp which contains only 3 or 4 files, but I own two
of them so NIS is consulted for file ownership. I took NFS delays out by
going to /tmp.


#!/bin/tcsh
set i=0
while ($i < 100)
rsh $1 ls -l /tmp > /dev/null
set i=`expr $i + 1`
end

[tim at frontend-0 tim]$ time ./script compute-0-0
real    0m12.704s
user    0m0.520s
sys     0m0.440s

So if the job takes zero time and connecting to a machine takes zero time
then the NIS overhead is about 1/8 of a second. I ran this a half a dozen
times and the run varied between 10 and 13 seconds.

Now I point this script at 6 nodes at the same time (or at least as fast
as I can type a return in 6 xterms) and the mean time per run is about 31
seconds. That puts my potential NIS delay at a maximum of 1/3 of a
second. But I have also launched 600 jobs in 31 seconds.

Two examples from the larger test:

[tim at frontend-0 tim]$ date; time ./script compute-0-0
Thu Oct  4 21:06:53 PDT 2001

real    0m30.905s
user    0m0.600s
sys     0m0.540s

[tim at frontend-0 tim]$ date; time ./script compute-0-2
Thu Oct  4 21:06:52 PDT 2001

real    0m30.075s
user    0m0.530s
sys     0m0.710s


Before and after "ps -ax | grep ypserv" on the master node.
  639 ?        S     73:08 ypserv
  639 ?        S     73:10 ypserv
So I used 2 seconds of CPU time with ypserv


My first version of the script was a "touch /tmp/testfile" and produced
similar results. My /etc/nsswitch.conf files go "files nis" and the only
entry in /etc/passwd on the compute nodes is root

I am willing to be enlightened as to how my test is flawed. I'll run
different tests if asked. Is my test too trivial?

Tim

Tim Carlson
Voice: (509) 376-0300
Email: Tim.Carlson at pnl.gov
EMSL UNIX System Support





More information about the Beowulf mailing list