Creating user accounts....

Fri Feb 14 08:47:25 PST 2003

On Thu, 13 Feb 2003, Srihari Angaluri wrote:

> Is there any serious performance/scalability issue to using NIS,

Yes.  A difference between a clusters and a collection of workstations
is that a cluster-wide job will cause all nodes to be active at once.
Rather than getting scattered name service requests, all requests will
arrive at once.  More than about 25 clients will cause a NIS server to
drop requests, and even in the best case you end up with a serialization
point and slow-downs.

The current practice is changing the name system, both implementation
and administrative model, to match each installion.
    - People use NIS when there is a rapidly changing user base
    - explicitly copy files when there is a small user and node set, or
    - synchronize, large mostly-static files with 'rsync'.

The file approach is typically implemented with ad hoc scripts, which
work well for the people that wrote the scripts but make it difficult to
trace an error to the change that caused the error.

We developed BeoNSS, a cluster nameservice, to address the
administration and scaling problems.  The advantage of our BeoNSS system
is that it 
    - Scales very well
    - Uses an unchanged administrative model
    - Allows a single point of administration
       (add users on just the master),
    - May be extended to support multiple administrative domains
       (each master sharing a cluster might have their own user list).

> as
> opposed to copying the individual files to each and every node on the
> cluster? Is this even a desirable option for large clusters, for
> example? What if I need to add more accounts?

You missed a big one: what about machines that are down or
non-responsive (!) when you update?

If you are willing to deal with more complexity and administration, copying
files to cluster nodes with full installs will result in good scaling,
but... then you have administration issues.

We designed our architecture on the following principle: compute nodes
exist to perform computations for a master.  Using BeoNSS, compute nodes
only need know about the users that are actively running jobs.  They
don't keep any persistant configuration that might be outdated.  Only
the master needs the full user list.

Look at this from end-goal perspective: there shouldn't be an
administrative change or a compute node performance difference between
the master having 10 users, having 10,000 users, or supporting arbitrary
job submission from an even larger user base.  We still have to try to
keep UIDs less that 32767 but that's a compatibility issue with
old-style 16 bit UIDs, not a design limit.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993