[Beowulf] Fwd: NIS limitations question
Donald Becker
becker at scyld.com
Mon Feb 6 14:36:51 PST 2006
On Sun, 5 Feb 2006, Walid wrote:
> I belive i have seen on this maling list*, and other internet fourms** some
> limitation of NIS, but i have failed to find a documented limiation from
> SUN, or from the various linux distrubutions, did any one try to research
> the scalability of NIS servers?
The scalability depends on many details of your environment, and the
timing of the requests.
Remember that NIS was designed for a workstation environment, where
humans-rate requests generate asynchronous events. It wasn't designed for
a cluster environment where a single application generates queries from
every node simultaneously, and where the system state (e.g. number of
nodes) might change frequently and new applications expect the state to
current.
There are ways to tune NIS (increase the backlog) to minimize the
observable problems. But you haven't fixed it, you have only made them
less obvious for the current cluster scale and application set.
> The reason i am asking on a 256 nodes cluster using GigE with two nis Linux
> slaves we do see lots of rpc timeouts, the moment we added, an extra slave
> we have not experinced much, but in the other hand our solairs Linux slaves
> handles triple the amount of clients, and users have not reported problems.
>
> so my question in these big clusters that have 256 nodes and more, what do
> people use for host, and name lookups?, and how much NIS slaves if any do
> they deploy? does any one know how many concurrent connections an NIS can
> handle ?
We developed a cluster-specific name service/ directory service called
BeoNSS. It uses knowledge about the cluster structure to cache, compute
or avoid name lookups. Some examples
Host map
We number cluster compute nodes sequentially starting at '0', and
map them to sequential IP addresses.
We then use names based on these numbers: node 23 is named
".23" with aliases "cluster.23" "23.cluster", "23.cluster0" and
"<prefix>23". BeoNSS knows these formats, and returns the address
calculated from the known IP address of node 0 and other info (node
count, netmask, preferred interface, cluster name).
Netgroup map
Netgroups are used for file server exports and security.
We use much the same approach to generate a list of compute nodes
names in the cluster.
Password and group
We send credentials out with each job, so that the process has a
preserved passwd and group entry. BeoNSS uses the information to
generate getpwent() entry for the user and a synthetic entry for
"root". (Note that this approach automatically handles disjoint user
sets from multiple masters, and is one element of highly secure
servers since the process doesn't have access to the list of
other users.)
These are not the only name services that BeoNSS provides, but they are
good examples of how a cluster-specific name service can make the cluster
faster, easier to scale and more consistent.
BeoNSS works with other name services. If an cluster requires other name
services, it's easy to configure them as fall-back services. This
works very well, since BeoNSS handles the really troublesome queries (an
application generating an all-to-all IP address map on each node
simultaneously, or libc looking up a user name at start-up from a 10,000
entry passwd map), while taking a negligible amount of time to return a
soft fail ("don't know, ask the next service on the list").
There are other approaches that clusters have used:
The most obvious is copying out files to each /etc/. This has the problem
of consistency and synchronization. You might think that you'll remember
to push out new copies with each update. But what about machines that are
down? Or booting? Or up but not responding right now?
I've seen systems that use NSCD, the Name Service Caching Daemon.
It's another "it seems to work for me, at least today" solutions. Like
most caching systems, it reduces traffic in the common case. But it
doesn't handle update consistency, and won't handle the start-up backlog
and dropped-request problem.
--
Donald Becker becker at scyld.com
Scyld Software Scyld Beowulf cluster systems
914 Bay Ridge Road, Suite 220 www.scyld.com
Annapolis MD 21403 410-990-9993
More information about the Beowulf
mailing list