nfsd question on linux (rh 7.3)
mathog at mendel.bio.caltech.edu
Fri Jan 10 12:39:03 PST 2003
A simple job runs simultaneously on 20 compute nodes. Each job
reads the same files via nfs. In /etc/rc.d/init.d/nfs
and ps -ef shows 25 nfsd. (25 because there is 1 nfsd per possible
client node, but some of them aren't in the compute cluster.)
Yet when the jobs are running "top" typically shows only 3 or 4
of the nfsd processes getting any CPU time. And then not all that
much - only 4-5% CPU max.
Why aren't more nfsd processes being brought in?
Anything else I can do to bump up throughput with NFS?
Nothing else is running. It's as if a small set of nfsd
servers are handling all the requests even though 25
are available. Essentially these
jobs just read the files from disk into memory, and then close
the files before processing anything. It appears that NFS is
rate limiting. There's about 330Mb
of data to be read across the 100baseT network to each node.
All jobs complete at roughly the same time.
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
More information about the Beowulf