Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Transient NFS Problems in New Cluster

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Prentice Bisbal prentice at ias.edu
Wed Feb 3 07:22:17 PST 2010



Jon Forrest wrote:
> I have a new cluster running CentOS 5.3.
> The cluster uses a Sun 7310 storage server
> that provides NFS service over a private
> 1Gb/s ethernet with 9K jumbo frames to the
> cluster.
> 
> We've noticed that a number of the compute
> nodes sometimes generate the
> 
> automount[15023]: umount_autofs_indirect: ask umount returned busy /home
> 
> message. When this happens the program running on the
> node dies. This has happened between 10 and 20 times.
> We're not sure what's going on on a node when this
> happens. Most of the time everything is fine and
> the home directories are automounted without problem.
> 
> I've googled for this problem and I see that other people
> have seen it too, but I've never seen a resolution,
> especially not for RHEL5.
> 
> The auto.master line for this mount is
> 
> /home  /etc/auto.home  --timeout=1200
> noatime,nodiratime,rw,noacl,rsize=32768,wsize=32768
> 
> The network interface configuration is
> 

Jon,

I had this same exact problem a couple of weeks ago after changing the
autmounting scheme on our network, requiring all nodes to reread the
automounter configuration. It only happened on a few nodes.

My only solution was reboot the nodes with the problem. After rebooting,
 'service autofs reload' or 'service autofs restart' worked without a
problem.

I'm sure that's not the answer you were looking for, but that's all I
got. Sorry. I suspect its a bug in the automount daemon, but I can't
prove it.


-- 
Prentice



More information about the Beowulf mailing list