[Beowulf] Troubleshooting NFS stale file handles

Ellis H. Wilson III ellis at ellisv3.com
Wed Apr 19 11:17:13 PDT 2017


On 04/19/2017 02:11 PM, Prentice Bisbal wrote:
> Thanks for the suggestion(s). Just this morning I started considering
> the network as a possible source of error. My stale file handle errors
> are easily fixed by just restarting the nfs servers with 'service nfs
> restart', so they aren't as severe you describe.

If a restart on solely the /server-side/ gets you back into a good state 
this is an interesting tidbit.  Do you have some form of HA setup for 
NFS?  Automatic failover (sometimes setup with IP aliasing) in the face 
of network hiccups can occasionally goof the clients if they aren't 
setup properly to keep up with the change.  A restart of the server will 
likely revert back to using the primary, resulting in the clients 
thinking everything is back up and healthy again.  This situation varies 
so much between vendors it's hard to say much more without more details 
on your setup.

Best,

ellis

P.S., apologies for the top-post last time around.

-- 
Ellis H. Wilson III, Ph.D.
      www.ellisv3.com


More information about the Beowulf mailing list