[Beowulf] Troubleshooting NFS stale file handles

Prentice Bisbal pbisbal at pppl.gov
Thu Apr 20 14:01:36 PDT 2017


On 04/19/2017 03:21 PM, Jörg Saßmannshausen wrote:
> Hi Prentice,
>
> three questions (not necessarily to you and it can be dealt with in a different
> thread too):
>
> - why automount and not a static mount?
Well, I've been told that, in general, automounting reduces the load(s) 
on the servers, since the mounts only exist when they are needed. I'm 
somewhat skeptical of that that specific claim myself. In this case, 
these /l/hostname directories aren't used by every job, so a majority of 
the cluster nodes aren't actually serving out these dirs over NFS at any 
given time, so there is some truth to that.

Automounting certainly makes life easier when your home directories or 
project directories spread over many different servers. No need for  a 
massive /etc/fstab, and it's easy to move directories from server to 
server when needed without updating the /etc/fstab on every single 
server, so there's that.

Just about every where I've worked, /home an project/shared directories 
are automounted, and directories like /usr/local are statically mounted.
> - do I get that right that the nodes itself export shares to other nodes?
Exactly! It's not how I would do it. In fact, I think this is a horrible 
idea, but I inherited it from those who came before me, and have to live 
it now.
> - has anything changed? I am thinking of something like more nodes added, new
> programs being installed, more users added, generally a higher load on the
> cluster.
Not on my end. After dealing with this problem for close to weeks, it 
finally came out that use changed his code a few days before these 
problems started, but at the moment, there's no evidence that that 
change broke things, I rebooted all the nodes yesterday as a 'hail 
mary', and jobs have been running just fine ever since, so that's an 
important clue to this mystery (some sort of resource exhaustion?)
>
> One problem I had in the past with my 112 node cluster where I am exporting
> /home, /opt and one directory in /usr/local to all the nodes from the headnode
> was that the NFS-server on the headnode did not have enough spare servers
> assigned and thus was running out of capacity. That also lead to strange
> behaviour which I fixed by increasing the numbers of spare servers.
In this case, there's probably only a single client accessing one of 
these NFS shares at a time. Maybe 2-3 at most, so I don't think it's 
likely that the server is being ovewhelmed by clients in this case.
>
> The way I have done that was setting this in
> /etc/default/nfs-kernel-server
>
> # Number of servers to start up
> RPCNFSDCOUNT=32
>
> That seems to provide the right amount of servers and spare ones for me.
> Like in your case, the cluster was running stable until I added more nodes
> *and* users decided to use them, i.e. the load of the cluster got up. A more
> idle cluster did not show any problems, a cluster under 80 % load suddenly had
> problem.
>
> I hope that helps a bit. I am not the expert in NFS as well and this is just
> my experience. I am also using Debian nfs-kernel-server 1:1.2.6-4 if that
> helps.

I think it's unlikely that this will fix my issue, but I'm not ruling 
anything out at this time. Thanks for the suggestion.
>
> All the best from a sunny London
>
> Jörg
>
> On Mittwoch 19 April 2017 Prentice Bisbal wrote:
>> On 04/19/2017 02:17 PM, Ellis H. Wilson III wrote:
>>> On 04/19/2017 02:11 PM, Prentice Bisbal wrote:
>>>> Thanks for the suggestion(s). Just this morning I started considering
>>>> the network as a possible source of error. My stale file handle errors
>>>> are easily fixed by just restarting the nfs servers with 'service nfs
>>>> restart', so they aren't as severe you describe.
>>> If a restart on solely the /server-side/ gets you back into a good
>>> state this is an interesting tidbit.
>> That is correct, restarting NFS on the server-side is all it takes to
>> fix the problem
>>
>>> Do you have some form of HA setup for NFS?  Automatic failover
>>> (sometimes setup with IP aliasing) in the face of network hiccups can
>>> occasionally goof the clients if they aren't setup properly to keep up
>>> with the change.  A restart of the server will likely revert back to
>>> using the primary, resulting in the clients thinking everything is
>>> back up and healthy again.  This situation varies so much between
>>> vendors it's hard to say much more without more details on your setup.
>> My setup isn't nearly that complicated. Every node in this cluster has a
>> /local directory that is shared out to the other nodes in the cluster.
>> The other nodes automount this by remote directory as /l/hostname, where
>> "hostname" is the name of owner of the filesystem. For example, hostB
>> will mount hostA:/local as /l/lhostA.
>>
>> No fancy fail-over or anything like that.
>>
>>> Best,
>>>
>>> ellis
>>>
>>> P.S., apologies for the top-post last time around.
>> NO worries. I'm so used to people doing that, in mailing lists that I've
>> become numb to it.
>>
>> Prentice
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20170420/afa50655/attachment.html>


More information about the Beowulf mailing list