[Beowulf] quick note on Redhat NFS issues with NAS units
Joe Landman
landman at scalableinformatics.com
Wed Dec 29 05:29:16 PST 2004
Jan-Frode Myklebust wrote:
>On Sun, Dec 26, 2004 at 03:59:18PM -0500, Joe Landman wrote:
>
>
>>Folks:
>>
>> Been looking into why a Redhat EL3 WS x86_64 client hangs when
>>accessing a NAS based upon SuSE 9x.
>>
>>
>
>Great, thanks for this note!
>
>I've been struggeling quite a bit myself with Rocks-3.3 on opteron
>(IBM e326), with AIX as file-server. I still don't quite understand
>exactly what caused my hangs, but after reverting back to udp, and
>default mount options plus increasing the number of lock-daemons on
>the AIX-server, I now have a stable NFS. Still struggeling a bit with
>the NFS performance..
>
>Should maybe test if bcm lets me go back to nfs over tcp.
>
>
I may have spoken a bit early ... It works in my test enviroment, works
on the compute nodes, fails on the head node. I can mount and unmount,
and intr now works. I can see the top-most directory of the mount.
Traverse the mount point by one level (say to any subdirectory) and do
an ls, or something that does a stat, and it hangs. Only on the head
node. Compute nodes work perfectly now. No hangs. None of the above
mentioned behavior.
I may reload the head node. I will be trying to force replication of
this in my lab, but if I cannot, I will do the head node reload. I am
starting to suspect some sort of cached state (which is incorrect) on
the head node.
>
>
>>ps: if there are some Redhat people reading the list, you know, we would
>>like some modern kernels, and not lots of backported stuff, not to
>>mention xfs, and other goodies ... (yeah, I know, wait till EL4, ...)
>>
>>
>>
>
>Maybe someone should do a kernel-2.6 roll for Rocks...
>
>
I just pulled down the ROCKS source trees with the intention of rolling
a 2.6 (with XFS, Trond and others NFS patches, and Andi Kleen's x86_64
bits). If I get this done soon I'll post a note looking for crash
dummies^H^H^H^H^H^H^H^H^H^H^H^H^H volunteers to help me test.
Joe
>
> -jf
>
>
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 612 4615
More information about the Beowulf
mailing list