[Beowulf] Re: NFS Read Errors

Joe Landman landman at scalableinformatics.com
Tue Dec 4 13:22:32 PST 2007


David Mathog wrote:
> I missed the beginning of this thread - what were the parameters
> in /etc/fstab on the client?
> 
> Unless hard mounts are used it is possible for a block of 
> null bytes to end up in the file where data was supposed to be.

I think his issue is one of an over-zealous retry loop somewhere ...  He 
is using udp mounts by default (could do a "mount -o remount,tcp /path" 
to change to tcp, but I don't think this will help).

It sounded to me like a bad HD, but his local HD reads/writes seem ok 
(is this correct)?

It could be

	a) bad driver
	b) bad NIC
	c) bad PCI slot
	d) bad cable
	e) bad switch
	f) bad switch port
	g) other things :)

The gear he was using is *old*, and the distro is a 2.4.20 based thing 
(RH9 I think?).

If it is worth the time and effort to hunt it down, I might suggest 
investing in a pair of new (different NICs) putting them in a node with 
a crossover cable, and making sure he can pass data back and forth 
without issue.  Then see if the problem emerges in changing one thing at 
a time (or bisect the search space, but the list is short enough that 
either one would work well).

> 
> Regards,
> 
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the Beowulf mailing list