[Beowulf] serious NFS problem on Mandrake 10.0
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
David Mathog mathog at mendel.bio.caltech.eduFri Dec 3 16:07:38 PST 2004
- Previous message: [Beowulf] serious NFS problem on Mandrake 10.0
- Next message: [Fwd: [Beowulf] serious NFS problem on Mandrake 10.0]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> > > cp /tmp/SAVELASTMEGABLAST.txt /tmp/TESTLAST.txt > > mv /tmp/TESTLAST.txt ./TESTLAST.txt.$NODE > > set `md5sum TESTLAST.txt.$NODE` > > NEWMD=$1 > > /bin/rm ./TESTLAST.txt.$NODE > > if [ "$NEWMD" != "$HOLDMD" ] > > hmm. you're doing both the writes and reads from the slave node here. > was that part of your original description? I'm wondering about > bad writes vs bad reads. what happens if you run the md5sum on > the master instead? It was originally found as corrupted data on the master. Then it was confirmed that the data looked corrupted from the slave too, so the script ran entirely on the slave. You do have a point though, presumably the slave is rereading the data back across the net for the md5sum, so there are two passes where it could go wrong, and the script didn't check to see that the corrupted data was of the same type. Poked around in bugzilla for kernel.org, this sounds like it may be the same or a closely related problem, if so, it's still around in 2.6.9: http://bugzilla.kernel.org/show_bug.cgi?id=3608 I'll try some of your suggested changes next week - not the sort of thing to attempt late on a Friday... Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech
- Previous message: [Beowulf] serious NFS problem on Mandrake 10.0
- Next message: [Fwd: [Beowulf] serious NFS problem on Mandrake 10.0]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
