Cluster programming...

Karl Bellve Karl.Bellve at umassmed.edu
Wed Jan 22 07:52:31 PST 2003


I am running into a little problem about multiple writes to a single 
file via NFS.


An application is spawned on a number of nodes. When they are done, they 
all write to a specific, but non-overlapping area of the NFS mounted 
file. I use fcntl (fd, F_SETLKW, &lck) to lock to file, or wait until it 
can lock the file for writing. Fcntl() is capable to lock across NFS. 
However, some nodes fail to write their result to the file. It isn't the 
same nodes every time. I am not seeing any write errors. I tend to think 
it is a NFS caching issue. All writes get flushed before releasing the 
lock via fsync() and close().

The fileserver is a Redhat 8.0 system. I uprgaded to the latest Kernel 
offered to RH8.0. That didn't fix the problem. I compiled a new kernel 
(2.4.20) and that didn't fix the problem. The nodes are Alpha's running 
RH6.2.

I am thinking about alternate means of locking but fnctl() should be the 
trick.






-- 
Cheers,



Karl Bellve, Ph.D.                   ICQ # 13956200
Biomedical Imaging Group             TLCA# 7938 		
University of Massachusetts
Email: Karl.Bellve at umassmed.edu
Phone: (508) 856-6514
Fax:   (508) 856-1840
PGP Public key: finger kdb at molmed.umassmed.edu





More information about the Beowulf mailing list