Cluster programming...
Jakob Oestergaard
jakob at unthought.net
Thu Jan 23 23:57:39 PST 2003
On Wed, Jan 22, 2003 at 10:52:31AM -0500, Karl Bellve wrote:
>
> I am running into a little problem about multiple writes to a single
> file via NFS.
Ok, first of all that sounds like a bad idea to begin with.
Why not have each node write it's own file, and run a "cat node.* >
bigfile" afterwards?
Quadratish, praktisch, gut ;)
> An application is spawned on a number of nodes. When they are done, they
> all write to a specific, but non-overlapping area of the NFS mounted
> file.
If the parts are non-overlapping, I assume that the offset and data
length of each node's write is fixed - correct ?
> I use fcntl (fd, F_SETLKW, &lck) to lock to file, or wait until it
> can lock the file for writing. Fcntl() is capable to lock across NFS.
If I was correct above - why do you need to lock the file?
A seek() + write() should do the trick as I see it - but maybe there's
something I don't see :)
> However, some nodes fail to write their result to the file. It isn't the
> same nodes every time. I am not seeing any write errors. I tend to think
> it is a NFS caching issue. All writes get flushed before releasing the
> lock via fsync() and close().
>
> The fileserver is a Redhat 8.0 system. I uprgaded to the latest Kernel
> offered to RH8.0. That didn't fix the problem. I compiled a new kernel
> (2.4.20) and that didn't fix the problem. The nodes are Alpha's running
> RH6.2.
>
> I am thinking about alternate means of locking but fnctl() should be the
> trick.
I completely agree with you that locking should work - and you have
already received many good suggestions from fellow 'wolfers on how to
test/check/improve the locking on your systems.
What I'm curious about is, if you need locking at all. While it should
of course work, avoiding it would solve the problem completely.
--
................................................................
: jakob at unthought.net : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob Østergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:
More information about the Beowulf
mailing list