[Beowulf] Help for terrible NFS write performance

Joe Landman landman at scalableinformatics.com
Fri Aug 21 11:09:18 PDT 2009

Cunningham, Dave wrote:
> Are you suggesting that running nosync in an hpc environment might be
> a good idea ?  We had that discussion with our integrator and were
> assured that running nosync was the path to damnation and would
> probably grow hair on our palms.

... that may be, but the flip side is that your write performance will 
be terrible with sync.  Its a question of which risk is larger.

Sync forces the write to wait to return to the user until the data is 
committed to disk.  Which is technically not true, as it is committed to 
cache on the drive unless you turned off all write caching.  Even then, 
drivers don't necessarily wait and verify that the data got to disk. 
They verify that the data was flushed to disk, but not that the data on 
the disk is what you thought it was.

This said, we haven't seen this problem (corrupted fs data) as an issue 
for a crashed NFS server in a while (years).  YMMV.

> What are your thoughts on the tradeoff ?

Performance or "safety".  Sync doesn't give you guarantees that your 
data is on disk, it merely guarantees that the relevant semantics have 
been honored.

We have found that with a good journaling file system (not ext3), that 
this is not usually an issue.

Then again, you are using md raid, which means you don't have a nice 
battery backed raid behind you to cache IO ops, such as writes that 
didn't finish making it to disk.  So unless you have turned off write 
caching on the drives themselves, the sync is superfluous.

Bug me offline if you want to talk more about this.


> Dave Cunningham
> -----Original Message----- From: beowulf-bounces at beowulf.org
> [mailto:beowulf-bounces at beowulf.org] On Behalf Of Joe Landman Sent:
> Friday, August 21, 2009 10:38 AM To: Orion Poplawski Cc: Beowulf List
>  Subject: Re: [Beowulf] Help for terrible NFS write performance
> Orion Poplawski wrote:
>> I'm trying to improve the terrible NFS (write in particular)
>> performance I'm seeing.  Pure network performance does not appear
>> to be an issue as I can hit 120MB/s reading which should be about
>> the limit for gigE. Perhaps the local disk performance is not what
>> it should be.  Any help would be greatly appreciated.  Using
>> bonnie++ for benchmarks.
>> Server:
>> Dual proc dual core opteron 2GHz 8GB RAM CentOS 4.7 kernel
>> 2.6.9-78.0.22.plus.c4smp 3 8-port Marvell MV88SX6081 SATAII
>> controllers sata_mv 3.6.2 driver Ethernet controller: nVidia
>> Corporation MCP55 Ethernet (rev a3) MTU 8982
>> Arrays are linux md arrays of 6 disks with 2 on each controller.
>> 64k cunks.  ext3 filesystem.
> If I had to bet, ext3 would have much to do with this ... though, 
> honestly, md RAID write performance over NFS is nothing to write home
>  about.  We can get ~350MB/s on our DeltaV's, but this takes lots of
> work.
>> "working" - raid0 ST31000340AS 1TB drives local perf: 224-240MB/s
>> write, 135MB/s rewrite, 390-400MB/s read "cora6" - raid5
>> ST31500341AS 1.5TB drives local perf: 84MB/s write, 42MB/s rewrite,
>> 161-166MB/s read
>> /etc/exports: /export          *.cora.nwra.com(rw,sync,fsid=0) 
>> /export/cora6    *.cora.nwra.com(rw,sync,nohide) /export/working
>> *.cora.nwra.com(rw,sync,nohide)
> Ok.  There it is...  Sync.
> Don't need to see anything else.
> That is it.

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

More information about the Beowulf mailing list