[Beowulf] how fast can NFS run?

Joe Landman landman at scalableinformatics.com
Tue Jan 31 20:56:40 PST 2006


Hi Bruce

Bruce Allen wrote:
> I'd like to know the fastest that anyone has seen an NFS server run, 
> over either a 10Gb/s ethernet link or a handful of link aggregated 
> (channel-bonded) Gb/s ethernet lines.

If you allow us to go into the world of NFS-alike things, the Panasas 
file system and server has hit about 2 GB/s in some of the testing we 
had done more than a year ago.

We ran the same problem with NFS on the same hardware (different code 
paths/file system name space) and it was suffering along at about 300 MB/s.

> 
> This would be with a small number of clients making large file 
> sequential reads from the same NFS host/server.  Please assume that the 
> NFS server has 'infinitely fast' disks.

This was ~32 compute nodes talking over a gigabit switch of some sort 
(Nortel I think).

> I am told by one vendor that "NFS can't run faster than 100MB/sec".  I 

Hmmmm....

Maybe theirs can't ...

...  or they are trying to sell you something ... :)

> don't understand or believe this.  If the server's local disks can 
> read/write at 300MB/s and the networking can run substantially faster 
> than 100 MB/s, I don't see any constraint to faster operation.  But 
> perhaps someone on this list can provide real-world data (or say why it 
> can't work).

.... ok, a number of different issues going on here

a) the 300 MB/s (SATA II, right?) is the max theoretical speed.  You are 
  going to get something close to this in pure buffer to memory 
transactions in specialized cases. Normally you will see 50-70 MB/s for 
these disks for large block sequential reads.  SATA also does a bit of 
interrupting... you need a *good* SATA controller, or you will see your 
interrupt rate go up 10x in heavy disk load times.  Software RAID will 
increase this a bit as well.

b) If this is gigabit, you get about 110 MB/s max in best case 
scenarios, with the wind at your packets, along with a nice 
gravitational potential, an a good switch to direct packets by.  If this 
is IB, you should be able to see quite a bit higher, though your PCI is 
going to limit you.  PCI-e is better (and HTX is *awesome*).

> Note: I am free to use modern versions of the NFS protocol, jumbo 
> frames, large rsize/wsize, etc.

We had some issues about a year ago (not revisited recently) with RHEL3, 
jumbo frames, and Broadcom gigabit adapters (tg3 was flaky, and bcm5700 
was much more stable/faster).  We reported it to RH, whose response at 
the time was basically "go away".  Wasn't an issue on the same hardware 
using other distros.

With NFS, you are moving through a protocol stack (NFS) as well as a 
transport stack (TCPIP).  This is not cheap.  However, there can be a 
number of reasons why NFS appears slow for you or your vendor.

FWIW, we have customers with units we have built out that happily 
support 2-400 MB/s over NFS without complaining, over gigabit (multiple 
simultaneous clients hammering on the server).  There are multiple 
problems to overcome to get this working correctly and efficiently.

<speculation>

 From what I can see on a 4 way system, I think it could support at 
maximum about 2 GB/s of disk IO ( DMA access to ram ) per CPU connected 
to an IO channel (most 4 ways have a single CPU connected to their IO 
channels).  The protocol is not cheap, and the processing overhead could 
easily pare this down to 600-900 MB/s over a fast enough network fabric.

With some tweaking and tuning, you might be able to get this going a 
little faster.  You would need to speak to the IB folks, or the 10 Gbe 
folks to see what they are really seeing.  1GB/s per adapter (10Gbe) is 
doable over PCIe/HTX (if there were HTX cards for it).  If they have 
RDMA and TCP offload capability, you will likely get a win and some 
better performance.

</speculation>



-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615



More information about the Beowulf mailing list