[Beowulf] Network Filesystems performance

Glen Dosey doseyg at r-networks.net
Fri Aug 24 11:32:23 PDT 2007


On Fri, 2007-08-24 at 19:23 +0200, Bogdan Costescu wrote:
> On Thu, 23 Aug 2007, Glen Dosey wrote:
> 
> > What really gets me is that while my NFS reads are around ~50MB/s , 
> > the writes are basically at wire speed, slowing down to and holding 
> > at about ~90MB/s when we exceed the 4GB file size. That would seem 
> > to indicate to me the server has no problem dealing with a saturated 
> > NIC and reasonably high I/O on the QLA2342 at the same time.
> 
> I beg to disagree with your conclusion. At least for networking the Tx 
> and Rx paths are quite different in what operations they do and 
> especially how many interrupts they need (due to possible interrupt 
> mitigation, in either driver or hardware); I think that for the SCSI 
> stack used to talk to a FC controller the situation is quite similar. 
> So, I think that a test that would better aproximate the NFS read case 
> would be:
> - run the local reading test on the server (using dd)
> - run at the same time a network speed test (using f.e. ttcp) which 
> sends data from the server to a client

I agree with you in theory, but in practice it seems to have no affect.
A dd can read from the local disk at 160MB/s while at the same time
transferring TCP with iperf from the NFS server to the NFS client at
960Mb/s.


> 
> The slowing down that you mention above as happening at 4GB looks to 
> me like the server's memory used for caching fills up and the writing 
> to disk is then forced; so it's not the 4GB file-size that is changing 
> behaviour, but the caching that doesn't help anymore. 

Yes, that was exactly my point :) The NFS write happen at wire speed
until the disk gets involved, at which point it's about 90MB/s.


> 
> >>  50:      69688     963247    1227309     270953   IO-APIC-level  qla2xxx
> >>  58:      15112      96722      96347       7613   IO-APIC-level  qla2xxx
> >>  66:   47398161          0          0          0   IO-APIC-level  eth0
> 
> Seems like the interrupts for the NIC go to only one CPU, while the 
> ones for the FC controller are spread among CPUs. I think that memory 
> locality effects come into play here and reduce the speed. Did you 
> play by any chance with CPU affinity for interrupts ?

I have not changed any affinity settings. 

> 
> >> atop shows basically the same iostat does, which is that on the initial
> >> read the FC disk is about %85 percent utilized
> 
> This seems strange: for 40MB/s the disk is used at 85%, but then you 
> report getting 160MB/s ? It doesn't add up for me - do you have an 
> explanation ?

Per the iostat man page, %util is the percentage of CPU time during
which I/O requests were issued to the device  (bandwidth  utilization
for the device).  Device saturation  occurs  when this value is close to
100%.

My assumption, as I have seen this before, is that because the # of read
requests coming from NFS is lower we are able to merge fewer requests
together which results in a greater number of requests being sent to the
disk for the same amount of data transfer. At the same time, I have a
guess that the way iostat calculates the %util results in a slightly
non-linear curve, where the upper 5-10% account for more than just 10%
performance as queue sizes and wait increase along with total data
transfer.



Here are 2 iostat captures for averages over a 2 second period. The
numbers are representative for the entire test. 


This is what iostat shows when dd'ing from a file to /dev/null at
160MB/s . I was running iperf at the same time and moving 960Mb/s of
data off the server to keep the NIC busy per your prior suggestion.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.37    0.00   13.48   19.10    0.00   67.04

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     9.50  0.00 15.50     0.00   179.00    11.55     0.02    1.39   0.13   0.20
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd            2225.00     0.00 376.00  0.00 326976.00     0.00   869.62     6.64   17.64   2.66  99.95
dm-0              0.00     0.00  0.00 21.50     0.00   172.00     8.00     0.03    1.26   0.09   0.20
dm-1              0.00     0.00  0.00  1.00     0.00     2.00     2.00     0.00    1.50   1.50   0.15
dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  0.00  2.50     0.00     5.00     2.00     0.00    0.80   0.20   0.05
dm-4              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdf               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00 2602.00  0.00 328000.00     0.00   126.06    48.77   18.79   0.38  99.95

This is what iostat shows for reading the same file from an NFS client at about ~40MB/s

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    6.12   38.00    0.00   55.88

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd             970.35     0.00 914.07  0.00 118822.11     0.00   129.99     3.92    4.27   0.97  89.05
dm-0              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-4              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdf               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00 1885.43  0.00 119272.36     0.00    63.26    17.23    9.08   0.47  89.10










More information about the Beowulf mailing list