[Beowulf] NFS share - IO rate

Henning Fehrmann henning.fehrmann at aei.mpg.de
Wed Apr 21 06:16:04 PDT 2010

Hi Bogdan,

On Wed, Apr 21, 2010 at 11:07:26AM +0200, Bogdan Costescu wrote:
> On Tue, Apr 20, 2010 at 9:36 PM, Henning Fehrmann
> <henning.fehrmann at aei.mpg.de> wrote:
> > Client A says I got the IO-rate Ra which is twice as big as the IO-rate of B:
> > Ra = 2 Rb.  The test on B took twice as long as on A.
> I look at this differently: the overall rate that the server has dealt
> with is given by the total amount of data transferred in the time
> taken by the slowest node. So if
> Ra=Da/Ta and Rb=Db/Tb then I consider Rt=(Da+Db)/max(Ta, Tb)

Yes, the rate of the slowest node times the number of nodes would give a
lower bound for the IO rate.
> It's a similar view to the one I have about a parallel program: the
> real time (wallclock) of giving me the solution is what matters, not
> whatever built-in counters report. And this real time is the time
> taken by the slowest node (=the one which finished last, I'm not
> referring to the CPU speed...)
> > But one can also interpret the result in a different way.
> > Client A was doing its IO test and Client B got no bandwidth left at all.
> > Only after A finished the test, B has been served. This results in a twice as small
> > average rate on B.
> This shows a different point of view: you mention the average rate on
> B, I talk about what the server sees. So what are you actually
> interested in ? Do you have some rate specified by the manufacturer
> for the server that you want to compare with ? Or do you have some
> requirement of rate per node ?

The server has a Solaris 10 and a SAM/QFS on it and there are tools to measure the IO rate.
Currently, I can't say how reliable these tools are. Some of them
measure the IO rate on the discs which makes no sense in a RAID set. 
Doing these tests on clients might give a better picture of the
usability of the cache system.
Measuring the performance on the server wouldn't also take into account the
buffering of the VFS or NFS on the client side. 
Additionally, more important than the streaming is the IO rate doing
random seeks. 

In the bidding process we specified the read and write rate doing random seeks on the server,
induced and seen by many clients in parallel. 

Thank you and cheers,

More information about the Beowulf mailing list