[Beowulf] Slightly OT: storage performance

Joe Landman landman at scalableinformatics.com
Thu Nov 9 14:09:47 PST 2006

Hi folks:

   I am running some IOzone (and bonnie++, and spew, and we wrote our 
own RAFAP*), and I want to get a feel for what people consider "good" 
performance.  I know IOzone isn't really representative of workloads, 
and I personally abhor "benchmarks" which don't make at least an effort 
to reflect real workloads (which is an issue in and of itself, as there 
aren't any "standard" workloads that I am aware of for servers, 
everyones will be different ...).

   My question is this:  apart from using huge file sizes to see raw 
disk performance, what do you considered good performance on the various 
tests, either in the huge file size regime, or in the cache interaction 
regime?  Basically which tests are most meaningful to your workloads? 
Are the raw disk data really the most useful datum?  Are they corner 
cases that you are simply interested in?  Is the most important test 
case reading and changing one byte at random in a 1TB file, several 
hundred million times?  Or is it large block sequential IO?

   Disclosure:  working on a white paper for something we are working on 
(bug me at SC), and I eschew using completely meaningless numbers.

   If there is some sort of cutoff that people have between what they 
consider "eh" and "good", I would like to hear it.  You can email me 
offline if you want, and I will summarize later on.



* RAFAP == Read As Fast As Possible, a really simple C code that tries 
to hammer on the IO.


