[Beowulf] GPFS on Linux (x86)

Thu Sep 14 09:18:54 PDT 2006

> We are currently running GigE as our interconnect.

did you mention the kind of compute/client load you've got?

> Basically we are currently running two NFS servers out to our web
> servers.

uh, that sounds fine - web traffic tends to be quite read-cache
friendly, which NFS does very nicely.

> We also are running three MySQL servers. The MySQL instances
> are segmented right now, but we are about to start an eval of
> Continuent's M//Cluster software.

have you measured the nature of your NFS and SQL loads?

> As stated, our FS infrasdtructure leaves much to be desired.  The
> current setup involving NFS servers (Dell PE 2850 with local 1TB local
> storage 10K scsi disks) have not performed well.  We are constantly IO
> waiting.

but _why_?  heavy write load without async NFS (and writeback at the 
block level)?  with multiple local 10K scsi disks, you really shouldn't
be seek limited, especially if requests are coming over just gigabit.

> Another interesting thing is, each MySQL server is using a ISCSI block
> device from SATAII NAS servers that we built using generic super micro
> boards and Areca controllers.  Each of these boxes has approx 2.1TB of
> usable disk, and the performance has been suprisingly good.  The Areca
> 1160 controllers with 1GB cache are handling the load, especially
> compared to our FS infrastructure of localized disks (I would have
> thought the opposite would be true),

to me that indicates your disk-local servers are misconfigured.
(which reminds me - dell has shipped some _astoundingly_ bad raid systems
marketed as high-end...)

> as the mysql disk IO pattern
> would be more smaller random IO, and the FS is mostly read (serving up
> web pages).

but web pages will normally be nicely read-cached on the web frontends...

> We have made pretty much every last ounce of optimization we can on
> the NFS side (TCP, packet sizes, hugemem kernels, tried David Howells
> fscache on web client side) but non has been the silver bullet we've
> been looking for, which led us down the parallel fs path.

how much memory do the web servers have?  if the bottleneck IO really
is mostly-read pages, then local dram will help a lot.

> on Cluster FS that we seek to employ.. Yet in an effort to scale to
> the sky, we are going to try to do this correctly, rather than
> continually being reactive.

not to insult, but I find that the main problem is not understanding 
the workload sufficiently, not lapses in proactivity...