[Beowulf] Building new cluster - estimate

Matt Lawrence matt at technoronin.com
Mon Aug 4 13:54:19 PDT 2008

On Mon, 4 Aug 2008, Joe Landman wrote:

> This mirrors our experience, though RHEL stability under intense loads is 
> questionable IMO (talking about the kernel BTW).  We find that the missing 
> drivers, the omitted drivers, the backported drivers along with some odd and 
> often useless "features" (4k stacks anyone?) render the RHEL default kernels 
> (and by definition the Centos kernels) less useful for HPC and storage tasks 
> than what we build.  Our current standard is a kernel which is rock 
> solid under load.  Working on a 2.6.26 based version now (even though I am on 
> vacation/holiday, I just updated it to to address an observed 
> crashing issue with the RDMA server)

Since I plan to continue running CentOS, it sounds like building a much 
later kernel rpm is the way I want to approach the problem.  Will going to 
a much later kernel break any of the utilities?  Other problems I can 
expect to see?

What do you recommend for the kernel config?

> Combine this with the small upper limit of ext3 partition sizes, the file 
> size limits in ext3, the serialization in the journaling code (ext4 is 
> extents based to help deal with this), ext3 just doesn't make much sense in a 
> storage/HPC system (apart from possibly boot/root file system where 
> performance is less critical).  Yeah I have seen studies from folks whom had 
> done 1E6 removes, file creates, and other things who claim xfs is slower than 
> ext3.  Yeah, those are bad benchmarks in that they really don't touch on real 
> end user use cases for the most part (apart from possible large scale mail 
> servers and other things like that).

I have never had any problems with ext3.  I had dinner with a friend who 
is an expert Linux sysadmin who was warning me to stay away from xfs.  He 
cited lots of fragmentation problems that routinely locked up his systems. 
I am willing to be convinced otherwise, but he is a very sharp fellow.

-- Matt
It's not what I know that counts.
It's what I can remember in time to use.

More information about the Beowulf mailing list