[Beowulf] Building new cluster - estimate
Bill Broadley
bill at cse.ucdavis.edu
Tue Jul 29 10:14:07 PDT 2008
Bogdan Costescu wrote:
> On Tue, 29 Jul 2008, Chris Samuel wrote:
>
>> 1) Use a mainline kernel, we've found benefit of that
>> over stock CentOS kernels.
>
> Care to comment on this statement ?
>
2.6.18 (RHEL-5.2) is currently almost 2 years old. One improvement since then
that I use heavily is ECC scrubbing, I don't like to have RAID arrays without
it, silent errors can accumulate otherwise. It's also created a ugly nest of
backports inside and outside of redhat. So things like sky2 gigE adapters are
ugly to support (and don't have a driver disk), and are especially hard to fix
when you have to modify the installer (CD or PXE) to work. I've seen similar
with intel e1000s (which are always changing), infinipath, areca cards, etc.
There have also been tweaks for NUMA, quad core, and related. I'm guessing
that's why, er, one of the largest new clusters went with Fedora (TAC?).
In general I'd say that the new kernels do much better on modern hardware than
the ugly situation of downloading a random RPM, or waiting for official
support. Seems like quite a few companies (ati, 3ware, areca, intel, amd, and
many others I'm sure) are trying hard to improve the mainline kernel drivers.
I understand why RHEL doesn't change the kernel (stability, testing, etc.),
but not sure it's the best fit for HPC type applications, especially with the
pace of hardware changes these days.
More information about the Beowulf
mailing list