How bleeding edge are people with kernels (Was Re: [Beowulf] impressions of Super Micro IPMI management cards?)

Wed Nov 21 09:05:52 PST 2007

Brian Dobbins wrote:
>   I had at one point a simple script that would allow me to select a 
> kernel type at job submit time, it would load that up, reboot the nodes 
> with that kernel, and then run my job.  Sometimes this was incredibly 
> useful, as I found a difference of roughly 20-25% performance on one 
> particular code running on the same hardware, one with an /old/ 2.4 
> series and libc, and another with a more modern kernel + libc.  Even 
> now, as we're looking at a larger system, I'll probably put (in a static 
> fashion) one of the interactive nodes with a kernel supporting PAPI, and 
> quite possibly will put most of the compute nodes on a kernel with some 
> modifications for performance.

Thanks for your response. We're running a diskless environment aswell 
(it's a pretty small cluster - 20 nodes running a customised Debian). 
Performance is certainly interesting to me -- but stability is starting 
to become so also. We've squeezed a good bit out on the performance 
front by tweaking various components in the system including the MPI 
libraries and so on. So much so that the scientists I'm running the 
cluster for are largely happy with the performance (I suspect there 
could be another 5-10% lurking in there, but getting it out would 
probably involve a lot of my time and a lot of cluster downtime for 
testing/profiling .. so it feels like we're in the sweet spot at the 
moment).

So we're happy with performance, and now we'd like to run our models for 
weeks on end without any user intervention. What we have seen as we 
start doing this is some stability problems that have not been 
consistently reproducible so far and have left no traces in the logs (I 
might send a separate mail about these just to generally pick peoples 
brains) -- the key point here though is that I have no idea at the 
moment if these are kernel level problems or hardware level problems.

We're running Debian's stable kernel 2.6.18-5-amd64 (for the diskless 
nodes, we're using the 2.6.18-5-amd64 kernel source, recompiled after 
stripping out all unneccesary drivers). My concern about rolling to 
2.6.22 or something in between is that we might get some performance 
benefits but we might also get more intermittment wierd stability issues 
(the kind that may even be peculiar to our own hardware/software 
environment). I was just wondering what other peoples take is -- clearly 
a lot depends on your own risk aversion level, how much time you have 
for testing and supporting what you deploy and so on. Thanks to all that 
responded.

>   In case anyone is interested, I'm planning on bugging the National 
> Labs + Cray guys a bit more soon, and if they can't release or document 
> what they change, I'll set up a wiki about kernel stripping / tuning for 
> HPC workloads, and maybe the community can put together a decent 
> 'how-to' until the big guys can chime in.  If/when I find the time, I'll 
> also try to get some information on how much this can impact performance 
> on some modern code suites, but it might take a few weeks at least 
> before I'm able to do so. 

I'm not sure how much of the stuff thats relevant to tuning really big 
clusters would percolate down to the likes of myself but I would be 
interested in taking a look at it anyways.

>   Disclaimer to all of the above - I haven't done much system-level 
> stuff in a long while now, so your mileage may vary considerably.  :)

Oh, I understand that all suggestions on beowulf include the standard 
"But it depends" disclaimer :)

Thanks,

-stephen

-- 
Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland.  +353.91.751262  http://www.aplpi.com
Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway)