[Beowulf] Definition of HPC

Max R. Dechantsreiter max at performancejones.com
Wed Apr 24 08:02:27 PDT 2013


> it's very simple: pagecache and VM balancing is a very important part of the
> kernel, and has received a lot of quite productive attention over the years.
> I question the assumption that "rebooting the pagecache" is a sensible way to
> deal with memory-tuning problems.  it seems very passive-aggressive to me:
> as if there is assumption that the kernel isn't or can't Do The Right Thing
> for HPC.

Sure, it's important - WITHIN a given job.  Why should a
new job's performance depend on what ran before?  (And in
most cases, the impact is negative, because the cached
pages are not the ones needed by the new job.)

> for sites where a single job is rolled onto all nodes and runs for a long
> time, then is entirely removed, sure, it may make sense.  rebooting entirely
> might even work better.  I'm mainly concerned with clusters which run a
> wide mixture of jobs, probably with multiple jobs sharing a node at times.

I would advise any user never to do that.

> who says determinism is a good thing?  I assume, for instance, you turn off
> your CPU caches to obtain determinism, right?  I'm not claiming that variance
> is good, but why do you assume that the normal functioning of the pagecache
> will cause it?

Try it and see.



More information about the Beowulf mailing list