[Beowulf] Definition of HPC

Mark Hahn hahn at mcmaster.ca
Thu Apr 18 18:21:27 PDT 2013


> Only for benchmarking?  We have done this for years on our production
> clusters (and SGI provides a tool this and more to clean up nodes).  We
> have this in our epilogue so that we can clean out memory on our diskless
> nodes so there is nothing stale sitting around that can impact the next
> users job.

understood, but how did you decide that was actually a good thing?

if two jobs with similar file reference patterns run, for instance, 
drop_caches will cause quite a bit of additional IO delay.

I guess the rationale would also be much clearer for certain workloads, 
such as big-data reduction jobs, where things like executables would 
have to be re-fetched, but presumably much larger input data 
might never be re-referenced by following jobs.  it would have to be jobs 
that have a lot of intra- but not inter-job readonly file re-reference,
and where clean-page scavenging is a noticable cost.

I'm guessing this may have been a much bigger deal on strongly NUMA
machines of a certain era (high-memory ia64 SGI, older kernels).

regards, mark.



More information about the Beowulf mailing list