[Beowulf] Definition of HPC
hahn at mcmaster.ca
Thu Apr 18 18:21:27 PDT 2013
> Only for benchmarking? We have done this for years on our production
> clusters (and SGI provides a tool this and more to clean up nodes). We
> have this in our epilogue so that we can clean out memory on our diskless
> nodes so there is nothing stale sitting around that can impact the next
> users job.
understood, but how did you decide that was actually a good thing?
if two jobs with similar file reference patterns run, for instance,
drop_caches will cause quite a bit of additional IO delay.
I guess the rationale would also be much clearer for certain workloads,
such as big-data reduction jobs, where things like executables would
have to be re-fetched, but presumably much larger input data
might never be re-referenced by following jobs. it would have to be jobs
that have a lot of intra- but not inter-job readonly file re-reference,
and where clean-page scavenging is a noticable cost.
I'm guessing this may have been a much bigger deal on strongly NUMA
machines of a certain era (high-memory ia64 SGI, older kernels).
More information about the Beowulf