[Beowulf] non-stop computing
Guy Coates
guy.coates at gmail.com
Thu Oct 27 08:38:15 PDT 2016
BLCR or DMTCP should both be able to checkpoint a single node job (single
or multi threaded) straight out of the box; you won't need to recompile any
of your binaries.
DMTCP does not require any kernel modules, and so you might find that
easier going if you are on a more recent kernel than BLCR supports. (DMTCP
also seems to do a better job handling MPI jobs than BLCR does, if you care
about those.)
Thanks,
Guy
--
Dr. Guy Coates
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20161027/9e1c2898/attachment-0001.html>
More information about the Beowulf
mailing list