[Beowulf] Re: Time limits in queues

Leif Nixon nixon at nsc.liu.se
Thu Jan 17 00:31:42 PST 2008


Craig Tierney <Craig.Tierney at noaa.gov> writes:

> Allowing users to run for days or weeks as SOP is begging for failure.

Define failure. Our time limit is typically somewhere around 5 or 6
days. Many codes don't have checkpointing, and it's often simply not
possible to add it because you don't have access to the source code.

With backfill scheduling, short and narrow jobs typically don't have
to wait *that* long, at least with the job mixture we see.

-- 
Leif Nixon                       -            Systems expert
------------------------------------------------------------
National Supercomputer Centre    -      Linkoping University
------------------------------------------------------------



More information about the Beowulf mailing list