[Beowulf] [EXTERNAL] Re: Interactive vs batch, and schedulers

Sat Jan 18 08:01:49 PST 2020

On Fri, Jan 17, 2020 at 11:10 AM Prentice Bisbal via Beowulf
<beowulf at beowulf.org> wrote:
>
> The problem with timeslicing is that when one job is pre-empted, it's state needs to be stored somewhere so the next job can run. Since many HPC jobs are memory intensive, using RAM for this is not usually an option. Which leaves writing the state to disk. Since disk is many orders of magnitude slower than RAM, writing state to disk for timeslicing would ultimately reduce the throughput of the cluster. It's much more efficient to have one job "own" the nodes until it completes.

that's true, but only in a general sense. i used to have an old
quadrics machine that ran pbs (i think i might be mis-remembering) and
it was setup for timeslicing.  the trick was that we locked the
allocation so that each core was a division of ram and you couldn't
allocate more ram per core then we said.  since the nodes had four
times the amount of ram per core then we said each allocation could
have, we could effectively run four jobs in parallel (from a memory
standpoint).  the scheduler would switch the jobs around using round
robin giving each one 100% of the cpu's for x period of time

was it ideal, probably not.  but it made the researchers happy,
because there was very little waiting in the queue.

now, we're primarily batch scheduling through htcondor.  slurm gets
preference if someone wants something interactive and htcondor just
fills in around the edges across all my clusters.  we're not doing
batch in slurm, just interactive (python/mpi/ML/etc)