[Beowulf] first cluster

Christopher Samuel samuel at unimelb.edu.au
Thu Jul 15 21:09:36 PDT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 16/07/10 11:29, Mark Hahn wrote:

> every distro I've seen leaves these at the default seting:
> vm.overcommit_memory=0.  this is basically the traditional
> setting that tells the kernel to feel free to allocate way
> too much memory, and to resolve memory crunches via OOM

Looking at the kernel code if you set it vm.overcommit_memory
to 0 (OVERCOMMIT_GUESS) then the kernel allows *each process*
to allocate up to 97% of the total of RAM+swap (the last 3% is
reserved for root, or processes with CAP_SYS_ADMIN).   The catch
is that (as highlighted) the limit is a per process one, not a
system wide one.

With it set to 1 (OVERCOMMIT_ALWAYS) there are no checks at all,
it just returns 0 (OK) so any process can allocate as much as it
wants, just that you don't know who or what will get OOM'd when
you want to use it.. ;-)

With 2 (OVERCOMMIT_NEVER) you can never specify more than
your entire RAM+swap and the limit is applied across the
system.

We enforce RLIMIT_AS for MPI and single CPU processes by
setting pvmem limits in Torque in the default queue.

That doesn't work for SMP jobs so we have an 'smp' queue
for then which sets mem= instead, this means that pbs_mom
monitors the children and kills them if they go over their
limits.

cheers!
Chris
- -- 
 Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computational Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkw/238ACgkQO2KABBYQAh88PQCfdmVZjYE2GznidzDNPOJ2zO6U
DbIAnjKaviRyxIIsNVmsS3zfgbM0M7uZ
=eLad
-----END PGP SIGNATURE-----



More information about the Beowulf mailing list