[Beowulf] size of swap partition

Mikhail Kuzminsky kus at free.net
Tue Jun 10 10:35:46 PDT 2008


In message from Mark Hahn <hahn at mcmaster.ca> (Tue, 10 Jun 2008 
00:58:12 -0400 (EDT)):
... 
>for instance, you can always avoid OOM with the vm.overcommit_memory=2
>sysctl (you'll need to tune vm.overcommit_ratio and the amount of swap
>to get the desired limits.)  in this mode, the kernel tracks how much 
>VM
>it actually needs (worst-case, reflected in Committed_AS in 
>/proc/meminfo)
>and compares that to a commit limit that reflects ram and swap.
>
>if you don't use overcommit_memory=2, you are basically borrowing VM
>space in hopes of not needing it.  that can still be reasonable, 
>considering
>how often processes have a lot of shared VM, and how many processes 
>allocate but never touch lots of pages.  but you have to ask yourself:
>would I like a system that was actually _using_ 16 GB of swap?  if you
>have 16x disks, perhaps, but 16G will suck if you only have 1 disk.
>at least for overcommit_memory != 2, I don't see the point of 
>configuring
>a lot of swap, since the only time you'd use it is if you were 
>thrashing.
>sort of a "quality of life" argument.
>
>>> But what are the reccomendations of modern praxis ?
>
>it depends a lot on the size variance of your jobs, as well as their 
>real/virtual ratio.  the kernel only enforces RLIMIT_AS
>(vsz in ps),assuming a 2.6 kernel - I forget whether 2.4 did 
>RLIMIT_RSS or not.
>
>if you use overcommit_memory=2, your desired max VM size determines 
>the amount of swap.  otherwise, go with something modest - memory size
>or so.  but given that the smallest reasonable single disk these days
>is probably about 320GB, it's hard to justify being _too_ tight.
:-) The disks we use in nodes is SATA WD/10K RPM w/70 GB :-))

We didn't set overcommit_memory=2, but really use strongly restricted 
scheduling police for SGE batch jobs using only few applications. We 
have only batch jobs (no interactive), moreover - practically only 
*long batch jobs*. As a result we have summary VM (requested per node) 
equal (or lower) than RAM. There is practically zero swap activity. 
The only exclusion are (seldom executed) small test jobs, 
non-parallelized, mainly for check of input data. They use small RAM 
amount. So it looks for me that I may set even lower than 1.5*RAM swap 
size (I think RAM+4G = 20G will be enough).

In message from Walid <walid.shaari at gmail.com> (Tue, 10 Jun 2008 
19:27:43 +0300):
>Hi,
>For an 8GB dual socket quad core node, choosing in the kick start
>file --recommended instead of specifying size RHEL5 allocates 1GB of
>memory. our developers say that they should not swap as this will
>cause an overhead, and they try to avoid it as much as possible

OpenSuSE 10.3 recommends swap size=2 GB only, but I don't know, 
performs SuSE inst software some estimation of server RAM or no. 

Yours
Mikhail




More information about the Beowulf mailing list