How to tell when a job is swapping?

Jakob Østergaard jakob at unthought.net
Tue Feb 19 08:31:01 PST 2002


On Tue, Feb 19, 2002 at 09:23:25AM -0500, Jeff Layton wrote:
> Good morning,
> 
>    For a while now I've been checking if a job is swapping
> on our clusters using bWatch. The nodes are dual CPU
> boxes and we run two MPI processes per node. I usually
> look at the load on the nodes to see if it is above 3.0
> (sometimes our code will peak out at about 2.3) and I
> look at the free swap space number (bWatch just cats the
> /proc/meminfo file).
>    I usually assume that if the free swap space falls below
> the maximum and load starts climbing that the node is
> swapping. However, when I talk to the user, he states that
> the code is running fine and the timing numbers are where
> they should be. So, I'm obviously interpreting something
> incorrectly (unless the job is really swapping but for some
> reason performance is unaffected).
>    Does someone give me a could way to check if a job
> is swapping? Maybe a URL?


You can test if "a" job is swapping (not a particular job) using
vmstat.

See, you don't care that you're 2 G into swap usually, as long as
it's rarely used data that's swapped out.  And it will have no
performance impact on the system either.  What you care about is
whether a job is "thrashing".

A small quiz to illustrate my point:   Is this system loaded ?

[albatros:joe] $ free
             total       used       free     shared    buffers     cached
Mem:        513792     420720      93072          0       4696     111252
-/+ buffers/cache:     304772     209020
Swap:      2101000     718688    1382312
[albatros:joe] $ 

Oh, it has 512 MB of memory, and it's 718 MB into swap - oh horror !

Now look at vmstat:
[albatros:joe] $ vmstat 1
   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
16  2  2 718620   4912   4896 108560   0   0     0     2    7     3   2   5   1
12  5  2 718620   4884   4744 108024  40   0    64     0 3222  3654  84  16   0
17  2  1 718620   5224   4360 105528   0   0    48     0 3960  4195  82  18   0
19  6  2 718620   6244   4160 104908   0   0   168     0 3820  4368  75  25   0
 9  7  1 718620   5060   3932 100688 128   0   148  1672 3252  3514  77  23   0

The so and si numbers tell me how much paging (in and out) is happening - 
the swap space is almost idle here.

Conclusion:  This sytem is not stressed at all (wrt. swap space).

-- 
................................................................
:   jakob at unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:



More information about the Beowulf mailing list