Linux memory leak?
Josip Loncaric
josip at icase.edu
Thu Feb 28 11:58:07 PST 2002
On our heterogeneous cluster, we run Red Hat 7.2 updated to stock i686
Linux kernels 2.4.9-21 or 2.4.9-21smp. Sometimes (e.g. after 14 days of
normal operation) our nodes report unusually high memory usage even
without any user processes active. This can happen on both single CPU
and on dual CPU machines, and it used to happen with previous 2.4
kernels. Here is an example:
# free
total used free shared buffers
cached
Mem: 512444 449196 63248 0 70164
76332
-/+ buffers/cache: 302700 209744
Swap: 1060272 285492 774780
If I add up all RSS numbers reported by 'ps -e v' I get only about
20,500 KB, and yet this dual CPU system reports 302,700 KB RAM used
(without even counting buffers or cache). Apparently, only 'reboot' can
recover the missing 282,200 KB. Any ideas on tracking down where the
missing memory went?
Sincerely,
Josip
P.S. Here is more detail:
# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 524742656 460013568 64729088 0 71929856 369848320
Swap: 1085718528 292343808 793374720
MemTotal: 512444 kB
MemFree: 63212 kB
MemShared: 0 kB
Buffers: 70244 kB
Cached: 76332 kB
SwapCached: 284848 kB
Active: 242464 kB
Inact_dirty: 188960 kB
Inact_clean: 0 kB
Inact_target: 131068 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 512444 kB
LowFree: 63212 kB
SwapTotal: 1060272 kB
SwapFree: 774780 kB
# ps -e v
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
1 ? S 0:05 139 23 1392 480 0.0 init
2 ? SW 0:00 0 0 0 0 0.0 [keventd]
3 ? SWN 0:01 0 0 0 0 0.0 [ksoftirqd_CPU0]
4 ? SWN 0:01 0 0 0 0 0.0 [ksoftirqd_CPU1]
5 ? SW 0:08 0 0 0 0 0.0 [kswapd]
6 ? SW 0:00 0 0 0 0 0.0 [kreclaimd]
7 ? SW 0:00 0 0 0 0 0.0 [bdflush]
8 ? SW 0:00 0 0 0 0 0.0 [kupdated]
9 ? SW< 0:00 0 0 0 0 0.0 [mdrecoveryd]
13 ? SW 0:13 0 0 0 0 0.0 [kjournald]
88 ? SW 0:00 0 0 0 0 0.0 [khubd]
154 ? SW 0:01 0 0 0 0 0.0 [kjournald]
428 ? S 0:00 41 46 1485 504 0.0 /sbin/pump -i et
453 ? S 0:00 79 23 1452 644 0.1 syslogd -m 0
458 ? S 0:00 46 18 2077 508 0.0 klogd -2
478 ? S 0:00 83 25 1538 604 0.1 portmap
506 ? S 0:00 110 21 1590 616 0.1 rpc.statd
631 ? SL 0:03 24 234 1705 1936 0.3 ntpd -U ntp
685 ? S 0:00 20 12 1439 508 0.0 /usr/sbin/atd
703 ? S 0:00 32 232 2451 656 0.1 /usr/sbin/sshd
736 ? S 0:00 143 133 2138 820 0.1 xinetd -stayaliv
795 ? S 0:00 75 18 1573 624 0.1 crond
843 tty1 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
844 tty2 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
845 tty3 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
846 tty4 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
847 tty5 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
848 tty6 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
849 ? S 2:25 162 10 1429 584 0.1 /opt/sbin/cnm -i
850 ? S 1:39 243 484 1747 928 0.1 /bin/bash /opt/s
1105 ? SW 0:14 0 0 0 0 0.0 [rpciod]
1106 ? SW 0:00 0 0 0 0 0.0 [lockd]
11105 ? S 0:51 125 149 1794 1072 0.2 /usr/PBS/sbin/pb
24146 ? S 0:00 9 423 4804 1996 0.3 sendmail: accept
27052 ? S 0:00 0 400 39 172 0.0 /sbin/dhcpcd -n
27219 ? S 0:00 289 12 2243 1064 0.2 in.rlogind
27220 pts/0 S 0:00 288 16 2339 1120 0.2 login --
root
27221 pts/0 S 0:00 288 484 2047 1360 0.2 -bash
27314 ? S 0:00 168 9 1934 680 0.1 sleep 60
27315 pts/0 R 0:00 175 59 2588 716 0.1 ps -e v
# uptime
2:53pm up 14 days, 17:03, 1 user, load average: 0.00, 0.00, 0.00
--
Dr. Josip Loncaric, Research Fellow mailto:josip at icase.edu
ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/
NASA Langley Research Center mailto:j.loncaric at larc.nasa.gov
Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134
More information about the Beowulf
mailing list