Linux memory leak?
Huntsinger, Reid
reid_huntsinger at merck.com
Thu Feb 28 13:54:24 PST 2002
As far as I can tell, on later kernels (2.4.10, 2.4.13, 2.4.17) this is
mostly due to aggressive caching. (?) You can get an idea for what's going
on by running a program to eat up lots of memory (e.g, malloc then write
over and over) and check how long it takes. You should notice that when
"free" reports lots of "used" memory but nothing is really running, the
program will nearly run as fast as after a fresh boot. The "used" pages are
easily given up (not swapped out). This also has the side-effect of making
"free" report a reasonable number.
However on older kernels this didn't happen; lots of swapping out activity
would ensue and the malloc-and-write program would really bog down.
Reid Huntsinger
Date: Thu, 28 Feb 2002 14:58:07 -0500
From: Josip Loncaric <josip at icase.edu>
Reply-To: josip at icase.edu
Organization: ICASE
To: Beowulf mailing list <beowulf at beowulf.org>
Subject: Linux memory leak?
On our heterogeneous cluster, we run Red Hat 7.2 updated to stock i686
Linux kernels 2.4.9-21 or 2.4.9-21smp. Sometimes (e.g. after 14 days of
normal operation) our nodes report unusually high memory usage even
without any user processes active. This can happen on both single CPU
and on dual CPU machines, and it used to happen with previous 2.4
kernels. Here is an example:
# free
total used free shared buffers
cached
Mem: 512444 449196 63248 0 70164
76332
-/+ buffers/cache: 302700 209744
Swap: 1060272 285492 774780
If I add up all RSS numbers reported by 'ps -e v' I get only about
20,500 KB, and yet this dual CPU system reports 302,700 KB RAM used
(without even counting buffers or cache). Apparently, only 'reboot' can
recover the missing 282,200 KB. Any ideas on tracking down where the
missing memory went?
Sincerely,
Josip
P.S. Here is more detail:
# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 524742656 460013568 64729088 0 71929856 369848320
Swap: 1085718528 292343808 793374720
MemTotal: 512444 kB
MemFree: 63212 kB
MemShared: 0 kB
Buffers: 70244 kB
Cached: 76332 kB
SwapCached: 284848 kB
Active: 242464 kB
Inact_dirty: 188960 kB
Inact_clean: 0 kB
Inact_target: 131068 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 512444 kB
LowFree: 63212 kB
SwapTotal: 1060272 kB
SwapFree: 774780 kB
# ps -e v
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
1 ? S 0:05 139 23 1392 480 0.0 init
2 ? SW 0:00 0 0 0 0 0.0 [keventd]
3 ? SWN 0:01 0 0 0 0 0.0 [ksoftirqd_CPU0]
4 ? SWN 0:01 0 0 0 0 0.0 [ksoftirqd_CPU1]
5 ? SW 0:08 0 0 0 0 0.0 [kswapd]
6 ? SW 0:00 0 0 0 0 0.0 [kreclaimd]
7 ? SW 0:00 0 0 0 0 0.0 [bdflush]
8 ? SW 0:00 0 0 0 0 0.0 [kupdated]
9 ? SW< 0:00 0 0 0 0 0.0 [mdrecoveryd]
13 ? SW 0:13 0 0 0 0 0.0 [kjournald]
88 ? SW 0:00 0 0 0 0 0.0 [khubd]
154 ? SW 0:01 0 0 0 0 0.0 [kjournald]
428 ? S 0:00 41 46 1485 504 0.0 /sbin/pump -i et
453 ? S 0:00 79 23 1452 644 0.1 syslogd -m 0
458 ? S 0:00 46 18 2077 508 0.0 klogd -2
478 ? S 0:00 83 25 1538 604 0.1 portmap
506 ? S 0:00 110 21 1590 616 0.1 rpc.statd
631 ? SL 0:03 24 234 1705 1936 0.3 ntpd -U ntp
685 ? S 0:00 20 12 1439 508 0.0 /usr/sbin/atd
703 ? S 0:00 32 232 2451 656 0.1 /usr/sbin/sshd
736 ? S 0:00 143 133 2138 820 0.1 xinetd -stayaliv
795 ? S 0:00 75 18 1573 624 0.1 crond
843 tty1 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
844 tty2 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
845 tty3 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
846 tty4 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
847 tty5 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
848 tty6 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t
849 ? S 2:25 162 10 1429 584 0.1 /opt/sbin/cnm -i
850 ? S 1:39 243 484 1747 928 0.1 /bin/bash /opt/s
1105 ? SW 0:14 0 0 0 0 0.0 [rpciod]
1106 ? SW 0:00 0 0 0 0 0.0 [lockd]
11105 ? S 0:51 125 149 1794 1072 0.2 /usr/PBS/sbin/pb
24146 ? S 0:00 9 423 4804 1996 0.3 sendmail: accept
27052 ? S 0:00 0 400 39 172 0.0 /sbin/dhcpcd -n
27219 ? S 0:00 289 12 2243 1064 0.2 in.rlogind
27220 pts/0 S 0:00 288 16 2339 1120 0.2 login --
root
27221 pts/0 S 0:00 288 484 2047 1360 0.2 -bash
27314 ? S 0:00 168 9 1934 680 0.1 sleep 60
27315 pts/0 R 0:00 175 59 2588 716 0.1 ps -e v
# uptime
2:53pm up 14 days, 17:03, 1 user, load average: 0.00, 0.00, 0.00
--
Dr. Josip Loncaric, Research Fellow mailto:josip at icase.edu
ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/
NASA Langley Research Center mailto:j.loncaric at larc.nasa.gov
Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134
--__--__--
Message: 10
Date: Thu, 28 Feb 2002 15:42:09 -0500 (EST)
From: Joshua Baker-LePain <jlb17 at duke.edu>
To: Josip Loncaric <josip at icase.edu>
cc: Beowulf mailing list <beowulf at beowulf.org>
Subject: Re: Linux memory leak?
On Thu, 28 Feb 2002 at 2:58pm, Josip Loncaric wrote
> # free
> total used free shared buffers
> cached
> Mem: 512444 449196 63248 0 70164
> 76332
> -/+ buffers/cache: 302700 209744
> Swap: 1060272 285492 774780
>
> If I add up all RSS numbers reported by 'ps -e v' I get only about
> 20,500 KB, and yet this dual CPU system reports 302,700 KB RAM used
> (without even counting buffers or cache). Apparently, only 'reboot' can
> recover the missing 282,200 KB. Any ideas on tracking down where the
> missing memory went?
I've seen this behavior even after very little uptime. All you have to do
is have a process swap heavily. When that process goes away, it seems as
if what's left in swap also stays in memory. Further memory pressure
makes stuff then get paged *out* of swap.
I tracked it down to an existing bugzilla report:
http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=59002
There doesn't seem to be an official resolution from RedHat yet. But a
custom compiled 2.4.17 didn't show this behavior.
--
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University
--__--__--
Message: 11
Date: Thu, 28 Feb 2002 16:07:25 -0500 (EST)
From: "Robert G. Brown" <rgb at phy.duke.edu>
To: Beowulf Mailing List <beowulf at beowulf.org>
Subject: Motherboard query...
Dear Liststers,
I'd like to request comments on a couple of dual Athlon motherboards.
We are considering both the Tyan Tiger 2466N (760 MPX) and the MSI K7D
Master (MS-6501) (also 760 MPX). Our local vendor "supports" MSI
motherboards (which just means that we deal with them rather than Tyan
in the event of a return, but which makes it reasonable to use the MSI
all things being equal). We are going with 760 MPX to get the 64/66 PCI
slots, of course -- we actually have a small stack of 2460 Tigers which
are not totally painless but which we've more or less tamed.
Any experiences yet, good or bad, with either motherboard? The vendor
is probably going to loan us an MSI-based dual to test, but there's
nothing like the experience of somebody actually running a cluster if
there is anybody out there already doing so.
I'd also like comments on RAID alternatives. We have a group who needs
about 500 GB of RAID. We just got a Promise UltraTrak100 TX8 (IDE-SCSI)
RAID chassis that advertised decent itself as OS-independent plug and
play -- attach to SCSI bus and go. The first unit we were shipped
didn't work under any OS. The second we were shipped we got the vendor
(Megahaus) to verify function before shipping and it does "work", but it
returns unbelieveably poor performance at RAID 5 -- a (very) few MB/sec
-- under bonnie. From this we learned (among many things:-) that
vendors often quote performance numbers on a RAID from its RAID 0
configuration, which would kind of funny if it weren't for the murderous
impulses it creates when you learn that their numbers are some sort of
cruel joke under RAID 5.
We are twisting Megahaus's arm to take it back and give us our money
back (they are complaining that it is more than thirty days since they
delivered the FIRST unit, but we've only had a working unit for about
two weeks and do not want it if its SCSI performance is that abysmal).
We are then stuck looking for an alternative at roughly the same cost.
Our alternatives seem to be:
a) Another IDE-RAID enclosure, perhaps from a better manufacturer.
However, at this point we're more than a bit concerned about the gap
between vendor performance claims and reality. There are vendors that
assert 100 MB/sec read times, but we are concerned that they mean "at
RAID 0" which is useless to us. We need real-world loaded numbers at
RAID 5 (e.g. multiple instances of bonnie). Folks we know locally who
have e.g. zero-d chassis report real world throughput more like 20
MB/sec RW, but their boxes are a year or two old and may not reflect
current rates. 20 MB/sec is pretty much the LOWEST rate we could
tolerate in this application under multithreaded load, and we'd like
something better. Any enclosure/controllers out there that give good-to
excellent performance that you'd care to recommend?
b) md-raid, either ide or scsi, on a straight linux server. We know
that this works remarkably well. We run md raid in the departmental
server (scsi, with a stack of 36 GB disks in RAID 5) and get excellent
performance -- ~40 MB/sec write throughput and even better for read.
Unfortunately large SCSI disks are still excessively expensive and we
don't have the budget to reach 500 GB with SCSI disks for this cluster.
IDE is cheap and easy, but we would like a bit of assurance that linux
won't have (e.g. DMA) problems when dealing with 6-8 ide controllers on
one bus. Is anyone doing this? Good, bad experiences, hardware
recommendations or gotchas all welcome.
c) SCSI RAID. Definitely works, definitely high performance, but also
the most expensive and again, we won't be able to afford to reach our
design spec with the money allocated to this ($5-6K total).
If we have to fall back to SCSI we will and will live with a smaller
RAID than we had hoped, but we'd very much like to first find out if
IDE-based RAID solutions (RAID 5 on ~500GB total disk) with >20 MB sec
worst case write rates under heavy load exist.
TIA,
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
--__--__--
_______________________________________________
Beowulf mailing list
Beowulf at beowulf.org
http://www.beowulf.org/mailman/listinfo/beowulf
End of Beowulf Digest
More information about the Beowulf
mailing list