Linux memory leak?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Huntsinger, Reid reid_huntsinger at merck.comThu Feb 28 13:54:24 PST 2002
- Previous message: Need help setting up MPI on a cluster
- Next message: Linux memory leak?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
As far as I can tell, on later kernels (2.4.10, 2.4.13, 2.4.17) this is mostly due to aggressive caching. (?) You can get an idea for what's going on by running a program to eat up lots of memory (e.g, malloc then write over and over) and check how long it takes. You should notice that when "free" reports lots of "used" memory but nothing is really running, the program will nearly run as fast as after a fresh boot. The "used" pages are easily given up (not swapped out). This also has the side-effect of making "free" report a reasonable number. However on older kernels this didn't happen; lots of swapping out activity would ensue and the malloc-and-write program would really bog down. Reid Huntsinger Date: Thu, 28 Feb 2002 14:58:07 -0500 From: Josip Loncaric <josip at icase.edu> Reply-To: josip at icase.edu Organization: ICASE To: Beowulf mailing list <beowulf at beowulf.org> Subject: Linux memory leak? On our heterogeneous cluster, we run Red Hat 7.2 updated to stock i686 Linux kernels 2.4.9-21 or 2.4.9-21smp. Sometimes (e.g. after 14 days of normal operation) our nodes report unusually high memory usage even without any user processes active. This can happen on both single CPU and on dual CPU machines, and it used to happen with previous 2.4 kernels. Here is an example: # free total used free shared buffers cached Mem: 512444 449196 63248 0 70164 76332 -/+ buffers/cache: 302700 209744 Swap: 1060272 285492 774780 If I add up all RSS numbers reported by 'ps -e v' I get only about 20,500 KB, and yet this dual CPU system reports 302,700 KB RAM used (without even counting buffers or cache). Apparently, only 'reboot' can recover the missing 282,200 KB. Any ideas on tracking down where the missing memory went? Sincerely, Josip P.S. Here is more detail: # cat /proc/meminfo total: used: free: shared: buffers: cached: Mem: 524742656 460013568 64729088 0 71929856 369848320 Swap: 1085718528 292343808 793374720 MemTotal: 512444 kB MemFree: 63212 kB MemShared: 0 kB Buffers: 70244 kB Cached: 76332 kB SwapCached: 284848 kB Active: 242464 kB Inact_dirty: 188960 kB Inact_clean: 0 kB Inact_target: 131068 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 512444 kB LowFree: 63212 kB SwapTotal: 1060272 kB SwapFree: 774780 kB # ps -e v PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 1 ? S 0:05 139 23 1392 480 0.0 init 2 ? SW 0:00 0 0 0 0 0.0 [keventd] 3 ? SWN 0:01 0 0 0 0 0.0 [ksoftirqd_CPU0] 4 ? SWN 0:01 0 0 0 0 0.0 [ksoftirqd_CPU1] 5 ? SW 0:08 0 0 0 0 0.0 [kswapd] 6 ? SW 0:00 0 0 0 0 0.0 [kreclaimd] 7 ? SW 0:00 0 0 0 0 0.0 [bdflush] 8 ? SW 0:00 0 0 0 0 0.0 [kupdated] 9 ? SW< 0:00 0 0 0 0 0.0 [mdrecoveryd] 13 ? SW 0:13 0 0 0 0 0.0 [kjournald] 88 ? SW 0:00 0 0 0 0 0.0 [khubd] 154 ? SW 0:01 0 0 0 0 0.0 [kjournald] 428 ? S 0:00 41 46 1485 504 0.0 /sbin/pump -i et 453 ? S 0:00 79 23 1452 644 0.1 syslogd -m 0 458 ? S 0:00 46 18 2077 508 0.0 klogd -2 478 ? S 0:00 83 25 1538 604 0.1 portmap 506 ? S 0:00 110 21 1590 616 0.1 rpc.statd 631 ? SL 0:03 24 234 1705 1936 0.3 ntpd -U ntp 685 ? S 0:00 20 12 1439 508 0.0 /usr/sbin/atd 703 ? S 0:00 32 232 2451 656 0.1 /usr/sbin/sshd 736 ? S 0:00 143 133 2138 820 0.1 xinetd -stayaliv 795 ? S 0:00 75 18 1573 624 0.1 crond 843 tty1 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t 844 tty2 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t 845 tty3 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t 846 tty4 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t 847 tty5 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t 848 tty6 S 0:00 109 6 1381 368 0.0 /sbin/mingetty t 849 ? S 2:25 162 10 1429 584 0.1 /opt/sbin/cnm -i 850 ? S 1:39 243 484 1747 928 0.1 /bin/bash /opt/s 1105 ? SW 0:14 0 0 0 0 0.0 [rpciod] 1106 ? SW 0:00 0 0 0 0 0.0 [lockd] 11105 ? S 0:51 125 149 1794 1072 0.2 /usr/PBS/sbin/pb 24146 ? S 0:00 9 423 4804 1996 0.3 sendmail: accept 27052 ? S 0:00 0 400 39 172 0.0 /sbin/dhcpcd -n 27219 ? S 0:00 289 12 2243 1064 0.2 in.rlogind 27220 pts/0 S 0:00 288 16 2339 1120 0.2 login -- root 27221 pts/0 S 0:00 288 484 2047 1360 0.2 -bash 27314 ? S 0:00 168 9 1934 680 0.1 sleep 60 27315 pts/0 R 0:00 175 59 2588 716 0.1 ps -e v # uptime 2:53pm up 14 days, 17:03, 1 user, load average: 0.00, 0.00, 0.00 -- Dr. Josip Loncaric, Research Fellow mailto:josip at icase.edu ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/ NASA Langley Research Center mailto:j.loncaric at larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134 --__--__-- Message: 10 Date: Thu, 28 Feb 2002 15:42:09 -0500 (EST) From: Joshua Baker-LePain <jlb17 at duke.edu> To: Josip Loncaric <josip at icase.edu> cc: Beowulf mailing list <beowulf at beowulf.org> Subject: Re: Linux memory leak? On Thu, 28 Feb 2002 at 2:58pm, Josip Loncaric wrote > # free > total used free shared buffers > cached > Mem: 512444 449196 63248 0 70164 > 76332 > -/+ buffers/cache: 302700 209744 > Swap: 1060272 285492 774780 > > If I add up all RSS numbers reported by 'ps -e v' I get only about > 20,500 KB, and yet this dual CPU system reports 302,700 KB RAM used > (without even counting buffers or cache). Apparently, only 'reboot' can > recover the missing 282,200 KB. Any ideas on tracking down where the > missing memory went? I've seen this behavior even after very little uptime. All you have to do is have a process swap heavily. When that process goes away, it seems as if what's left in swap also stays in memory. Further memory pressure makes stuff then get paged *out* of swap. I tracked it down to an existing bugzilla report: http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=59002 There doesn't seem to be an official resolution from RedHat yet. But a custom compiled 2.4.17 didn't show this behavior. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University --__--__-- Message: 11 Date: Thu, 28 Feb 2002 16:07:25 -0500 (EST) From: "Robert G. Brown" <rgb at phy.duke.edu> To: Beowulf Mailing List <beowulf at beowulf.org> Subject: Motherboard query... Dear Liststers, I'd like to request comments on a couple of dual Athlon motherboards. We are considering both the Tyan Tiger 2466N (760 MPX) and the MSI K7D Master (MS-6501) (also 760 MPX). Our local vendor "supports" MSI motherboards (which just means that we deal with them rather than Tyan in the event of a return, but which makes it reasonable to use the MSI all things being equal). We are going with 760 MPX to get the 64/66 PCI slots, of course -- we actually have a small stack of 2460 Tigers which are not totally painless but which we've more or less tamed. Any experiences yet, good or bad, with either motherboard? The vendor is probably going to loan us an MSI-based dual to test, but there's nothing like the experience of somebody actually running a cluster if there is anybody out there already doing so. I'd also like comments on RAID alternatives. We have a group who needs about 500 GB of RAID. We just got a Promise UltraTrak100 TX8 (IDE-SCSI) RAID chassis that advertised decent itself as OS-independent plug and play -- attach to SCSI bus and go. The first unit we were shipped didn't work under any OS. The second we were shipped we got the vendor (Megahaus) to verify function before shipping and it does "work", but it returns unbelieveably poor performance at RAID 5 -- a (very) few MB/sec -- under bonnie. From this we learned (among many things:-) that vendors often quote performance numbers on a RAID from its RAID 0 configuration, which would kind of funny if it weren't for the murderous impulses it creates when you learn that their numbers are some sort of cruel joke under RAID 5. We are twisting Megahaus's arm to take it back and give us our money back (they are complaining that it is more than thirty days since they delivered the FIRST unit, but we've only had a working unit for about two weeks and do not want it if its SCSI performance is that abysmal). We are then stuck looking for an alternative at roughly the same cost. Our alternatives seem to be: a) Another IDE-RAID enclosure, perhaps from a better manufacturer. However, at this point we're more than a bit concerned about the gap between vendor performance claims and reality. There are vendors that assert 100 MB/sec read times, but we are concerned that they mean "at RAID 0" which is useless to us. We need real-world loaded numbers at RAID 5 (e.g. multiple instances of bonnie). Folks we know locally who have e.g. zero-d chassis report real world throughput more like 20 MB/sec RW, but their boxes are a year or two old and may not reflect current rates. 20 MB/sec is pretty much the LOWEST rate we could tolerate in this application under multithreaded load, and we'd like something better. Any enclosure/controllers out there that give good-to excellent performance that you'd care to recommend? b) md-raid, either ide or scsi, on a straight linux server. We know that this works remarkably well. We run md raid in the departmental server (scsi, with a stack of 36 GB disks in RAID 5) and get excellent performance -- ~40 MB/sec write throughput and even better for read. Unfortunately large SCSI disks are still excessively expensive and we don't have the budget to reach 500 GB with SCSI disks for this cluster. IDE is cheap and easy, but we would like a bit of assurance that linux won't have (e.g. DMA) problems when dealing with 6-8 ide controllers on one bus. Is anyone doing this? Good, bad experiences, hardware recommendations or gotchas all welcome. c) SCSI RAID. Definitely works, definitely high performance, but also the most expensive and again, we won't be able to afford to reach our design spec with the money allocated to this ($5-6K total). If we have to fall back to SCSI we will and will live with a smaller RAID than we had hoped, but we'd very much like to first find out if IDE-based RAID solutions (RAID 5 on ~500GB total disk) with >20 MB sec worst case write rates under heavy load exist. TIA, rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu --__--__-- _______________________________________________ Beowulf mailing list Beowulf at beowulf.org http://www.beowulf.org/mailman/listinfo/beowulf End of Beowulf Digest
- Previous message: Need help setting up MPI on a cluster
- Next message: Linux memory leak?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
