bpsh and memory leak - wien - solution

Donald Becker becker at scyld.com
Thu Oct 3 05:39:02 PDT 2002


On Thu, 3 Oct 2002, Florent Calvayrac wrote:

> Donald Becker wrote:
> > OK, my guess is still that some output is consuming memory on the
> > RAMdisk root.
> 
> but why would it grow infinitely ? Like I said, when the RAMdisk
> is full, it is full ! (and it's only 70 Mo large)

As the ramdisk fills, it grows.  As there is less available memory, the
kernel gets much less efficient at dealing with new page requests.
This usually happen when you are running with swap, and the non-ramdisk
pages take their turn, one-by-one, in memory.

> actually, the solution was found this way : following a solution
> I found in the archives about exporting slave disks, I tried
> 
> bpsh 0 bash
> 
> then "top" fails miserably...but our program runs fine !

Hmmm, what is the behavior with
   bpsh -n ...

> So I wrote a small shell script with only one line (our program)
> and it's ok. I guess that the program is writing something
> to /dev/null, like it was suggested about the nfs problem,
> and that when run on the remote node something does not work about it,
> but locally it's ok. At least it's my guess on why this hack works...
> but it works. So now, if anyone is interested, I have a few patches to the scripts 
> to run Wien97 interactively in parallel on a Scyld cluster (actually, Wien is parallelized
> by files and scripts, normally rsh/nfs is needed and not mpi).

Thanks for posting the solution.  On the driver mailing lists there are
often questions but no follow-ups when things work.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993




More information about the Beowulf mailing list