FW: MPICH, malloc, and my impending assault of one (1) beowulf cluster

Chris Richard Adams chrisa at ASPATECH.COM.BR
Thu Aug 2 09:39:54 PDT 2001



-----Original Message-----
From: Chris Richard Adams 
Sent: Thursday, August 02, 2001 12:38 PM
To: 'mundy erik'
Subject: RE: MPICH, malloc, and my impending assault of one (1) beowulf
cluster


Hi Erik - 

Admiting your abusiveness is the first and hardest step toward
healing..congratulations! I don't have a response to your problem, but
perhaps will run into that soon. Unfortuately I am just trying to get
the pi code to run.  If I may ask...

1) did you use the example that comes with the beowulf install - found
in \usr\mpi-beowulf\examples? That directory has a pi example, but I
can't get the Make to work.  Did you succeed? Can I just run MpiCC -o
pisamp cpi.c?

2.) DO you know of any documentation for those examples, or perhaps
could you share a reference to an MPI example.  I know C, but nothing
about MPI and just want some hello world examples. 

Thanks,
Chris

-----Original Message-----
From: mundy erik [mailto:erik.mundy at HAMPTONU.EDU]
Sent: Wednesday, July 18, 2001 4:35 PM
To: 'beowulf at beowulf.org'
Subject: MPICH, malloc, and my impending assault of one (1) beowulf
cluster


Hello, my name is Erik, and I am an MPICH abuser.

	I am running a simple one master, two slave Beowulf test
cluster,
RHL 6.1, kernel 2.4.4, MPICH 1.2.1, NFS mount from master to slave on
old
PII 400's.  MPICH is giving me some serious headaches - every MPI
program I
execute with a malloc in it crashes with the good old "p4 error:
interrupt
SIGSEGV: 11" message.  I have been experimenting with the test programs
that
come with MPICH for simplicity; for example, 'cpi' runs well on all
three
computers.  It
calculates pi, and I rejoice.  Mpptest also works without a problem
between
any two of the three computers.  But when I try to mpirun "sendrecv" or
"overtake" from examples/test/pt2pt (both of which use a malloc), MPICH
gives it the good old college try and then throws me the errors.
Normally I
would just try to do as much as humanly possible to ignore this problem,
but
the code that this beowulf was designed for works when I execute it on
one
computer, and crashes rather spectacularly with the segmentation
violation
error when I try to mpirun it, even on just one computer, leading me to
think that there is some sort of conflict between MPICH and malloc.  
	Granted, these computers aren't exactly state-of-the-art - each
has
only 128M ram with ~400M swap.  But that should be more than enough to
execute those simple examples.  Has anyone had trouble with the Linux
version of malloc in the past in a situation like this?  If you shudder
when
you hear the words "malloc" and "MPICH" used in the same sentence,
please
email me back.  This might be a bit difficult to track down, and I'm
really
not the best man for the job, all I did was build a beowulf :).  I've
only
been on this list for the last two months but it's taught me that if
anyone
can help its probably you guys.  I am EXTREMELY appreciative of any
assistance you can offer. 

Thanks,

Erik 
erik.mundy at hamptonu.edu

PS - also, I should mention that yes, the code I am trying to run WAS
designed for use with MPI, and yes, I did patch MPICH with the bug fixes
from the Argonne page.  Sorry to take the obvious 'he's so dumb!'
solutions
away... I'm hoping there's one more that maybe I'm just missing :) 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20010802/d9e896b6/attachment.html>


More information about the Beowulf mailing list