[Beowulf] running MPICH on AMD Opteron Dual Core Processor Cluster( 72 Cpu's)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Vadivelan Rathinasabapathy r.vadivelanrhce at gmail.comFri Dec 29 02:26:55 PST 2006
- Previous message: FW: [Beowulf] Which distro for the cluster?
- Next message: [Beowulf] picking a job scheduler
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear all
We have a problem of running application that are complied with MPICH. Our
Setup is a 16 Node 72 Cpu AMD Opteron cluster which has Rocks-4.1.2 and
RHEL 4.0 update 4 installed in it.
We are trying to run a benchmark with MPICH which came along with the
ROCKS installation. the run starts and then the following error occurs after
sometime.
" p1_8544: p4_error: Timeout in Establishing connection to remote process:
0 "
rm_l_1_8667: (359.417969) net_send: could not write to fd=5, errno=104
We have been trying the same for the past two days and we didnt get any
solution for the above.
Also we downloaded the Latest MPICH 1.2.7p1 and configured the same. now for
the same testing with the latest mpich, the code seems to be running in the
Master Server no matter, how many number of processors we give.
The same testing with LAM/MPI and OPENMPI are working fine. pls provide us a
good solution
--
Thanks and Regards
R.Vadivelan
CMC Ltd,
Bangalore
r.vadivelanrhce at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.scyld.com/pipermail/beowulf/attachments/20061229/0352fead/attachment.html
- Previous message: FW: [Beowulf] Which distro for the cluster?
- Next message: [Beowulf] picking a job scheduler
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
