[Beowulf] problem of mpich-1.2.7p1

Gus Correa gus at ldeo.columbia.edu
Tue Feb 2 17:58:01 PST 2010


PS - And don't run the programs as root!

Gus Correa

Gus Correa wrote:
> Hi Christian
> 
> Somehow your program was not attached to the message.
> 
> In any case, you didn't say anything about your "machinefile" contents.
> You need to list the nodes you want to use there.
> The command line will be something like this:
> 
> mpirun -np 4 -machinefile my_machinefile canon
> 
> "man mpirun" may help you with the details.
> (I assume you are using the mpirun that comes with mpich1.)
> 
> Having said that, I suggest that you move from MPICH-1 to
> OpenMPI or to MPICH2.
> MPICH-1 (mpich-1.2.7p1) is old, not maintained or supported anymore,
> and often times breaks in current Linux kernels.
> The MPICH developers also recommend upgrading to MPICH2.
> 
> OpenMPI and MPICH2 are free, easy to install, stable, up to date,
> and more efficient than MPICH1.
> Upgrading to one of them is likely to avoid more trouble later,
> specially with your tight deadline.
> 
> See:
> http://www.open-mpi.org/
> http://www.mcs.anl.gov/research/projects/mpich2/
> 
> 
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
> 
> 
> christian suhendra wrote:
>> hello guys
>> i have installed mpich-1.2.7p1 on ubuntu 9.04, i have configured hte 
>> NFS and RSH..
>> i use device=ch_p4,,
>> but when i ran my program it's like not working i've got this result :
>> root at cluster3:/mirror/mpich-1.2.7p1# mpirun -np 1 canon
>> Process 0 of 1 on cluster3
>> Total Time: 4.316000 msecs
>> root at cluster3:/mirror/mpich-1.2.7p1# mpirun -np 4 canon
>> Process 0 of 4 on cluster3
>> Total Time: 21.552000 msecs
>> Process 2 of 4 on cluster2
>> Process 1 of 4 on cluster1
>> Process 3 of 4 on cluster1
>> root at cluster3:/mirror/mpich-1.2.7p1#
>>
>> the process only wotk in 1 node..
>> but when i test the machine it connected to all node..
>> root at cluster3:/mirror/mpich-1.2.7p1# 
>> /mirror/mpich-1.2.7p1/sbin/tstmachines -v LINUX
>> Trying true on cluster1 ...
>> Trying true on cluster2 ...
>> Trying true on cluster3 ...
>> Trying true on cluster4 ...
>> Trying ls on cluster1 ...
>> Trying ls on cluster2 ...
>> Trying ls on cluster3 ...
>> Trying ls on cluster4 ...
>> Trying user program on cluster1 ...
>> Trying user program on cluster2 ...
>> Trying user program on cluster3 ...
>> Trying user program on cluster4 ...
>>
>> i don't know where exactly the problem so that my program cannot run 
>> in all node..
>> please help me...
>> my deadline its about 1 week later...
>> i'm very excpeting your help...
>>
>>
>> i attached my listing program so you can test on your system
>> thank you very much...
>>
>>
>>
>>
>> regards
>> christian
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit 
>> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list