[Beowulf] Running Different master and slave executables under MPI
J.Wood
j.wood at qmul.ac.uk
Sun May 1 04:12:21 PDT 2005
Hello ,
When I ran MPI on a cluster of SGI workstations , I was able to load executable code
onto the Master processor and different executable code loaded onto the Slave processors . I want to
do something similar on my College Cluster parallel machine . This is a stand-alone parallel
computer with about 150 individual processors (cn1 , cn2 , ......, cn100 etc) .
'fe03.esc.qmul.ac.uk' below is the name of the front-end for the parallel cluster .
Below , I attach the QSUB file and the myprocgroup file I use .
I also attach the error message from the system .Can you help ?.
Best Regards ,
Jim Wood
#!/bin/sh
#specify the number of nodes requested and the
# number of processors per node.
#PBS -l nodes=3:ppn=2,cput=5:00:00 -W x=\"NACCESSPOLICY:SINGLEJOB\"
NPROC=`wc -l < $PBS_NODEFILE`
echo "Allocated nodes are:"
cat $PBS_NODEFILE
echo "NUM PROC is: $NPROC"
cd /home/hep/wood/wmpi
mpirun -p4pg myprocgroup nlb.m.cryp.mod
fe03.esc.qmul.ac.uk 0 /home/hep/wood/wmpi/nlb.m.cryp.mod
fe03.esc.qmul.ac.uk 1 /home/hep/wood/wmpi/nlb.s.cryp.mod
fe03.esc.qmul.ac.uk 1 /home/hep/wood/wmpi/nlb.s.cryp.mod
fe03.esc.qmul.ac.uk 1 /home/hep/wood/wmpi/nlb.s.cryp.mod
fe03.esc.qmul.ac.uk 1 /home/hep/wood/wmpi/nlb.s.cryp.mod
fe03.esc.qmul.ac.uk 1 /home/hep/wood/wmpi/nlb.s.cryp.mod
Allocated nodes are:
cn119
cn119
cn118
cn118
cn117
cn117
NUM PROC is: 6
rm_6571: p4_error: rm_start: net_conn_to_listener failed: 32877
p0_9640: p4_error: Child process exited while making connection to remote process on fe03.esc.qmul.ac.uk: 0
p0_9640: (18.562668) net_send: could not write to fd=4, errno = 32
P4 procgroup file is myprocgroup.
More information about the Beowulf
mailing list