MPIrun errors
Joseph E. Rose, Jr.
jerosejr at cajunbro.com
Mon Jul 31 14:58:08 PDT 2000
I am getting this message any time I run the program on any nodes across the
network even after resetting the switch and nodes. The program runs fine on
1 or 2 processors (no network involvement/ dual proc. mobo).
(master-node: root):/usr/local/mpi/bin/mpirun -v -machinefile nodes -np 4
fdtdrfhat < man3mm.dat
running /home/kevinj/d.Fortran/d.compile/d.execute/fdtdrfhat on 4 LINUX
ch_p4 processors
Created /home/kevinj/d.Fortran/d.compile/d.execute/PI1995
rm_1_506: (0.343660) process not in process table; my_unix_id = 506
my_host=node
rm_1_506: (0.343828) Probable cause: local slave on uniprocessor without
shared memory
rm_1_506: (0.343848) Probable fix: ensure only one process on node
rm_1_506: (0.343863) (on master process this means 'local 0' in the
procgroup file)
rm_1_506: (0.343876) You can also remake p4 with SYSV_IPC set in the OPTIONS
file
rm_1_506: (0.343889) Alternate cause: Using localhost as a machine name in
the progroup
rm_1_506: (0.343904) file. The names used should match the external network
names.
rm_1_506: p4_error: p4_get_my_id_from_proc: 0
rm_l_1_507: p4_error: interrupt SIGINT: 2
p0_2124: p4_error: net_recv read: probable EOF on socket: 118457
rm_l_2_507: p4_error: interrupt SIGINT: 2
rm_l_3_502: p4_error: interrupt SIGINT: 2
bm_list_2125: p4_error: interrupt SIGINT: 2
rm_506: p4_error: interrupt SIGINT: 2
p3_501: p4_error: interrupt SIGINT: 2
etc., etc., etc.
Thanks in advance,
Joe
Joseph E. Rose, Jr.
BRISK System Analyst
Illgen Simulation Technologies, Inc.
(210) 348-6886/ Cell: (210) 710-9060
San Antonio, Texas, U.S.A
"What you can do, or dream you can, begin it;
Boldness has genius, power, and magic in it."
--Goethe
More information about the Beowulf
mailing list