mpich device=ch_p4 -comm=shared, P4_GLOBMEMSIZE segfault
Cabaniols, Sebastien
Sebastien.Cabaniols at compaq.com
Tue Feb 20 01:52:36 PST 2001
Hi beowulf people
I have several 4 Cpus Linux Alpha machines (ES40) and I want to launch
some mpich jobs on them. I only have fast ethernet at the moment
so I need the ch_p4 device to go between boxes and the -comm=shared
to compute efficiently into the box.
when I launch my job on one machine only, I have mpich complaining about
the amount of memory allocated in shared-memory:
p1_2061: (6.257812) xx_shmalloc: returning NULL; requested 2609648 bytes
> p1_2061: (6.257812) p4_shmalloc returning NULL; request = 2609648 bytes
> You can increase the amount of memory by setting the environment variable
> P4_GLOBMEMSIZE (in bytes)
> p1_2061: p4_error: alloc_p4_msg failed: 0
> p0_2060: p4_error: interrupt SIGINT: 2
> p2_2062: p4_error: interrupt SIGINT: 2
> p3_2063: p4_error: interrupt SIGINT: 2
It says to increase with the P4_GLOBMEMSIZE environement variable
on all the involved process (so I put it in the .bashrc)
But then my jobs can't start and give me a seg fault.
p2_1214: p4_error: interrupt SIGSEGV: 11
p0_1212: p4_error: interrupt SIGINT: 2
p3_1215: p4_error: interrupt SIGINT: 2
p1_1213: p4_error: interrupt SIGINT: 2
I have tried to change and check with ipcs -ml the limits of my system:
ipcs -ml
------ Shared Memory Limits --------
max number of segments = 128
max seg size (kbytes) = 1048576
max total shared memory (kbytes) = 16777216
min seg size (bytes) = 1
So I should be able to allocate 1 Gbyte of shared memory, when 2.6 Mbytes
are requested.
Do you have any ideas ?
Thanks in advance
Sebastien Cabaniols
More information about the Beowulf
mailing list