[Beowulf] shared memory error

Greg Lindahl lindahl at pbm.com
Mon Apr 18 22:28:43 PDT 2016


You might want to look at the semop manpage. EINVAL means something
particular. From the looks of it you could add some print statements
for debugging.

On Mon, Apr 18, 2016 at 11:31:27PM +0100, Jörg Saßmannshausen wrote:
> Hi all,
> 
> sorry for the lack of reply but I done some testings and now I got some 
> updates here.
> 
> I am using GAMESS version 5 DEC 2014 (R1) compiled with gfortran 4.9.2 on 
> Debian Linux Jessie. ATLAS is used for BLAS/LAPACK. All the test jobs have 
> passed.
> 
> What I have noticed is: the problem is not really reproducible. So, the same 
> input file is running well on my machine at home (same GAMESS version but 
> gfortran 4.7.2 and Debian Wheezy) but not on the machines at work. To make 
> things more interesting:
> - it might run perfectly ok on one machine but not on another one. They are 
> identical nodes with identical OS. All the installation of the nodes are done 
> from one master image.
> - it might start and generates the error very soon
> - it might run for ages and suddenly generates the error
> - my binary from my machine does generate the error on the machines at work
> 
> - I am lost in the mist. :-)
> 
> I cannot see a pattern here. I am still wondering whether my settings of the 
> shared memory might be correct as the only differences I can see between my 
> machine at home (48 GB of RAM) and the machines at work (64 GB of RAM) is the 
> memory. Having said that, as I got less RAM at home and it is working I would 
> have thought that my settings are ok for less RAM and thus should work on more 
> RAM as well.
> 
> Unfortunately I never got a reply from the GAMESS groups which usually means 
> nobody knows the answer here. 
> 
> Any ideas?
> 
> All the best from a cold London
> 
> Jörg
> 
> 
> 
> 
> On Dienstag 05 April 2016 Rafael R. Pappalardo wrote:
> > Could you share with us the input file(s)? Which version of GAMESS-US?
> > 
> > On lunes, 4 de abril de 2016 22:29:11 (CEST) Jörg Saßmannshausen wrote:
> > > Dear all,
> > > 
> > > I was wondering whether somebody might be able to shed some light on this
> > > problem I am having with a chemistry code (GAMESS-US):
> > > 
> > > DDI Process 15: semop return an error performing 1 operation(s) on semid
> > > 98307.
> > > semop errno=EINVAL.
> > > 
> > > This sometimes happens when I need quite a bit of memory for the fortran
> > > code (1550000000 words). Originally I thought it has to do with the
> > > hardware I am running it on but meanwhile I found it all over the place,
> > > i.e. on some older Opterons and on some newer Ivy and Haswell CPUs.
> > > 
> > > It is not quite reproducible, unfortunately. A run might work ok for a
> > > few days and then the problem kicks in and the logfile explodes from
> > > around 14 MB to 17 GB, or it might just work.
> > > 
> > > Some system informations: I am running Debian Jessie with gcc / gfortran
> > > version 4.9.2-10. The nodes have 64 GB of RAM and 16 or 20 cores.  As the
> > > shared memory default settings in Linux are not suitable for GAMESS
> > > (there is a note in the documentation), I am using these settings on the
> > > 64 GB RAM machines:
> > > 
> > > kernel.shmmax = 6923000000
> > > kernel.shmall = 25165824
> > > kernel.shmmni = 32768
> > > 
> > > I got the feeling the problem lies burried in these settings but my
> > > knowledge here is not sufficient to solve the problem. Could somebody
> > > point me in the right direction here?
> > > 
> > > All the best from London
> > > 
> > > Jörg
> 
> 
> -- 
> *************************************************************
> Dr. Jörg Saßmannshausen, MRSC
> University College London
> Department of Chemistry
> 20 Gordon Street
> London
> WC1H 0AJ 
> 
> email: j.sassmannshausen at ucl.ac.uk
> web: http://sassy.formativ.net
> 
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html



> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list