[Beowulf] shared memory error

John Hearns John.Hearns at xma.co.uk
Tue Apr 5 02:29:31 PDT 2016


Perhaps a stupid reply...

Can you try running  ipcs  when the code is running to see what shared memory segments and semaphores are being used?
It won't solve your problem but it might shed some light on what is going on.



-----Original Message-----
From: Beowulf [mailto:beowulf-bounces at beowulf.org] On Behalf Of Jörg Saßmannshausen
Sent: 04 April 2016 22:29
To: Beowulf Mailinglist <beowulf at beowulf.org>
Subject: [Beowulf] shared memory error

Dear all,

I was wondering whether somebody might be able to shed some light on this problem I am having with a chemistry code (GAMESS-US):

DDI Process 15: semop return an error performing 1 operation(s) on semid 98307.
semop errno=EINVAL.

This sometimes happens when I need quite a bit of memory for the fortran code
(1550000000 words). Originally I thought it has to do with the hardware I am running it on but meanwhile I found it all over the place, i.e. on some older Opterons and on some newer Ivy and Haswell CPUs.

It is not quite reproducible, unfortunately. A run might work ok for a few days and then the problem kicks in and the logfile explodes from around 14 MB to 17 GB, or it might just work.

Some system informations: I am running Debian Jessie with gcc / gfortran version 4.9.2-10. The nodes have 64 GB of RAM and 16 or 20 cores.  As the shared memory default settings in Linux are not suitable for GAMESS (there is a note in the documentation), I am using these settings on the 64 GB RAM
machines:

kernel.shmmax = 6923000000
kernel.shmall = 25165824
kernel.shmmni = 32768

I got the feeling the problem lies burried in these settings but my knowledge here is not sufficient to solve the problem. Could somebody point me in the right direction here?

All the best from London

Jörg

--
*************************************************************
Dr. Jörg Saßmannshausen, MRSC
University College London
Department of Chemistry
20 Gordon Street
London
WC1H 0AJ

email: j.sassmannshausen at ucl.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
#####################################################################################
Scanned by MailMarshal - M86 Security's comprehensive email content security solution.
#####################################################################################
Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP


More information about the Beowulf mailing list