[Beowulf] g03 and centos 3.6

Mike Davis jmdavis1 at vcu.edu
Tue Jun 13 15:04:22 PDT 2006


Joe et al.

I'm not sure how many of you will be interested, but I think that the 
g03/Centos 3.6 issue is resolved.

In talking to David Hibbs in Australia (who had experienced a similar 
issue), I found out that the issue was likely a compiler bug. David had 
solved the problem by compiling with a pgi5 version.

Joe L. was very helpful today as well. I decided to try the compile with 
-O2 and -Mnosse. Sadly that resulted in a threads error. But it was 
definitely worth a try.

Final success came with an install of the pgi6.1 compilers and making 
sure that ulimit -s was set to unlimited.

So, Centos3.6 will run g03D02 on the opterons when compiled with pgi6.1. 
No special settings were necessary in my case.


Mike Davis





Joe Landman wrote:
> Hi Mike:
> 
>   You hit a segfault.  What options did you use with GO3?  The last
> build we worked on with a customer that we had to restrict it to -O2
> -Mnosse or something like this.  Try to compile/link with -g to see if
> you can include symbols, and force a core dump so you can work backwards
> from there.
> 
>   Also, what does your "ulimit -a" report (or limit for the tcsh folks).
> 
> Joe
> 
> Mike Davis wrote:
> 
>>Hey Everyone,
>>
>>I asked back in January about G03 running on platforms other than SuSe.
>>Thanks to everyone who responded. Now I'm trying to compile G03 D02 with
>>pgi 6 on Centos 3.5.
>>
>>The good news is that it compiles. It generates the needed 80 .exe
>>files. G03 generates output.
>>
>>The bad news is that every run fails at the same point.
>>
>>
>>IExCor=   0 DFT=F Ex=HF Corr=None ExCW=0 ScaHFX=  1.000000
>> ScaDFX=  1.000000  1.000000  1.000000  1.000000
>> IRadAn=      0 IRanWt=     -1 IRanGd=            0 ICorTp=0
>> NAtoms=   40 NActive=   40 NUniq=   40 SFac= 7.50D-01 NAtFMM=   80
>>NAOKFM=F Big=F
>> Leave Link  301 at Thu Jun  8 15:06:59 2006, MaxMem=   20000000 cpu:
>>     0.0
>> (Enter /usr/global/g03/l302.exe)
>> NPDir=0 NMtPBC=     1 NCelOv=     1 NCel=       1 NClECP=     1 NCelD=
>>     1
>>         NCelK=      1 NCelE2=     1 NClLst=     1 CellRange=     0.0.
>> One-electron integrals computed using PRISM.
>>
>>
>>
>>After that nothing... Below is an strace.
>>
>>
>>strace:
>>
>>[pid 10461] write(1, " One-electron integrals computed"..., 46) = 46
>>[pid 10461] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>>Process 10460 resumed
>>Process 10461 detached
>><... wait4 resumed> [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV}], 0,
>>NULL) = 10461
>>rt_sigaction(SIGINT, {SIG_DFL}, NULL, 8) = 0
>>rt_sigaction(SIGQUIT, {SIG_DFL}, NULL, 8) = 0
>>rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
>>--- SIGCHLD (Child exited) @ 0 (0) ---
>>stat("/home/jmdavis/Gau-10460.inp", {st_mode=S_IFREG|0664, st_size=3980,
>>...}) = 0
>>unlink("/home/jmdavis/Gau-10460.inp")   = 0
>>munmap(0x2a9566d000, 4096)              = 0
>>exit_group(1)                           = ?
>>
>>David Hibbs in Australia was having a similar problem but I can't find
>>any answers to his query.
>>
>>
>>Mike Davis
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit
>>http://www.beowulf.org/mailman/listinfo/beowulf
> 
> 




More information about the Beowulf mailing list