[Beowulf] OpenMP on AMD dual core processors
Nathan Moore
ntmoore at gmail.com
Fri Nov 21 07:38:29 PST 2008
Thanks a ton for the worked out example!
I had a similar problem with gfortran, and it only appeared with large array
sizes (bigger than 4000x4000 as I recall). "ulimit" was no help, I assume
there's a memory constraint built in somewhere. (as an aside, I once ran
into a similar problem with perl - the release on linux would only allow
200MB array sizes, but the version available on a sun machine would allow GB
of array sizes)
On Fri, Nov 21, 2008 at 6:36 AM, Bill Broadley <bill at cse.ucdavis.edu> wrote:
> Fortran isn't one of my better languages, but I did manage to tweak your
> code
> into something that I believe works the same and is openMP friendly.
>
> I put a copy at:
> http://cse.ucdavis.edu/bill/OMPdemo.f
>
> When I used the pathscale compiler on your code it said:
> "told.f", line 27: Warning: Referenced scalar variable OLD_V is SHARED by
> default
> "told.f", line 29: Warning: Referenced scalar variable DV is SHARED by
> default
> "told.f", line 31: Warning: Referenced scalar variable CONVERGED is SHARED
> by
> default
>
> I rewrote your code to get rid of those, I didn't know some of the
> constants
> you mentioned dy and Ly. So I just wrote my own initialization. I skipped
> the boundary conditions by just restricting the start and end of the loops.
>
> Your code seemed to be interpolating between the current iteration (i-1 and
> j-1) and the last iteration (i+1 and j+1). Not sure if that was
> intentional
> or not. In any case I just processed the array v into v2, then if it
> didn't
> converge I processed the v2 array back into v. To make each loop
> independent
> I made converge a 1D array which stored the sum of that row's error. Then
> after each array was processed I walked the 1-d array to see if we had
> converged. I exit when all pixels are below the convergence value.
>
> It scales rather well on a dual socket barcelona (amd quad core), my
> version
> iterates a 1000x1000 array with a range of values from 0-200 over 1214
> iterations to within a convergence of 0.02.
>
> CPUs time Scaling
> =================
> 1 54.51
> 2 27.75 1.96 faster
> 4 14.14 3.85 faster
> 8 7.75 7.03 faster
>
> Hopefully my code is doing what you intended.
>
> Alas, with gfortran (4.3.1 or 4.3.2), I get a segmentation fault as soon as
> I
> run. Same if I compile with -g and run it under the debugger. I'm
> probably
> doing something stupid.
>
>
--
- - - - - - - - - - - - - - - - - - - - -
Nathan Moore
Assistant Professor, Physics
Winona State University
AIM: nmoorewsu
- - - - - - - - - - - - - - - - - - - - -
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20081121/683120af/attachment.html>
More information about the Beowulf
mailing list