[Beowulf] OpenMP on AMD dual core processors
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Bill Broadley bill at cse.ucdavis.eduFri Nov 21 04:36:43 PST 2008
- Previous message: [Beowulf] OpenMP on AMD dual core processors
- Next message: [Beowulf] OpenMP on AMD dual core processors
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Fortran isn't one of my better languages, but I did manage to tweak your code
into something that I believe works the same and is openMP friendly.
I put a copy at:
http://cse.ucdavis.edu/bill/OMPdemo.f
When I used the pathscale compiler on your code it said:
"told.f", line 27: Warning: Referenced scalar variable OLD_V is SHARED by default
"told.f", line 29: Warning: Referenced scalar variable DV is SHARED by default
"told.f", line 31: Warning: Referenced scalar variable CONVERGED is SHARED by
default
I rewrote your code to get rid of those, I didn't know some of the constants
you mentioned dy and Ly. So I just wrote my own initialization. I skipped
the boundary conditions by just restricting the start and end of the loops.
Your code seemed to be interpolating between the current iteration (i-1 and
j-1) and the last iteration (i+1 and j+1). Not sure if that was intentional
or not. In any case I just processed the array v into v2, then if it didn't
converge I processed the v2 array back into v. To make each loop independent
I made converge a 1D array which stored the sum of that row's error. Then
after each array was processed I walked the 1-d array to see if we had
converged. I exit when all pixels are below the convergence value.
It scales rather well on a dual socket barcelona (amd quad core), my version
iterates a 1000x1000 array with a range of values from 0-200 over 1214
iterations to within a convergence of 0.02.
CPUs time Scaling
=================
1 54.51
2 27.75 1.96 faster
4 14.14 3.85 faster
8 7.75 7.03 faster
Hopefully my code is doing what you intended.
Alas, with gfortran (4.3.1 or 4.3.2), I get a segmentation fault as soon as I
run. Same if I compile with -g and run it under the debugger. I'm probably
doing something stupid.
- Previous message: [Beowulf] OpenMP on AMD dual core processors
- Next message: [Beowulf] OpenMP on AMD dual core processors
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
