[Beowulf] Re: GPU Beowulf Clusters
Micha
michf at post.tau.ac.il
Mon Feb 1 15:56:44 PST 2010
On 01/02/2010 22:54, richard.walsh at comcast.net wrote:
>
> Jon Forrest <jlforrest at berkeley.edu> wrote:
>
> >On 2/1/2010 7:24 AM, richard.walsh at comcast.net wrote:
> >
> >> Coming in on this late, but to reduce this work load there is PGI's
> version
> >> 10.0 compiler suite which supports accelerator compiler directives. This
> >> will reduce the coding effort, but probably suffer from the classical
> >> "if it is
> >> easy, it won't perform as well" trade-off. My experience is limited, but
> >> a nice intro can be found at:
> >
> >I'm not sure how much traction such a thing will get.
> >Let's say you have a big Fortran program that you want
> >to port to CUDA. Let's assume you already know where the
> >program spends its time, so you know which routines
> >are good candidates for running on the GPU.
> >
> >Rather than rewriting the whole program in C[++],
> >wouldn't it be easiest to leave all the non-CUDA
> >parts of the program in Fortran, and then to call
> >CUDA routines written in C[++]. Since the CUDA
> >routines will have to be rewritten anyway, why
> >write them in a language which would require
> >purchasing yet another compiler?
>
> Mmm ... not sure I understand the response, but perhaps this response
> was to a different message ... ?? In any case, the PGI software supports
> accelerator directives for both C and Fortran, so for those languages I do
> not see a need to rewrite whole applications. The question presented is
> the same as always, what does the performance-programming effort function
> look like and how well does your code perform with directives to start
> with. The PGI models is also hardware generic and the code runs on
> the CPU in parallel when there is no GPU around I believe. What will
> gate interest is how well PGI compiler group does at delivering performance
> and how important portability is to the person developing the code.
>
As far as I know pgi also has a Cuda Fortran similar to cuda c, not only a
directive based approach, but I have to admit that I don't have any experience
with it.
As for why spend money on a compiler since the code has to be re-written. Even
an expensive compiler is cheap with regards to a programmer's time. Even for the
salary of a cheap programmer you can buy the compiler in at most two weeks
salary's worth.
On the other hand, you have a programmer that already knows fortran and a piece
of code that is already written and debugged in fortran. Quite a few programs
can produce a first unoptimized version with very little work.
Just sorting through counter based bugs and memory order bugs can cost you a lot
more than the compiler. Fortran is 1 based compared to c that is 0 based
(actually fortran 90/95 can use any index range for matrices). Fortran is column
order while c is row order. Do you know how much head ache that can bring into
the porting?
Translating matlab code into fortran is also much easier that into c due to
these issues.
> HMPP make offers a similar proposition ...
>
> rbw
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list