Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Multidimensional FFTs

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Greg Lindahl lindahl at pathscale.com
Tue Feb 28 18:34:50 PST 2006


On Tue, Feb 28, 2006 at 01:26:51PM -0500, Bill Rankin wrote:

> There is a research group here at Duke doing some application  
> development and they are looking at implementing their codes in a  
> cluster environment.  The main problem is that 95% of their  
> processing time is taken up by medium to large sized 3D FFTs (minimum  
> 64 elements on an edge, 256k total elements).

That's a fairly small FFT on a parallel cluster. How many cpus do they
imagine using? Perhaps the easiest thing to do is to whip up some code
and invite people to benchmark it. The G-PTRANS and G-FFTE elements of
HPC Challenge are relevant but not many folks have submitted numbers.

Let's see: for 64**3, and 64 cpus with a 1D decomposition, there are
64**2 words per cpu, and a naive Alltoall will send 64 messages of 64
words each to 63 other nodes. Then the message length is 1024 bytes
(double precision complex). I would disagree with Stu's
recommendations at this size due to the short message length, but I
don't know if 2D would be a better decomposition at this size. FFTW
version 2's MPI routines only do 1D decomposition.

-- greg




More information about the Beowulf mailing list