[Beowulf] What clusters are about - FFT

Tue Jul 18 08:50:09 PDT 2006

Hello All,

Together with Joel Vennes we're busy with some FFT code here. Plan is to 
port it after it has been optimized more to GMP (GNU MP).
Our first surprise was that latest version of GMP (4.2) is using pretty much 
the same method like we do, just it uses less bits
significant to calculate the FFT.

The entire FFT is integer based, because that loses the smallest amounts of 
bits significance when calculating at the 10 million 'limb'
(hundreds of millions of bits range) accuracy.

The first version will basically be working to calculate with huge integers 
(mpz_t in gmp), but effortless can get casted to work
for other datatypes as well (mpn generic FFT).

We would like to compare this FFT implementation with floating point 
implementations or SSE/SSE2 implementations. The math
to figure out how many bits significant a floating point FFT can use is 
compared to this integer implementation quite a bit different too.

It's very crystal clear what we do and how that will perform. Of course 
we're trying hard now to do less work as the disadvantage of
what we do is that we need to implement a modulo and of course we're 
rewriting using the 'modulo' instruction as that's lightyears slower
than a single multiply with some shifting and additions.

After all this works well i'll have a look at how to parallellize it; first 
within 1 node over all cores;
if GMP gets used a lot to do heavy calculating work then i'll consider 
porting it to MPI too.

Of course interesting is knowing in how well used GMP 4.2 is and how it 
compares to other implementations.
For example the default gmp you can get in ubuntu is gmp 3.3 which is pretty 
outdated and already quite slower
than todays 4.2.

http://www.swox.com/gmp/#DOWNLOAD

Which of you has an interest in math with big numbers and what 
libraries/code gets used there?

Thanks,
Vincent