Beowulf & FFT
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Martin Siegert siegert at sfu.caTue Jul 18 11:22:52 PDT 2000
- Previous message: Intel benchmarks fuel SDRAM vs. RDRAM debate
- Next message: Beowulf & FFT
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi there, this is probably related to the problems that have been discussed recently under the topic "Beowulf & Fluid Mechanics". I've been trying to optimize FFTs (fast Fourier transforms) on a cluster of dual PII-400MHz PCs with switched fast ethernet. The results are not promissing: The execution times for 100 forward and backward transforms for a system size of 400x400 are 26.61s (using gettimeofday) for a single processor and for np processors I get (in seconds using MPI_Wtime): np mpich-1.1.2 mpich-1.2.0 mpipro 2 28.54 24.18 27.28 4 33.00 29.29 24.50 8 16.80 16.69 16.41 mpich-1.1.2 is compiled for device=ch_p4, whereas mpich-1.2.0 is compiled for ch_p4 with -comm=shared. The FFT routines are from the FFTW library (www.fftw.org). The fftw_mpi_test of the fftw distribution shows similar results. My understanding is that most of the interprocess communication is due to a MPI_Alltoall in the matrix transpose routine. None of the MPI distributions seem to handle this particularly well :-(. For mpich 4 processors are slower than 2 processors. It seems that communication between processes on the same node is so much faster that np=2 turns to be faster than np=4. However, the effect is the same in mpich-1.1.2 and mpich-1.2.0; the latter does shared-memory communication. mpipro doesn't show the same effect. Since I'm in the process of expanding the beowulf, I'm wondering whether switching to 133 MHz would improve the results (given that I can find a motherboard that supports ECC - we had that discussion). Any thoughts and comments on this are appreciated. Somewhat off topic (those of you not interested in physics/stat.mech. may stop reading here :-) I need fast FFTs to integrate PDEs using spectral methods. Those PDEs describe ordering dynamics of surfaces, pattern formation, etc. (similar to spinodal decomposition). The usual way of doing these kind of things is using finite differences. That's ok, if anisotropies aren't important. If they are, however, finite differences are tricky, because they introduce artificial anisotropies. Spectral methods avoid this. However, if spectral methods trun out to be orders of magnitudes slower than finite differences, I'm not sure what to do. For a single processor this isn't a problem, but when MPI comes into play the efficiency of all these algorithms must be reevaluted. We probably all learned something like "the Euler method is bad, implicit methods are superior".On a beowulf that isn't true anymore ... If you have ideas how to get out of this dilemma, please let me know. Cheers, Martin ======================================================================== Martin Siegert Academic Computing Services phone: (604) 291-4691 Simon Fraser University fax: (604) 291-4242 Burnaby, British Columbia email: siegert at sfu.ca Canada V5A 1S6 ========================================================================
- Previous message: Intel benchmarks fuel SDRAM vs. RDRAM debate
- Next message: Beowulf & FFT
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
