availability of Memory compression routine
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Kwan Wing Keung hcxckwk at hkucc.hku.hkWed Jul 17 19:30:07 PDT 2002
- Previous message: RARP requests
- Next message: availability of Memory compression routine
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear Colleagues, Recently I have been working in parallelization a user's program that involved repeated mpi_broadcast of a big 2D-array (around 1000*1000 complex *16) from the master to each compute slaves. The parallelization is now completed, but the speed efficiency is not very high. Basically we found that upon using 4-5 processors, the program can speed up to around 60% (i.e. 40% of the original serial execution time). Further increase in no. of processor will not help. Though the clocked CPU time for each slave goes down, the wallclock duration is nearly flat. Likely the network is already "saturated". By using the Hermitian property, I can now reduce the communication size by half (the upper triangle can be locally generated from the elements in lower triangle in each slave after communication). The "saturated" time is now reduced for another 45%. My question is now whether we have a generic memory compression routine that allow the compression of a big memory chunk to a much smaller one like that used in "zip" or "compress". Of course we are talking about compression for memory variable inside a standard Fortran program BUT NOT the compression in a disk file. In this case we can first compress the huge array and then use mpi_broadcast to send the compressed data. Upon receiving the compressed data, each slave can decompress it to retrieve the original data. In simple word, we are sacrifying local computation vs communication. Any suggestion is whole heartedly welcome. W.K. Kwan Computer Centre HKU p.s. I prefer the compression/decompression routines in pure F77 coding, i.e. with no recursion.
- Previous message: RARP requests
- Next message: availability of Memory compression routine
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
