[Beowulf] Accelerator for data compressing

Mark Hahn hahn at mcmaster.ca
Thu Oct 2 19:31:48 PDT 2008


> Currently I generate nearly one TB data every few days and I need to pass it

Bill's right - 6 MB/s is really not much to ask from even a complex WAN.
I think the first thing you should do is find the bottleneck.  to me it
sounds like you have a sort of ropey path with a 100 Mbps hop somewhere.

> thinking about compressing it (most tiff format image data) as much as I can

tiff is a fairly generic container that can hold anything from a horrible
uncompressed 4-byte-per-pixel to jpeg or rle.  looking at the format you're
really using would be wise.  I'm guessing that if you transcode to png,
you'll get better compression than gzip/etc.  dictionary-based compression
is fundamentally inappropriate for most non-text data - not images, not 
double-precision dumps of physical simulations, etc.  png is quite a lot 
smarter about most kinds of images than older formats, and can be lossy or
lossless.

hardware compression would be a serious mistake unless you've already 
pursued these routes.  specialized hardware is a very short-term and 
quite narrow value proposition.  I would always prefer to improve the 
infrastructure.

> The information transmitted in this electronic communication is intended only

uh, email is publication.

regards, mark hahn.



More information about the Beowulf mailing list