[Beowulf] Peformance penalty when using 128-bit reals on AMD64

NiftyOMPI Tom Mitchell niftyompi at niftyegg.com
Sat Jun 26 15:40:24 PDT 2010

On Fri, Jun 25, 2010 at 10:30 AM, Nathan Moore <ntmoore at gmail.com> wrote:
> I used GMP's arbitrary precision library (rational number arithmetic)
> for my thesis a few years back.  It was very easy to implement, but
> not fast (better on x86 hardware than sun/sgi/power as I recall).  I
> too am curious about the sort of algorithm that would require that
> much precision.

A lot can depend on the dynamic range of the
values being operated on.    If there is a
mix of very large and very small values odd
results can surface especially in parallel code.

Also basic statistics where the squared values can sometimes
unexpectedly overflow a computation when the "input" is well within

It does make a lot of sense to test code and data with 32 and 64 bit
floating point to see if odd results surface.   It would be nice
if systems+compilers had the option of 128 and even 256 bit
operations so code could be tested for sensitivity that

I sort of wish precision was universally an application ./configure or
a #define
and while I am dreaming, 128 and 256 bit versions would just run...
A +24x slower run would validate a lot of codes and critical runs.

In a 30 second scan of GMP's arbitrary precision library I cannot tell
if 32 and 64bit sizes fall out as equal in performance to native types.

        T o m   M i t c h e l l

More information about the Beowulf mailing list