[Beowulf] Optimized math routines/transcendentals

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Mon May 2 06:47:22 PDT 2016


Running finite element codes for electromagnetics is a fine example of
where you need more than "single precision², and in which lots of
transcendental function calls are made (sin and cos for the most part).

You might look at the ³theory of operation² document for the Numerical
Electromagnetics Code (NEC) because it, as I recall, has a discussion of
the sin/cos accuracy.

Another place where good precision is needed, albeit not in a HPC sort of
application (although HPC might be used to simulate it), is in a direct
digital synthesizer (DDS) or numerically controlled oscillator (NCO). One
synthesizes a series of samples from an idealized sinusoid by incrementing
the phase (represented as an integer, with the full scale being 2 pi) and
using the high order bits to look up a value for sin/cos.  Alternately the
CORDIC method uses arithmetic to generate the sin/cos on the fly.  CORDIC,
though, is notorious because it¹s a numerical integration, so the longer
you run it, the more error can accumulate.

In deep space navigation, one uses fairly long precision: it¹s a numerical
integration problem, with big numbers and small additions.  For instance,
we regularly measure the distance to a spacecraft that¹s a billion km away
(e.g. Saturn) with an accuracy of centimeters and velocity to cm/second.
That¹s one part in 1E10 - the underlying ³round trip light time"
measurement is actually made to a fractional accuracy of around 1E-15 to
1E-16, averaged over 1000 seconds.




James Lux, P.E.
Task Manager, DHFR Space Testbed
Jet Propulsion Laboratory
4800 Oak Grove Drive, MS 161-213
Pasadena CA 91109
+1(818)354-2075
+1(818)395-2714 (cell)
 





On 4/30/16, 12:09 PM, "Beowulf on behalf of C Bergström"
<beowulf-bounces at beowulf.org on behalf of cbergstrom at pathscale.com> wrote:

>I was hoping for feedback, from scientists, about what level of
>accuracy their codes or fields of study typically require. Maybe the
>weekend wasn't the best time to post.. hmm..
>
>On Sun, May 1, 2016 at 1:31 AM, Peter St. John <peter.st.john at gmail.com>
>wrote:
>> A bit off the wall, and not much help for what you are doing now, but
>>sooner
>> or later we won't be hand-crating ruthlessly optimal code; we'll be
>>training
>> neural nets. You could do this now if you wanted: the objective
>>function is
>> just accurate answers (which you get from sub-optimal but mathematically
>> correct existing code) and the wall clock (faster is better), and you
>>train
>> with the target hardware. So in principle it's easy, and if you look at
>>how
>> fast Deep Mind trained AlphaGo it begins to sound feasible to train for
>>fast
>> fourier transforms or whatever.
>> Peter
>>
>> On Fri, Apr 29, 2016 at 9:06 PM, William Johnson
>><meatheadmerlin at gmail.com>
>> wrote:
>>>
>>> Due to the finite nature of number representation on computers,
>>> any answer will be an approximation to some degree.
>>> To me, it looks to be a non-issue to some 15 significant digits.
>>> I would say it depends how accurate you need.
>>> You could do long-hand general calculations that track percent error,
>>> and see how it gets compounded in a particular series of calculations.
>>>
>>> If you got right into the nuts and bolts of writing optimized
>>>functions,
>>> there are many clever ways to calculate common functions
>>> that you can find in certain math or algorithms & data structures
>>>texts.
>>> You would also need intimate knowledge of the target chipset.
>>> But it seems that would be way too much time in
>>> research and development to reinvent the wheel.
>>>
>>>
>>> On Fri, Apr 29, 2016 at 7:28 PM, Greg Lindahl <lindahl at pbm.com> wrote:
>>>>
>>>> On Sat, Apr 30, 2016 at 02:23:31AM +0800, C Bergström wrote:
>>>>
>>>> > Surprisingly, glibc does a pretty respectable job in terms of
>>>> > accuracy, but alas it's certainly not the fastest.
>>>>
>>>> If you go look in the source comments I believe it says which paper's
>>>> algorithm it is using... doing range reduction for sin(6e5) is
>>>> expensive to do accurately. Which is why the x86 sin() hardware
>>>> instruction does it inaccurately but quickly, and most people/codes
>>>> don't care.
>>>>
>>>> -- greg
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>>>Computing
>>>> To change your subscription (digest mode or unsubscribe) visit
>>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>>
>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>>Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list