Of course I forgot the code... ;-)<br><br><div><span class="gmail_quote">2006/11/22, Ivan Paganini <<a href="mailto:ispmarin@gmail.com">ispmarin@gmail.com</a>>:</span><blockquote class="gmail_quote" style="margin-top: 0; margin-right: 0; margin-bottom: 0; margin-left: 0; margin-left: 0.80ex; border-left-color: #cccccc; border-left-width: 1px; border-left-style: solid; padding-left: 1ex">
Thank you Jonh and Vincent for the answers! <br>Well, I did a little more research, and tested a very simple code (anexed). Ran ldd on it, and got <br>+++++++++<br>asgard@marvin:~$ ldd a.out <br> libc.so.6 => /lib/libc.so.6 (0x00002b630e4d5000)
<br> /lib64/ld-linux-x86-64.so.2 (0x00002b630e3b8000)<br>+++++++++<br>So I think that my gcc is using 64bits libraries.<br><br>My question now points to other direction: the SSE/SSE2 registers can support, as Vincent said, 2 x 64bits float or 2 x 64bits integer. Forgeting about the SIMD instructions (not using vectorization at any level yet), what in the 32bits processor uses two registers (double, 64bits), in amd64 uses just one register (of half of 128bit register, for 64bits float), right? Is gcc already using this amd64 registers? Or PGI compilers, or sun express compilers? To be more clear, what I am expecting is that, if I need 10 digits, in an
<br> 32bits I would need 2 registers, and so more slow code (two cicles per instruction, etc), in 64bits I would need only one <br>register, and so my code will be faster. <br><br>And we, scientists, are worried about this roundoff errors ;-)
<br><br>Thank you!<br><br>Ivan<br><br><br><br><br><br><div><span class="gmail_quote">2006/11/22, Vincent Diepeveen <<a href="mailto:diep@xs4all.nl" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
diep@xs4all.nl</a>>:</span><div><span class="e" id="q_10f102364233a48d_1"><blockquote class="gmail_quote" style="margin-top: 0; margin-right: 0; margin-bottom: 0; margin-left: 0; margin-left: 0.80ex; border-left-color: #cccccc; border-left-width: 1px; border-left-style: solid; padding-left: 1ex">
<div bgcolor="#ffffff"><div><font face="Arial" size="2">hi Ivan,</font></div><div> </div><div><font face="Arial" size="2">there is for your amd64 hardware different compiler versions of gcc. One that is 32 bits and one that is 64 bits.
</font></div><div><font face="Arial" size="2">For 64 bits doubles it doesn't matter however which of the both you use, yet i'd advice going for the 64 bits version.</font></div><div> </div><div><font face="Arial" size="2">
To define variables just use:</font></div><div><font face="Arial" size="2"> "double"</font></div><div> </div><div><font face="Arial" size="2">double is 64 bits in total when using gcc. intel c++ sometimes uses less but
</font></div><div><font face="Arial" size="2">i guess not at x64. that kind of cheating happens more for specfp and chips that lack certain important instructions, such as itanium lacks a lot.</font></div><div> </div><div>
<font face="Arial" size="2">mantissa in double is 52 bits of the precision of total length of number.</font></div><div><font face="Arial" size="2">then 1 bit for sign and the other 11 for exponent.</font></div><div> </div>
<div><font face="Arial" size="2">the 128 bits SSE/SSE2 splits itself up into vector registers.</font></div><div><font face="Arial" size="2">So either 2 independant unrelated doubles of 64 bits, 4 floats of 32 bits, or 4 integers of 32 bits or 2 integers of 64 bits.
</font></div><div> </div><div><font face="Arial" size="2">if you have no professor degree spaghetti programming, then better stay away from SIMD (SSE/SSE2) and let the compiler automatically generate it for you.</font></div>
<div> </div><div><font face="Arial" size="2">You like to compile at AMD hardware with the additional flags (additional to the ones you want for SIMD):</font></div><div> </div><div><font face="Arial" size="2">gcc -O2 -march=k8 -mtune=k8 -o myprogram
</font></div><div> </div><div><font face="Arial" size="2">-O3 and higher potentially buggy for most software i try, but if you have a deterministic way to test executables you can of course try it out. </font></div><div>
</div><div><font face="Arial" size="2">Let's not complain about those bugs too loud. GCC team is happy amateurs, like all gnu folks never having followed any course about QAD (quality assurance testing) which is most important thing of a product; test until it is reliable and bugfree working. However QAD and testing is for professionals rather than amateurs, we understand that very well, and instead cheer for having the privilege to work with this free available software.
</font></div><div> </div><div><font face="Arial" size="2">As it seems the gcc 4.1 release was a snapshot lobotomized deliberately for AMD, pgo for most software no longer helps much at AMD hardware with gcc. Earlier 4.1
snapshots were at least for my integer code a lot faster with pgo (using -O3 in the speedtests by the way). So not worth the trouble trying to get pgo working, as you need to check it for deterministic output too; which with
</font><font face="Arial" size="2">floating point is quite hard i assume.</font></div><div> </div><div><font face="Arial" size="2">Some tricks indeed get rid of a few digits significance in compilers, or cause roundoff errors to sooner backtrack in the endresult (which nearly no scientist ever notices, saying more about the scientist than the different compilers)
</font><font face="Arial" size="2">, yet when using above tips that shouldn't happen too quickly and in doubles you should have roughly 52 * log(2) / log(10) = 52 / 3.322 = 15 digits or so</font></div><div> </div><div><font face="Arial" size="2">
Good Luck!</font></div><div> </div><div><font face="Arial" size="2">Vincent</font></div><blockquote style="padding-right: 0px; padding-left: 5px; margin-left: 5px; border-left-color: #000000; border-left-width: 2px; border-left-style: solid; margin-right: 0px">
<div><span><div style="font-style: normal; font-variant: normal; font-weight: 400; font-size: 10pt; line-height: normal">----- Original Message ----- </div><div style="background-color: #e4e4e4; font-style: normal; font-variant: normal; font-weight: 400; font-size: 10pt; line-height: normal">
<b>From:</b> <a title="ispmarin@gmail.com" href="mailto:ispmarin@gmail.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">Ivan Paganini</a> </div><div style="font-style: normal; font-variant: normal; font-weight: 400; font-size: 10pt; line-height: normal">
<b>To:</b> <a title="beowulf@beowulf.org" href="mailto:beowulf@beowulf.org" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">beowulf@beowulf.org</a> </div><div style="font-style: normal; font-variant: normal; font-weight: 400; font-size: 10pt; line-height: normal">
<b>Sent:</b> Wednesday, November 22, 2006 12:01 PM</div><div style="font-style: normal; font-variant: normal; font-weight: 400; font-size: 10pt; line-height: normal"><b>Subject:</b> [Beowulf] Question about amd64 architecture and floating pointoperations
</div><div><br></div>Hello everybody at beowulf. Sorry about the _really_ newbie question, but after doing some tests and researching a little, a question arose when fooling around with amd64 (more precisely, an amd64 Athlon 4200 X2) and gcc and sun studio 11. The architecture has 64 bits integer registers and 128 bits floating point registers, but my test programs in C just gave me the same precision that I got with an old athlon 2400 xp (32bits), that is, long double go only to 1x10^ 4961, even with the -m64 flag. I always imagined that I would get the double precision without the long double declaration (or, maybe, 40bits precision). What am I missing here? Is the compiler (gcc
4.1, sun studio express 11), the operating system (ubuntu 64bits edgy), or just an error in my logic?<br><br>Thank you for the patience!<br clear="all"><br>-- <br>-----------------------------------------------------------
<br>Ivan S. P. Marin<br>Laboratório de Física Computacional<br><a href="http://lfc.ifsc.usp.br" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">lfc.ifsc.usp.br</a><br>Instituto de Física de São Carlos - USP
<br>---------------------------------------------------------- </span></div><p><hr>_______________________________________________<br>Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
Beowulf@beowulf.org</a><br>To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
http://www.beowulf.org/mailman/listinfo/beowulf</a><br></p></blockquote></div></blockquote></span></div></div><br><br clear="all"><br>-- <br>-----------------------------------------------------------<div><span class="e" id="q_10f102364233a48d_3">
<br>Ivan S. P. Marin<br>Laboratório de Física Computacional <br><a href="http://lfc.ifsc.usp.br" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">lfc.ifsc.usp.br</a><br>Instituto de Física de São Carlos - USP
<br>---------------------------------------------------------- </span></div></blockquote></div><br><br clear="all"><br>-- <br>-----------------------------------------------------------<br>Ivan S. P. Marin<br>Laboratório de Física Computacional
<br><a href="http://lfc.ifsc.usp.br">lfc.ifsc.usp.br</a><br>Instituto de Física de São Carlos - USP<br>----------------------------------------------------------