[Beowulf] Power calculations , double precision, ECC and power of APU's

Fri Mar 22 13:07:52 PDT 2013

New thread, new disclaimer: I work for Intel and sometimes even on Intel(r) Xeon Phi(tm) co-processor...

On 22 Mar 2013, at 16:14, Mark Hahn wrote:

>> No ECC might be an issue.
> 
> Phi definitely has a mode to enable ECC for onboard dram,
> similar to how Nvidia and AMD do it (somewhat reduced 
> performance and capacity, since the interface is a power
> of two bits wide.)  
Correct. There is ECC, and it is enabled by default. All Intel's published benchmarks
are supposed to be run with ECC enabled and a statement to that effect ought to
be included. (There should *certainly* be a statement if it is *not* enabled!)
Of course Intel can't control how owners of the card choose to run or report their results.

> it says here:
> http://software.intel.com/en-us/articles/case-study-achieving-superior-performance-on-black-scholes-valuation-computing-using
> that Phi has ECC on L2 as well.  (afaik it's fairly common
> to do only parity on L1, especially for inclusive caches, 
> since corrupted lines can be refetched from the L+1 cache.)
> 
>> Public information states approximately 2 SP
>> ops per DP op. Sounds like the SIMD registers can do both, like a normal
>> x86 chip.
Exactly. 
A little Googling finds http://www.theregister.co.uk/2012/11/12/intel_xeon_phi_coprocessor_launch/
for instance, which is accurate AFAICT and says 
"This VPU is capable of processing eight 64-bit double-precision floating point operations or 
sixteen 32-bit single-precision operations in one clock cycle."
Not wanting to feed the troll, but I'm not clear why Vincent thinks that a 2x SP 
to DP ratio is so bad, when some Googling suggests that 
the ratio for various GPUs seems to be rather higher (i.e. more SP biased).

> Phi implements standard x86 integer, x87 (!), and the Phi-specific
> 512b-wide mode.  (I wish they'd just call it MMX512 or something, 
> rather than inventing a new, inconsistent name.  MMX, SSE, SSE2, SSE3,
> but then SSSE3, then back to SSE4, then AVX and now IMICPHIAVX++ ;)
> although I'm excited to get my hands on a Phi, I can't help thinking
> about how it seems a little rushed.  not supporting any of the *SSE*
> levels, for instance.
It would clearly be nice if the Phi had all of those features, but
* You wouldn't get performance if you used them
* They take space; how many cores would you give up to have them?
  (Remembering that for the codes you're serious about you won't be using them...)

--
-- Jim
--
James Cownie <jcownie at cantab.net>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20130322/be06d0c6/attachment.html>