[Beowulf] Power calculations , double precision, ECC and power of APU's
jcownie at cantab.net
Fri Mar 22 13:07:52 PDT 2013
New thread, new disclaimer: I work for Intel and sometimes even on Intel(r) Xeon Phi(tm) co-processor...
On 22 Mar 2013, at 16:14, Mark Hahn wrote:
>> No ECC might be an issue.
> Phi definitely has a mode to enable ECC for onboard dram,
> similar to how Nvidia and AMD do it (somewhat reduced
> performance and capacity, since the interface is a power
> of two bits wide.)
Correct. There is ECC, and it is enabled by default. All Intel's published benchmarks
are supposed to be run with ECC enabled and a statement to that effect ought to
be included. (There should *certainly* be a statement if it is *not* enabled!)
Of course Intel can't control how owners of the card choose to run or report their results.
> it says here:
> that Phi has ECC on L2 as well. (afaik it's fairly common
> to do only parity on L1, especially for inclusive caches,
> since corrupted lines can be refetched from the L+1 cache.)
>> Public information states approximately 2 SP
>> ops per DP op. Sounds like the SIMD registers can do both, like a normal
>> x86 chip.
A little Googling finds http://www.theregister.co.uk/2012/11/12/intel_xeon_phi_coprocessor_launch/
for instance, which is accurate AFAICT and says
"This VPU is capable of processing eight 64-bit double-precision floating point operations or
sixteen 32-bit single-precision operations in one clock cycle."
Not wanting to feed the troll, but I'm not clear why Vincent thinks that a 2x SP
to DP ratio is so bad, when some Googling suggests that
the ratio for various GPUs seems to be rather higher (i.e. more SP biased).
> Phi implements standard x86 integer, x87 (!), and the Phi-specific
> 512b-wide mode. (I wish they'd just call it MMX512 or something,
> rather than inventing a new, inconsistent name. MMX, SSE, SSE2, SSE3,
> but then SSSE3, then back to SSE4, then AVX and now IMICPHIAVX++ ;)
> although I'm excited to get my hands on a Phi, I can't help thinking
> about how it seems a little rushed. not supporting any of the *SSE*
> levels, for instance.
It would clearly be nice if the Phi had all of those features, but
* You wouldn't get performance if you used them
* They take space; how many cores would you give up to have them?
(Remembering that for the codes you're serious about you won't be using them...)
James Cownie <jcownie at cantab.net>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf