[Beowulf] Nehalem Xeons

Bill Broadley bill at cse.ucdavis.edu
Tue Oct 14 19:19:26 PDT 2008

Ellis Wilson wrote:
> Joe Landman wrote:
>> Kilian CAVALOTTI wrote:
>>> Do you, by any chance, have any substantial performance figure to make 
>>> us drool? :)
>> Intel has asked that no benchmarks be published by people with units.
> One wonders why they distributed them in the first place if they didn't 
> intend to excite people about their performance prior to releasing them.

Heh, well they want to excited people... important people... who in exchange 
sign NDAs.  Not to mention providing feedback on performance, stability, bios 
compatibility, operating system, drivers, compilers, applications etc that 
intel couldn't hope to replicate all the variations of in the lab.

>   With processors I don't think it's for "debugging" or stability checks 
> since that should be well simulated (owing to the high cost of CPU molds 
> costs millions itself).

It's hard to predict when a show stopper will show. Nvidia, AMD, and Intel 
(and likely most everyone) has had learned hard lessons in this area.  Indeed 
  companies do spend big $$$ trying to make sure that each silicon revision is 
bug free... hardly a guarantee though.

In any case if you google around there's a fair bit of performance information 
on nehalem chips.  Stream performance has been mentioned, unlabeled charts 
with relative performance on Spec CPU among other benchmarks, and 
preproduction benchmarks on a variety of things.  Public info and fuzzy IDF 
slides seem to conclude:
* 2.6, 3.0, and 3.2 clock bins or so
* slightly 5-25% higher IPC (per thread) on many workloads
* Dramatically better memory system
* 4 cores/8 threads first, more variations later.
* 3 memory systems per socket.
* On chip memory controller
* lower memory latency than current intels or opterons.
* slightly HIGHER power use per socket than current intel.

So nothing really earth shattering for the single socket market, but very 
healthy competition (unlike the current CPUs) in the 2-4 socket market.

Of course that leaves tons of interesting questions, pressure your favorite 
vendor if you can't wait.  Although there is some info at:

Personally I'm most interested in when hyperthreading helps (hopefully it's a 
better implementation of SMT than the P4 had) and exactly how the memory 
system works.  Things like how fast does 1,2,4,8,16 threads fetch a random 
cache line?  Sequential?  From L1, L2, L3, and main memory.  Things like:

Current rumors claim the desktop chip (core i7) is due in week 46, but recent 
news claims week 47 around Nov 17th.  No idea when the workstation/server 
version will be out, at least is should give a good idea where the server 
version should be performance wise.

More information about the Beowulf mailing list