[Beowulf] Thought that this might be of interest

Vincent Diepeveen diep at xs4all.nl
Mon Nov 6 08:46:12 PST 2006


Thanks for your answer.

Please show us results of prime95 (iteration times is already enough) please 
to prove your marketing talk regarding floating point unit of
existing K8 chips..

Paper benchmarks of items that are not even close to be able to get bought 
in shops happen just a bit too much IMHO.

Even an old P4 2.4Ghz is faster than my dual core opteron 2.4Ghz here for 
prime95 and LLR.

Core2 at single core is 2 times faster than that P4 at 2.4Ghz.

RAM latency is not that interesting for most SIMD applications such as 
prime95,
instead it is more interesting for integer applications to have a good 
latency, but even then we're talking about 5%-10% of the system
time at most.

Further core2 has a 2 times bigger L2 cache than K8, so that hides quite 
some
latency.

Last but not least you should also theoretically go in a room with no 
contact to the rest of the world and ponder about next statement:

"Imagine you are a programmer and your number crunching applications scaling 
is totally dependant upon bandwidth; shouldn't you take a different job than 
programming?"

Vincent

----- Original Message ----- 
From: "Richard Walsh" <rbw at ahpcrc.org>
To: "Vincent Diepeveen" <diep at xs4all.nl>
Cc: "Beowulf Mailing List" <beowulf at beowulf.org>
Sent: Monday, November 06, 2006 4:35 PM
Subject: Re: [Beowulf] Thought that this might be of interest


> Vincent Diepeveen wrote:
>> Thanks for your info, this is very helpful.
>>
>> So until end 2007 the core2 annihilates any opteron system.
>    Nope.  Dual-core socket F does quite a bit to even the score
>    on floating-point, just with the DDR2 latency and bandwidth
>    improvements.  I would not use the word annihilate ... but,
>    Woodcrest still dominates for integer.
>>
>> Except of course when you're interested in just measuring bandwidth.
>>
>> So the K8L should then take over from core2 the performance reign again.
>    While buying Woodcrest now might make sense on a raw performance
>    basis, I would want to consider carefully the direction that AMD takes
>    you in through 2007.  The socket compatiblity of the the quad-core 
> Barcelona
>    is an important consideration especially considering what 8-cores on a
>    board probably means for the Intel CloverTown.  A key question is how
>    well does your code use cache as always.
>>
>> Wasn't that K8L also going to do 4 instructions per cycle like core2 is 
>> doing?
>     That is not what I have read.  See the presentation ... and other 
> articles on
>     Barcelona.   I see only 3-way super scalar.
>> For my chess application of course more instructions per cycle means 
>> faster.
>    Yes.  I can the that the bandwidth, instruction intensity, and integer 
> profile of
>    a chess application might favor the Woodcrest,  on the other had a 4 
> socket
>    socket 1207 motherboard with 8 channels to memory and 16 cores might be
>    a better alternative as scale than an Intel SMP system.
>
>    rbw
>
> -- 
>
> Richard B. Walsh
>
> "The world is given to me only once, not one existing and one
> perceived. The subject and object are but one."
>
> Erwin Schroedinger
>
> Project Manager
> Network Computing Services, Inc.
> Army High Performance Computing Research Center (AHPCRC)
> rbw at ahpcrc.org  |  612.337.3467
>
> -----------------------------------------------------------------------
> This message (including any attachments) may contain proprietary or
> privileged information, the use and disclosure of which is legally
> restricted.  If you have received this message in error please notify
> the sender by reply message, do not otherwise distribute it, and delete
> this message, with all of its contents, from your files.
> ----------------------------------------------------------------------- 
>
> 




More information about the Beowulf mailing list