[Beowulf] More AMD rumors
deadline at eadline.org
Mon Nov 19 10:06:06 PST 2012
> On Mon, 19 Nov 2012, Vincent Diepeveen wrote:
>> If you measure memory latency at all 8 cores at the same time, it's
>> even more horrible.
> Thanks for a remarkably clear and useful reply, Vincent. This nearly
> precisely mirrors my own measurements with a more floating point
> intensive task. The larger i7-3770 cache and its 8 operational contexts
> (it is a four core system but it maintains two completely independent
> contexts per core, IIRC) seem to give it an overwhelming advantage over
> the FX with its eight "real" cores but much smaller cache. Interesting
> to see that this continues with the (I assume) integer/logic intensive
> chess code.
This may be of interest:
> Basically, the i7 looks like a butt-kicking good processor, with the one
> problem being that it doesn't look like a multiprocessing cpu (at least
> I can't find a dual i7 motherboard, although in principle it appears to
> be possible, leaving one with Xeons that don't LOOK like they would
> perform as well although I'd be interested in information on that as
> At the moment, single processor i7's look like they might actually be
> the world's fastest, at least on a per core basis. OTOH, it might well
> be that putting two of them on a single board would horribly saturate
> the memory bus and cause memory management collisions and worse and cost
> them their advantage.
> I'm getting ready to do some very data intensive stuff -- terabyte-scale
> datasets being chewed to pieces basically -- to the point where my
> "cluster" will probably be a pile of RAIDs each with its own private
> copy of the datasets in questions and equipped with an i7 motherboard,
> which seems odd somehow (as the i7 motherboards aren't generally
> configured as "server" motherboards) but the Xeons all run at lower
> clock and are older technology.
Intel has a single socket Xeon (E3-12XX series Sandy/Ivy-Bridge)
and will work on single socket motherboards. Mostly designed for
the small office/home server these have more "server" features,
basically ECC, and cost slightly more than the i-5/7 series. They
are lower power as well.
> Comments from anyone else?
>>> I would have hoped that AMD would dig in an innovate and
>>> regain at least parity if not the lead, because it is good for the
>>> industry for Intel to have serious competition, but while Intel could
>>> make money and survive as second best to AMD, AMD can't make any money
>>> as second best to Intel...
>> We must split of course the 2 worlds of HPC performance.
>> In fact htere is 3 but let's do a rough 2 world division
>> a) floating point or vectorized performance (can be integers as well)
>> We skip A : the manycores have won there.
>> b) integer performance non-vectorized
>> For integers and branches if i take a huge program like Diep.
>> More is better.
>> i7-3960X-EE : 2.0 Million chess positions a second (12 logical cores)
>> i7-980x turbo: 1.85 Million chess positions a second (12 logical cores)
>> i7-3770k: 1.47 million chess positions a second (8 logical
>> AMD Phenom X6 1100T : 1.34 million chess positions a second (6 cores)
>> AMD Phenom X6 1090T : 1.30 million chess positions a second (6 cores)
>> FX-8150 : 1.22 million chesspositions a second (8 mini cores)
>> The FX-8150 is AMD's latest 'bulldozer' CPU.
>> The problem is the new generation FX-8150 at a NEW process
>> technology, with 2 billion transistors or so (caches counted
>> - the initial press release from AMD - not the later one where they
>> creatively not counting things reached 1.2 billion) is not beating
>> their own old design.
>> Furthermore another big problem is power usage.
>> Under full load:
>> Phenom X6 1090T : 69.6 watt,
>> Phenom X6 1100T : 92 watt
>> We see how the 1100T already was clocked a tad too high by AMD, which
>> explains the huge power increase.
>> Now the FX-8150 : 115.2 watt
>> As if Law of Moore garantueeing progress doesn't exist...
>> As for you, in many benchmarks you did do maybe multiplication was
>> important. Each minicore has its own multiplication unit.
>> Sounds good huh?
>> So far the good news: the problem is: it's also over 2 times slower
>> that unit...
>> Please note that bulldozer does have AVX. From benchmarks we know
>> that both intel as well as AMD with this bulldozer,
>> had tried to optimize performance for game. Games using AVX especially.
>> It's not doing bad there in fact. Worse than the quadcore intels. I
>> don't want a quadcore chip though.
>> I want a million cores.
>>>> Mailscanner: Clean
>>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>>> To change your subscription (digest mode or unsubscribe) visit
>>> Robert G. Brown http://www.phy.duke.edu/~rgb/
>>> Duke University Dept. of Physics, Box 90305
>>> Durham, N.C. 27708-0305
>>> Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>> To change your subscription (digest mode or unsubscribe) visit
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
> Robert G. Brown http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
> Mailscanner: Clean
More information about the Beowulf