[Beowulf] Barcelona vs. Woodcrest, computational chemistry research

Vincent Diepeveen diep at xs4all.nl
Wed Sep 26 05:55:10 PDT 2007

hi Andrew,

Things largely depend upon your workload and number of sockets.

For integer code, intel definitely is 50% faster in IPC and most likely will 
clock higher too. We were shocked to see that Sjeng at specint is having a 
10% worse to 5% better (depending whether you take the base or optimal 
performance time) IPC for 1.9Ghz barcelona 2 socket machine than it's oldie 
K8 brother that's at specint as well (and clocked at 3Ghz).

It is yet unclear what causes this big difference. A possibility might be 
that one of the reasons for Sjeng a load from memory cannot be done 
speculative before a write, whereas core2 can actually do this.

In floating point i'd argue that intel will have problems equalling 
Barcelona's IPC, yet intel clocks that much higher that doing a test there 
should be easy.

Additionally AMD already has a working quadcore cpu as of now, whereas intel 
doesn't yet. So whether intel will be in time producing one and scaling to 
3Ghz by start 2008 we cannot foresee yet, whereas for AMD it should be more 
straightforward procedure now to produce a 3Ghz+ clocked one at 45 nm.

It would be a very positive surprise to me when in floating point intel can 
get even close in IPC to AMD there.

Yet if you order cpu's right now, that might not be such a clever idea, 
knowing in a few months from now, both intel and amd might have in the shops 
cpu's that are at least 50% higher clocked than todays ones and quadcore.

I'd argue that if you have no need for nodes with more than 1 socket, that 
intel will be most interesting as of now. At 2 sockets it is very unclear 
what's faster right now in floating point, at 4 sockets AMD is the absolute 
king of course. Having 16 cores @ 1 memory controller until end 2009, make 
that 2010 until it's in the shops, which is what intel basically offers, is 
quite complicated to see as a scalable solution. AMD with 4 memory 
controllers definitely is doing better there.

If your software isn't dependant a lot on the memory controller, then of 
course it's very unclear to foresee what will be faster in 2008.

As of now, barcelona is very low clocked. Just like Itanium solutions i 
don't take it very serious unless you have a need for 4 socket machines, 
where it worth a very serious consideration.

Another question i actually have is when do we see ddr3 ram in machines?

We see GPU's using DDR3 ram right now. With so many cores so soon, not a 
single machine has actually enough bandwidth to the RAM as compared to a 

For number crunching DDR3 ram might be very interesting and solutions with 
many memory controllers for 1 cpu are real interesting as well.

In that sense, intel is totally outdated introducing CSI somewhere end 2009 
in x64 market, so probably by 2010 it will be in production.

On the other hand, Barcelona IMHO is a big bummer in terms of performance, 
simply because it clocks only 1.9Ghz.

----- Original Message ----- 
From: "andrew holway" <andrew at moonet.co.uk>
To: <beowulf at beowulf.org>
Sent: Wednesday, September 26, 2007 12:51 PM
Subject: [Beowulf] Barcelona vs. Woodcrest, computational chemistry research

> Hi,
> I am looking at chips for a small 20 node cluster for computational
> chemistry. I'm trying to get my head around the various merits of the
> different ways the cache work.
> Considering the fairly chunky bits of data that are going to be going
> through, does anyone have any opinion on how the new 1.9 Barcelona
> will perform against a 2.66 woodcrest. Is the Intel front side bus
> going to be a significant bottleneck for this type of application?
> Cheers
> Andy
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list