[Beowulf] Which is better: 32x2 or 64x1

Michael Will mwill at penguincomputing.com
Thu Apr 27 16:18:07 PDT 2006

It does not matter as long as you buy from us.


No really it depends on your applications. Are they CPU intensive, RAM
intensive, IO intensive?
Do they have a lot of interprocesscommuncation going on? In what size

The technical parameters that those question allow to weigh are:
1. core speed
the fastest dual core cpu you can buy is the 285 (2x 2.6ghz) and the
fastest single core is the 254 (1x 2.8 ghz)
for some reason we already have the 4-way opteron 856 3.0Ghz but not the
So generally speaking, the dual core cpu's are one or two speed steps
behind the single core version.

Price wise in the middle field, the opteron 250 (one 2.4ghz core per
cpu) is about the same price as the opteron 265 (two 1.8ghz core per
cpu), which means if you only run one thread on the node, 
it's only 75% of the speed that it would be on the single core system.
However if you run four threads on the dual core system, then the total
performance in the best case could be 133% of the single core system, if
it scales linearily
and there is no contention from I/O or memory.

2. memory architecture.
A node with two single core opteron cpu's has one memory controller per
core with two channels each, whereas
a node with two dual core opteron cpu's still has only one memory
controller per cpu socket, and so two cores
share one memory controller, which can lead to a slowdown if all four
cores are mostly accessing ram most of the time.
Examples are signal processing style applications that mostly iterate
over data that does not fit into cache and so
hit the RAM all the time.

3. I/O
3.1 interconnect
If you happen to have a process that uses four cores, and it can run
within the same node on a dual dual core node
doing message passing in RAM instead of having to do go through ethernet
or infiniband between two separate 
single core nodes, you will see a speedup. There also are 

The more typical case is that either you don't do much
interprocesscommunication and then it does not matter, or
you are having a lot of it, and then having four processes going through
the same NIC instead of just two is a disadvantage (network I/O bound

If you have up to 8 processes talking in one group, buying quad-opteron
dual core nodes will result in the fastest

3.2 disk I/O 
Same goes for accessing local disk from four threads instead of two.
dual core is a disadvantage if you
are I/O bound. One way to mitigate that is to use 1U nodes with four
drives and have each process use it's own
drive with a separate filesystem for local scratch storage.

Michael Will / SE Technical Lead / Penguin Computing /
-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org]
On Behalf Of hpc at gurban.org
Sent: Wednesday, April 26, 2006 9:05 AM
To: beowulf at beowulf.org
Subject: [Beowulf] Which is better: 32x2 or 64x1


I'm going to build a beowulf linux cluster for my coledge. Thay want 64
core cluster with Opteron processors.
Now I want to know which is better (performance & price)?
  * 64 x Opteron Single Core
  * 32 x Opteron Dual Core


Gurban M. Tewekgeli

Beowulf mailing list, Beowulf at beowulf.org To change your subscription
(digest mode or unsubscribe) visit

More information about the Beowulf mailing list