[Beowulf] dual-core benefits?

Tahir Malas tmalas at ee.bilkent.edu.tr
Fri Sep 23 05:59:12 PDT 2005

> -----Original Message-----
> From: Mark Hahn [mailto:hahn at physics.mcmaster.ca]
> Sent: Friday, September 23, 2005 12:07 AM
> To: Tahir Malas
> Cc: beowulf at beowulf.org
> Subject: Re: [Beowulf] dual-core benefits?
> > 1. The scalability of our program is not so good, less then 20 for 32
> nodes
> > (measured on a single node system). So we don't plan to go beyond 16
> nodes.
> > (which makes 32 processors due to dual-node usage)
> is this scalability assuming a slow interconnect like gigabit?
Yes, gigabit on Pentium 4 cluster.

> have you considered when it would be appropriate to go to something fast?
Yes, that is probably sth that we will consider after trying gigabit and two
network interfaces per mb. 

> (myrinet, infinipath and quadrics are my favorite, though the latter
> especially is always difficult to squeeze into a budget.)  it's really
> excellent if you can characterize your code based on distribution of
> packet sizes, so you can trade off the latency/bandwidth properties of
> various interconnect options.  any recognizable communication patterns
> (esp nearest-neighbor) can pay off as well.
> > 2. Memory requirement is huge; we will use 4GB memory per node for the
> time
> > being and increase this to 16 GB later. So wee need fast CPUs and
> efficient
> > usage of memory.
> these days, that's not huge - after all, 1GB dimms are definitely
> "above the knee" (in the linear region, price-wise.)  what I find is that
> there is (continued) divergence between small and large-memory kinds of
> applications.  people who do MC-type stuff continue to need only a few
> handfuls of MB, whereas memory-intensive apps would like 1000x as much.
But still we are limited with 16 GB per node (for dual-node mbs.), and for
that we have to use costly 2GB RAM modules, which is around $1000. So,
memory cost is likely to be larger than CPU cost.
> that's why you need to figure out why your scalability is poor.
> on a multiprocessor system, you effectively have a pretty fast, if small,
> interconnect.  if your code can take advantage of that, then going
> dual-core could well be a win.  for instance, if your code is limited
> by short-message, point-to-point latency, then increasing "SMP-ness"
> should help a lot, especially if you are assuming mere gigabit.

Well, actually I'm still not sure about this. The CPUs inside the node will
communicate fast, but then the network will be a bottleneck?
> obviously, if your code scales poorly because it's bottlenecked on memory,
> then dual-core is a bad idea.  (actually, if it's bottlenecked on memory
> _latency_, that might not necessarily be true...)
I think this is not the case. Though requiring high memory, it is quite
compute intensive.

> DC doesn't change memory issues: AMD claims that the chips are slightly
> more efficient (slightly higher aggregate streaming bandwidth), but it
> seems to be a very small factor.  especially if you compare to e-rev
> singlecore chips.  there is a noticable difference vs older revs,
> especially
> with lots of memory, since older chips drop down as low as PC1600
> for a sufficient number of memory banks in use (dimm sides, basically.)
> it's not automatic - dual-core is just SMP-in-one-package.  the advantage
> is mainly that DC lets you amortize the other components in the system.
> for programs which are truely limited by memory bandwidth, you really
> don't
> want to amortize the memory, so DC is a loss in this case.
> bear in mind that DC is also lower clock than SC.

More information about the Beowulf mailing list