[Beowulf] $2500 cluster. What it's good for?

Mon Dec 20 05:56:39 PST 2004

On Sun, 19 Dec 2004, Jim Lux wrote:

--snip--

> 
> This brings up an interesting optimization question. Just like in many
> things (I'm thinking RF amplifiers in specific) it's generally cheaper/more
> cost effective to buy one big thing IF it's fast enough to meet the
> requirements. Once you get past what ONE widget can do, then, you're forced
> to some form of parallelism or combining smaller widgets, and to a certain
> extent it matters not how many you need to combine (to an order of
> magnitude).   The trade comes from the inevitable increase in system
> management/support/infrastructure to support N things compared to supporting
> just one. (This leaves aside high availability/high reliability kinds of
> things).
> 
> So, for clusters, where's the breakpoint?  Is it at whatever the fastest
> currently available processor is?   This is kind of the question that's been
> raised before.. Do I buy N processors now with my grant money, or do I wait
> a year and buy N processors that are 2x as fast and do all the computation
> in the second of two years?  If one can predict the speed of future
> processors, this might guide you whether you should wait for that single
> faster processor, or decide that no matter if you wait 3 years, you'll need
> more than the crunch of a single processor to solve your problem, so you
> might as well get cracking on the cluster.
> 
> Several times, I've contemplated a cluster to solve some problem, and then,
> by the time I had it all spec'd out and figured out and costed, it turned
> out that I'd been passed by AMD/Intel, and it was better just to go buy a
> (single) faster processor.  There are some interesting power/MIPS trades
> that are non-obvious in this regime, as well as anomalous application
> environments where the development cycle is much slower (not too many "Rad
> Hard" Xeons out there).
> 
> There are also inherently parallel kinds of tasks where you want to use
> commodity hardware to get multiples of some resource, rather than some
> special purpose thing (say, recording multi-track audio or the
> aforementioned video wall). Another thing is some sort of single input
> stream, multiple parallel processes for multiple outputs. High performance
> speech recognition might be an example.
> 
> What about some sort of search process with applicability to casual users
> (route finding for robotics or such...)
> 

Jim,

Here is my "soap box" speech about this issue. 

The question of a cluster versus next years processor has always been a
worthwhile consideration. For modestly parallel programs, say 3-4 times
faster on 6-8 processors, this is definitely an issue. If however, you are
seeing a 30-40 times faster on 60-80 processors (on a problem that will
not fit on a workstation), then next years model will not help much. Now,
the cost to go 30-40 times faster may be an issue to some.

For small clusters this is more of an issue. For instance, on our $2500
cluster, we have eight 1.75GHz Semprons and 2304 MB of RAM.  Using a very
naive argument that we have 14.00 GHz to apply to a problem (or some other
metric that is 8 time a single CPU), then if we can achieve 50%
scalability (4x times faster on 8 CPUs) we are getting 7 GHz out for the
system. I would *guess* that this is close to a dual desk top box. Of
course, highly scalable things would push the cluster ahead.

Now, it gets more interesting when you ask "Well should I wait for next
year and get fast processors for my $2500 cluster?" As always it depends.
If all that changes are faster CPUs (and lets assume the memory gets
faster as well), then using the same interconnect, GigE, the scalability
of some applications gets less and a cluster may not be the best choice.

These types of arguments have been important "parallel computing" issues
for quite some time. However, this was based on a Moore's Law assumption
that single CPU speed will keep increasing. This assumption has held up
until now. The introduction of dual core processors is an indication that
scaling up frequency is harder than scaling out processors. So now the
question will become, is it better to have two quad boxes (two dual
motherboards with dual core processors), or four dual boxes (four single
motherboards with dual core processors), or eight single boxes. Who knows?  
What I do know is that the issues we have been talking about on this
little list will very soon become big issues to the rest of the market.

Doug
----------------------------------------------------------------
Editor-in-chief                   ClusterWorld Magazine
Desk: 610.865.6061                            
Fax:  610.865.6618                www.clusterworld.com