[Beowulf] A Cluster of Motherboard.

Mark Hahn hahn at physics.mcmaster.ca
Thu Nov 10 09:09:18 PST 2005


> Yeah, I think that Jim's observation that you should think carefully
> about the diminishing returns of building a freeform caseless cluster is

it's a very seductive idea - I feel the pull towards hackerish approaches
myself.  I think the main attraction is that the hardware _starts_out_ 
so remarkably cheap, in contrast to the final prices of a "professional"
cluster.

for instance, suppose someone pay $US 7M for a 1500-cpu cluster.
jeez, that sounds horrible, nearly $5k per cpu!  let's add it up:
$500 for the cpu, half of a motherboard, half a chassis/ps.
street prices should be ~20% of the actual cost.

that ignores many serious things.  for one, random hardware from 
pricewatch doesn't come with 3-year 9-5 NBD support, and is, in any case
probably noticably lower MTBF.  further, this is a large cluster, and 
intended for large, tight-coupled jobs.  that means lots of high-end 
networking and fileserving hardware, not gigabit and a couple NFS boxes.

Google's approach is excellent, since they are not running large/tight
jobs, and have no reason to demand GB/s interconnect or multi GB/s writes 
to a single file.  they are a great example of taking advantage of 
computers by the pound; the point here is that HPC can't do that as 
easily.

some people can.  for instance, if you're into running billions of tiny
MC simulations, you're close enough to Google's workload that you could 
copy them.  you'd want to look into questions of managability, of course,
especially WRT service.  and you'd still need to worry about overall
power/space/cooling issues.

in summary, subtracting the chassis sounds smart, but really only makes
sense if you follow through with the rest - cheap motherboard, cheap cpu,
minimal cpu, minimal network, cheap labor, workload that is embarassingly
parallel, and not long-running...

regards, mark hahn.






More information about the Beowulf mailing list