[Beowulf] Themes for a talk on beowulf clustering
Lux, Jim (337C)
james.p.lux at jpl.nasa.gov
Mon Mar 4 16:06:19 PST 2013
I think the change in scale over the past 10-15 years is interesting, and especially the changes in architecture that result from this.
Going from 8-16 processors to 1000s is a big change. Bisection bandwidth on your comm fabric. How do you boot.. 8 processors can be booted sequentially or simultaneously from a server. For 1000 you need a "better way". How do you feed files to/from a 1000 processor cluster?
Issues with checkpoint/restart/reliability. We had a project here at JPL looking at replacing the big 70 meter dishes with an array of, say, 100 6-12 meter dishes. Replacing the single custom box with lots of a commodity things (6-12 meter antennas are stamped out by the hundreds). Very Beowulf'y in concept.
Turns out that cryocolers (needed to keep the receiver at a nice toasty 4 Kelvins) aren't really a mass produced item, and at the observed failure rates, you'd have a hard time keeping enough of them working to do what you needed. A failure rate of once a month (or something.. I don't know what the actual rates are) on the 70m antenna means you can have a spare and swap it in, and then you basically have a month to fix the broken one. With 100 antennas and a cryocooler MTBF of 0.5 years, you'll have 4 broken coolers at any given time
The practical differences in experience between assembling a toy cluster of 4-8 processors and simulating it with VM instances on a single machine. What you learn from the former that you don't get on the latter (the importance of labeling of cables, for instance).
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Andrew Holway
Sent: Sunday, March 03, 2013 12:24 AM
Subject: [Beowulf] Themes for a talk on beowulf clustering
I am giving a talk on beowulf clustering to a local lug and was wondering if you had some interesting themes that I could talk about.
ta for now.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf