[Beowulf] Interesting google server design

Simon Hogg seth at hogg.org
Thu Apr 2 06:16:22 PDT 2009

Robert G. Brown wrote:

>On Wed, 1 Apr 2009, Ellis Wilson wrote:
>> Beyond that, building an entire cluster of that size into a large
>> shipping container is genius - given access and resources to a crane and
>> proper machining expertise.  But then again if your building a cluster
>> of that size I suppose the crane, shipping container and a few welders
>> are not going to be your biggest worries.
>> Also interesting is that they use Gigabyte - I haven't been entirely
>> impressed with them and that is for purely desktop use.  Perhaps their
>> server grade boards are better quality enough to make them worthwhile at
>> that scale.
>IIRC Google doesn't use "server grade" anything.  They use OTC parts and
>do a running computation on failure rates and optimize price performance
>dynamically.  They are truly industrial scale production here.  For them
>servicing/replacing a system is cheap:  Box dies.  Employee notes this,
>grabs box from Big Stack of Boxes, carries it to dead box, removes dead
>box, replace it with new working box, presses power switch, walks away.
>Problem solved.

This may have changed, but way back when I remember being told that Google *don't* replace dead nodes, they just turn them off.  Supposedly it wasn't cost-effective to repair them or cannibalize them for other nodes.

As I say, this was a good few years ago now, so the economics now may be different (or my original info might have been based on hearsay).


