[Beowulf] Servers Too Hot? Intel Recommends a Luxurious Oil Bath

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Wed Sep 5 10:05:11 PDT 2012


I'm not sure that google actually does servicing per se.. they mark it dead, and just move on.  The cost to service (or even to diagnose) is probably higher than the cost of just overprovisioning.

Jim Lux


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Robert G. Brown
Sent: Wednesday, September 05, 2012 6:15 AM
To: Ellis H. Wilson III
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] Servers Too Hot? Intel Recommends a Luxurious Oil Bath

On Tue, 4 Sep 2012, Ellis H. Wilson III wrote:

> Yes, Google does house these containers in a fairly basic building, 
> but there is no reason I can think of why it couldn't put them out in 
> the open and run all wires, etc, into the ground instead.  I think 
> they just put them in a building for convenience to the maintainers, 
> rather than for some property of the building itself that would enable 
> the containers to work better.

Google in particular, though, lives and dies by means of instantaneous access to parts.  A computer is to them as a mere neuron is to us -- nodes fail in their cluster at the rate of many a day, and are replaced almost immediately the way they have things set up.  This is multiply economical for them -- minimum downtime, minimum human costs (because it is EASY and FAST for them to pop a node out and a new one in), minimum hardware costs because IIRC a "node" for them is literally a motherboard, memory, CPU and it just fits into a harness in the trailers, I don't think they even bother with a proper enclosure per motherboard.  Over the counter, commodity, cheap, almost hardware agnostic.



Which are all reasons that it would be a terrible idea for Google to fill the containers with any sort of gas or immerse the nodes in oil or use any sort of non-contained direct-contact liquid to cool them.  It would take ten times as long to replace a node, literally.  It would mean (very probably) that they'd have to "mess" with the nodes in some way putting them in -- I don't see normal CPU cooling fans moving oil, for example, or there would be custom plumbing to a per-CPU, per-Mobo water cooled sink that wouldn't work or would have to be replace if they changed Mobo, or the fire/explosion risk and need to pump down an entire container in order to replace a single motherboard, which might come with a need to SHUT DOWN the entire container while this was going on.



More information about the Beowulf mailing list