another radical concept...Re: [Beowulf] Cooling vs HW replacement

William Dieter william.dieter at gmail.com
Wed Jan 19 21:59:22 PST 2005


On Tue, 18 Jan 2005 12:52:50 -0800, Jim Lux <james.p.lux at jpl.nasa.gov> wrote:
> There's also the prospect, not much explored in clusters, but certainly used
> in modern laptops, etc. of dynamically changing computation rate according
> to the environment. If the temperature goes up, maybe you can slow down the
> computations (reducing the heat load per processor) or just turn off some
> processors (reducing the total heat load of the cluster).  Maybe you've got
> a cyclical temperature environment (that sealed box out in the dusty
> desert), and you can just schedule your computation appropriately (compute
> at night, rest during the day).

For non-real-time systems, it is a similar problem to the
heterogeneous load balancing problem, where the load balancer does not
have complete control over the load on the machine (e.g.,  cycle
scavenging systems like Condor where a user can sit down and start
running jobs.)  The difference is that the capacity of the machine
changes when the system scales the frequency and voltage rather than
due to newly arriving jobs.  If the temperature change happens slowly
enough, you would probably be able to predict when a rebalancing will
be needed before it really is.

To work most efficiently the computational task would have to know how
to arbitrarily partition the problem, and migrate pieces of work
between nodes.  A simple approach for applications that are not so
easily divided, might be to locally monitor temperature and scale
voltage and frequency to keep temperature under a predetermined limit.
 However, if one machine is warmer than others (maybe it is near a
window or far from a vent), it would slow down the entire application.
  Assuming only one application is running (or at least only one that
you care about) there is no point in having some nodes running faster
than others.  The system could distribute speed scaling information so
that all nodes will slow down to the speed of the slowest, keeping
them more or less in sync and reducing the global amount of heat
generated.  With less heat generated the room would cool off, and the
warmest node (and all the others) would be able to speed up somewhat.

Nodes could also slow down below their advertised speed when they have
less work to do.  The Transmeta Effcieons already sort of do this. 
They reduce speed when a certain percentage of the CPU time is idle. 
Or if the system notices a job is memory bandwidth limited, it could
slow down the CPU to match the memory speed.

> This kind of "resource limited" scheduling is pretty common practice in
> spacecraft, where you've got to trade power, heat, and work to be done and
> keep things viable.
> 
> There are very well understood ways to do it autonomously in an "optimal"
> fashion, although, as far as I know, nobody is brave enough to try it on a
> spacecraft, at least in an autonomous way.
> 
> Say you have a distributed "mesh" of processors (each in a sealed box), but
> scattered around, in a varying environment.  You could move computational
> work among the nodes according to which ones are best capable at a given
> time.  I imagine a plain with some trees, where the shade is moving around,
> and, perhaps, rain falls occasionally to cool things off.  You could even
> turn on and off nodes in some sort of regular pattern, waiting for them to
> cool down in between bursts of work.

This would be especially true if each node runs off battery power, for
example in a sensor network.  In addition to the reliability issues,
the amount of energy that can be extracted from the battery is much
lower if the battery temperature is too high.   Jobs could migrate to
a new node just before each node gets too hot, as long as the network
is dense enough to still cover the sensed phenomenon.  The main
limitation would be how fast the nodes can cool off in the low power
mode.  For example if it takes twice as long for a node to cool as it
does to heat up, you would need two idle nodes for each active node.
 
Bill.
-- 
Bill Dieter.
Assistant Professor
Electrical and Computer Engineering
University of Kentucky
Lexington, KY 40506-0046



More information about the Beowulf mailing list