[Beowulf] BMW Shifts Supercomputing To Iceland To Save Emissions

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Mon Oct 15 14:46:53 PDT 2012

On 10/15/12 11:07 AM, "Mark Hahn" <hahn at mcmaster.ca> wrote:

>> Mind you, I'm a huge fan of small clusters under a single person's
>>where nobody is watching to see if you are making 'effective utilization'
>>and you can do whatever you want.  A personal supercomputer, as it were.
>>But I recognize that for much of the HPC world, clusters are managed in
>>same way as big iron mainframes were in the 70s,
>I think you're being a bit disingenuous here.  dedicated/personal
>clusters are perfectly sensible when the workload is non-bursty
>or somehow otherwise high-duty-cycle.  or perhaps when you're
>talking about resources cheap enough to hand out like pencils.
>(that is, let's be honest: cheap enough to waste.)

That's a good way to describe it.. Cheap enough to waste (or, cheaper to
let the computer idle waiting for the user rather than the user idling,
waiting for the computer)

>a larger, shared resource pool is ideal for bursty/low-DS environments.

Especially if the bursts are "planable"..

>as far as I can see, there are really only a couple problems with this:
>- many people and most environments have a mixture of burstiness.


>- schedulers are not awesome at managing latency of either flavor
>   when both are mixed, especially in the presence of poor resource
>   requirements (bad runtime estimates, poor memory requirements, etc.)

Yes.. (whether the scheduler is human or algorithmic)

>- resource granularity becomes even more of a problem: serial jobs
>   "contaminate" nodes for parallel use or high vs low mem, etc.
>- very short runtime limits permit more rebalancing of resources,
>   but are incredibly harmful to most people's productivity.
>- preemption (SIG_STOP/CONT) seems to be a relatively little-used
>   way to optimize for latency - enough so that it simply does not work
>   right on major non-free schedulers.
>- it's hard to get people to treat storage as ephemeral :(

If it's under my desk, it doesn't have to be ephemeral.  I can leave all
my temporary data on the node disks...
>- big resources are also big budget targets :(

Yes, but I think it's more of an issue of "size of capital expenditure" vs
"size of organization where expenditure is reviewed"...

For instance, a 10K expenditure might be reviewed very locally, and might
be "big" compared to, say, a new ergonomic desk chair at $800.  But in the
context of something like JPL's entire budget of $1.5B, 10k is below the
noise floor.  So once you got your 10k, if it sat idle in your office half
the time, you only really have to explain it to a few people.

OTOH, if it's a million dollar piece of hardware, and it's sitting in your
office idle, you're going to be explaining that to a substantially larger
group of people.

I suppose it all comes down to how much reduction of *my* labor cost there
is by having the computer available, when I want it, as opposed to, say,
tomorrow after calling to make an appointment for supercomputer time.

Recognizing that the "I've got it now" is not going to be anywhere near
the horsepower of "request it for tomorrow".


More information about the Beowulf mailing list