[Beowulf] Is there really a need for Exascale?

Mark Hahn hahn at mcmaster.ca
Wed Nov 28 22:37:06 PST 2012


> "consumer" products. And then, there's the "Win on Sunday, sell on
> Monday" effect, which I don't think needs any explanation.

there's a premise here which I think is mistaken.  this theory depends
on the F1 circuit having a very different cost-effectiveness requirement.
that team X will spend whatever it takes to gain an advantage, and 
that once developed, the advantage can transfer to consumer products.
does exascale really have a blank check to develop exotic whatever-it-takes
technologies?  in a very meaningful sense, HPC is all about
cost-effectiveness, even at high scale.  you want a computer 10x faster
so you don't have to wait as long - there's no whatever-it-takes there.

> An even more cynical view say that the HPC vendors lobby the government
> to believe exascale is important so the government invests in it and
> subsidizes their R&D.

oh, I don't think that's a stretch at all.  governments love to be sold
on the idea that they're special and because of that, they need to invest
in preservation of that specialness.  one word: military.

> In my opinion, the new technology driven by the move to petscale,
> exascle, etc, will ultimately valuable to use consumers, but to your

I claim that tech is largely fungible these days.  chip-stacking
was first mass-produced for phones afaik.  but anyone who wants to do it
can just buy some expertise or just outsource to get it done.
the trick is in how you put the pieces together.

> average researcher, having a decent-sized cluster that they have a lot
> of access to is more valuable than a large, shared system like Blue
> Waters or something similar, that must shared with hundreds or thousands
> of other researchers.

well, that's stretching things a lot.  some people have steady workloads,
and for them, owning dedicated hardware makes the most sense.  most people,
though, have bursty demand, and sharing large resource pools is ideal.
but "large resource pool" has nothing to do with exascale - the size of 
your shared machines is really a question of average job sizes and minimizing
fragmentation.  (so, for instance, if I had money to spend on 1M cores,
I'd definitely spend most of it on clusters of a few hundred nodes - 
say, O(10k) cores.)

I think your negative reference to big-shared systems is actually more 
a comment on how big systems seem to inspire BOFHishness - that it's 
hard to get access to them.  their keepers quite naturally incline
toward a job mix that shows off the biggness to its best advantage.
"I successfully argued to spend extra to make it big, so I don't want 
it overcome with hordes of mundane-sized jobs!"  unfortunately this 
seems to go along with the kind of foot-dragging that gives us application
processes that cycle at 1/year (here in Canada at least, not that we have
decent non-spasmodic HPC funding...)



More information about the Beowulf mailing list