[Beowulf] core diameter is not really a limit

Mon Jun 17 07:37:02 PDT 2013

> I've remembered to check the cluster monkey feed, and
> seen
> http://www.clustermonkey.net/Opinions/the-core-diameter.html

well, it's an amusing musing on scale, but I don't think it helps
anyone make decisions, practical or strategic.

> The assumption made here is that every node needs to be able
> to talk to every other node within the assembly.

yes, all-to-all connectivity *and* fixed latency.

> I think there is a large class of problems where direct
> long-distance communication is not necessary. E.g. if

sure, but there are also interconnects which do not follow 
the premises of this article (mesh/lattice/torus ones).

I don't really have a sense for how large a class you're talking about,
though.  from a general-purpose/flexibility standpoint, a purely 
local-interacting system, even if 3d, would seem to have issues.
for instance, if you had discrete fileservers, how would they connect
to a giant cube that is partitioned into job-subcubes?  obviously 
scheduling jobs so that they're in contiguous sub-volumes is much
more constrained than on a better-connected cluster, as well.

I saw a conference talk by David Turek (IBM exascale) recently, wherein he
was advocating coming up with a new model that gives up the kind of extreme
modularity that has been traditional in computing.  it's a bit of a strawman,
but the idea is that you have specialized cpus that mostly compute 
and do not have much storage/state, talking to various permutations
of random-access storage (cache, shared cache, local and remote dram),
all talking over a dumb-pipe network to storage, which does no computation,
just puts and gets.  (this was at HPCS, which was a slightly odd mashup
of bigdata (ie, hadoop-like) and HPC (mostly simulation-type crunching)).

all this in the context of the sort of picojoule-pinching
that exascale theorists worry about.  it wasn't really clear 
what concrete changes he was advocating, but since since higher capacity
clearly causes systems of greater extent, he advocates spreading 
the computation everywhere.  compute in the storage, compute in the network, 
presumably compute in memory.  the pJ approach argues that computations
involving some state should happen as close to that state as possible.

I'm skeptical that just because flops are cheap, we need to put them 
everywhere.  OTOH, the idea of putting processors into memory has always
made a lot of sense to me, though it certainly changes the programming 
model.  (even in OO, functional models, there is a "self" in the program...)

regards, mark hahn