[Beowulf] [tt] One million ARM chips challenge Intel bumblebee
hahn at mcmaster.ca
Sat Jul 16 13:19:43 PDT 2011
>>> How long is it going to take to wire them all up? And how fast are they
>>> going to fail? If there's a MTBF of one million hours, that's still one
>>> failure per hour.
> They do address som of that in ftp://ftp.cs.man.ac.uk/pub/amulet/papers/SBF_ACSD09.pdf
the 1m proc seems to be referring to cores, of which their current SOC
has 20/chip, and there are 4 chips on their current test board:
hmm, that article says 18 cores (maybe reduced for yield). stacked dram, not
sure what the other companion chip is on the test board.
anyway, compare it to the K computer: 516096 compute cores, 64512 packages,
versus 50k packages for Spinnaker. Spinnaker will obviously put more
chips onto a single board (board links are more reliable than connectors,
as well as more power-efficient.) Spinnaker has 6 links for a 2d toroidal
mesh (not 3d for some reason) - K also uses a 6-link mesh.
obviously, off-board links need a connector, but if I were designing either
box I'd have each board plug into a per-rack backplane, again, to avoid
dealing with cables. if you have a per-rack sub-mesh anyway, it should
be 3d, shouldn't it?
in abstract, it seems like Spinnaker would want a 3d mesh to better model
the failure effect in the brain (which is certainly not 2d nearest-neighbor!)
in fact, if you wanted to embrace brain-like topologies, I'd think a
flat-network-neighborhood would be most realistic (albeit cable-intensive.
but we're not afraid of failed cables, since the brain isn't!)
More information about the Beowulf