[Beowulf] [tt] One million ARM chips challenge Intel bumblebee

Douglas Eadline deadline at eadline.org
Fri Jul 8 20:33:55 PDT 2011


>> > It's all about ultimate scalability.  Anybody with a moderate
>> competence (certainly anyone on this
>> list) could devise a scheme to use 1000 perfect processors that never
>> fail to do 1000 quanta of work
>> in unit time.  It's substantially more challenging to devise a scheme to
>> do 1000 quanta of work in
>> unit time on, say, 1500 processors with a 20% failure rate.  Or even in
>> 1.2*unit time.
>> >
>>
>> Just to be clear - I wasn't saying this was a bad idea. Scaling up to
>> this size seems inevitable. I was just imagining the team of admins who
>> would have to be working non-stop to replace dead processors!
>>
>> I wonder what the architecture for this system will be like. I imagine
>> it will be built around small multi-socket blades that are hot-swappable
>> to handle this.
>
>
>
> I think that you just anticipate the failures and deal with them.  It's
> challenging to write code to do this, but it's certainly a worthy
> objective. I can easily see a situation where the cost to replace dead
> units is so high that you just don't bother doing it: it's cheaper to just
> add more live ones to the "pool".

I wrote about the programming issue in a series of three articles
(conjecture, never really tried it, if only I had the time ...).
The first article links (at the end) to the other two.

  http://www.clustermonkey.net//content/view/158/28/

And yes, disposable "nodes" just like a failed cable in a large
cluster, route a new one, don't worry about unbundling
a huge cable tree. I assume there will be a high level of
integration  so there may be "nodes" are left for dead which
are integrated into a much larger blade.

--
Doug

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.




More information about the Beowulf mailing list