[Beowulf] Roadrunner picture

Andrew Robbie (GMail) andrew.robbie at gmail.com
Wed Jul 16 11:42:00 PDT 2008


On Tue, Jul 15, 2008 at 12:35 AM, Josip Loncaric <josip at lanl.gov> wrote:
>
> Another good link:
>
> http://www.lanl.gov/orgs/hpc/roadrunner/rrtechnicalseminars2008.shtml

As I was reading the slides, one question leap out at me: they have a
huge IB network connecting every 'node', but instead of connecting
direct to storage, this connects to an IB-to-10GigE bridge/IO
processor board. Why? If they used avoided the protocol conversion
going on that would be inherently simpler, and I've seen nothing to
indicate that 10GigE is faster or cheaper (certainly not cheaper!).

Is this (ethernet) a Panasas requirement? Particularly as there is one
paper directly relating checkpoint time (IO performance) to overall
throughput, I would have though IO was quite a central requirement. I
can see not wanting to change an existing storage backend though.

It would be great to see some graphs of bus contention (Cell <-> PPC,
PPC<->IB<->Cell three-way etc) for the various codes, rather than just
latency/bandwidth figures. And maybe GFlops/MMF (mythical man month)
(or have MMFs been related to Watts in some fashion?)

\Andrew



More information about the Beowulf mailing list