[Beowulf] MPI, fault handling, etc.

Douglas Eadline deadline at eadline.org
Thu Mar 10 12:44:29 PST 2016


> I will supper C's "hater" listing effort just to keep a spot light on the
> important subject.
>
> The question is not MPI is efficient or not. Fundamentally, all
> electronics
> will fail in unexpected ways. Bare metal computing was important decades
> ago but detrimental to large scale computing. It is simply flawed for
> extreme scale computing.
>
> The Alan Fekete, Nancy Lynch, John Spinneli's impossible proof is the
> fundamental "line in the sand" that cannot be crossed.
>
> The corollary of that proof is that it is impossible to detect failure
> reliably either. Therefore, those efforts for for runtime
> detection/repair/reschedule are also flawed for extreme scale computing.
>

Well on that note, I suppose we should just call it day.
Although, some thought Godel would put the whole math thing
out of business as well.

--
Doug




> Justin
>
> On Thu, Mar 10, 2016 at 8:44 AM, Lux, Jim (337C)
> <james.p.lux at jpl.nasa.gov>
> wrote:
>
>> This is interesting stuff.
>> Think back a few years when we were talking about checkpoint/restart
>> issues: as the scale of your problem gets bigger, the time to checkpoint
>> becomes bigger than the time actually doing useful work.
>> And, of course, the reason we do checkpoint/restart is because it’s
>> bare-metal and easy.  Just like simple message passing is “close to
>> the
>> metal” and “straightforward”.
>>
>> Similarly, there’s “fine grained” error detection and correction:
>> ECC
>> codes in memory; redundant comm links or retries.  Each of them imposes
>> some speed/performance penalty (it takes some non-zero time to compute
>> the
>> syndrome bits in a ECC, and some non-zero time to fix the errored
>> bits… in
>> a lot of systems these days, that might be buried in a pipeline, but the
>> delay is there, and affects performance)
>>
>> I think of ECC as a sort of diffuse fault management: it’s pervasive,
>> uniform, and the performance penalty is applied evenly through the
>> system.
>> Redundant (in the TMR sense) links are the same way.
>>
>> Retries are a bit different.  The “detecting” a fault is diffuse and
>> pervasive (e.g. CRC checks occur on each message), but the correction of
>> the fault is discrete and consumes resources at that time.  In a system
>> with tight time coupling (a  pipelined systolic array would be the sort
>> of
>> worst case), many nodes have to wait to fix the one that failed.
>>
>> A lot depends on the application: tighter time coupling is worse than
>> embarrassingly parallel (which is what a lot of the “big data” stuff
>> is:
>> fundamentally EP, scatter the requests, run in parallel, gather the
>> results).
>>
>> The challenge is doing stuff in between:  You may have a flock with
>> excess
>> capacity (just as ECC memory might have 1.5N physical storage bits to be
>> used to store N bits), but how do you automatically distribute the
>> resources to be failure tolerant.   The original post in the thread
>> points
>> out that MPI is not a particularly facile tool for doing this.  But
>> I’m not
>> sure that there is a tool, and I’m not sure that MPI is the root of
>> the
>> lack of tools.    I think it’s that moving from close to the metal is
>> a
>> “hard problem” to do in a generic way.  (The issues about 32 bit
>> counts are
>> valid, though)
>>
>>
>> James Lux, P.E.
>>
>> Task Manager, DHFR Space Testbed
>>
>> Jet Propulsion Laboratory
>>
>> 4800 Oak Grove Drive, MS 161-213
>>
>> Pasadena CA 91109
>>
>> +1(818)354-2075
>>
>> +1(818)395-2714 (cell)
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>
>
> --
> Mailscanner: Clean
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>


-- 
Doug

-- 
Mailscanner: Clean



More information about the Beowulf mailing list