[Beowulf] Checkpointing using flash

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Mon Sep 24 07:55:24 PDT 2012


Yes indeed.. but I'm hopeful.  50 years ago, FORTRAN was hot stuff compared to assembler; and things like virtual memory, paging, and such were "wouldn't it be nice" kind of ideas.   40 years from now, hopefully, there will be optimizing compilers for 10^9 processing nodes, etc.

Jim Lux

-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Eugen Leitl
Sent: Monday, September 24, 2012 2:11 AM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] Checkpointing using flash

On Sat, Sep 22, 2012 at 09:29:25PM +0000, Lux, Jim (337C) wrote:

> I think the future is in explicitly recognizing that you have to pass 
> messages serially and designing algorithms that are tolerant of things 
> like missing messages, variable (but bounded) latency (or heck, 
> latency at all).

Computational physics pretty much demands this. 
 
> Once you've got a generalized fast approach using message passing, 
> it's very scalable.

But the human programming doesn't scale across 10^6 to asynchronous 10^9 nodes with <GByte of memory each and where determinism is computationally more expensive than stochastical good-enough result.

Of course the physical modelers won't bat an eyelash, but the common programmer who still tries to figure out this multithreading thing will be out to lunch.





More information about the Beowulf mailing list