[Beowulf] Does computation threaten the scientific method?

Thu Mar 29 07:24:38 PDT 2012

It is an interesting problem, and one relevant to HPC, just because HPC
tends to be used in applications where errors can have big effects
(although one could argue that a bad 1 page spreadsheet in the hands of a
Fortune 10 CEO might result in a bad decision that has huge impacts..)

With respect to "the scientific method", is it really different than say,
a very complex or esoteric math proof.  The SM relies on (attempted)
replication for validation and in some ways, it's more a funding thing
(that is, if your results depend on 100 work years of software
development, replicating requires investing 100 work years).  "big
science" has always had that problem: Do you build a copy of the LHC?

And, given the small number of people actually doing it, would a
replication really be an independent replication.  Odds are, some of the
same developers would wind up on the new project, because of the limited
pool of people who really know, say, numerical weather modeling.  (again,
this is partly funding.. If there were public investments of trillions of
dollars in numerical weather modeling, there would be a veritable army of
numerical weather modelers emerging from the halls of academe from which
to choose for your replication effort.  We don't, and there aren't.)

And one has to be very careful about measures of defect density (usually
given as defects/KLOC).  Not only are there the "what's a LOC" questions,
but different defects have different consequences, and the reporting of
defects varies as a result. The LOC question is particularly an issue with
the use of autogenerated code, although I tend to think this is no
different than a compiler. Do you count source lines or machine
instructions? If you make the (big) leap that the compiler is validated,
you're working on the assumption that there's some underlying defect rate
per "human operation".

At JPL, we have a defect density of .1 to 4 defects/KSLOC spanning
development test, system test, and operations.  Mostly, we're in the less
than 0.5 defects/KSLOC in operational use.    Comparing with DoD data,
it's in the same ballpark (DoD reports 0.2 to 0.6).  But this is for
"flight software" (that which runs on a computer in the spacecraft) which
has a much more rigorous development process, over a shorter time span,
than a lot of what the article was talking about.  The data is somewhat
skewed by the observation that until very recently, flight code is small
because of the limited processor and memory space.    Our productivity is
gradually increasing, but not by leaps and bounds. We generated about 100
lines of code per work-month in the early 80s and about 50% more right
now, and that's at the low end of the  1-5 line of code/work hour you see
bandied about. I would attribute the low rate to the large amount of
oversight and rigor.

The shuttle software at $500M/500KSLOC is an interesting thing.  Let's say
that was mostly done in the early 80s, when a high end work hour cost
about $50 (= 50K/yr salary+ overheads).  $500M is then 10M work hours, so
they achieved the spectacularly low productivity of 0.05 SLOC/hour.  The
shuttle software is widely known to have been VERY low productivity.
What's also interesting is that it's not clear that they achieved a
commensurate reduction of defect rate (that is, is the defect rate 20
times lower?).  Nancy Leveson, among others, has some papers talking about
this.

I suspect that most of the software being described in the article doesn't
have anywhere near the process rigor, nor does it have the inspectability,
of the flight software at JPL.  In general, I think it's probably tested
by comparing results with expectation or with actual data, and the code is
adjusted to make the model output match the observed data.  Whether the
code adjustments are actually "model parameters" or "unfounded logic
changes" is sort of the question. And, of course, the "software" is really
a conglomeration of libraries, other people's programs, etc.  There are
pieces of the puzzle that are no doubt rigorously verified: For instance
not only is the source code for the Numerical Electromagnetics Code, NEC,
published, but there's hundreds of pages of theory of operation
documentation, as well as lots of published validation studies comparing
against theoretical calculation and empirical measurements.  I suspect
that something like LAPACK is pretty well validated as well.  But I would
imagine that there is little *published* information on the validation of
the overall assemblage.  I think a lot of people just assume the library
or popular tool "just works" and probably don't look at analyzing "well,
how would we know if NEC or LAPACK was wrong".

I'll speculate, too, that while a given research effort and the scientists
attached to it may have a long duration (and, so, retain some corporate
knowledge of the software architecture, design, and validation), there is
more turnover on the software developers, and knowledge retention/transfer
probably isn't all that good.  An interesting observation from JPL is that
the vast majority of the 1000+ people developing software don't have any
training in software development processes/engineering/etc, beyond
practical experience (which is a form of training).  This is, in part,
because most software development is quite diffuse. Over half the software
development projects are smaller than 2 work years, and software
development is just part of the overall bigger task.  That is, the
scientist or engineer is doing software  development as a tool to do their
job, not as a job in itself.  I suspect a similar phenomenon is true in
academe.  How much "I'll just whip up this Matlab module to analyze the
data" winds up being the basis for "production code".

On 3/29/12 4:58 AM, "Douglas Eadline" <deadline at eadline.org> wrote:

>
>
>I am glad some one is talking about this. I have wondered
>about this myself, but never had a chance to look into it.
>
>
>http://www.isgtw.org/feature/does-computation-threaten-scientific-method
>
>-- 
>Doug
>
>-- 
>This message has been scanned for viruses and
>dangerous content by MailScanner, and is
>believed to be clean.
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf