<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    To borrow from an old joke, I'd say the short answer is "No.", and

    the long answer?  "Nooooooooooo."<br>

    <br>

    Reproducibility is an interesting issue - on the surface, it seems

    like a binary thing: something is or is not reproducible.  In

    reality, though, things are almost never duplicated exactly, and

    there exists some fuzzy threshold at which point things are

    considered good enough to be a reproduction.  I can go down to a

    local store and buy a print of the Mona Lisa and, to me, it might be

    a really great reproduction, yet even writing that sentence has some

    art critic screaming in agony.  Similarly, in computing, if I run

    some model on two different systems and get two different results,

    that can either be indicative of a potential issue or it can be

    completely fine, because those differences are below a certain

    threshold and thus the runs were, in scientific terms,

    'reproducible' with respect to each other.<br>

    <br>

    On a small scale (meaning a lab, code or project), this is a key

    issue - I've seen grad students and faculty alike be dismayed by

    trivial differences, and when this happens, more often than not the

    mentality is, "My first results are correct - make this code give

    them back to me", without understanding that the later, different

    results are quite possibly equally valid, and possibly more so. 

    Back in the early Beowulf days, I remember switching some codes from

    an RS/6000 platform to an x86-based one, and the internal precision

    of the x86 FPU was 80-bits, not 64, so sequences of FP math could

    produce small differences unless this option was specifically

    disabled via compiler switches.  Which a lot of people did, not

    because the situation was carefully considered, but because with it

    on, it gave 'wrong' results.    Another example would be an

    algorithm that was orders of magnitude faster than one previously in

    use, but wasn't adopted because ultimately the results were

    different.  The catch here?  Reordering the input data while still

    using the original algorithm gave similarly different answers - the

    nature of the code was that single runs were useless, and ensemble

    runs were a necessity.  <br>

    <br>

    Ultimately, the issues here come down to the common perception of

    computers - "They give you THE answer!" - versus the reality of

    computers - "They give you AN answer!", with the latter requiring

    additional effort to provide some error margin or statistical

    analysis of results.  This happens in certain computational

    disciplines far more often than others.<br>

    <br>

    On the larger scales - whether reproducibility is an issue in

    scientific <i>fields</i> - again, I'd say the answer is no.   The

    scientific method is resilient, but it never made any claims to be

    'fast'.  Would it speed things up to have researchers publish their

    code and data?  Probably.  Or, rather, it'd certainly speed up the

    verification of results, but it might also inhibit new approaches to

    doing the same thing.  Some people here might recall Michael

    Abrash's "Graphics Programming Black Book", which had a wonderful

    passage where about a word-counting program.  It focused explicitly

    on performance tuning, with the key lesson being that nobody thought

    there was a better way of doing the task... until someone showed

    there was.  And that lead to a flurry of new ideas.  Similarly,

    having software that does things in a certain way often convinces

    people that that is THE way of doing things, whereas if they knew it

    could be done but not how, newer methods might develop.  There's

    probably some happy medium here, since having so many different

    codes, mostly with a single author who isn't a software developer by

    training, seems less efficient and flexible than a large code with

    good documentation, a good community and the ability to use many of

    those methods previously in the one-off codes.<br>

    <br>

    In other words, we can probably do better, but science itself isn't

    threatened by the inefficiency in verifying results, or even bad

    results - in the absolute worst case, with incorrect ideas being

    laid down as the foundation for new science and no checking done on

    them, progress will happen until it can't... at which point people

    will backtrack until the discover the underlying principle they

    thought was correct and will fix it.  The scientific method is a bit

    like a game of chutes and ladders in this respect.<br>

    <br>

    Ultimately, in a lot of ways, I think computational science has it

    better than other disciplines.  There was news earlier this week [1]

    about problems reproducing some early-stage cancer research -

    specifically, Amgen tried to reproduce 53 'landmark' conclusions,

    and were only able to do so with 11% of them.  Again, that's OK - it

    will correct itself, albeit in slow fashion, but what's interesting

    here is that these sorts of experiments, especially those involving

    mice (and often other wet-lab methods), don't have something like

    Moore's Law making them more accessible over time.  To reproduce a

    study involving the immune system of a mouse, I need mice.  And I

    need to wait the proper number of days.  Yet with computational

    science, what today may take a top end supercomputer can probably be

    done in a few years on a departmental cluster.  A few years after

    that?  Maybe a workstation.  In our field, data doesn't really

    change or degrade over time and the ability to analyze it in

    countless different ways becomes more and more accessible all the

    time.<br>

    <br>

    In short (hah, nothing about this was short!), can we do better with

    our scientific approaches?  Probably.  But is the scientific method

    threatened by computation?  Nooooooooo.  :-)<br>

    <br>

    That's my two cents,<br>

      - Brian<br>

    <br>

    [1]

<a class="moz-txt-link-freetext" href="http://vitals.msnbc.msn.com/_news/2012/03/28/10905933-rethinking-how-we-confront-cancer-bad-science-and-risk-reduction">http://vitals.msnbc.msn.com/_news/2012/03/28/10905933-rethinking-how-we-confront-cancer-bad-science-and-risk-reduction</a><br>

         Or, more directly (if you have access to Nature) :

    <a class="moz-txt-link-freetext" href="http://www.nature.com/nature/journal/v483/n7391/full/483531a.html">http://www.nature.com/nature/journal/v483/n7391/full/483531a.html</a><br>

    <br>

    (PS.  The one thing which can threaten science is a lack of

    education - it can decrease the signal-to-noise ratio of 'good'

    science, amongst other things.  That's a whole essay in itself.)<br>

    (PPS.  This was a long answer, and yet not nearly long enough... but

    I didn't want to be de-invited from future Beowulf Bashes by writing

    even more!)<br>

    <br>

    <br>

    On 3/29/2012 7:58 AM, Douglas Eadline wrote:

    <blockquote

      cite="mid:7219832b87625990e515bb6f9ddb621d.squirrel@mail.eadline.org"

      type="cite">

      <pre wrap="">

I am glad some one is talking about this. I have wondered

about this myself, but never had a chance to look into it.

<a class="moz-txt-link-freetext" href="http://www.isgtw.org/feature/does-computation-threaten-scientific-method">http://www.isgtw.org/feature/does-computation-threaten-scientific-method</a>

</pre>

    </blockquote>

        <br>

  </body>

</html>