[Beowulf] Does computation threaten the scientific method?

Vincent Diepeveen diep at xs4all.nl
Thu Mar 29 17:45:44 PDT 2012


Brian - using medical examples probably is not a good idea in this  
discussion.
To introduce anew medicine is something of a 3 phase introduction to  
get it from acceptance to the market;
getting allowance at a political level is the hardest.

Now a big problem is simply that you can already get a 'go' there  
when you have a 95% confidence - or around 200 persons
which react positive at whatever you were doing at the SHORT TERM.

That's fairly little actually, especially for the trillion turnover  
that nowadays psychological medicines alltogether make.

So to speak that's science from 1946.

The computation risk there is a different one - namely that in short  
term cocain always works - that's what kids get now basically,
that is, the new medicines for examle for ADHD look frightening much  
like cocain and in long term also have those same side effects.

Yet they get prescribed massively, also for kids that do not need it.

The real problem is not so much the computation, but more the  
government rules of allowing something based upon 200 cases short term.
If you sell already a product for billions, even later corrections  
onto what you do, will not soon make it into mainstream, so it will  
get used
for years and years until someone says STOP.

That STOP is very tough to give if they produced synthetical cocain  
and in short term got through the 200 cases that reacted positive.
This is more of a government problem of course - they are still 65+  
years behind.

So to mix lapack statements with pharmaceuticals is not a good idea  
i'd say. If you suck everywhere as a scientist, you can still go produce
a new medicine and sell it worldwide.

Government should really undertake action there. Also directly modify  
that DSM classification once again so that less people get diagnosed,
as then it's not always the doctors that want to put medicines into  
someone, but for teachers it's very great to do so and one of the  
reports,
though i don't know whether that's accurate, seems so though, said  
33% more children now are diagnosed thanks to a few changes in the
classification system! As usual of course that was taken over from  
USA, as each individual small nation in Europe is too small to do  
things like
that on itself, wit hsometimes devastating consequences.

Vincent

On Mar 29, 2012, at 6:22 PM, Brian Dobbins wrote:

>
> To borrow from an old joke, I'd say the short answer is "No.", and  
> the long answer?  "Nooooooooooo."
>
> Reproducibility is an interesting issue - on the surface, it seems  
> like a binary thing: something is or is not reproducible.  In  
> reality, though, things are almost never duplicated exactly, and  
> there exists some fuzzy threshold at which point things are  
> considered good enough to be a reproduction.  I can go down to a  
> local store and buy a print of the Mona Lisa and, to me, it might  
> be a really great reproduction, yet even writing that sentence has  
> some art critic screaming in agony.  Similarly, in computing, if I  
> run some model on two different systems and get two different  
> results, that can either be indicative of a potential issue or it  
> can be completely fine, because those differences are below a  
> certain threshold and thus the runs were, in scientific terms,  
> 'reproducible' with respect to each other.
>
> On a small scale (meaning a lab, code or project), this is a key  
> issue - I've seen grad students and faculty alike be dismayed by  
> trivial differences, and when this happens, more often than not the  
> mentality is, "My first results are correct - make this code give  
> them back to me", without understanding that the later, different  
> results are quite possibly equally valid, and possibly more so.   
> Back in the early Beowulf days, I remember switching some codes  
> from an RS/6000 platform to an x86-based one, and the internal  
> precision of the x86 FPU was 80-bits, not 64, so sequences of FP  
> math could produce small differences unless this option was  
> specifically disabled via compiler switches.  Which a lot of people  
> did, not because the situation was carefully considered, but  
> because with it on, it gave 'wrong' results.    Another example  
> would be an algorithm that was orders of magnitude faster than one  
> previously in use, but wasn't adopted because ultimately the  
> results were different.  The catch here?  Reordering the input data  
> while still using the original algorithm gave similarly different  
> answers - the nature of the code was that single runs were useless,  
> and ensemble runs were a necessity.
>
> Ultimately, the issues here come down to the common perception of  
> computers - "They give you THE answer!" - versus the reality of  
> computers - "They give you AN answer!", with the latter requiring  
> additional effort to provide some error margin or statistical  
> analysis of results.  This happens in certain computational     
> disciplines far more often than others.
>
> On the larger scales - whether reproducibility is an issue in  
> scientific fields - again, I'd say the answer is no.   The  
> scientific method is resilient, but it never made any claims to be  
> 'fast'.  Would it speed things up to have researchers publish their  
> code and data?  Probably.  Or, rather, it'd certainly speed up the  
> verification of results, but it might also inhibit new approaches  
> to doing the same thing.  Some people here might recall Michael  
> Abrash's "Graphics Programming Black Book", which had a wonderful  
> passage where about a word-counting program.  It focused explicitly  
> on performance tuning, with the key lesson being that nobody  
> thought there was a better way of doing the task... until someone  
> showed there was.  And that lead to a flurry of new ideas.   
> Similarly, having software that does things in a certain way often  
> convinces people that that is THE way of doing things, whereas if  
> they knew it could be done but not how, newer methods might  
> develop.  There's probably some happy medium here, since having so  
> many different codes, mostly with a single author who isn't a  
> software developer by training, seems less efficient and flexible  
> than a large code with good documentation, a good community and the  
> ability to use many of those methods previously in the one-off codes.
>
> In other words, we can probably do better, but science itself isn't  
> threatened by the inefficiency in verifying results, or even bad  
> results - in the absolute worst case, with incorrect ideas being  
> laid down as the foundation for new science and no checking done on  
> them, progress will happen until it can't... at which point people  
> will backtrack until the discover the underlying principle they  
> thought was correct and will fix it.  The scientific method is a  
> bit like a game of chutes and ladders in this respect.
>
> Ultimately, in a lot of ways, I think computational science has it  
> better than other disciplines.  There was news earlier this week  
> [1] about problems reproducing some early-stage cancer research -  
> specifically, Amgen tried to reproduce 53 'landmark' conclusions,  
> and were only able to do so with 11% of them.  Again, that's OK -  
> it will correct itself, albeit in slow fashion, but what's  
> interesting here is that these sorts of experiments, especially  
> those involving mice (and often other wet-lab methods), don't have  
> something like Moore's Law making them more accessible over time.   
> To reproduce a study involving the immune system of a mouse, I need  
> mice.  And I need to wait the proper number of days.  Yet with  
> computational science, what today may take a top end supercomputer  
> can probably be done in a few years on a departmental cluster.  A  
> few years after that?  Maybe a workstation.  In our field, data  
> doesn't really change or degrade over time and the ability to  
> analyze it in countless different ways becomes more and more  
> accessible all the time.
>
> In short (hah, nothing about this was short!), can we do better  
> with our scientific approaches?  Probably.  But is the scientific  
> method threatened by computation?  Nooooooooo.  :-)
>
> That's my two cents,
>   - Brian
>
> [1] http://vitals.msnbc.msn.com/_news/2012/03/28/10905933- 
> rethinking-how-we-confront-cancer-bad-science-and-risk-reduction
>      Or, more directly (if you have access to Nature) : http:// 
> www.nature.com/nature/journal/v483/n7391/full/483531a.html
>
> (PS.  The one thing which can threaten science is a lack of  
> education - it can decrease the signal-to-noise ratio of 'good'  
> science, amongst other things.  That's a whole essay in itself.)
> (PPS.  This was a long answer, and yet not nearly long enough...  
> but I didn't want to be de-invited from future Beowulf Bashes by  
> writing even more!)
>
>
> On 3/29/2012 7:58 AM, Douglas Eadline wrote:
>>
>> I am glad some one is talking about this. I have wondered about  
>> this myself, but never had a chance to look into it. http:// 
>> www.isgtw.org/feature/does-computation-threaten-scientific-method
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list