[Beowulf] Stroustrup regarding multicore
Vincent Diepeveen
diep at xs4all.nl
Wed Aug 27 08:47:46 PDT 2008
On Aug 26, 2008, at 10:34 PM, Tim Cutts wrote:
>
> On 26 Aug 2008, at 2:29 pm, Perry E. Metzger wrote:
>
>> I think part of the issue is that most people doing scientific
>> computing don't have computer science backgrounds, which is a
>> shame.
>
> There is an unwritten recruitment rule, certainly in my field of
> science, that the programmer "must understand the science", and
> actually being able to write good code is very much a secondary
> requirement.
I couldn't disagree more.
Maybe your judgement is not objective.
For any serious software, let's be objective. Only a few will learn
how to program real well and manage to find their way in complex codes.
That's for just very few. That usually and not seldom takes a year or
10 to learn. If someone has a PHD or Master or whatever in some science,
he's usually capable of explaining and understanding things.
Becoming a very good low level programmer is a lot harder than
learning a few more algorithms that can solve a specific problem.
Especially understanding how to program efficiently parallel is not
so easy. I spoke with a guy who figured past week some stuff
that was used in the 90s at supercomputers, and he concluded it was
very inefficient.
That's *professors* in computer science and math who were involved in
that.
A single good low level programmer can speedup things not seldom
factor 50.
In fact i remember a statement of a programmer who was hired in
germany to do some physics works and after a few years he managed a
speedup of a factor 1000+ over the original software. In fact it's
more than factor 1000, it was an exponential speedup.
According to his opinion: "getting a speedup less than a factor 1000
in scientific number crunching software you can do with your eyes
closed".
Knowing everything about efficient caching and hashing and how to
divide that over the nodes without getting the full latency, nor losing
factor 50+ to just MPI messaging, that's just simply a fulltime
expertise in itself, and there is far fewer you can find who can do
that,
than the huge amount of people who can explain you the field's stuff.
Note bio-informatics is a bad idea to mention, it's eating a grand
total of < 0.5% system time at supercomputers
and that's already system time that hardly gets used in an efficient
manner. There is just not much to calculate there,
when compared to math, physics and everything that has to do with the
weather from climate in X years from now to earthquake prediction.
Physics in itself is eating 50%+ of all supercomputer time.
> I think this grew out of the last 20 years of exponentially
> increasing computer power which meant that in many fields you could
> write crappy code and just wait for hardware improvements to make
> it faster. This is particularly the case in fields such as
> bioinformatics where the field came into existence since the days
> of very limited memory and very slow machines, so they never
> experienced the world when writing tight code was essential (I
> started programming in 1984, so I can barely remember those days
> either). This is further hindered by the fact that no-one doing a
> masters in Bioinformatics learns a compiled language. They learn
> things like Java, R, perl, python and ruby.
>
> There is light at the end of the tunnel, though. I'm beginning to
> see signs that people are starting to be hired primarily as
> programmers, and not scientists. This is usually in areas where
> the scientists have hit a brick wall in terms of performance, and
> with exponentially increasing data quantities, had nowhere else to
> go. I expect this to gradually expand over the next couple of
> years, but there's going to be a lot of pain in the meantime -
> particularly for those of us building and running the systems, who
> will tend to get the blame when we supply a 10,000 core cluster and
> the scientists find their code doesn't run any faster than it does
> on the current 1,000 core system. "It's a more powerful system, it
> must be your fault it's not working"
>
> Tim
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome
> ResearchLimited, a charity registered in England with number
> 1021457 and acompany registered in England with number 2742969,
> whose registeredoffice is 215 Euston Road, London, NW1
> 2BE._______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
More information about the Beowulf
mailing list