[Beowulf] cluster softwares supporting parallel CFD computing

William Harman wharman at prism.net
Fri Sep 15 13:46:20 PDT 2006


Agree completely and let's remember that the best algorithm is directly
related to the hardware platform that it is implemented on.  I have worked
with an algorithm specialist who has a unique understanding on how code gets
converted into the electrons, that executes the code.  This relationship is
becoming more apparent as companies integrate FPGA technology into their
solutions, and let's not forget RAM disc, interconnect, storage
technologies, and others.  A person who has mastered this relationship is
truly a rare bird.  But that is the academic side of the question, the
practical side is the cost performance metric (which includes the profit
motive in most cases), and more and more it is easier to through low cost
hardware on the solution as opposed to the more elegant solution, at least
at the macro level.  When you get into the top sites, however, this seems to
go completely in the reverse as the National Labs, etc, need the more
elegant solution to solve the 'grand challenge' problems.

I've also know some software guys who think they are one step from the
'throne' just because they can write some code and it executes. 

Bill Harman,
P - (801) 572-9252  F - (801) 571-4927
wharman at prism.net
 
------ High Performance Computing Solutions ------

-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On
Behalf Of Toon Knapen
Sent: Friday, September 15, 2006 8:05 AM
To: Patrick Geoffray
Cc: 'Beowulf List'; Mark Hahn
Subject: Re: [Beowulf] cluster softwares supporting parallel CFD computing

I agree that in general the quality of the parallelism in most codes is
rather low, unfortunately. But it is hard to proof that much can be gained
when the quality would be improved.

Let me elaborate. When developing an app. that needs to run fast, one first
needs to look at using the best algorithm to get the job done. 
While implementing the algorithm, attention must be paid to the app being
stable (no use in having a fast app which crashes the whole time). 
And finally you start optimizing. But while using a better algorithm might
give you a 50% boost, performance increases due to code-optimization are
generally only marginal. Basically, changes early in the development process
will have a big effect on performance while changes late in the dev.process.
will have minor effects.

For instance, I wonder if any real-life application got a 50% boost by just
changing the switch (and the corresponding MPI implementation). Or, what is
exactly the speedup observed by switching from switch A to switch B on a
real-life application?

toon


Patrick Geoffray wrote:
> Hi Mark,
> 
> Mark Hahn wrote:
>> all these points are accurate to some degree, but give a sad 
>> impression of the typical MPI programmer.  how many MPI programmers 
>> are professionals, rather than profs or grad students just trying to 
>> finish a calculation?
>> I don't know, since I only see the academic side.
> 
> I think that the sample of MPI codes or traces that I have seen so far 
> is a good representation of the academic, labs and commercial sides.
> It's pretty bad. I am sure they are many reasons, but a few come to mind:
> 
> * a lot of codes in academia and at the labs are written directly by 
> the scientist, physicist, chemist, whatever. They are expert in their 
> domain, but they don't know how to write good code. Doesn't matter if 
> it's parallel or sequential, they don't know how to do it right. For 
> their defense, they never really learned, and they are doing the best 
> they can. However, they really should work with professional 
> programmers. It's paradoxical that physicists would use the service of 
> a statistician to help them make sense of their experimental data, but 
> they don't want help for computer science.
> It's interesting to note that there has always been this push from 
> high in the food chain to bypass the human computer science expertise: 
> it was automagic compilers (OpenMP, HPF and family) in the past, it's 
> "high-productivity" languages now.
> 
> * In the commercial side, the codes are quite old, at least in their 
> design. You can see traces of port from SHMEM to MPI, with Barriers 
> a-lot-and-often. You see collective communications done by hand, I 
> guess because the implementation of the collectives sucked at the 
> time. You see an shameful amount of unexpected messages, the kind 
> where the receive is just a little too late, typical from a code that 
> was designed for a slow network, relatively. In short, it looks like 
> they minimize the investment in code maintenance.
> 
> 
>> for academics, time-to-publish is the main criterion, which doesn't 
>> necessarily mean well-designed or tuned code.  taking a significant
> 
> I don't know if time is really the constraint here. For grads 
> students, sure, but I would not think that more time would help with 
> profs. A good programing book maybe, but they are too proud to read 
> those :-)
> 
> Patrick


--
Toon Knapen

------------------------------------------------
Check out our training program on acoustics and register on-line at
http://www.fft.be/?id=35 _______________________________________________
Beowulf mailing list, Beowulf at beowulf.org To change your subscription
(digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf





More information about the Beowulf mailing list