[Beowulf] Why one might want a bunch o' processors under your desk.

Jim Lux James.P.Lux at jpl.nasa.gov
Mon May 9 17:34:05 PDT 2005


At 01:40 PM 5/9/2005, Vincent Diepeveen wrote:
>At 05:49 PM 5/6/2005 -0700, Jim Lux wrote:
> >Today I was running a lot of antenna models, using a method of moments code
> >called NEC4 (in FORTRAN).
> >Just to describe the computational task for context:
> >
> >The antenna I am modeling is 9 patches, in a square grid, the middle one of
> >which is excited.
> >
> >
> >
> >What I DON'T want to do is rewrite (or even recompile) the antenna modeling
> >code. It works, it's been validated, it's been optimized (to a certain
> >extent), and besides, my job is to use the code, not to rewrite it for
> >parallel computing.
>
>You know, i can get very sad reading that.
>
>I worked for 1.5 years real hard (i have worked several months, 7 days a
>week, from 9 AM to 11 PM or later even) to get a hard to parallellize
>algorithm to work on a 512 processor SGI origin3800, without being able to
>test on the machine.
>
>If you can get system time on a 1024 processor machine for how many cpu
>hours is it? That means that the organisation in question is spending on
>you tens of thousands of dollars of system time and probably even more to
>salaries of the organisations guarding the machine.
>
>You aren't even prepared to do hard work to let the program run more
>efficient within the system time given?
>
> >And yes, there are approximations, better modeling codes, etc.
> >available.  But again, I'd like to avoid having to track them down,
> >validate them, and so forth. I want to run my tried and true (but slow)
> >code, faster.
> >
> >I suspect that I am not alone.  There are probably hundreds of people who
> >have similar kinds of problems, and would be well served by a desktop or
> >personal supercomputer.
> >
> >Flame On!!
>
>If you are not prepared to modify the software,
>then basically i'm missing the point of the problem presented.
>
>Any way to run it more efficient involves re-programming the software.
>
>Matrix type stuff is very well possible to parallellize.

Actually, this describes the basic problem in the high performance 
computing area very well.. The people who have jobs that "need" HPC don't 
have the skills or time or resources to modify their code to use some 
particular computational resource.

So you have a resource (a very high performance computational system) that 
goes begging looking for work, because there's some other "non-free" 
resource needed to effectively use it (that is, skilled software people). I 
should point out that JPLs 1024 processor Dell Xeon cluster is actually 
heavily used, as are the Cray and the SGI machines, so my comments are of a 
general nature.

And, yes, the organization IS paying hundreds of thousands of dollars to 
provide a shared resource, just as it pays for the buildings, the library, 
and so forth.  And, none of these resources are "free", even if they come 
as part of the institutional overhead.

But, at some point, you have to decide whether to allocate your resources 
to developing software, or working on your particular problem, for which 
the software is merely a tool.  You do a cost benefit analysis: do I spend 
a work month of time on parallelizing some code, so that the remaining 4 
months worth of work takes only 2 months? Or, do I just soldier on with the 
old slow code, and adapt my working style to making overnight runs.

Then, there's also the situation that even if you DID have the money, you 
might not have the people resources. It's very difficult to "buy" a few 
weeks' time of a skilled developer. If they're skilled, they're probably 
busy and fully subscribed. If I have to wait a month for them to fit me 
into their schedule, I might as well have been running the old slow code, 
and be partway to my end point.

And then there's the granuarity of purchase problem.  If the 10 skilled 
developers are already fully occupied, my little one work month increment 
of work would require hiring a whole additional person, which my little 
research task could not afford.

Add to this the fact that for most codes, it would probably take many many 
work months to significantly improve and modify them. It's a full time job 
in itself. And that's assuming that you have sufficient visibility into the 
code to do it.  What if you're stuck with a tool that is ONLY available as 
a compiled program (and such things are not particularly 
uncommon).  Imagine trying to modify OpenOffice to use Base 9, instead of 
Base 10.  Sure, the source is available, and the actual change might be 
quite simple, once you knew where to change it.  The problem is that it 
would probably take you a year to find the 4 or 5 essential routines, and 
to make sure that everything still worked after you were done.


So... the trick is to find a way to make cluster (or super) computing 
usable in a transparent fashion?  This is one reason why people buy 
mainframes, after all.  You can run the same old code, faster. It's the 
original concept that Cray had.  Run your unchanged FORTRAN program, a LOT 
faster.  It's the original concept behind a system I worked on back in the 
80s, where the idea was to build a 80286 emulator out of fast ECL, so that 
IBM PC software could be run lots faster.  Not particularly clever, but 
still, elegant in a kind of perverse way.

If the reconfiguration extends to maybe an hour or two of setting up 
(because that's essentially what it takes to install a new software 
package), you'll find that people are willing to do it.  But if it takes 
weeks and weeks, you'll not get many takers.

It's not laziness, nor a lack of desire, just a lack of appropriate resources.









More information about the Beowulf mailing list