[Beowulf] Why one might want a bunch o' processors under your desk.
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Jim Lux James.P.Lux at jpl.nasa.govMon May 9 17:34:05 PDT 2005
- Previous message: [Beowulf] Why one might want a bunch o' processors under your desk.
- Next message: [Beowulf] CCL:Opteron or Nocona ? (fwd fromm.somers@chem.leidenuniv.nl)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
At 01:40 PM 5/9/2005, Vincent Diepeveen wrote: >At 05:49 PM 5/6/2005 -0700, Jim Lux wrote: > >Today I was running a lot of antenna models, using a method of moments code > >called NEC4 (in FORTRAN). > >Just to describe the computational task for context: > > > >The antenna I am modeling is 9 patches, in a square grid, the middle one of > >which is excited. > > > > > > > >What I DON'T want to do is rewrite (or even recompile) the antenna modeling > >code. It works, it's been validated, it's been optimized (to a certain > >extent), and besides, my job is to use the code, not to rewrite it for > >parallel computing. > >You know, i can get very sad reading that. > >I worked for 1.5 years real hard (i have worked several months, 7 days a >week, from 9 AM to 11 PM or later even) to get a hard to parallellize >algorithm to work on a 512 processor SGI origin3800, without being able to >test on the machine. > >If you can get system time on a 1024 processor machine for how many cpu >hours is it? That means that the organisation in question is spending on >you tens of thousands of dollars of system time and probably even more to >salaries of the organisations guarding the machine. > >You aren't even prepared to do hard work to let the program run more >efficient within the system time given? > > >And yes, there are approximations, better modeling codes, etc. > >available. But again, I'd like to avoid having to track them down, > >validate them, and so forth. I want to run my tried and true (but slow) > >code, faster. > > > >I suspect that I am not alone. There are probably hundreds of people who > >have similar kinds of problems, and would be well served by a desktop or > >personal supercomputer. > > > >Flame On!! > >If you are not prepared to modify the software, >then basically i'm missing the point of the problem presented. > >Any way to run it more efficient involves re-programming the software. > >Matrix type stuff is very well possible to parallellize. Actually, this describes the basic problem in the high performance computing area very well.. The people who have jobs that "need" HPC don't have the skills or time or resources to modify their code to use some particular computational resource. So you have a resource (a very high performance computational system) that goes begging looking for work, because there's some other "non-free" resource needed to effectively use it (that is, skilled software people). I should point out that JPLs 1024 processor Dell Xeon cluster is actually heavily used, as are the Cray and the SGI machines, so my comments are of a general nature. And, yes, the organization IS paying hundreds of thousands of dollars to provide a shared resource, just as it pays for the buildings, the library, and so forth. And, none of these resources are "free", even if they come as part of the institutional overhead. But, at some point, you have to decide whether to allocate your resources to developing software, or working on your particular problem, for which the software is merely a tool. You do a cost benefit analysis: do I spend a work month of time on parallelizing some code, so that the remaining 4 months worth of work takes only 2 months? Or, do I just soldier on with the old slow code, and adapt my working style to making overnight runs. Then, there's also the situation that even if you DID have the money, you might not have the people resources. It's very difficult to "buy" a few weeks' time of a skilled developer. If they're skilled, they're probably busy and fully subscribed. If I have to wait a month for them to fit me into their schedule, I might as well have been running the old slow code, and be partway to my end point. And then there's the granuarity of purchase problem. If the 10 skilled developers are already fully occupied, my little one work month increment of work would require hiring a whole additional person, which my little research task could not afford. Add to this the fact that for most codes, it would probably take many many work months to significantly improve and modify them. It's a full time job in itself. And that's assuming that you have sufficient visibility into the code to do it. What if you're stuck with a tool that is ONLY available as a compiled program (and such things are not particularly uncommon). Imagine trying to modify OpenOffice to use Base 9, instead of Base 10. Sure, the source is available, and the actual change might be quite simple, once you knew where to change it. The problem is that it would probably take you a year to find the 4 or 5 essential routines, and to make sure that everything still worked after you were done. So... the trick is to find a way to make cluster (or super) computing usable in a transparent fashion? This is one reason why people buy mainframes, after all. You can run the same old code, faster. It's the original concept that Cray had. Run your unchanged FORTRAN program, a LOT faster. It's the original concept behind a system I worked on back in the 80s, where the idea was to build a 80286 emulator out of fast ECL, so that IBM PC software could be run lots faster. Not particularly clever, but still, elegant in a kind of perverse way. If the reconfiguration extends to maybe an hour or two of setting up (because that's essentially what it takes to install a new software package), you'll find that people are willing to do it. But if it takes weeks and weeks, you'll not get many takers. It's not laziness, nor a lack of desire, just a lack of appropriate resources.
- Previous message: [Beowulf] Why one might want a bunch o' processors under your desk.
- Next message: [Beowulf] CCL:Opteron or Nocona ? (fwd fromm.somers@chem.leidenuniv.nl)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
