[Beowulf] vectors vs. loops

Joe Landman landman at scalableinformatics.com
Wed Apr 27 04:53:05 PDT 2005

Hi Vincent:

   As I am sure you are well aware, not all algorithms map well into a 
distributed framework.  Some algorithms require huge global shared 
memories.  Some require extremely low latency interconnects.  Modern 
clusters with very low latency (single digit microsecond) high bandwidth 
(substantial fraction of 1 GB/s) inteconnection fabrics can work quite 
well on a rather large subset of problems.  It is not possible to run 
every code on them though.

   As for rewriting, that may be a good point.  To make the best use of 
a particular architecture, you need to adapt your algorithms to take 
advantage of the features you have available.  Unfortunately, there is a 
time cost to adaptation, and also a productivity loss as you have to 
spend your time (a zero sum game) verifying results as well as porting. 
  So, again as with other problems, it becomes a cost-benefit problem. 
The millions of $/euros/...  spent on maintaining systems may be small 
compared to the costs of rewriting.  Not everything is as simple as drop 
in an FFT library.

   Moreover, things like the Cell architecture are interesting, but the 
point I was trying to make is that compilers need to become very smart 
to hide the complexity of how you need to program it.  This is true of 
many architectures.  Sure, you could write all of your algorithms in 
Assembly if you want.  Wouldn't be portable, but they would be fast (I 
tend to note that this "Law" indicates that portable codes are not fast, 
and fast codes are not portable).

   It would be risky to assume that the Cell will auto-magically run 
your codes 10x or 100x faster without significant effort to make them 
adapt to the underlying architecture.

Vincent Diepeveen wrote:
> Way easier is a few libraries that do the entire matrix or invariant
> calculation for you.
> What i am always amazed about is that todays supercomputers cost tens of
> millions and maintenance even more (as there is sysadmin staffs of tens of
> people and i had to work with many organisations for example with all paid
> dudes who just add to burocracy, this in order to run at an outdated
> supercomputer with 500Mhz processors, still equipped with OFF chip L2 cache
> !!!, whereas my competitors ran on 2.8Ghz Xeon MP's).
> Yet the majority of code running at it, is so so childish written.
> This where i've been programming for many years to have my code run
> parallel very well. It also runs on pc's and must perform real well too
> when running single cpu, competing with tens of other programs.
> Government could easily afford paying a few persons making code running
> ideal on those machines.
> There is several real good programmers.
> What i've seen running there, it would not be nice to start describing it
> here, but some really have no clue about programming.
> They program something in 1980, and the next 30 years it is running and
> eating system time. That system time, because of the burocracy, is
> practical very expensive, so a coder speeding them up bigtime would be a
> big help.
> A good coder can let the majority of the software run faster on a PC, than
> it runs on those supercomputers.
> I met one programmer who a year or 10 ago helped a german project speed up;
>  he managed to speedup a certain program a factor 1 million for his
> department, by adding fast fourier transforms to the code. 
> 1 coder saving out tens of millions of dmarks.
> At 10:10 PM 4/26/2005 -0400, Andrew Piskorski wrote:
>>On Tue, Apr 26, 2005 at 10:01:25AM -0500, Ben Mayer wrote:
>>>I actually just did a small study of how well students in a parallel
>>>computing class write parallel codes on X1 with MPI and UPC. One of
>>>the things that stood out is that they tended to do odd things in
>>>their loops that inhibit code from vectorizing.
>>So, why were these students writing loops in the first place?  If the
>>goal is to generate vectorized code, wouldn't it make more sense to
>>use a language or library which directly supports vector commands?
>>E.g., although they're used for serial not parallel programming, S and
>>R are vector oriented in a pleasantly convenient way.  There do exist
>>languages and libraries specifically intended for vector programming,
>>like CVL or NESL, right, so, are they not useful?
> http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/scandal/public/papers/cvl.html
>> http://www-2.cs.cmu.edu/~scandal/
>>Andrew Piskorski <atp at piskorski.com>
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615

More information about the Beowulf mailing list