[Beowulf] Re: vectors vs. loops

Thu Apr 28 07:50:12 PDT 2005

In message from Ben Mayer <bmayer at gmail.com> (Wed, 27 Apr 2005 
10:42:50 -0500):
>> However, most code doesn't vectorize too well (even, as you say, 
>>with
>> directives), so people would end up getting 25 MFLOPs out of 300 
>>MFLOPs
>> possible -- faster than a desktop, sure, but using a multimillion 
>>dollar
>> machine to get a factor of MAYBE 10 in speedup compared to (at the 
>>time)
>> $5-10K machines.
>
>What the people who run these centers have told me that a
>supercomputer is worth the cost if you can get a speed up of 30x over
>serial. What do others think of this?
>
>> The moral of this particular story is to NOT try to force code onto 
>>a
>> vector environment unless it is, really, a vector task.  Indeed, 
>>don't
>> force code into a PARALLEL environment (e.g. into PVM or MPI) unless 
>>it
>> is a NON-TRIVIAL parallel task (I spent a lot of time rewriting my 
>>code
>> as master-slave stuff in PVM, only to finally realize that EP tasks 
>>are
>> more easily managed by just running the damn jobs independently via 
>>e.g.
>> a script and accumulating results with other scripts, because 
>>writing
>> ROBUST PVM (or MPI) code -- code that can survive a casual reboot or
>> interruption of any particular node -- is Not Easy.
>
>:) I needed to do some CHARMM runs this summer. The X1 did not like 
>it
>much (neither did I, but when the code is making references to punch
>cards and you are trying to run it on a vector super, I think most
>would feel that way), I ended up running it in parallel by a similar
>method as yours. Worked great!
>
>> If it IS a vector (or nontrivial parallel, or both) task, then the
>> problem almost by definition will EITHER require extensive "computer
>> science" level study -- work done with Ian Foster's book, Amalsi and
>> Gottlieb for parallel and I don't know what for vector as it isn't 
>>my
>> area of need or expertise and Amazon isn't terribly helpful (most 
>>books
>> on vector processing deal with obsolete systems or are out of print, 
>>it
>> seems).
>
>So what we should really be trying to do is matching code to the
>machine. One of the problems that I have run into is that unless one
>is at a large center there are only one or two machines that provide
>computing power. Where I am from we have a X1 and T3E. Not a very 
>good
>choice between the two. There should be a cluster coming up soon,
>which will give us the options that we need. ie Vector or Cluster.
>
>The manual for the X1 provides some information and examples. Are the
>Apple G{3,4,5} the only processors who have real vector units? I have
>not really looked at SSE(2), but remember that they were not full
>precision.

SSE2 gives 64-bit floating point numbers, in most cases it's enough. 
But if at the word "full" you means 80 bit - you are right.

Yours
Mikhail Kuzminsky
Zelinsky Institute of Organic Chemistry
Moscow

>
>> For me, I just revel in the Computer Age.  A decade ago, people were
>> predicting all sorts of problems breaking the GHz barrier.  Today 
>>CPUs
>> are routinely clocked at 3+ GHz, reaching for 4 and beyond.  A 
>>decade
>
>I just picked up a Semptron 3000+, 1.5GB RAM, 120GB HD, CD-ROM, 
>video,
>10/100 + intel 1000 Pro for $540 shipped. I was amazed.
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf