[Beowulf] Re: vectors vs. loops
Joe Landman
landman at scalableinformatics.com
Wed Apr 27 16:26:10 PDT 2005
Ben Mayer wrote:
>>However, most code doesn't vectorize too well (even, as you say, with
>>directives), so people would end up getting 25 MFLOPs out of 300 MFLOPs
>>possible -- faster than a desktop, sure, but using a multimillion dollar
>>machine to get a factor of MAYBE 10 in speedup compared to (at the time)
>>$5-10K machines.
>
>
> What the people who run these centers have told me that a
> supercomputer is worth the cost if you can get a speed up of 30x over
> serial. What do others think of this?
I thought a good new machine should be 4-10x the current speed of your
old machine. A "supercomputer" is a hard thing to define in general
terms. If you look at it like "at least 2 orders of magnitude faster
than what you can do today" with not such a significant effort (e.g. not
rewriting your 100k line code from scratch) ...
[...]
> So what we should really be trying to do is matching code to the
> machine.
"Portable code is not fast, fast code is not portable"
There is a a price for every decision. How hard are you willing to work
to make your code fast? How much time (or money) are you willing to
spend to do this?
[...]
> The manual for the X1 provides some information and examples. Are the
> Apple G{3,4,5} the only processors who have real vector units? I have
> not really looked at SSE(2), but remember that they were not full
> precision.
Altivec are just SIMD units, but with a saner instruction set design
than SSEx. Here is hoping that SSE4 will have a real maximum/minimum
function. :(
Not sure what you mean by full precision. SSE2 has a variety of
formats, and the ISA design makes it hard to get data in and out of the
SIMD registers. Packing/unpacking are very expensive.
>>For me, I just revel in the Computer Age. A decade ago, people were
>>predicting all sorts of problems breaking the GHz barrier. Today CPUs
>>are routinely clocked at 3+ GHz, reaching for 4 and beyond. A decade
>
>
> I just picked up a Semptron 3000+, 1.5GB RAM, 120GB HD, CD-ROM, video,
> 10/100 + intel 1000 Pro for $540 shipped. I was amazed.
Well we are going to run into some thermodynamical (structural
stability) limits pretty soon. At some feature size (haven't done the
calculations, guessing in the 10-30nm region) the defect formation
energies will become comparible to the thermal energy. When this
happens, the devices do a pretty good job of destroying themselves,
usually with threading dislocations (this happened in the early days of
blue LEDs). The usual tricks to stabilize structures get harder at
smaller sizes, and the electronic structure effects of surface
deformations underneath the wires lead to some interesting electronic
responses. I have doubts that we will ever see 1 atom wires. Then
again, things like carbon nanotubes and other self assembling bits are
quite intriguing. And they are small and quite rigid.
Still, I am waiting for bosonic computation (photons). Enough of these
fermions (electrons/holes). Massively parallel by design.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
More information about the Beowulf
mailing list