[Beowulf] [landman at scalableinformatics.com: Re: [Bioclusters] FPGA in bioinformatics clusters (again?)]

Joe Landman landman at scalableinformatics.com
Sat Jan 14 16:58:50 PST 2006

Jim Lux wrote:
> At 08:52 AM 1/14/2006, Eugen Leitl wrote:

> using masses of FPGAs is fundamentally different than uses masses of 
> computers in a number of ways:

There are a range of options for application acceleration, and no one 
solution fits all models.  The rest of my posts in the other list 
covered this.

> 1) The FPGA is programmed at a lower conceptual level if you really want 
> to get the benefit of performance.  Sure, you can implement multiple 
> PowerPC cores on a Xilinx, and even run off the shelf PPC software on 
> them, however, it would be cheaper and faster to just get Power PCs.

I think that buying Xilinx cores to run PPC software is a serious abuse 
of the power of the FPGA.  Basically an FPGA is a circuit.  A digital 
one, but still a circuit.  If your application maps well into this, then 
you can realize excellent benefits (with appropriate caveats, 
specifically, YMMV)

> There's a world of difference between modifying someone's program in 
> FORTRAN or C and modifying something in Verilog or VHDL.  Fundamentally, 

It is a very different paradigm.

> the FPGA is a bunch of logic gates, and not a sequential VonNeumann 
> computer with a single ALU.  The usage model is different.

s/different/very different/

> 2) Most large FPGAs have very high bandwidth links available 
> (RocketPortIO for Xilinx for instance), although, they're hardly a 
> commodity generic thing with well defined high level methods of use 
> (e.g. not like using sockets).  You're hardly likely to rack and stack 
> FPGAs.

There are some system on a chip + FPGA folks making this more 
interesting and much easier.  Have a look at Stretch 
(www.stretchinc.com).  There are others.  Its still not "plug and go" 
but it is getting better.

> 3) Hardware failure probabilities are probably comparable between FPGAs 
> and conventional CPUs.  However, you're hardly likely to get the 
> economies of scale for FPGAs. 

 From some of the builders of these cards I have spoken to, run rates of 
100 cards are considered large.

> Megagate FPGAs are in the multikilobuck 
> range: just for the chip. 

Yup :(

> There isn't a well developed commodity mobo 
> market for FPGAs, that you can just pick your FPGA, and choose from 
> among a half dozen boards that are all essentially identical 
> functionally.  

Well there are "some" but the FPGA world started in signal processing, 
so lots of the "generic" boards you can chose from have lots of other 
stuff you don't need.  And you are right, there is no standard interface 
to memory, or to IO.  This means that if you have a new version of the 
board, you have to port to your new version.  Similar to putting down a 
new bios to a degree, just different.

> What "generic" FPGA boards exist are probably in the 
> multikilobuck area as well.

I haven't seen too many "generic" boards for HPC.  Lots for DSP and related.

> 4)  There are applications for which FPGAs excel, but it's likely that a 
> FPGA solution to that problem is going to be very tailored to that 
> solution, and not particularly useful for other problems.  The FPGA may 
> be a perfectly generalized resource, but the system into which it is 
> soldered is not likely to be so.
> Joe's analogy to video coprocessors is apt.  Very specialized to a 
> particular need, where they achieve spectacular performances, especially 
> in a operations per dollar or operations per watt sense.  However, they 
> are very difficult to apply in a generalized way to a variety of problems.
> Of course, the video coprocessor is actually an ASIC, and is essentially 
> hardwired for a particular (set) of algorithms.  You don't see many 
> video cards out there with an FPGA on them, for the reason that the 
> price/performance would not be very attractive.

The interesting thing is the utilization of the shaders by folks seeking 
higher performance math and pipeline processing.  This suggests that for 
a well enough designed system, with an effectively standardized way to 
access the resources, there may be places where a 
micro-accelerator/co-processor might add value to a code.  The early 
x86-x87 pairs were like this.  An attached co-processor.  I don't think 
you are going to want to try to implement a large general purpose 
processor on an FPGA, as they don't have enough gates to be interesting. 
  But a highly specialized co-processor that has some sort of highly 
focused functionality has been useful in other areas.

> (mind you, if you've got the resources, and a suitable small set of 
> problems you're interested in, developing ASICs is the way to go.  
> D.E.Shaw Research is doing just this for their computational chemistry 
> problems.)

ASICs cost more than FPGAs to developand you are committed to a design. 
  ASICs will be much less expensive in high volumes.

> 5) FPGA development tools are wretchedly expensive, compared to the 
> tools for "software". It's a more tedious, difficult and expensive 
> development process.

Actually the compilers are getting reasonable (using Pathscale EKO as a 
definition of reasonable), they aren't 100x ot 10x of that price. 
Something in the 2-5x region.  However, and this is important, the 
compilers aren't as good (yet) as humans for this work.

> There's a lot more software developers than FPGA designers out there, so 
> it's harder to find someone to do your FPGA.  It's not really a dollar 
> issue (top pros in both are both in the $100K/yr salary range) it's 
> finding someone who's interested in working on YOUR project.
> Sure, there are some basic "free-ish" tools out there for some FPGAs, 
> but if you want to do the equivalent of programming in a high level 
> language, you're going to be forking out $100K/yr in tools.
> Resynthesizing your FPGA design is often a lot slower than recompiling 
> your program.  Also, because it's basically a set of logic gates, there 
> typically isn't the concept of recompiling just part and relinking.  
> It's resynthesize EVERYTHING, and then you have to revalidate all the 
> timings, regenerate test vectors, etc.  There's no equivalent of 
> "patching".

In general yes.  Have a look at the Stretch bit.  It is quite 
interesting.  They have some issues as well (their FPGA is small).  But 
it is intriguing.

> James Lux, P.E.
> Spacecraft Radio Frequency Subsystems Group
> Flight Communications Systems Section
> Jet Propulsion Laboratory, Mail Stop 161-213
> 4800 Oak Grove Drive
> Pasadena CA 91109
> tel: (818)354-2075
> fax: (818)393-6875

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615

More information about the Beowulf mailing list