[Beowulf] [landman at scalableinformatics.com: Re: [Bioclusters] FPGA in bioinformatics clusters (again?)]
Eugen Leitl
eugen at leitl.org
Sat Jan 14 08:52:51 PST 2006
----- Forwarded message from Joe Landman <landman at scalableinformatics.com> -----
From: Joe Landman <landman at scalableinformatics.com>
Date: Sat, 14 Jan 2006 11:45:14 -0500
To: "Clustering, compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>
Subject: Re: [Bioclusters] FPGA in bioinformatics clusters (again?)
User-Agent: Thunderbird 1.5 (Windows/20051201)
Reply-To: "Clustering, compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>
[in order to keep this list of a higher quality, lets keep the marketing
side to a minimum, and simply debate the merits]
Kathleen wrote:
>>From what I'm learning, cross communication is less likely an issue in
>transactional-based problems. Protein-folding would be a great example of a
>non-transactional problem that is tightly coupled (needs cross
???
Protein folding, depending upon the algorithm used, could be expressed
in terms of a lattice model, a force field model with or without
electrostatics and rigid atoms, or more complex models with
electrostatics and quantum effects, etc. The computational cost of
these models vary tremendously. If you are attempting to do a
conformational search over a large molecule, you really want to set up a
huge sample and search space. This is a embarrassingly parallel
problem. Not tightly coupled. Very little in-algorithm communication.
Most of it is done in terms of sending information back to the main
process.
[ ... the rest of the marketing bits deleted ...]
>Can a user intermix FPGAs with COTS-based technology
Yes. Easily. This is being done today.
>What if one FPGA on a board fails?
Buy a new one. They are around the same price as a node or two. Most
are covered by a warranty of some sort.
>How easy
>is it to swap that out?
One version: pop the top of the unit off, pull the card out, put the
new one in, pop the top back on, reboot.
Another version: disconnect the USB2 cable and connect the new one.
Another version: disconnect the network wire/optical cable and connect
the new one
...
>Do you swap out just the failed FPGA or the whole
>board?
Same debugging model as the node. FPGAs don't tend to go bad
individually. You will debug it at the same level as the node (e.g. the
subsystem). I don't see people pulling off the northbridge from their
motherboards when it fails. I see them replacing the motherboard. This
is no different.
>Who would swap that out?
Anyone who knows how to put cards into their machine. This includes the
vast majority of people on this list, or those who consume clusters in
general.
>What happens to the work allocated to the
>failed FPGA,
This is up to the scheduling software. If configured as such, this is
not a problem. It is identical to the issue of what happens when a node
fails, as the FPGA is part of the node.
>does it get dynamically re-directed? What is the max # of FPGAs
>for a single board and does each board cross communicate?
Given that a single FPGA is 100x (one hundred times) faster on pairwise
sequence alignment than the host CPU, how many FPGA's do you think you
need? Similarly, other codes are anywhere from 10x to 100x the
performance of a single node. The cost for this performance is
excellent. As volumes increase, the cost gets even better.
The correct model for looking at FPGAs or any other coprocessor in
machines is to consider them to be just like a video card. Most people
on this list (or using a cluster) would have no trouble dealing with a
defunct video card, and they are not going to pull off the GPU if it
fails. Moreover, the video card is 10-100x faster at rendering OpenGL
graphics than the host processor. Again, the cost for this performance
is excellent.
Even more importantly, this model, the attached processor, is well
established in the PC/server industry. We (HPC and cluster users
collectively) have attached processors of all sorts: Graphics, RAID,
SCSI, SATA, Network, low latency network (Infinipath, Infiniband,
Myrinet), ... and a long history of leveraging the highest performing
and best price performance systems. I don't see attached processing
replacing clusters or HPC systems, I see it augmenting them, in much the
same way that we see RAID cards augmenting IO, or graphics cards
augmenting displays. The people who need and care about performance
will likely be interested in these systems.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Bioclusters maillist - Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters
----- End forwarded message -----
--
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20060114/a2ccddf0/attachment.sig>
More information about the Beowulf
mailing list