[Beowulf] RE: [Bioclusters] FPGAin bioinformatics clusters (again?)

Mon Jan 16 15:43:42 PST 2006

But BLAST is only a small part and argueably the easiest part of 
genomics work. The advantages of parallelization and/or smp come into 
play when attempting to assemble the genome. Phred/Phrap can do the work 
but starts to slow even large machines when your talking 50k+ of 
sequences (which it wants to be in one folder). A quiz for  the Unix 
geeks out there, what happens when a folder has 50,000 files in it. Can 
you say SLOOOOOOOOOWWWW?

Mike Davis

Lukasz Salwinski wrote:

> Michael Will wrote:
>
>> I have always been amazed at the promises of massivelyparallel. Now
>> their
>> technique is so good they don't even need the source code to
>> parallelize?
>>
>> ...but if I tell you how I would have to kill you...
>>
>> Michael Will 
>
>
> uh.. just a quick comment on bioinformatics and parallelizing things...
>
> please note, that most of the bioinformatic problems are already
> embarrassingly parallel and, with the new genomes showing up at an 
> amazing rate, getting more and more so. Thus, in most cases, it just
> doesn't make much sense to parallelize anything - if one's got to
> run 300x4000 blasts against a library of 300x4000 sequences (ie 300
> genomes, 4000 genes/proteins, all vs all) the simplest solution -
> a lot of nodes, blast optimized for a single cpu and a decent queing
> system will ultimately win (as long as one stays within the same
> architecture; FPGAs are a diferent story ;o)
>
> lukasz
>