[Beowulf] RE: [Bioclusters] FPGAin bioinformatics clusters (again?)
ctierney at hypermall.net
Mon Jan 16 16:29:18 PST 2006
Mike Davis wrote:
> But BLAST is only a small part and argueably the easiest part of
> genomics work. The advantages of parallelization and/or smp come into
> play when attempting to assemble the genome. Phred/Phrap can do the
> work but starts to slow even large machines when your talking 50k+ of
> sequences (which it wants to be in one folder). A quiz for the Unix
> geeks out there, what happens when a folder has 50,000 files in it.
> Can you say SLOOOOOOOOOWWWW?
First, pick the right filesystem.
Second, rewrite your code so you don't have 50k+ files in one directory.
There must be some straightforward way to solve the problem if
you have too many files in one directory.
> Mike Davis
> Lukasz Salwinski wrote:
>> Michael Will wrote:
>>> I have always been amazed at the promises of massivelyparallel. Now
>>> technique is so good they don't even need the source code to
>>> ...but if I tell you how I would have to kill you...
>>> Michael Will
>> uh.. just a quick comment on bioinformatics and parallelizing things...
>> please note, that most of the bioinformatic problems are already
>> embarrassingly parallel and, with the new genomes showing up at an
>> amazing rate, getting more and more so. Thus, in most cases, it just
>> doesn't make much sense to parallelize anything - if one's got to
>> run 300x4000 blasts against a library of 300x4000 sequences (ie 300
>> genomes, 4000 genes/proteins, all vs all) the simplest solution -
>> a lot of nodes, blast optimized for a single cpu and a decent queing
>> system will ultimately win (as long as one stays within the same
>> architecture; FPGAs are a diferent story ;o)
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf