Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] RE: [Bioclusters] FPGAin bioinformatics clusters (again?)

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Joe Landman landman at scalableinformatics.com
Mon Jan 16 17:03:14 PST 2006


Hi Craig:

Craig Tierney wrote:
> Mike Davis wrote:
> 
>> But BLAST is only a small part and argueably the easiest part of 
>> genomics work. The advantages of parallelization and/or smp come into 
>> play when attempting to assemble the genome. Phred/Phrap can do the 
>> work but starts to slow even large machines when your talking 50k+ of 
>> sequences (which it wants to be in one folder). A quiz for  the Unix 
>> geeks out there, what happens when a folder has 50,000 files in it. 
>> Can you say SLOOOOOOOOOWWWW?
>>
> First, pick the right filesystem.
> Second, rewrite your code so you don't have 50k+ files in one directory.
> There must be some straightforward way to solve the problem if
> you have too many files in one directory.

Lots of the informatics codes were not written with such input (or 
database) scaling in mind.  For them, 10-100 files in a directory isn't 
much of a problem.  Its when you start to scale up that the bugs and 
surprises start.


Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615




More information about the Beowulf mailing list