[Beowulf] RE: [Bioclusters] FPGAin bioinformatics clusters (again?)

bella at carolina.rr.com bella at carolina.rr.com
Mon Jan 16 17:36:56 PST 2006


Mike Davis wrote:

> But BLAST is only a small part and argueably the easiest part of 
> genomics work. The advantages of parallelization and/or smp come into 
> play when attempting to assemble the genome. Phred/Phrap can do the 
> work but starts to slow even large machines when your talking 50k+ of 
> sequences (which it wants to be in one folder). A quiz for  the Unix 
> geeks out there, what happens when a folder has 50,000 files in it. 
> Can you say SLOOOOOOOOOWWWW?
>
> Mike Davis
>
Sorry... I just couldn't let this one go by.  And no offense meant to 
anyone but...

Many times I have found users and application folks making inordinately 
and (in my opinion) unacceptably large numbers of files in 
sub-directories on one of "my" UNIX or Linux boxes. 

I simply gently take them aside and have a little "prayer meetin'" with 
them.  There is always a way to fix this kind of problem by consulting 
with the applications folks, and helping them see a better way.  That's 
why God made "mkdir (2)".

In my opinion, if this "Phred/Phrap" thingy (about which I KNOW NOTHING 
- all disclaimers apply) _absolutely_  requires one to place 50,000 (or 
more) files in a single sub-directory... and therefore is slow... the 
application is simply broken.  Contact the developers, or get the 
source... and we'll go fix it.

My 1 & 1/2 cents worth.

Arthur Bell
Senior UNIX/Linux System Administrator




More information about the Beowulf mailing list