Diskless for Bioinformatics?

William Pearson wrp at virginia.edu
Fri Jun 22 13:17:49 PDT 2001

> Message: 2
> From: Brian LaMere <blamere at diversa.com>
> To: "Beowulf List (E-mail)" <beowulf at beowulf.org>
> Subject: diskless clients?  beowulf-newbie seeks advice
> Date: Fri, 22 Jun 2001 10:55:08 -0700
> charset="ISO-8859-1"
> why does every guide around talk about diskless clients?  I 
> mean...disks are
> stinkin cheap nowadays...
> I have ~$150,000 to make a test cluster (with WAY more if the test 
> cluster
> shows worth) but the boss-man wants to go with nodes which aren't 
> exactly
> "commodity" in my book.  dual p3-1000 with 1.25Gb ram, 15krpm 18Gb 
> drives.
> The things cost $8k+ each...tried to explain that 148 $1k machines 
> would way
> out perform 16 $8k machines, but...oh well.  These boxes take up 1u, 
> which
> seems to be their main selling point (HP's lp1000r).  Fortunately, these
> boxes are down to $6.5k now in cost (dropped a bit since we bought 
> them a
> couple months back), but still...

There is a lot to be said for dual processor machines for Bioinformatics 
Many of the applications, like HMMER, FASTA, and perhaps BLAST, run out 
of cache,
so you really get 2X speedup on 2 CPU's.     We purchased SuperMicro 1U 
dual PIII machines
with 18 Gb SCSI (10K), 512 Mb RAM for about $3K each about 4 months ago.

> Just a little genetics research firm, needing some serious horsepower to
> start running big hammer and blast jobs.  The data we have now is just 
> the
> bare minimum we need to get by, but if we had things like a working 
> beowulf
> the scientists upstairs would start making, since they'd be able to use 
> it,
> much more data.

I think you want disks - they make it easier to debug a node separately, 
and for your BLAST applications (which will not run in parallel, you 
must run many separate instances) you can have all 16-32-64 CPU's 
loading up the database independently.

Bill Pearson

More information about the Beowulf mailing list