Parallel BLAST
Steve Gaudet
SGaudet at turbotekcomputer.com
Mon Apr 15 11:11:55 PDT 2002
> -----Original Message-----
> From: William R. Pearson [mailto:wrp at alpha0.bioch.virginia.edu]
> Sent: Sunday, April 14, 2002 10:32 PM
> To: beowulf at beowulf.org
> Subject: Parallel BLAST
>
>
>
> > Why is it that BLAST is not available for MPI/PVM? I would think
> > clusters would be the prefect host for such an application.
> > Is it there is no need because BLAST is already so fast and
> > no one wants to break the database out onto node-resident disks?
> > Or is it that BLAST is kept running on single processor or
> shared memory
> > machines BLAST so that the DB is always in memory ready to
> roll without
> > loading and doing the same for a cluster is not worth it
> > because the same trick is difficult to do on a node given
> the current
> > way clusters are built? I assume the same is true for FASTA?
>
> I suspect that BLAST is not available for MPI/PVM because (1) it is
> too fast, and (2) there is not much demand for it.
>
> 95% of the time, BLAST is almost an in-memory grep (the other 5% of
> the time it is working on the things it is looking for). Sequence
> comparison is embarrassingly parallel, and very easily threaded.
> Distributing the sequence databases and collecting results has more
> overhead (there probably aren't many distributed grep programs
> either). FASTA is 5 - 10X slower than BLAST, and Smith-Waterman is
> another 5-20X slower than FASTA. Here, the communications overhead is
> low, and distributed systems work OK for FASTA, and great for
> Smith-Waterman (where the overhead fraction is very small).
>
> Of course, it is a lot easier to compile a threaded program, and just
> run it, than it is to install and configure the MPI or PVM environment
> and the programs to run in it. Bioinformatics software is often run
> by computer savvy biologists, not high-performance computing folks,
> and not having to install and configure PVM/MPI is a big advantage.
> The NCBI probably does not make a PVM/MPI parallel BLAST because there
> is very little demand for it, and it does not meet their computational
> needs.
--------------
There's also a commerical version from Turbogenomics.
http://www.turbogenomics.com
Offering:
1) Ready to go, plug-n-play solution for parallel BLAST
2) Expertise and 20+ years of experience in parallel computing
3) Dynamic database splitting feature to take advantage of computers that
have less memory than the size of the database
4) Smart load balancing - achieve linear to superlinear speedup
5) No modification made to the NCBI BLAST algorithm to ensure identical
results with the non-parallel version
6) Easy drop-in update whenever NCBI releases newer versions of their
algorithm
7) Excellent support
8) 30-days money back guarantee
Cheers,
Steve Gaudet
Linux Solutions Engineer
.....
<(©¿©)>
===================================================================
| Turbotek Computer Corp. tel:603-666-3062 ext. 21 |
| 8025 South Willow St. fax:603-666-4519 |
| Building 2, Unit 105 toll free:800-573-5393 |
| Manchester, NH 03103 e-mail:sgaudet at turbotekcomputer.com |
| web: http://www.turbotekcomputer.com |
===================================================================
More information about the Beowulf
mailing list