Sequence analysis (blast/fasta/hmmer) on Beowulfs
William Pearson
wrp at virginia.edu
Sat Aug 11 11:12:44 PDT 2001
The fasta package of programs (ftp.virginia.edu/pub/fasta/fasta3.shar.Z)
provide virtually all the programs in the fasta package (including
SSEARCH for Smith-Waterman) under either the PVM or MPI environment, so
I would
expect the MPI versions to work on a Scyld/Beowulf cluster. Earlier
versions of the programs required the sequence databases be visible to
the worker nodes, but with the current version of the PVM/MPI parallel
programs, only the manager/host process needs to have access to the
databases.
We have just upgraded our Linux cluster to RedHat 7.1, and the MPI
versions of the programs no longer work
on more than two nodes (they work on 2 nodes just fine, but with 3 or
more, MPI does not start up properly).
Somewhere on the WWW is a reference to a parallel implementation of
BLAST. We worked on this several years ago, but I do not believe there
is a generally available PVM/MPI implementation of the current BLAST
version - parallel versions of BLAST use threads on shared memory
machines, and there are Perl scripts that automatically send out
individual sequences to individual machines in a cluster and collect the
results. This might be a bit of a challenge under Scyld, because the
databases would have to be visible on the cluster/node machines.
There is also a pvm implementation of HMMER available from Sean Eddy's
group, I believe.
Bill Pearson
More information about the Beowulf
mailing list