[Bioclusters] Questions on mpiBLAST
Aaron Darling
darling at cs.wisc.edu
Thu Feb 3 15:39:11 EST 2005
I'd like to make a brief addendum to Jason's excellent reply...
Jason Gans wrote:
> Hello,
>
> There are a number of reasons for the results you show below.
>
> 1) Load balancing.
>
> The latest version of mpiBLAST uses a master node and
> a scheduler node. Hence if you run mpiBLAST on 16 nodes, only 14 worker
> nodes will being performing the actual BLAST search (i.e. the heavy
> lifting).
>
> If you format your database into 16 fragments, 12 worker nodes will be
> assigned 1 fragment each and 2 worker nodes will get 2 fragments. This
> is fine
> for a large query (and may actually improve load balancing) but for a
> small query
> the nodes that must search 2 fragments will be the rate limiting step
> in your calculation.
>
> You're better off formatting your database into 14 fragments (so that
> every worker
> node searches a single fragment).
The "scheduler" process performs almost no work, so to really optimize
performance on a 16 node cluster one could try formatting the database
into 15 fragments and running 17 processes. Of course, care must be
taken that the node which runs two processes is running a scheduler and
either a worker or output. The best way I can think of to achieve this
would be adding the following at line 178 of mpiblast.cpp (version 1.3.0):
scheduler_process = node_count - 1;
That will set the last MPI process to be the scheduler. AFAIK, mpich
(and possibly other MPI implementations) will wrap around to the first
node when assigning processes beyond the number of nodes given in the
mpich configuration. The net result being that the scheduler process
and writer process end up on the same cluster node.
>
> 2) Run time depends not just on the length of the query, but on the
> sequence composition of
> the query as well.
>
> A query sequence that is "similar" to a large number of database
> sequences will take longer to
> search than a query sequence that is "similar" to a only small number
> of database sequences.
>
> The reason for this is two-fold: (a) The BLAST algorithm only fully
> aligns two sequences if it first
> identifies identical sub-sequences of length W or greater. (b) The
> time that mpiBLAST spends
> formatting the BLAST output is proportional to the number of database
> entires that match the
> query (not the query length).
>
One additional factor that can significantly impact the run time is the
length of DB sequences that your queries hit. By default, versions
1.2.x and 1.3.0 of mpiBLAST transmit the *entire* database sequence over
the wire, not just the portion of the sequence used in the resulting
alignment. Nucleotide databases like nt or the human chromosome DB
contain sequences several MB in length, which can result in LOTS of
network traffic. Long sequences are not usually a problem with protein
sequence databases. Fortunately, a workaround exists for blastn
searches. The command-line option --disable-mpi-db will prevent workers
from transmitting sequences over the network. Instead, the writer
process reads only the necessary parts of the sequence from the database
on shared storage (e.g. it reads a small amount of data from NFS instead
of a large amount of data from worker nodes).
Summary: to get good performance, always use --disable-mpi-db when
performing blastn searches on databases with large sequence entries like
nt and human chromosomes.
A nice feature for a future mpiBLAST release would be workers
transmitting only the aligned portion of the bioseq to the writer
instead of the entire bioseq...
-Aaron
More information about the Bioclusters
mailing list