[Bioclusters] multiple inputs to MPIBLAST

Lik Mui lmui at stanford.edu
Wed Mar 9 16:27:10 EST 2005


Hello, I tried to feed multiple inputs to mpiblast (all in a single FASTA
file).  I found that when the number of inputs is > 15, mpiblast's
performance GREATLY deteriotes.  For example, using 1 single head node, I
get a blastall output in about 20 seconds.  When I feed an input of 20
input sequences to MPIBLAST on a 24 node cluster, the result takes 3
minutes to get back.  This is hardly super-linear.

I am running on a 24 nodes Platform ROCKS cluster with MPICH 1.2.6, and the
latest MPIBLAST 1.3.0.

Can anyone explain why this is or how to get around MPIBLAST slowing down
with multiple inputs.

Thanks in advanced.

           Lik Mui


p.s. because my genome db is about 1 GB, it seems to make sense to process a
batch of inputs together with a single read of the db.  Hence, I am running
multiple input files.  If this is not correct reasoning, please comment.





More information about the Bioclusters mailing list