[Bioclusters] Questions on mpiBLAST

Xiaowu Gai xgai at genome.chop.edu
Thu Feb 3 11:45:45 EST 2005

Hi Everyone:

We have a 16-node Xserve cluster, with 2GB memory on each node and dual
processors.  I was able to install mpiBLAST on it, along with LAM/MPI.
However, the performance that I saw with some test runs has not been that
good and quite confusing.  Here is what I did:

1.) I formatted the nt database:

mpiformatdb -N 16 -i nt

2.) I ran the mpiblast on one, two, five, ten, twenty, and more sequences
(about 500bp each) and with the command:

time mpirun N mpiblast -p blastn -d nt -i single.fa -o blast_results.

Here are the numbers:

Single: 1m39.054s
Two: 0m11.009s
Five: 0m16.021s
Ten: 0m46.591s
twenty: 3m7.541s

I am all confused.  First of all, the performance is not that impressive.
Secondly, the numbers are very confusing to me.  Why is that a single
sequence query takes so much more time than a two (BTW, I reran the query of
a single sequence right after the query of two and got similar results)? And
query of five takes only 5 seconds more than the query of two and  so on..

I am afraid that I have done something wrong and would really appreciate any



