[Bioclusters] split database with blast

Daniel Xavier de Sousa danielucg at yahoo.com.br
Tue Nov 21 18:41:20 EST 2006

Hi for all,


I need some help about Parallel BLAST. I will bee happy if anyone help me.  


I have worked with parallel BLAST using split database. 


I don’t have problem to execute on part of database  and statistics values  when
use WUBLAST, because  use DBRECMAX and DBRECMIN parameters and I
execute Blast like virtual split database, get just the piece of all
database, and the e-value get right.


But I really want do everything work in NCBI_BLAST. I know the solution of  mpiBLAST
and the list of GI number file. But, these solutions aren’t so good.
The first because the source of BLAST have to change. And the second,
because require that you use GI numbers in the FASTA identifier.


So, my  question is:

1)      Somebody
knows some else solution to run process blast on split database, and
not changes the e-value with relation to run whole database?


If not, the difference between e-value with whole database and part of database (using the parameter –z and –Y of  ncbi_blast) is very important?


Example, I processed one sequence with whole database and just part of database, using parameter –z, the result was:


                                                (evalue)NR                                  (evalue) NR/2 using –z

SeqQuery       Seq1DB                4e-66                                                  6e-66

SeqQuery       Seq2DB                4e-38                                                  5e-38


            This difference is relevant?



Daniel Xavier – PUC – Rio de Janeiro - Brazil

O Yahoo! está de cara nova. Venha conferir! 

More information about the Bioclusters mailing list