Res: [Bioclusters] split database with blast
Daniel Xavier de Sousa
danielucg at yahoo.com.br
Mon Nov 27 16:41:59 EST 2006
Hi Lucas
Thank you so much about your opnion.
I will test dBlast, but I think ( I read the paper) the solution is equal of mpiBLAST, it is, one patch to NCBI BLAST.
well I'm studing about this issue, then any other thing please tell me.
So I will wait the opnion from other users of list.
By
Daniel Xavier
----- Mensagem original ----
De: Lucas Carey <lcarey at odd.bio.sunysb.edu>
Para: HPC in Bioinformatics <bioclusters at bioinformatics.org>
Enviadas: Segunda-feira, 27 de Novembro de 2006 14:05:47
Assunto: Re: [Bioclusters] split database with blast
Hi Daniel,
It is non-trivial to get correct effective database sizes with NCBI BLAST, as it involves processing both query sequences and database sequences. You're best bet is to use a package that can split the databases and return correct e-values. mpiBLAST is one, but dBlast is another if for some unfathomable reason you don't like mpiBLAST.
However, depending on what you're doing, e-value differences may not matter. In my personal opinion, there is no difference between e-36 and e-40, so the differences you are talking about are negligible.
-Lucas
On Tuesday, November 21, 2006 at 15:41 -0800, Daniel Xavier de Sousa wrote:
>
>
> Hi for all,
>
> I need some help about Parallel BLAST. I will bee happy if anyone help me.
> I have worked with parallel BLAST using split database.
>
> I don?t have problem to execute on part of database and statistics values when
> use WUBLAST, because use DBRECMAX and DBRECMIN parameters and I
> execute Blast like virtual split database, get just the piece of all
> database, and the e-value get right.
>
> But I really want do everything work in NCBI_BLAST. I know the solution of mpiBLAST
> and the list of GI number file. But, these solutions aren?t so good.
> The first because the source of BLAST have to change. And the second,
> because require that you use GI numbers in the FASTA identifier.
>
> So, my question is:
>
> 1) Somebody
> knows some else solution to run process blast on split database, and
> not changes the e-value with relation to run whole database?
>
> If not, the difference between e-value with whole database and part of database (using the parameter ?z and ?Y of ncbi_blast) is very important?
>
> Example, I processed one sequence with whole database and just part of database, using parameter ?z, the result was:
>
> (evalue)NR (evalue) NR/2 using ?z
> SeqQuery Seq1DB 4e-66 6e-66
> SeqQuery Seq2DB 4e-38 5e-38
>
> This difference is relevant?
> Thanks,
>
> Daniel Xavier ? PUC ? Rio de Janeiro - Brazil
_______________________________________________
Bioclusters maillist - Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters
_______________________________________________________
Novidade no Yahoo! Mail: receba alertas de novas mensagens no seu celular. Registre seu aparelho agora!
http://br.mobile.yahoo.com/mailalertas/
More information about the Bioclusters
mailing list