[BiO BB] All-again-all protein sequence comparison

Iddo Friedberg idoerg at burnham.org
Thu Dec 16 16:47:15 EST 2004


Use ncbi toolkit, write a script around bl2seq for the all-vs-all.

If the genomes are really large, I would try and cluster each genome 
first at 90% Sequence ID, to remove redundancies, using CD-HIT.

I wouldn't go with the strategy of having  one genome as a database, and 
another as a query pool, because that would skew your BLAST statistics 
to give you false-positive hits. I would go with the all-vs-all pairwise 
BLAST.

./I


Dr. Christoph Gille wrote:

>the ncbi toolkit works well.
>I can loop over all proteins in one genome
>and run blast against the other.
>
>
>  
>
>>Hi, All
>>
>>
>>I have been working on obtain the BLAST e-score for all-against-all
>>protein sequences of two genomes. Is there is tool for script for this
>>function? Any suggestions will be helpful.
>>
>>Thanks,
>>
>>
>>Anne_______________________________________________
>>BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>
>>
>>    
>>
>
>
>_______________________________________________
>BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>  
>


-- 

Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9930
http://ffas.ljcrf.edu/~iddo




More information about the BBB mailing list