[BiO BB] All-again-all protein sequence comparison

Ian Donaldson idonalds at blueprint.org
Fri Dec 17 09:37:46 EST 2004

Dear Anne

There is a pre-computed BLAST of all pairwise proteins in the NCBI's nr
database available at


These results are also available via a remote API (in Perl/Java/C/C++).

You can read http://www.blueprint.org/seqhound/seqhound_documentation.html

for how to get started with this API if it meets your needs.

Best regards


-----Original Message-----
From: bio_bulletin_board-bounces at bioinformatics.org
[mailto:bio_bulletin_board-bounces at bioinformatics.org]On Behalf Of Iddo
Sent: December 16, 2004 4:47 PM
To: The general forum at Bioinformatics.Org
Subject: Re: [BiO BB] All-again-all protein sequence comparison

Use ncbi toolkit, write a script around bl2seq for the all-vs-all.

If the genomes are really large, I would try and cluster each genome
first at 90% Sequence ID, to remove redundancies, using CD-HIT.

I wouldn't go with the strategy of having  one genome as a database, and
another as a query pool, because that would skew your BLAST statistics
to give you false-positive hits. I would go with the all-vs-all pairwise


Dr. Christoph Gille wrote:

>the ncbi toolkit works well.
>I can loop over all proteins in one genome
>and run blast against the other.
>>Hi, All
>>I have been working on obtain the BLAST e-score for all-against-all
>>protein sequences of two genomes. Is there is tool for script for this
>>function? Any suggestions will be helpful.
>>BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org


Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9930

BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org

More information about the BBB mailing list