[Biodevelopers] Blast not symmetrical?
Malay
mbasu at mail.nih.gov
Thu Jan 18 12:15:44 EST 2007
Michael Nuhn wrote:
> Hello, Everybody!
>
> While I was trying to track down a "bug" in my program I found out that the
> blast program (Blastn v2.2.11) is not symmetrical, that is:
>
> If I blast a query sequence Q against a database S (1 sequence), I get a
> result set B(S,Q).
>
> If I do the blast the other way around, that is, I use S as query sequence
> and blast it against the database Q, I get a result B(Q,S).
>
> And the problem is: B(S,Q) and B(Q,S) are not equal. Each blast set has some
> blast hits that the other does not have and also some blast hits that have
> one common coordinate but end at another.
>
> Both blasts were made with the blast defaults, no filter was used. The two
> sequences are large (~2Mb each, the sequences are genomes). According to the
> statistics used in blast (at least the part I understand), it should not
> play a role which sequence is the query and which is the subject.
>
> Does anyone have an explanation for this? Since I don't really have a clue
> at where to start, hints and wild guesses are also appreciated.
>
> Thanks in advance,
> Michael.
>
> _______________________________________________
> Biodevelopers mailing list
> Biodevelopers at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biodevelopers
This is a very well known phenomenon. You can read more about it here:
http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html
Why this appears to be counter-intuitive is because people mistake BLAST
scores (a sort of bit value) with alignment score. The raw alignment
score will be same for both the alignments. Doesn't matter which one is
query. But Blast bit score is then calculated from the raw alignment
score taking into consideration of background distribution of the amino
acids of the query sequence. Because, the query sequences differs in
composition the bit-scores will be different (and the e-value, which is
calculated from the bit-score). I guess that's a over-simplified
explanation.
--
Malay K Basu
www.malaybasu.net
More information about the Biodevelopers
mailing list