[Bioclusters] Re: Help on BLAST

Chris Dwan (CCGB) bioclusters@bioinformatics.org
Tue, 27 Aug 2002 08:08:09 -0500 (CDT)


> 	e-value = { m n 2^(-bit_score) }
> 
> I think you forgot a minus. This is the equation found in the blast
> tutorial.

You are exactly correct.  I apologize for any confusion caused by my typo.
Yours is the correct formula.  Seriously, thank you for limiting the
spread of my error.

> Recalculating the scores like this does create significant rounding
> errors, mostly because of the number of significant digits in the blast
> output.

How significant is "significant" for your purposes? In my experience with 
recombining BLAST reports, the errors introduced were rarely greater than
an order of magnitude.  Further, the erronious values seem to be monotonic
with the originals:  i.e, the list is in the same order.

E-values on individual hits vary more than an order of magnitude when I
update BLAST reports that are more than a year out of date against the
public repositories.  They also vary slightly depending on the bit-width
(32 vs. 64) of the architecture I'm using.

I don't lose too much sleep trying to pass the "diff" test with my BLAST
reports.  It's a heuristic algorithm anyway.  I think that the time is
more effectively spent giving my users access to another search
methodology.  

-Chris