[Bioclusters] Re: Help on BLAST

Wim Glassee bioclusters@bioinformatics.org
Wed, 28 Aug 2002 11:34:37 +0200

> > Recalculating the scores like this does create significant rounding
> > errors, mostly because of the number of significant digits in the
> > output.
> How significant is "significant" for your purposes? In my experience
> recombining BLAST reports, the errors introduced were rarely greater
> an order of magnitude.  Further, the erronious values seem to be
> with the originals:  i.e, the list is in the same order.

Point taken. I take it you use this kind of merging yourself. Have you
had a change to think about my last question? About the differences in
-actual- results, hits and hsps. I've had differences on several
E.g. a hit residing in the middle of a 5k subquery of a 100k query,
where blasting both gives different hsps in the smaller and the bigger
blast. I'd really like a second opinion on this.

> E-values on individual hits vary more than an order of magnitude when
> update BLAST reports that are more than a year out of date against the
> public repositories.  They also vary slightly depending on the
> (32 vs. 64) of the architecture I'm using.

Very interesting. Could the first case you state have anything to do
with the fact that the actual database is probably a lot bigger now than
when you first did your blast, which would naturally give a different
The last case seems unavoidable.


PS Pretty productive mailing list!

> I don't lose too much sleep trying to pass the "diff" test with my
> reports.  It's a heuristic algorithm anyway.  I think that the time is
> more effectively spent giving my users access to another search
> methodology.
> -Chris
