> > Recalculating the scores like this does create significant rounding > > errors, mostly because of the number of significant digits in the blast > > output. > > How significant is "significant" for your purposes? In my experience with > recombining BLAST reports, the errors introduced were rarely greater than > an order of magnitude. Further, the erronious values seem to be monotonic > with the originals: i.e, the list is in the same order. Point taken. I take it you use this kind of merging yourself. Have you had a change to think about my last question? About the differences in -actual- results, hits and hsps. I've had differences on several occasions. E.g. a hit residing in the middle of a 5k subquery of a 100k query, where blasting both gives different hsps in the smaller and the bigger blast. I'd really like a second opinion on this. > > E-values on individual hits vary more than an order of magnitude when I > update BLAST reports that are more than a year out of date against the > public repositories. They also vary slightly depending on the bit-width > (32 vs. 64) of the architecture I'm using. Very interesting. Could the first case you state have anything to do with the fact that the actual database is probably a lot bigger now than when you first did your blast, which would naturally give a different e-value. The last case seems unavoidable. Wim PS Pretty productive mailing list! > > I don't lose too much sleep trying to pass the "diff" test with my BLAST > reports. It's a heuristic algorithm anyway. I think that the time is > more effectively spent giving my users access to another search > methodology. > > -Chris > > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters