[Bioclusters] sensitivity & blast
Chris Dwan
cdwan at bioteam.net
Wed Apr 6 14:24:54 EDT 2005
> Could you suggest whether we are on the right track? What is the right
> approach to set a uniform sensitivity for all inputs?
E-values already incorporate statistics to eliminate (normalize for) a
number of factors, including query size. Getting rid of that
normalization is possible, but not necessarily a good idea unless you
know exactly what you're doing.
E values for identical HSPs grow with the product of the sizes of the
query and the target set. The rationale is that the same hit will be
more and more likely to occur by random chance in a larger sample of
sequence. Said HSPs will be less and less statistically interesting as
the query and the target set grow.
This leads to your observation that you must increase the E-value
threshold to keep getting the same hits.
The question you seem to be asking is "find me all of the HSPs that fit
some criterion, regardless of their statistical significance." The
question that BLAST is designed to answer is "find me most of the
statistically significant HSPs for some particular search, and extend
them to build up gapped local alignments."
If you're willing to share your goal in running these searches, the
list might be able to suggest alternative tools better suited to your
problem.
-Chris Dwan
The BioTeam
More information about the Bioclusters
mailing list