I would like to point out that the E-value is NOT a probability, and can exceed 1 for unsignificant alignments. Here's a quote from NCBI's blast pages: "The Expect value (E) is a parameter that describes the number of hits one can "expect" to see just by chance when searching a database of a particular size. It decreases exponentially with the Score (S) that is assigned to a match between two sequences. Essentially, the E value describes the random background noise that exists for matches between sequences. For example, an E value of 1 assigned to a hit can be interpreted as meaning that in a database of the current size one might expect to see 1 match with a similar score simply by chance. " Regards, Os Quoting Martin Heusel <mheusel at gmail.com>: > Hi Noel, > > yepp i meant '-e' sorry. For the number of sequences returned i never > get all sequences of a database. Even for a small database with 500 > sequences only around 300 are given back. The e-value was set to > 1000000. What i recently learned is that blastpgp only makes > approximations for computing the e-values for speed up reasons. It > computes only the first taylor term of a taylor approx. of exp() of > > E = K m n exp(-lambda score) > > which only makes sense for not too small scores. So my assumption is > that for to small scores and lambdas the approx. gives way to high > e-values exceeding the -e threshold. By definition an e-value is a > propability and should not go beyond 1. > > Regards > > Martin > > On 2/25/07, Noel Faux <Noel.Faux at med.monash.edu.au> wrote: > > Hi Martin, > > > > This was probably a typo, but, I think you need '-e' not '-E' to set the > > e-value cutoff for the returned results. When I wanted all results I > > set -b to the size of the subject database and -e 100000. The e-value > > never reached that, so PSI-BLAST returned all results. > > > > Cheers > > Noel > > > > Martin Heusel wrote: > > > Hi, > > > > > > i'm wondering if it's possible to get all sequences of a large > > > database ranked by E-value or score from PSI-BLAST with a query. > > > Normally PSI-BLAST stops outputting after a couple of sequences even > > > if one sets the output parameters -b or -E to very high values. Is it > > > possible in general or are there computational limits (time etc.) in > > > figuring out the right scores or E-values when it comes to the many > > > sequences with very low identity? Thanks for any advice. > > > > > > Martin > > > > > > > _______________________________________________ > > Biodevelopers mailing list > > Biodevelopers at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/biodevelopers > > > > > -- > + Neural Processing Group Technical University Berlin > + http://ni.cs.tu-berlin.de > + Institute of Bioinformatics Johannes Kepler University Linz > + http://www.bioinf.jku.at/ > > + In the beginning was the WORD, and the WORD was UNSIGNED, > + and the main(){} was without form and void > _______________________________________________ > Biodevelopers mailing list > Biodevelopers at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biodevelopers > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.