[Bioclusters] sensitivity & blast
Pamela Culpepper
pculpep at hotmail.com
Wed Apr 6 17:27:49 EDT 2005
Chris,
You might be interested in what we are working on --
http://www.lifeformulae.com
Pam
>From: Chris Dwan <cdwan at bioteam.net>
>Reply-To: "Clustering, compute farming & distributed computing in life
>science informatics" <bioclusters at bioinformatics.org>
>To: "Clustering, compute farming & distributed computing in life science
>informatics" <bioclusters at bioinformatics.org>
>Subject: Re: [Bioclusters] sensitivity & blast
>Date: Wed, 6 Apr 2005 16:58:36 -0400
>
>
>BLAST is not a black box, and its function need not be determined by
>experiment:
>
>- An excellent reference on the algorithm:
>http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html
>- The source code: ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/ncbi.tar.Z
>- O'Reilly published an entire book on BLAST, whose author is active on
>this list.
>
>Yes, the search space defaults to the product of the query length (m) and
>the target set length (n). The -Y option overrides that search space.
>
>Alignment Score depends only on the alignments and the substitution matrix.
>Bit score normalizes for values specific to the substitution matrix.
>Expect value normalizes out query and target set size.
>
>Keep in mind as well: BLAST is an heuristic algorithm with no knowledge of
>any structure beyond primary sequence. If increased sensitivity is the
>goal, you will get much greater milage by using an algorithm which takes
>structure into account, or one which utilizes more than pairwise
>alignments.
>
>However, taken very literally, your answer is correct. If the goal is to
>remove query length as a factor in E value, the "-Y" option is the way to
>go.
>
>-Chris Dwan
> The BioTeam
>
>On Apr 6, 2005, at 4:39 PM, Pamela Culpepper wrote:
>
>>orks as follows.
>>In the absense of -Y, the "effective search space" is the product of the
>>query sequence length
>>and the total database length. It affects the calculation of the
>>expection value but not the score.
>>It will thus vary with the query sequence length.
>>Using "-Y 12345" sets the above "effective search space" to 12345,
>>constant for each query
>>sequence. To make the
>
>_______________________________________________
>Bioclusters maillist - Bioclusters at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bioclusters
More information about the Bioclusters
mailing list