-------- Original Message -------- Subject: [blast-announce] [BLAST-Announce #053] BLAST 2.2.13 released Date: Tue, 13 Dec 2005 10:53:34 -0500 Notes for the 2.2.13 release Standalone BLAST 2.2.13 is now available from the BLAST download page. Major changes include: * New engine now available in blastall * Statistical parameter change * Bug fixes New engine available in blastall Blastall now has support for a new version of the BLAST engine that can be enabled by adding "-V F" to the blastall command-line. This option will probably be the default in future versions. There are a few situations where it is very advantageous to use the new engine: 1. Large word-sizes with a BLASTN search. The new engine uses the "stride" idea of AGBLAST and this can lead to a considerable speedup for large word sizes. For a run of a typical mRNA sequence (u00001) with a word size of 25 the new code runs about twice as fast as the old code. Note that the AG "stride" has been available in megablast since the 2.2.10 release. This enhancement is platform-independent. 2. Searching multiple queries at once. The new engine will search multiple queries by scanning the database once, rather than once for each query. The speedup will depend upon the queries being searched and what part of the time is spent scanning the databases vs. actual computations (e.g., extensions etc.). Typically this feature is most important if a number of short queries (e.g., mRNA's or EST's) are being searched with blastn or if a tblastn search is performed. This feature is partially supported in the old code with the -B option as well as by megablast. 3. For very large queries. The memory management (especially during the dynamic programming phase) has been improved and this may allow searches with lots of matches or large queries that used to fail to now run to completion. Statistical parameter change Megablast, blastall and bl2seq have until now allowed users to select arbitrary gap existence and extension penalties for a blastn type search. This has been convenient for users but has led to the unfortunate situation that searches with some parameter sets were significantly overestimating the statistical significance of matches. To address this problem the proper statistical parameters for a number of reward/penalty/gap existence/gap extension values have been calculated. The parameters that might cause an issue here are -r (match reward), -q (mismatch penalty), -G (gap existence cost), and -E (gap extension cost). If you do not change these, then nothing will change for you. Please email blast-help at ncbi.nlm.nih.gov with any questions, bug reports, or requests for different parameter sets. Below are listed the supported combinations. Note that above a certain gap existence and extension penalty any value is permitted, as the statistics for ungapped searches can be used. These are marked as "ungapped threshold" below. For match = 1, mismatch = -4 the supported combinations are: G E ----- 1, 2, 0, 2, 2, 1, 1, 1, 2, 2 (ungapped threshold) match = 2, mismatch = -7 the supported combinations are: G E ----- 2, 4, 0, 4, 4, 2, 2, 2, 4, 4 (ungapped threshold) match = 1, mismatch = -3 the supported combinations are: G E ----- 1, 2, 0, 2, 2, 1, 1, 1 2, 2 (ungapped threshold) match = 2, mismatch = -5 the supported combinations are: G E ----- 2, 4, 0, 4, 4, 2, 2, 2, 4, 4 (ungapped threshold) match - 1, mismatch = -2 the supported combinations are: G E ----- 1, 2, 0, 2, 3, 1, 2, 1, 1, 1, 2, 2 (ungapped threshold) match = 2, mismatch = -3 the supported combinations are: G E ----- 4, 4, 2, 4, 0, 4, 3, 3, 6, 2, 5, 2, 4, 2, 2, 2, 6, 4 (ungapped threshold) match = 1, mismatch = -1 the supported combinations are: G E ----- 3, 2, 2, 2, 1, 2, 0, 2, 4, 1, 3, 1, 2, 1, 4, 2 (ungapped threshold) match = 5, mismatch = -4 the supported combinations are: G E ----- 10, 6 8, 6 25, 10 (ungapped threshold) match = 4, mismatch = -5 the supported combinations are: G E ----- 6, 5, 5, 5, 4, 5, 3, 5, 12, 8 (ungapped threshold) Bug fixes * A bug has been fixed in formatdb. This bug occurred when the -o option was not used, meaning that the FASTA definition lines of the input file were not parsed, and multiple database volumes were generated. The bug normally did not become apparent to the user until the BLAST run at which point the BLAST binary (e.g., blastall) would produce messages containing "ObjMgrChoice: pointer [0] type [1] not found". -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615