[BiO BB] Fwd: [blast-help] Refined nucleotide BLAST matrix

Yannick Wurm idh at poulet.org
Wed Feb 23 03:25:18 EST 2005


And so this is the reference Peter mentioned, as kindly indicated by 
Wayne Matten at NCBI.


@article{States1991Improved-Sensit,
	Abstract = {Scoring matrices for nucleic acid sequence comparison that 
are based on models appropriate to the analysis of molecular sequencing 
errors or biological mutation processes are presented. In mammalian 
genomes, transition mutations occur significantly more frequently than 
transversions, and the optimal scoring of sequence alignments based on 
this substitution model differs from that derived assuming a uniform 
mutation model. The information from sequence alignments potentially 
available using an optimal scoring system is compared with that 
obtained using the BLASTN default scoring. A modified BLAST database 
search tool allows these, or other explicitly specified scoring 
matrices, to be utilized in computationally efficient queries of 
nucleic acid databases with nucleic acid query sequences. Results of 
searches performed using BLASTN's default score matrix are compared 
with those using scores based on a mutational model in which 
transitions are more prevalent than transversions.},
	Author = {David J. States and Warren Gish and Stephen F. Altschul},
	Date-Added = {2005-02-23 09:14:28 +0100},
	Date-Modified = {2005-02-23 09:15:41 +0100},
	Journal = {METHODS: A Companion to Methods in Enzymology},
	Url = {http://blast.wustl.edu/doc/ntmats.pdf},
	Month = {August},
	Number = {1},
	Pages = {66-70},
	Title = {Improved Sensitivity of Nucleic Acid Database Searches Using 
Application-Specific Scoring Matrices},
	Volume = {3},
	Year = {1991}}


Thanks again!
-yannick

Begin forwarded message:
> From: "Matten, Wayne (NIH/NLM)"
> Date: 22 février 2005 21:23:29 GMT+01:00
> To: 'Yannick Wurm' <Yannick.Wurm at unil.ch>, 
> "'blast-help at ncbi.nlm.nih.gov'" <blast-help at ncbi.nlm.nih.gov>
> Subject: RE: [blast-help] Refined nucleotide BLAST matrix
>
> Hello,
>  
> I believe the reference that Peter mentions is this one:
>  
> http://blast.wustl.edu/doc/ntmats.pdf
>  
> Peter summed up the "hack" very well. You might need other commandline 
> options; turning off the low complexity filter comes to mind. But you 
> can get blastp, within blastall, to run as long as you format the 
> database as a protein database and use a matrix name already in the 
> /data directory. You might also get some ideas from here:
>  
> ftp://ftp.ncbi.nlm.nih.gov/blast/matrices/
>  
> e.g., NUC4.4.
>  
>
> Best regards,
> Wayne
>
> <><><><>>><>>>>><><>>><>
> Wayne Matten
> NCBI User Services

. . . . . . . . . . . . . . . . . .
yannick.wurm at unil.ch
+41.21.692.4157
PhD student, Departement of Ecology and Evolution
Université de Lausanne, Switzerland



More information about the BBB mailing list