[Bioclusters] blast query
Michael Cariaso
bioclusters@bioinformatics.org
Wed, 15 Sep 2004 01:22:49 -0400
Che, Anney (NIH/NIAID) wrote:
> Does anyone know how to set a filter in blast to omitting the replicated
> hits?
>
> Thanks, Anney
Hope this helps. I keep a text 'gilist.txt' file with this format:
GI#1 <tab> a description of the sequence
GI#2 <tab> a description of the sequence
GI#3 <tab> a description of the sequence
When I want a filter that will only see 'mouse' sequences. I run this
command:
grep mouse gilist.txt | cut -f 1 > filter.txt
Then you run blast as you normally would but with an extra -l switch.
blastall -p blastn -d database -i query -l filter.txt
Optionally. If you'll be using the filter several times you may want to
make a binary version which will allow blast to run faster. You can use
this file in place of 'filter.txt' for a quite a little boost of extra
speed.
formatdb -F filter.txt -B filter.bin
followed by
blastall -p blastn -d database -i query -l filter.bin
Since this is bioclusters, I'll mention that this also works very well
with mpiblast.
Michael Cariaso
Bioinformatics Developer
Besthesda, MD
cariaso at yahoo dot com